{"id":1858,"date":"2019-05-13T20:11:04","date_gmt":"2019-05-14T04:11:04","guid":{"rendered":"https:\/\/live-cometml.pantheonsite.io\/blog\/building-a-fully-reproducible-machine-learning-pipeline-with-comet-ml-and-quilt\/"},"modified":"2019-05-13T20:11:04","modified_gmt":"2019-05-14T04:11:04","slug":"building-a-fully-reproducible-machine-learning-pipeline-with-comet-ml-and-quilt","status":"publish","type":"post","link":"https:\/\/www.comet.com\/site\/blog\/building-a-fully-reproducible-machine-learning-pipeline-with-comet-ml-and-quilt\/","title":{"rendered":"Building a fully reproducible machine learning pipeline with comet.ml and Quilt"},"content":{"rendered":"\n<h4 class=\"wp-block-heading\">Classifying fruits using a Keras multi-class image classification model and Google Open Images<\/h4>\n\n\n\n<p>&nbsp;<\/p>\n\n\n\n<figure class=\"wp-block-image alignfull size-large\"><img decoding=\"async\" src=\"https:\/\/wordpress.comet.ml\/app\/uploads\/2021\/03\/1R0DgVptZwIgZfm-QXXKm5g.png\" alt=\"\" \/>\n<figcaption>Photo by\u00a0<a href=\"https:\/\/unsplash.com\/photos\/1cWZgnBhZRs?utm_source=unsplash&amp;utm_medium=referral&amp;utm_content=creditCopyText\" target=\"_blank\" rel=\"noreferrer noopener\">Luke Michael<\/a>\u00a0on\u00a0<a href=\"https:\/\/unsplash.com\/search\/photos\/fruit?utm_source=unsplash&amp;utm_medium=referral&amp;utm_content=creditCopyText\" target=\"_blank\" rel=\"noreferrer noopener\">Unsplash<\/a><\/figcaption>\n<\/figure>\n\n\n\n<blockquote class=\"wp-block-quote is-layout-flow wp-block-quote-is-layout-flow\">\n<p><em>This post was written in collaboration with\u00a0<\/em><a href=\"https:\/\/www.linkedin.com\/in\/aleksey-bilogur-0956a994\/\" target=\"_blank\" rel=\"noreferrer noopener\">Aleksey Boligur<\/a><em>\u00a0from the Quilt Data team.\u00a0<\/em><a href=\"https:\/\/twitter.com\/residentmario?lang=en\" target=\"_blank\" rel=\"noreferrer noopener\">Follow Aleksey on Twitter<\/a><em>\u00a0and his personal website\u00a0<\/em><a href=\"http:\/\/www.residentmar.io\/\" target=\"_blank\" rel=\"noreferrer noopener\">here<\/a><em>. Follow Quilt\u00a0<\/em><a href=\"https:\/\/twitter.com\/quiltdata\" target=\"_blank\" rel=\"noreferrer noopener\">here<\/a><\/p>\n<\/blockquote>\n\n\n\n<p>The term machine learning \u2018pipeline\u2019 can suggest a one-way flow of data and transformations, but in reality, machine learning pipelines are cyclical and iterative. For a given project, a data scientist can try hundreds and thousands of experiments before arriving at a champion model to put in production.<\/p>\n\n\n\n<p>With each iteration, it becomes harder to manage subsets and variations of your data and models. Keeping track of which model iteration ran on which dataset is key to reproducibility.<\/p>\n\n\n\n<figure class=\"wp-block-pullquote\">\n<blockquote>\n<p>Having a proper machine learning pipeline that\u00a0<a href=\"https:\/\/blog.quiltdata.com\/reproduce-a-machine-learning-model-build-in-four-lines-of-code-b4f0a5c5f8c8\" target=\"_blank\" rel=\"noreferrer noopener\">tracks specific versions of data, code, and environment<\/a>\u00a0details can not only help you easily reproduce your own model results, but also allow you to share your work with fellow data scientists or machine learning engineers who need to deploy your model.<\/p>\n<\/blockquote>\n<\/figure>\n\n\n<hr class=\"wp-block-separator is-style-dots\" \/>\n\n\n<p><strong>In this article, we\u2019ll show you how to build a simple and reproducible end-to-end machine learning pipeline using a Keras image multi-class classification model and a custom dataset crafted from Google Open Images using\u00a0<\/strong><a href=\"https:\/\/alpha.quiltdata.com\/b\/quilt-example%20\/\" target=\"_blank\" rel=\"noreferrer noopener\"><strong>Quilt T4<\/strong><\/a><strong>\u00a0and\u00a0<\/strong><a href=\"http:\/\/bit.ly\/309Cll1\" target=\"_blank\" rel=\"noreferrer noopener\"><strong>comet.ml<\/strong><\/a><\/p>\n\n\n\n<p>You can access the full tutorial in\u00a0<a href=\"https:\/\/github.com\/comet-ml\/keras-fruit-classifer\" target=\"_blank\" rel=\"noreferrer noopener\">this Github repository<\/a>. For a walk-through of the tutorial, continue reading below \u2b07.\ufe0f<\/p>\n\n\n\n<h2 class=\"wp-block-heading\"><strong>Creating your custom dataset<\/strong><\/h2>\n\n\n\n<p>The Open Images Dataset is an attractive target for building image recognition algorithms because it is one of the largest, most accurate, and most easily accessible image recognition datasets. For image recognition tasks, Open Images contains 15 million bounding boxes for 600 categories of objects on 1.75 million images. Image labeling tasks meanwhile enjoy 30 million labels across almost 20,000 categories.<\/p>\n\n\n\n<p>The images come from Flickr and are of highly variable quality, as would be realistic in an applied machine learning setting.<\/p>\n\n\n\n<p>Downloading the entire Google Open Images corpus is possible and potentially necessary if you want to build a general purpose image classifier or bounding box algorithm. However downloading\u00a0<em>everything<\/em>\u00a0is a waste if you just want a small categorical subset of the data in the corpus.\u00a0<strong>For this tutorial, we are just interested in downloading and working with fruit images.<\/strong><\/p>\n\n\n\n<div class=\"wp-block-image\">\n<figure class=\"aligncenter size-large\"><img decoding=\"async\" src=\"https:\/\/wordpress.comet.ml\/app\/uploads\/2021\/03\/1R0DgVptZwIgZfm-QXXKm5g.png\" alt=\"\" \/>\n<figcaption>View an interactive version of this plot on Quilt T4\u00a0<a href=\"https:\/\/alpha.quiltdata.com\/b\/quilt-example\/tree\/quilt\/open_fruit\/\" target=\"_blank\" rel=\"noreferrer noopener\">here<\/a><\/figcaption>\n<\/figure>\n<\/div>\n\n\n\n<p>The\u00a0<code>src\/openimager<\/code>\u00a0subfolder in\u00a0<a href=\"https:\/\/github.com\/quiltdata\/open-fruit\" target=\"_blank\" rel=\"noreferrer noopener\">the Github repository<\/a>\u00a0provided contains a small module that handles downloading a categorical subset of the Open Images corpus: just the images corresponding with a user-selected group of labels, and just from the set of images with bounding box information attached. Instead of using the zipped blob files it does so by downloading the source images from Flickr directly.<\/p>\n\n\n\n<p>This script will allow you to download any subset of the 600 labels that do. Here\u2019s a taste of what\u2019s possible:<\/p>\n\n\n\n<p><code>football<\/code>,\u00a0<code><code>toy<\/code><\/code>,\u00a0<code><code>bird<\/code><\/code>,\u00a0<code><code>cat<\/code><\/code>,\u00a0<code><code>vase<\/code><\/code>,<code><code>lemon<\/code><\/code>,\u00a0<code><code>dog<\/code><\/code>,\u00a0<code><code>elephant<\/code><\/code>,\u00a0<code><code>shark<\/code><\/code>,\u00a0<code>flower<\/code>,\u00a0<code>furniture<\/code>,\u00a0<code>airplane<\/code>,\u00a0<code>spoon<\/code>,\u00a0<code>bench<\/code>,\u00a0<code>swan<\/code>,\u00a0<code>peanut<\/code>,\u00a0<code>camera<\/code>,\u00a0<code>flute<\/code>,\u00a0<code>helmet<\/code>,\u00a0<code>pomegranate<\/code>,\u00a0<code>crown<\/code>\u2026<\/p>\n\n\n\n<p>For the purposes of this article, we\u2019ll limit ourselves to just fruit classes including:<\/p>\n\n\n\n<p><code>apple<\/code>,\u00a0<code>banana<\/code>,\u00a0<code>cantaloupe<\/code>,\u00a0<code>common_fig<\/code>,\u00a0<code>grape<\/code>,\u00a0<code>lemon<\/code>,\u00a0<code>mango<\/code>,\u00a0<code>orange<\/code>,\u00a0<code>peach<\/code>,\u00a0<code>pear<\/code>,\u00a0<code>pineapple<\/code>,\u00a0<code>pomegranate<\/code>,\u00a0<code>strawberry<\/code>,\u00a0<code>tomato<\/code>,\u00a0<code>watermelon<\/code><\/p>\n\n\n\n<p>For more information on Open Images, check out the article \u2018<a href=\"https:\/\/medium.freecodecamp.org\/how-to-classify-photos-in-600-classes-using-nine-million-open-images-65847da1a319\" target=\"_blank\" rel=\"noreferrer noopener\">How to classify photos in 600 classes using nine million Open Images<\/a>\u2019.<\/p>\n\n\n\n<h3 class=\"wp-block-heading\"><strong>Preprocessing your data \u2014 and packaging it<\/strong><\/h3>\n\n\n\n<p><a href=\"https:\/\/github.com\/comet-ml\/keras-fruit-classifer\/blob\/master\/notebooks\/build-dataset.ipynb\" target=\"_blank\" rel=\"noreferrer noopener\">This annotated Jupyter notebook<\/a>\u00a0in the\u00a0<a href=\"https:\/\/github.com\/comet-ml\/keras-fruit-classifer\" target=\"_blank\" rel=\"noreferrer noopener\">demo GitHub repository\u00a0<\/a>does this work. After running the notebook code, we will have an\u00a0<code>images_cropped<\/code>\u00a0folder on disk containing all of the cropped images.<\/p>\n\n\n\n<p>It\u2019s easy to access the package of fruit class data along with the pre-processed images is via\u00a0<a href=\"https:\/\/alpha.quiltdata.com\/b\/quilt-example\/tree\/quilt\/open_fruit\/\" target=\"_blank\" rel=\"noreferrer noopener\">the Quilt T4 package<\/a>\u00a0. In order to access the data, simply run this command:<\/p>\n\n\n\n<pre class=\"wp-block-code\"><code>! pip install t4\n\nt4.Package.install('quilt\/open_fruit', registry='s3:\/\/quilt-example', dest='some\/path\/some\/where')<\/code><\/pre>\n\n\n\n<p>Looking closely at the fruit data, we can see that there is a class imbalance. There are over 26,000 samples of bananas but then only a few hundred labelled common fig or pear examples. This skew is important to note as we approach building our image classifier.<\/p>\n\n\n\n<div class=\"wp-block-image\">\n<figure class=\"aligncenter size-large\"><img decoding=\"async\" src=\"https:\/\/wordpress.comet.ml\/app\/uploads\/2021\/03\/17WEmPKEJHB1eQ0C90e2c7w.png\" alt=\"\" \/>\n<figcaption>View this plot on Quilt T4\u00a0<a href=\"https:\/\/alpha.quiltdata.com\/b\/quilt-example\/tree\/quilt\/open_fruit\/\" target=\"_blank\" rel=\"noreferrer noopener\">here<\/a><\/figcaption>\n<\/figure>\n<\/div>\n\n\n\n<h2 class=\"wp-block-heading\"><strong>Building your image classification model<\/strong><\/h2>\n\n\n\n<p>Now that we\u2019ve downloaded our fruit data from Quilt, we can begin building our image classification model! As with any machine learning project, we\u2019ll go through a few experiments to try to maximize our model\u2019s validation accuracy:<\/p>\n\n\n\n<ul class=\"wp-block-list\">\n<li><em>First we\u2019ll start with a baseline simple convolution neural network (CNN) model.<\/em><\/li>\n<li><em>Then, we\u2019ll try to leverage a pre-trained network (VGG architecture, pre-trained on the ImageNet dataset) whose learned features can help us reach a higher accuracy more effectively than just relying on our fruits dataset. We\u2019ll use transfer learning by fine-tune the top layers of our pre-trained network.<\/em><\/li>\n<li><em>Finally, we\u2019ll do a quick overview of different approaches for optimization including changing parameters like the amount of dropout, learning rate, and weight decay to see how they could contribute to model performance.<\/em><\/li>\n<\/ul>\n\n\n\n<p>The material for this tutorial was inspired by Francois Chollet\u2019s excellent post \u2018<a href=\"https:\/\/blog.keras.io\/building-powerful-image-classification-models-using-very-little-data.html\" target=\"_blank\" rel=\"noreferrer noopener\">Building powerful image classification models using very little data<\/a>\u2019. We\u2019ve expanded upon Chollet\u2019s example and adjusted to reflect our multi-class classification problem space.<\/p>\n\n\n\n<p>Along with having proper data versioning from Quilt, we\u2019ll also make sure to track our results, code, and environment for our different model iterations as this is critical to building a reproducible machine learning model pipeline.<\/p>\n\n\n\n<blockquote class=\"wp-block-quote is-layout-flow wp-block-quote-is-layout-flow\">\n<p><strong>Note:\u00a0<\/strong><em>We\u2019ll be using Jupyter notebooks for this tutorial, but comet.ml has native support for both\u00a0<\/em><a href=\"https:\/\/medium.com\/comet-ml\/monitoring-machine-learning-model-results-live-from-jupyter-notebooks-765a142069bb\" target=\"_blank\" rel=\"noreferrer noopener\">Jupyter notebooks\u00a0<\/a><em>and scripts.<\/em><\/p>\n<\/blockquote>\n\n\n\n<h3 class=\"wp-block-heading\"><strong>Baseline model \u2014 Simple CNN<\/strong><\/h3>\n\n\n\n<p>For our baseline model, we are using a small CNN with three convolution layers, using a ReLU activation, followed by a max-pooling layer. We\u2019ll include data augmentation and fairly aggressive dropout to prevent overfitting. Remember, we\u2019re not expecting our best accuracy here, so if you\u2019d like to skip this section and go straight to the pre-trained model, simply proceed to the next section below.<\/p>\n\n\n\n<p>Here\u2019s the <a href=\"https:\/\/www.comet.com\/ceceshao1\/comet-quilt-example\/d0625f21ccb34e83b9a338092016cf83\" target=\"_blank\" rel=\"noreferrer noopener\">experiment details<\/a>\u00a0for our small CNN model:<\/p>\n\n\n\n<p>Not surprisingly, our simple CNN model did not perform that well on the multi-classification task (which puts us in a multi-dimensional space). The model was originally meant to support a binary classification task, so having more than three times the number of classes means trivially you need more nodes to get the same performance.\u00a0<strong>Here are the metrics for one run of our model (link\u00a0<\/strong><a href=\"https:\/\/www.comet.com\/ceceshao1\/comet-quilt-example\/863e981d4189442d9c33efd5407a182a\" target=\"_blank\" rel=\"noreferrer noopener\"><strong>here<\/strong><\/a><strong>):<\/strong><\/p>\n\n\n\n<div class=\"wp-block-image\">\n<figure class=\"aligncenter size-large\"><img decoding=\"async\" src=\"https:\/\/wordpress.comet.ml\/app\/uploads\/2021\/03\/1T3qXHXMNjfRNsBwcQ0bbcA.png\" alt=\"\" \/><\/figure>\n<\/div>\n\n\n\n<p>To log your experiment results from training, set up your comet.ml account\u00a0<a href=\"http:\/\/bit.ly\/309Cll1\" target=\"_blank\" rel=\"noreferrer noopener\">here<\/a>. For each run of the model, we initialize the Comet experiment object and provide our API Key and project name.<\/p>\n\n\n\n<div class=\"wp-block-image\">\n<figure class=\"aligncenter size-large\"><img decoding=\"async\" src=\"https:\/\/wordpress.comet.ml\/app\/uploads\/2021\/03\/1otJVH1p0dTRVYaWk52KaGw.png\" alt=\"\" \/><\/figure>\n<\/div>\n\n\n\n<p>Once you run\u00a0<code>model.fit()<\/code>, you\u2019ll be able to see your different model runs in comet.ml through the direct experiment URL. As an example for this tutorial, we have created a Comet project that you can view and interact with\u00a0<a href=\"https:\/\/www.comet.com\/ceceshao1\/comet-quilt-example\" target=\"_blank\" rel=\"noreferrer noopener\">here<\/a>.<\/p>\n\n\n\n<p>Since we\u2019re using Keras, Comet\u2019s auto-logging for popular machine learning frameworks allows us to automatically capture model details such as metrics like accuracy and loss, the model\u2019s graph definition, and package dependencies \u2014 this\u00a0<em>significantly<\/em>\u00a0reduces the amount of manual logging we have to do from our end.<\/p>\n\n\n\n<h3 class=\"wp-block-heading\"><strong>Using a pre-trained model with transfer learning: InceptionV3<\/strong><\/h3>\n\n\n\n<p>A popular starting point for building image classifiers these days is to use a pre-trained network and fine-tune it with new classes of data. Let\u2019s use this approach to build our image classifier (just make sure to take note of\u00a0<a href=\"https:\/\/medium.com\/comet-ml\/approach-pre-trained-deep-learning-models-with-caution-9f0ff739010c\" target=\"_blank\" rel=\"noreferrer noopener\">these implementation details for pre-trained models<\/a>).<\/p>\n\n\n\n<p>There are several popular CNN architectures such as VGGNet, ResNet, and AlexNet along with a wealth of resources to read more about CNNs (see\u00a0<a href=\"https:\/\/medium.com\/@RaghavPrabhu\/understanding-of-convolutional-neural-network-cnn-deep-learning-99760835f148\" target=\"_blank\" rel=\"noreferrer noopener\">here<\/a>\u00a0and\u00a0<a href=\"https:\/\/adeshpande3.github.io\/adeshpande3.github.io\/A-Beginner%27s-Guide-To-Understanding-Convolutional-Neural-Networks\/\" target=\"_blank\" rel=\"noreferrer noopener\">here<\/a>). Keras enables users to easily access these pre-trained models (ie. their weights pre-trained on ImageNet) through\u00a0<a href=\"https:\/\/keras.io\/applications\/\" target=\"_blank\" rel=\"noreferrer noopener\">keras.applications<\/a>.<\/p>\n\n\n\n<p>We selected InceptionV3 since it\u2019s both a smaller model compared to VGGNet and because it\u2019s documented to provide a higher accuracy for benchmark datasets.\u00a0<a href=\"https:\/\/codelabs.developers.google.com\/codelabs\/cpb102-txf-learning\/index.html#1\" target=\"_blank\" rel=\"noreferrer noopener\">Transfer learning with InceptionV3<\/a>\u00a0essentially means that we re-use the feature extraction portion of the model that has been trained with the ImageNet dataset and re-train the classification portion on our fruit dataset.<\/p>\n\n\n\n<div class=\"wp-block-image\">\n<figure class=\"aligncenter size-large\"><img decoding=\"async\" src=\"https:\/\/wordpress.comet.ml\/app\/uploads\/2021\/03\/img_6064e46183c13.png\" alt=\"\" \/>\n<figcaption>See<a href=\"https:\/\/codelabs.developers.google.com\/codelabs\/cpb102-txf-learning\/index.html#5\" target=\"_blank\" rel=\"noreferrer noopener\">\u00a0this helpful diagram<\/a>\u00a0on transfer learning with an Inception V3 architecture<\/figcaption>\n<\/figure>\n<\/div>\n\n\n\n<p>Here\u2019s the <a href=\"https:\/\/gist.githubusercontent.com\/ceceshao1\/cb834ed819628244093a2b61408c86e0\">code<\/a> plus\u00a0<a href=\"https:\/\/www.comet.com\/ceceshao1\/comet-quilt-example\/d0625f21ccb34e83b9a338092016cf83\" target=\"_blank\" rel=\"noreferrer noopener\">experiment details<\/a>\u00a0for our fine-tuned InceptionV3 model:<\/p>\n\n\n\n<p>Once we begin training with\u00a0<code>model.fit()<\/code>, we can use Comet to track how the model is performing in real-time. We can also check to make sure that we\u2019re properly using our GPUs in the\u00a0<strong>System Metrics<\/strong>\u00a0tab. The experiment charts in Comet update with our model\u2019s accuracy and loss metrics:<\/p>\n\n\n\n<div class=\"wp-block-image\">\n<figure class=\"aligncenter size-large\"><img decoding=\"async\" src=\"https:\/\/wordpress.comet.ml\/app\/uploads\/2021\/03\/1NvEpw78RTU2REPgaOvOY4A.gif\" alt=\"\" \/><\/figure>\n<\/div>\n\n\n\n<p>We\u2019ll make sure to log our model weights at the end of the training process to Comet so we can reproduce the model in the future if we need to.<\/p>\n\n\n\n<pre class=\"wp-block-code\"><code># save locally\nmodel.save_weights('inceptionv3_tuned.h5')\n\n# save to Comet Asset Tab\n# you can retrieve these weights later via the REST API\nexperiment.log_asset(file_path='.\/inceptionv3_tuned.h5', file_name='inceptionv3_tuned.h5')<\/code><\/pre>\n\n\n\n<p>If you want to retrieve the model code and have trained your model from a git directory, simply use the\u00a0<strong>Reproduce\u00a0<\/strong>button in the Comet experiment view.<\/p>\n\n\n\n<div class=\"wp-block-image\">\n<figure class=\"aligncenter size-large\"><img decoding=\"async\" src=\"https:\/\/wordpress.comet.ml\/app\/uploads\/2021\/03\/1JiSrnmEKl78JRPdMjTlDPw.png\" alt=\"\" \/><\/figure>\n<\/div>\n\n\n\n<p>The\u00a0<strong>Reproduce<\/strong>\u00a0dropdown will surface key pieces of information about your environment, git commit, and everything you need to reproduce your experiment, including the actual run commands or notebook file. If you have uncommitted changes, we also provide you with a patch for applying your changes later.<\/p>\n\n\n\n<h3 class=\"wp-block-heading\"><strong>Evaluating the model<\/strong><\/h3>\n\n\n\n<p>In order to evaluate our image classifier model, it\u2019s useful to generate a few sample predictions and plot a confusion matrix so we can see where our model classified certain fruits correctly and incorrectly.<\/p>\n\n\n\n<p>These images and figures would also be useful to share with teammates, so we can log them to Comet even after the experiment is complete using the\u00a0<code>Experiment.log_figure()<\/code>\u00a0and\u00a0<code>Experiment.log_image()<\/code>\u00a0methods (see more\u00a0<a href=\"https:\/\/www.comet.com\/docs\/python-sdk\/matplotlib\/#uploading-figures-and-plots\" target=\"_blank\" rel=\"noreferrer noopener\">here<\/a>).<\/p>\n\n\n\n<div class=\"wp-block-image\">\n<figure class=\"aligncenter size-large\"><img decoding=\"async\" src=\"https:\/\/wordpress.comet.ml\/app\/uploads\/2021\/03\/1S__Tv59AA6A9CXKzZnSCFg.png\" alt=\"\" \/>\n<figcaption>For\u00a0<a href=\"https:\/\/www.comet.com\/ceceshao1\/comet-quilt-example\/b1c98bfbeec9473ea825724f75063863\/images\" target=\"_blank\" rel=\"noreferrer noopener\">this experiment<\/a>, we\u2019ve logged some random samples from our fruit dataset. You can see this sample image is hardly a very clear image of a strawberry (in fact, there was some preprocessing!).<\/figcaption>\n<\/figure>\n<\/div>\n\n\n\n<blockquote class=\"wp-block-quote is-layout-flow wp-block-quote-is-layout-flow\">\n<p><em>See this great resource on evaluating machine learning models from Jeremy Jordan:\u00a0<\/em><a href=\"https:\/\/www.jeremyjordan.me\/evaluating-a-machine-learning-model\/\" target=\"_blank\" rel=\"noreferrer noopener\">https:\/\/www.jeremyjordan.me\/evaluating-a-machine-learning-model\/<\/a><\/p>\n<\/blockquote>\n\n\n\n<h3 class=\"wp-block-heading\"><strong>Further optimizations<\/strong><\/h3>\n\n\n\n<p>There are several ways we could approach improving our model. Here is an non-exhaustive list of things we could try to adjust:<\/p>\n\n\n\n<ul class=\"wp-block-list\">\n<li>Type of architecture\u2014 we also provide the code for VGG16\u00a0<a href=\"https:\/\/github.com\/comet-ml\/keras-fruit-classifer\/tree\/master\/notebooks\" target=\"_blank\" rel=\"noreferrer noopener\">here<\/a><\/li>\n<li>Number of Layers \u2014 Increase network depth to give it more capacity. Try adding more layers, one at a time, and check for improvements<\/li>\n<li>Number of Neurons in a layer<\/li>\n<li>Adding regularization and adjusting those parameters<\/li>\n<li>Learning Rate \u2014 you can incorporate the Keras LearningRateScheduler through the callback (see\u00a0<a href=\"https:\/\/machinelearningmastery.com\/using-learning-rate-schedules-deep-learning-models-python-keras\/\" target=\"_blank\" rel=\"noreferrer noopener\">here<\/a>\u00a0and\u00a0<a href=\"https:\/\/www.jeremyjordan.me\/nn-learning-rate\/\" target=\"_blank\" rel=\"noreferrer noopener\">here<\/a>)<\/li>\n<li>Type of optimization \/ back-propagation technique to use<\/li>\n<li>Dropout rate<\/li>\n<li><a href=\"https:\/\/towardsdatascience.com\/boost-your-cnn-image-classifier-performance-with-progressive-resizing-in-keras-a7d96da06e20\" target=\"_blank\" rel=\"noreferrer noopener\">Progressive Resizing<\/a><\/li>\n<li>H<a href=\"https:\/\/www.comet.com\/parameter-optimization\" target=\"_blank\" rel=\"noreferrer noopener\">yperparameter optimization services<\/a>.<\/li>\n<\/ul>\n\n\n\n<p>As you try these different optimizations,\u00a0<a href=\"http:\/\/bit.ly\/309Cll1\" target=\"_blank\" rel=\"noreferrer noopener\">comet.ml<\/a>\u00a0allows you to create visualizations like bar charts and line plots to track your experiments with along with parallel coordinate charts. These experiment-level and project-level visualizations help you quickly identify your best-performing models and understand your parameter space.<\/p>\n\n\n<hr class=\"wp-block-separator is-style-dots\" \/>\n\n\n<h2 class=\"wp-block-heading\"><strong>Your full machine learning pipeline<\/strong><\/h2>\n\n\n\n<p>If you had to share your model results or intermediate work with your fellow data scientist today. How would you do it?<\/p>\n\n\n\n<p>The benefits of using Quilt for data versioning and Comet for model versioning is that by combining these best-in-breed tools you can simultaneously make your machine learning model experiments easily accessible, trackable, and reproducible.<\/p>\n\n\n\n<p><strong>Sharing a model and the code used to generate it?<\/strong>\u00a0Link your collaborator to\u00a0<a href=\"https:\/\/www.comet.com\/ceceshao1\/comet-quilt-example\/b1c98bfbeec9473ea825724f75063863\" target=\"_blank\" rel=\"noreferrer noopener\">the Comet experiment page<\/a>. Sharing the data you used? Share a link to\u00a0<a href=\"https:\/\/alpha.quiltdata.com\/b\/quilt-example\/tree\/quilt\/open_fruit\/\" target=\"_blank\" rel=\"noreferrer noopener\">the Quilt T4 package<\/a>.<\/p>\n\n\n\n<div class=\"wp-block-image\">\n<figure class=\"aligncenter\"><img decoding=\"async\" class=\"wp-image-566\" src=\"https:\/\/i1.wp.com\/blog.comet.ml\/wp-content\/uploads\/2019\/10\/data-model.png?fit=769%2C547&amp;ssl=1\" alt=\"\" \/><\/figure>\n<\/div>\n\n\n\n<p><strong>Reproducing the result locally, or using an old experiment as the starting point for a new one?<\/strong>\u00a0Get back to where you left off with this code:<\/p>\n\n\n\n<pre class=\"wp-block-code\"><code># GET THE CODE\ngit clone https:\/\/github.com\/comet-ml\/keras-fruit-classifer\ncd open_fruit\/\n\n# GET THE DATA\npython -c \"import t4; t4.Package.install('quilt\/open_fruit', 's3:\/\/quilt-example', dest='keras-fruit-classifier\/')\"\n\n# GET THE ENVIRONMENT\n# There are a *lot* of ways to do this: a pip requirements.txt, a\n# conda environment.yml, a Docker container...\n\n# Here's one cool way - cloning the Comet runtime\nPY_VERSION=$(python -c \"import comet_ml; print(comet_ml.API().get_experiment_system_details('01e427cedce145f8bc69f19ae9fb45bb')['python_version'])\")\n\nconda create -n my_test_env python=$PY_VERSION\nconda activate my_test_env\n\npython -c \"import comet_ml; print('n'.join(comet_ml.API().get_experiment_installed_packages('01e427cedce145f8bc69f19ae9fb45bb')))\" &gt; requirements.txt\npip install -r requirements.txt\n\n# You can also get this from comet.ml by clicking on the Download button\n\n# GET DEVELOPING\njupyter notebook<\/code><\/pre>\n\n\n\n<p>Congratulations! You\u2019ve gone beyond building a multi-class image classifier model to building a fully reproducible (and shareable) machine learning pipeline with data, code, and environment details \u2b50\ufe0f<\/p>\n\n\n\n<p><em>Thanks to\u00a0Gideon Mendels\u00a0and\u00a0Aleksey Bilogur.\u00a0<\/em><\/p>\n","protected":false},"excerpt":{"rendered":"<p>Classifying fruits using a Keras multi-class image classification model and Google Open Images &nbsp; This post was written in collaboration with\u00a0Aleksey Boligur\u00a0from the Quilt Data team.\u00a0Follow Aleksey on Twitter\u00a0and his personal website\u00a0here. Follow Quilt\u00a0here The term machine learning \u2018pipeline\u2019 can suggest a one-way flow of data and transformations, but in reality, machine learning pipelines are [&hellip;]<\/p>\n","protected":false},"author":1,"featured_media":0,"comment_status":"closed","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"customer_name":"","customer_description":"","customer_industry":"","customer_technologies":"","customer_logo":"","footnotes":""},"categories":[6],"tags":[],"coauthors":[107],"class_list":["post-1858","post","type-post","status-publish","format-standard","hentry","category-machine-learning"],"yoast_head":"<!-- This site is optimized with the Yoast SEO Premium plugin v25.9 (Yoast SEO v25.9) - https:\/\/yoast.com\/wordpress\/plugins\/seo\/ -->\n<title>Building a fully reproducible machine learning pipeline with comet.ml and Quilt - Comet<\/title>\n<meta name=\"robots\" content=\"index, follow, max-snippet:-1, max-image-preview:large, max-video-preview:-1\" \/>\n<link rel=\"canonical\" href=\"https:\/\/www.comet.com\/site\/blog\/building-a-fully-reproducible-machine-learning-pipeline-with-comet-ml-and-quilt\/\" \/>\n<meta property=\"og:locale\" content=\"en_US\" \/>\n<meta property=\"og:type\" content=\"article\" \/>\n<meta property=\"og:title\" content=\"Building a fully reproducible machine learning pipeline with comet.ml and Quilt\" \/>\n<meta property=\"og:description\" content=\"Classifying fruits using a Keras multi-class image classification model and Google Open Images &nbsp; This post was written in collaboration with\u00a0Aleksey Boligur\u00a0from the Quilt Data team.\u00a0Follow Aleksey on Twitter\u00a0and his personal website\u00a0here. Follow Quilt\u00a0here The term machine learning \u2018pipeline\u2019 can suggest a one-way flow of data and transformations, but in reality, machine learning pipelines are [&hellip;]\" \/>\n<meta property=\"og:url\" content=\"https:\/\/www.comet.com\/site\/blog\/building-a-fully-reproducible-machine-learning-pipeline-with-comet-ml-and-quilt\/\" \/>\n<meta property=\"og:site_name\" content=\"Comet\" \/>\n<meta property=\"article:publisher\" content=\"https:\/\/www.facebook.com\/cometdotml\" \/>\n<meta property=\"article:published_time\" content=\"2019-05-14T04:11:04+00:00\" \/>\n<meta property=\"og:image\" content=\"https:\/\/wordpress.comet.ml\/app\/uploads\/2021\/03\/1R0DgVptZwIgZfm-QXXKm5g.png\" \/>\n<meta name=\"author\" content=\"Gideon Mendels\" \/>\n<meta name=\"twitter:card\" content=\"summary_large_image\" \/>\n<meta name=\"twitter:creator\" content=\"@Cometml\" \/>\n<meta name=\"twitter:site\" content=\"@Cometml\" \/>\n<meta name=\"twitter:label1\" content=\"Written by\" \/>\n\t<meta name=\"twitter:data1\" content=\"Gideon Mendels\" \/>\n\t<meta name=\"twitter:label2\" content=\"Est. reading time\" \/>\n\t<meta name=\"twitter:data2\" content=\"11 minutes\" \/>\n<!-- \/ Yoast SEO Premium plugin. -->","yoast_head_json":{"title":"Building a fully reproducible machine learning pipeline with comet.ml and Quilt - Comet","robots":{"index":"index","follow":"follow","max-snippet":"max-snippet:-1","max-image-preview":"max-image-preview:large","max-video-preview":"max-video-preview:-1"},"canonical":"https:\/\/www.comet.com\/site\/blog\/building-a-fully-reproducible-machine-learning-pipeline-with-comet-ml-and-quilt\/","og_locale":"en_US","og_type":"article","og_title":"Building a fully reproducible machine learning pipeline with comet.ml and Quilt","og_description":"Classifying fruits using a Keras multi-class image classification model and Google Open Images &nbsp; This post was written in collaboration with\u00a0Aleksey Boligur\u00a0from the Quilt Data team.\u00a0Follow Aleksey on Twitter\u00a0and his personal website\u00a0here. Follow Quilt\u00a0here The term machine learning \u2018pipeline\u2019 can suggest a one-way flow of data and transformations, but in reality, machine learning pipelines are [&hellip;]","og_url":"https:\/\/www.comet.com\/site\/blog\/building-a-fully-reproducible-machine-learning-pipeline-with-comet-ml-and-quilt\/","og_site_name":"Comet","article_publisher":"https:\/\/www.facebook.com\/cometdotml","article_published_time":"2019-05-14T04:11:04+00:00","og_image":[{"url":"https:\/\/wordpress.comet.ml\/app\/uploads\/2021\/03\/1R0DgVptZwIgZfm-QXXKm5g.png","type":"","width":"","height":""}],"author":"Gideon Mendels","twitter_card":"summary_large_image","twitter_creator":"@Cometml","twitter_site":"@Cometml","twitter_misc":{"Written by":"Gideon Mendels","Est. reading time":"11 minutes"},"schema":{"@context":"https:\/\/schema.org","@graph":[{"@type":"Article","@id":"https:\/\/www.comet.com\/site\/blog\/building-a-fully-reproducible-machine-learning-pipeline-with-comet-ml-and-quilt\/#article","isPartOf":{"@id":"https:\/\/www.comet.com\/site\/blog\/building-a-fully-reproducible-machine-learning-pipeline-with-comet-ml-and-quilt\/"},"author":{"name":"engineering@atre.net","@id":"https:\/\/www.comet.com\/site\/#\/schema\/person\/550ac35e8e821db8064c5bd1f0a04e6b"},"headline":"Building a fully reproducible machine learning pipeline with comet.ml and Quilt","datePublished":"2019-05-14T04:11:04+00:00","mainEntityOfPage":{"@id":"https:\/\/www.comet.com\/site\/blog\/building-a-fully-reproducible-machine-learning-pipeline-with-comet-ml-and-quilt\/"},"wordCount":1905,"publisher":{"@id":"https:\/\/www.comet.com\/site\/#organization"},"image":{"@id":"https:\/\/www.comet.com\/site\/blog\/building-a-fully-reproducible-machine-learning-pipeline-with-comet-ml-and-quilt\/#primaryimage"},"thumbnailUrl":"https:\/\/wordpress.comet.ml\/app\/uploads\/2021\/03\/1R0DgVptZwIgZfm-QXXKm5g.png","articleSection":["Machine Learning"],"inLanguage":"en-US"},{"@type":"WebPage","@id":"https:\/\/www.comet.com\/site\/blog\/building-a-fully-reproducible-machine-learning-pipeline-with-comet-ml-and-quilt\/","url":"https:\/\/www.comet.com\/site\/blog\/building-a-fully-reproducible-machine-learning-pipeline-with-comet-ml-and-quilt\/","name":"Building a fully reproducible machine learning pipeline with comet.ml and Quilt - Comet","isPartOf":{"@id":"https:\/\/www.comet.com\/site\/#website"},"primaryImageOfPage":{"@id":"https:\/\/www.comet.com\/site\/blog\/building-a-fully-reproducible-machine-learning-pipeline-with-comet-ml-and-quilt\/#primaryimage"},"image":{"@id":"https:\/\/www.comet.com\/site\/blog\/building-a-fully-reproducible-machine-learning-pipeline-with-comet-ml-and-quilt\/#primaryimage"},"thumbnailUrl":"https:\/\/wordpress.comet.ml\/app\/uploads\/2021\/03\/1R0DgVptZwIgZfm-QXXKm5g.png","datePublished":"2019-05-14T04:11:04+00:00","breadcrumb":{"@id":"https:\/\/www.comet.com\/site\/blog\/building-a-fully-reproducible-machine-learning-pipeline-with-comet-ml-and-quilt\/#breadcrumb"},"inLanguage":"en-US","potentialAction":[{"@type":"ReadAction","target":["https:\/\/www.comet.com\/site\/blog\/building-a-fully-reproducible-machine-learning-pipeline-with-comet-ml-and-quilt\/"]}]},{"@type":"ImageObject","inLanguage":"en-US","@id":"https:\/\/www.comet.com\/site\/blog\/building-a-fully-reproducible-machine-learning-pipeline-with-comet-ml-and-quilt\/#primaryimage","url":"https:\/\/wordpress.comet.ml\/app\/uploads\/2021\/03\/1R0DgVptZwIgZfm-QXXKm5g.png","contentUrl":"https:\/\/wordpress.comet.ml\/app\/uploads\/2021\/03\/1R0DgVptZwIgZfm-QXXKm5g.png"},{"@type":"BreadcrumbList","@id":"https:\/\/www.comet.com\/site\/blog\/building-a-fully-reproducible-machine-learning-pipeline-with-comet-ml-and-quilt\/#breadcrumb","itemListElement":[{"@type":"ListItem","position":1,"name":"Home","item":"https:\/\/www.comet.com\/site\/"},{"@type":"ListItem","position":2,"name":"Building a fully reproducible machine learning pipeline with comet.ml and Quilt"}]},{"@type":"WebSite","@id":"https:\/\/www.comet.com\/site\/#website","url":"https:\/\/www.comet.com\/site\/","name":"Comet","description":"Build Better Models Faster","publisher":{"@id":"https:\/\/www.comet.com\/site\/#organization"},"potentialAction":[{"@type":"SearchAction","target":{"@type":"EntryPoint","urlTemplate":"https:\/\/www.comet.com\/site\/?s={search_term_string}"},"query-input":{"@type":"PropertyValueSpecification","valueRequired":true,"valueName":"search_term_string"}}],"inLanguage":"en-US"},{"@type":"Organization","@id":"https:\/\/www.comet.com\/site\/#organization","name":"Comet ML, Inc.","alternateName":"Comet","url":"https:\/\/www.comet.com\/site\/","logo":{"@type":"ImageObject","inLanguage":"en-US","@id":"https:\/\/www.comet.com\/site\/#\/schema\/logo\/image\/","url":"https:\/\/www.comet.com\/site\/wp-content\/uploads\/2025\/01\/logo_comet_square.png","contentUrl":"https:\/\/www.comet.com\/site\/wp-content\/uploads\/2025\/01\/logo_comet_square.png","width":310,"height":310,"caption":"Comet ML, Inc."},"image":{"@id":"https:\/\/www.comet.com\/site\/#\/schema\/logo\/image\/"},"sameAs":["https:\/\/www.facebook.com\/cometdotml","https:\/\/x.com\/Cometml","https:\/\/www.youtube.com\/channel\/UCmN63HKvfXSCS-UwVwmK8Hw"]},{"@type":"Person","@id":"https:\/\/www.comet.com\/site\/#\/schema\/person\/550ac35e8e821db8064c5bd1f0a04e6b","name":"engineering@atre.net","image":{"@type":"ImageObject","inLanguage":"en-US","@id":"https:\/\/www.comet.com\/site\/#\/schema\/person\/image\/027c18177377edf459980f0cfb83706c","url":"https:\/\/secure.gravatar.com\/avatar\/d002a459a297e0d1779329318029aee19868c312b3e1f3c9ec9b3e3add2740de?s=96&d=mm&r=g","contentUrl":"https:\/\/secure.gravatar.com\/avatar\/d002a459a297e0d1779329318029aee19868c312b3e1f3c9ec9b3e3add2740de?s=96&d=mm&r=g","caption":"engineering@atre.net"},"sameAs":["https:\/\/live-cometml.pantheonsite.io"],"url":"https:\/\/www.comet.com\/site\/blog\/author\/engineeringatre-net\/"}]}},"_links":{"self":[{"href":"https:\/\/www.comet.com\/site\/wp-json\/wp\/v2\/posts\/1858","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/www.comet.com\/site\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/www.comet.com\/site\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/www.comet.com\/site\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/www.comet.com\/site\/wp-json\/wp\/v2\/comments?post=1858"}],"version-history":[{"count":0,"href":"https:\/\/www.comet.com\/site\/wp-json\/wp\/v2\/posts\/1858\/revisions"}],"wp:attachment":[{"href":"https:\/\/www.comet.com\/site\/wp-json\/wp\/v2\/media?parent=1858"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/www.comet.com\/site\/wp-json\/wp\/v2\/categories?post=1858"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/www.comet.com\/site\/wp-json\/wp\/v2\/tags?post=1858"},{"taxonomy":"author","embeddable":true,"href":"https:\/\/www.comet.com\/site\/wp-json\/wp\/v2\/coauthors?post=1858"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}