{"id":1643,"date":"2018-08-06T14:56:00","date_gmt":"2018-08-06T22:56:00","guid":{"rendered":"https:\/\/live-cometml.pantheonsite.io\/blog\/building-reliable-machine-learning-models-with-cross-validation\/"},"modified":"2018-08-06T14:56:00","modified_gmt":"2018-08-06T22:56:00","slug":"building-reliable-machine-learning-models-with-cross-validation","status":"publish","type":"post","link":"https:\/\/www.comet.com\/site\/blog\/building-reliable-machine-learning-models-with-cross-validation\/","title":{"rendered":"Building reliable machine learning models with cross-validation"},"content":{"rendered":"\n<p>Cross-validation is a technique used to measure and evaluate machine learning models performance. During training we create a number of partitions of the training set and train\/test on different subsets of those partitions.<\/p>\n\n\n\n<p>Cross-validation is frequently used to train, measure and finally select a machine learning model for a given dataset because it helps assess how the results of a model will generalize to an independent data set<em> in practice<\/em>. Most importantly, cross-validation has been shown to produce models with lower bias than other methods.<\/p>\n\n\n\n<p>This tutorial will focus on one variant of cross-validation named<strong> k-fold cross-validation<\/strong>.<\/p>\n\n\n\n<p>In this tutorial we\u2019ll cover the following:<\/p>\n\n\n\n<ol class=\"wp-block-list\">\n<li>Overview of K-Fold Cross-Validation<\/li>\n<li>Example using Scikit-Learn and <a href=\"https:\/\/www.comet.com\/\" target=\"_blank\" rel=\"noopener noreferrer\" data-href=\"https:\/\/www.comet.com\/\">comet.ml<\/a><\/li>\n<\/ol>\n\n\n<hr class=\"wp-block-separator\" \/>\n\n\n<h3 class=\"wp-block-heading\">K-Fold Cross-Validation<\/h3>\n\n\n\n<p>Cross-validation is a resampling technique used to evaluate machine learning models on a limited data set.<\/p>\n\n\n\n<p>The most common use of cross-validation is the k-fold cross-validation method. Our training set is split into<em> K<\/em> partitions, the model is trained on <em>K-1 <\/em>partitions and the test error is predicted and computed on the <em>Kth<\/em> partition. This is repeated for each unique group and the test errors are averaged across.<\/p>\n\n\n\n<p><strong>The same procedure is described by the following steps:<\/strong><\/p>\n\n\n\n<ol class=\"wp-block-list\">\n<li>Split the training set into K (K=10 is a common choice) partitions<\/li>\n<\/ol>\n\n\n\n<p>For each partition:<\/p>\n\n\n\n<p>2. Set the partition is the test set<\/p>\n\n\n\n<p>3. Train a model on the rest of the partitions<\/p>\n\n\n\n<p>4. Measure performance on the test set<\/p>\n\n\n\n<p>5. Retain the performance metric<\/p>\n\n\n\n<p>6. Explore model performance over different folds<\/p>\n\n\n\n<p>Cross-validation is commonly used since it\u2019s easy to interpret and since it generally results in a less biased or less optimistic estimates of the model performance than other methods, such as a simple train\/test split. One of the biggest downsides in using cross-validation is the increased training time as we are essentially training K times instead of 1.<\/p>\n\n\n\n<h3 class=\"wp-block-heading\">Cross-validation example using scikit-learn<\/h3>\n\n\n\n<p><a href=\"http:\/\/scikit-learn.org\/\" target=\"_blank\" rel=\"noopener noreferrer\" data-href=\"http:\/\/scikit-learn.org\/\"><strong>Scikit-learn<\/strong><\/a> is a popular machine learning library that also provides many tools for data sampling, model evaluation and training. We\u2019ll use the <code>Kfold<\/code> class to generate our folds. Here\u2019s a basic overview:<\/p>\n\n\n\n<pre class=\"wp-block-preformatted\"><strong>from<\/strong> <strong>sklearn.model_selection<\/strong> <strong>import<\/strong> KFold<br \/>X = [...] # My training dataset inputs\/features<br \/>y = [...] # My training dataset targets<br \/><br \/>kf = KFold(n_splits=2)<br \/>kf.get_n_splits(X)<br \/><br \/><strong>for<\/strong> train_index, test_index <strong>in<\/strong> kf.split(X):<br \/>   X_train, X_test = X[train_index], X[test_index]<br \/>   y_train, y_test = y[train_index], y[test_index]<br \/>   model = train_model(X_train,y_train)<br \/>   score = eval_model(X_test,y_test)<\/pre>\n\n\n\n<h3 class=\"wp-block-heading\">Now let\u2019s train an end-to-end example using scikit-learn and comet.ml.<\/h3>\n\n\n\n<p>This example trains a text classifier on the news groups dataset (you can find it <a href=\"http:\/\/scikit-learn.org\/stable\/datasets\/twenty_newsgroups.html\" target=\"_blank\" rel=\"noopener noreferrer\" data-href=\"http:\/\/scikit-learn.org\/stable\/datasets\/twenty_newsgroups.html\">here<\/a>). Given a piece of text (string), the model classifies it to one of the following classes: \u201catheism\u201d,\u201dchristian\u201d,\u201dcomputer graphics\u201d, \u201cmedicine\u201d.<\/p>\n\n\n\n<p><script src=\"https:\/\/gist.github.com\/gidim\/558a21b0cf5fccf632523bf0940a0fd4.js\"><\/script><\/p>\n\n\n\n<p>On every fold we report the accuracy to <a href=\"https:\/\/www.comet.com\/\" target=\"_blank\" rel=\"noopener noreferrer\" data-href=\"https:\/\/www.comet.com\/\">comet.ml<\/a> and finally we report the average accuracy of all folds. After the experiment finishes, we can <a href=\"https:\/\/www.comet.com\/gidim\/cross-validation\/dd73c9696cbc497cb8274abcb883e03e\/chart\" target=\"_blank\" rel=\"noopener noreferrer\" data-href=\"https:\/\/www.comet.com\/gidim\/cross-validation\/dd73c9696cbc497cb8274abcb883e03e\/chart\"><strong>visit comet.ml and examine our model<\/strong><\/a><strong>:<\/strong><\/p>\n\n\n\n<figure class=\"wp-block-image size-large\"><img decoding=\"async\" src=\"https:\/\/www.comet.com\/site\/wp-content\/uploads\/2022\/06\/12S1eAai0AiaexVCxge8zCg.png\" alt=\"\" \/><\/figure>\n\n\n\n<p>The following chart was automatically generated by comet.ml. The right most bar (in purple) represents the <strong>average<\/strong> <strong>accuracy across folds.<\/strong> As you can see some folds preform significantly better than the average and shows how important k-fold cross validation is.<\/p>\n\n\n\n<p>You might have noticed that we didn\u2019t compute the <strong>test<\/strong> accuracy. The <strong>test<\/strong> set should not be used in any way until you\u2019re completely finished with all experimentation. If we change hyperparameters or model types based on the test accuracy we\u2019re essentially over-fitting our hyperparameters to the test distribution.<\/p>\n\n\n\n<p>Still curious about cross-validation? Here are some other great resources:<\/p>\n\n\n\n<ul class=\"wp-block-list\">\n<li><a href=\"https:\/\/medium.com\/u\/f374d0159316\" target=\"_blank\" rel=\"noopener noreferrer\" data-href=\"https:\/\/medium.com\/u\/f374d0159316\" data-anchor-type=\"2\" data-user-id=\"f374d0159316\" data-action-value=\"f374d0159316\" data-action=\"show-user-card\" data-action-type=\"hover\">Jason Brownlee<\/a>\u2019s \u201cGentle Introduction to Cross-validation\u201d @ <a href=\"https:\/\/machinelearningmastery.com\/k-fold-cross-validation\/\" target=\"_blank\" rel=\"noopener noreferrer\" data-href=\"https:\/\/machinelearningmastery.com\/k-fold-cross-validation\/\">https:\/\/machinelearningmastery.com\/k-fold-cross-validation\/<\/a><\/li>\n<li><a href=\"https:\/\/medium.com\/u\/a84d0e60277a\" target=\"_blank\" rel=\"noopener noreferrer\" data-href=\"https:\/\/medium.com\/u\/a84d0e60277a\" data-anchor-type=\"2\" data-user-id=\"a84d0e60277a\" data-action-value=\"a84d0e60277a\" data-action=\"show-user-card\" data-action-type=\"hover\">Prashant Gupta<\/a>\u2019s <a href=\"https:\/\/towardsdatascience.com\/cross-validation-in-machine-learning-72924a69872f\">medium post <\/a><\/li>\n<\/ul>\n\n\n\n<p><strong>Found this article useful? Here are some articles you might find interesting:<\/strong><\/p>\n\n\n\n<ul class=\"wp-block-list\">\n<li><a href=\"https:\/\/www.notion.so\/cometml\/Comet-ml-Release-Notes-93d864bcac584360943a73ae9507bcaa\" target=\"_blank\" rel=\"noopener noreferrer\" data-href=\"https:\/\/www.notion.so\/cometml\/Comet-ml-Release-Notes-93d864bcac584360943a73ae9507bcaa\">comet.ml Release Notes\u200a<\/a>\u2014\u200aupdated daily with new features and fixes!<\/li>\n<li><a href=\"https:\/\/medium.com\/comet-ml\/using-fasttext-and-comet-ml-to-classify-relationships-in-knowledge-graphs-e73d27b40d67\" target=\"_blank\" rel=\"noopener noreferrer\" data-href=\"https:\/\/medium.com\/comet-ml\/using-fasttext-and-comet-ml-to-classify-relationships-in-knowledge-graphs-e73d27b40d67\">Using fastText and comet.ml to classify relationships in Knowledge<\/a><\/li>\n<li><a href=\"https:\/\/medium.com\/comet-ml\/real-time-model-performance-visualizations-with-comet-ml-992fb6226cb6\" target=\"_blank\" rel=\"noopener noreferrer\" data-href=\"https:\/\/medium.com\/comet-ml\/real-time-model-performance-visualizations-with-comet-ml-992fb6226cb6\">Real-time model performance visualizations<\/a><\/li>\n<\/ul>\n\n\n<hr class=\"wp-block-separator\" \/>\n\n\n<p><strong>Gideon Mendels<\/strong> is the CEO and co-founder of <a href=\"https:\/\/www.comet.com\/\" target=\"_blank\" rel=\"noopener noreferrer\" data-href=\"https:\/\/www.comet.com\/\">comet.ml<\/a>.<\/p>\n\n\n\n<p><strong>About comet.ml\u200a\u2014\u200a<\/strong><a href=\"https:\/\/www.comet.com\/\" target=\"_blank\" rel=\"noopener noreferrer\" data-href=\"https:\/\/www.comet.com\/\">comet.ml<\/a> is doing for ML what Github did for code. Our lightweight SDK enables data science teams to automatically track their datasets, code changes, experimentation history. This way, data scientists can easily reproduce their models and collaborate on model iteration amongst their team!<\/p>\n","protected":false},"excerpt":{"rendered":"<p>Cross-validation is a technique used to measure and evaluate machine learning models performance. During training we create a number of partitions of the training set and train\/test on different subsets of those partitions. Cross-validation is frequently used to train, measure and finally select a machine learning model for a given dataset because it helps assess [&hellip;]<\/p>\n","protected":false},"author":1,"featured_media":1645,"comment_status":"closed","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"customer_name":"","customer_description":"","customer_industry":"","customer_technologies":"","customer_logo":"","footnotes":""},"categories":[7],"tags":[],"coauthors":[107],"class_list":["post-1643","post","type-post","status-publish","format-standard","has-post-thumbnail","hentry","category-tutorials"],"yoast_head":"<!-- This site is optimized with the Yoast SEO Premium plugin v25.9 (Yoast SEO v25.9) - https:\/\/yoast.com\/wordpress\/plugins\/seo\/ -->\n<title>Building reliable machine learning models with cross-validation - Comet<\/title>\n<meta name=\"robots\" content=\"index, follow, max-snippet:-1, max-image-preview:large, max-video-preview:-1\" \/>\n<link rel=\"canonical\" href=\"https:\/\/www.comet.com\/site\/blog\/building-reliable-machine-learning-models-with-cross-validation\/\" \/>\n<meta property=\"og:locale\" content=\"en_US\" \/>\n<meta property=\"og:type\" content=\"article\" \/>\n<meta property=\"og:title\" content=\"Building reliable machine learning models with cross-validation\" \/>\n<meta property=\"og:description\" content=\"Cross-validation is a technique used to measure and evaluate machine learning models performance. During training we create a number of partitions of the training set and train\/test on different subsets of those partitions. Cross-validation is frequently used to train, measure and finally select a machine learning model for a given dataset because it helps assess [&hellip;]\" \/>\n<meta property=\"og:url\" content=\"https:\/\/www.comet.com\/site\/blog\/building-reliable-machine-learning-models-with-cross-validation\/\" \/>\n<meta property=\"og:site_name\" content=\"Comet\" \/>\n<meta property=\"article:publisher\" content=\"https:\/\/www.facebook.com\/cometdotml\" \/>\n<meta property=\"article:published_time\" content=\"2018-08-06T22:56:00+00:00\" \/>\n<meta property=\"og:image\" content=\"https:\/\/www.comet.com\/site\/wp-content\/uploads\/2022\/06\/cv.png\" \/>\n\t<meta property=\"og:image:width\" content=\"743\" \/>\n\t<meta property=\"og:image:height\" content=\"471\" \/>\n\t<meta property=\"og:image:type\" content=\"image\/png\" \/>\n<meta name=\"author\" content=\"Gideon Mendels\" \/>\n<meta name=\"twitter:card\" content=\"summary_large_image\" \/>\n<meta name=\"twitter:creator\" content=\"@Cometml\" \/>\n<meta name=\"twitter:site\" content=\"@Cometml\" \/>\n<meta name=\"twitter:label1\" content=\"Written by\" \/>\n\t<meta name=\"twitter:data1\" content=\"Gideon Mendels\" \/>\n\t<meta name=\"twitter:label2\" content=\"Est. reading time\" \/>\n\t<meta name=\"twitter:data2\" content=\"4 minutes\" \/>\n<!-- \/ Yoast SEO Premium plugin. -->","yoast_head_json":{"title":"Building reliable machine learning models with cross-validation - Comet","robots":{"index":"index","follow":"follow","max-snippet":"max-snippet:-1","max-image-preview":"max-image-preview:large","max-video-preview":"max-video-preview:-1"},"canonical":"https:\/\/www.comet.com\/site\/blog\/building-reliable-machine-learning-models-with-cross-validation\/","og_locale":"en_US","og_type":"article","og_title":"Building reliable machine learning models with cross-validation","og_description":"Cross-validation is a technique used to measure and evaluate machine learning models performance. During training we create a number of partitions of the training set and train\/test on different subsets of those partitions. Cross-validation is frequently used to train, measure and finally select a machine learning model for a given dataset because it helps assess [&hellip;]","og_url":"https:\/\/www.comet.com\/site\/blog\/building-reliable-machine-learning-models-with-cross-validation\/","og_site_name":"Comet","article_publisher":"https:\/\/www.facebook.com\/cometdotml","article_published_time":"2018-08-06T22:56:00+00:00","og_image":[{"width":743,"height":471,"url":"https:\/\/www.comet.com\/site\/wp-content\/uploads\/2022\/06\/cv.png","type":"image\/png"}],"author":"Gideon Mendels","twitter_card":"summary_large_image","twitter_creator":"@Cometml","twitter_site":"@Cometml","twitter_misc":{"Written by":"Gideon Mendels","Est. reading time":"4 minutes"},"schema":{"@context":"https:\/\/schema.org","@graph":[{"@type":"Article","@id":"https:\/\/www.comet.com\/site\/blog\/building-reliable-machine-learning-models-with-cross-validation\/#article","isPartOf":{"@id":"https:\/\/www.comet.com\/site\/blog\/building-reliable-machine-learning-models-with-cross-validation\/"},"author":{"name":"engineering@atre.net","@id":"https:\/\/www.comet.com\/site\/#\/schema\/person\/550ac35e8e821db8064c5bd1f0a04e6b"},"headline":"Building reliable machine learning models with cross-validation","datePublished":"2018-08-06T22:56:00+00:00","mainEntityOfPage":{"@id":"https:\/\/www.comet.com\/site\/blog\/building-reliable-machine-learning-models-with-cross-validation\/"},"wordCount":640,"publisher":{"@id":"https:\/\/www.comet.com\/site\/#organization"},"image":{"@id":"https:\/\/www.comet.com\/site\/blog\/building-reliable-machine-learning-models-with-cross-validation\/#primaryimage"},"thumbnailUrl":"https:\/\/www.comet.com\/site\/wp-content\/uploads\/2022\/06\/cv.png","articleSection":["Tutorials"],"inLanguage":"en-US"},{"@type":"WebPage","@id":"https:\/\/www.comet.com\/site\/blog\/building-reliable-machine-learning-models-with-cross-validation\/","url":"https:\/\/www.comet.com\/site\/blog\/building-reliable-machine-learning-models-with-cross-validation\/","name":"Building reliable machine learning models with cross-validation - Comet","isPartOf":{"@id":"https:\/\/www.comet.com\/site\/#website"},"primaryImageOfPage":{"@id":"https:\/\/www.comet.com\/site\/blog\/building-reliable-machine-learning-models-with-cross-validation\/#primaryimage"},"image":{"@id":"https:\/\/www.comet.com\/site\/blog\/building-reliable-machine-learning-models-with-cross-validation\/#primaryimage"},"thumbnailUrl":"https:\/\/www.comet.com\/site\/wp-content\/uploads\/2022\/06\/cv.png","datePublished":"2018-08-06T22:56:00+00:00","breadcrumb":{"@id":"https:\/\/www.comet.com\/site\/blog\/building-reliable-machine-learning-models-with-cross-validation\/#breadcrumb"},"inLanguage":"en-US","potentialAction":[{"@type":"ReadAction","target":["https:\/\/www.comet.com\/site\/blog\/building-reliable-machine-learning-models-with-cross-validation\/"]}]},{"@type":"ImageObject","inLanguage":"en-US","@id":"https:\/\/www.comet.com\/site\/blog\/building-reliable-machine-learning-models-with-cross-validation\/#primaryimage","url":"https:\/\/www.comet.com\/site\/wp-content\/uploads\/2022\/06\/cv.png","contentUrl":"https:\/\/www.comet.com\/site\/wp-content\/uploads\/2022\/06\/cv.png","width":743,"height":471,"caption":"Accuracy fold 0 and accuracy fold 1"},{"@type":"BreadcrumbList","@id":"https:\/\/www.comet.com\/site\/blog\/building-reliable-machine-learning-models-with-cross-validation\/#breadcrumb","itemListElement":[{"@type":"ListItem","position":1,"name":"Home","item":"https:\/\/www.comet.com\/site\/"},{"@type":"ListItem","position":2,"name":"Building reliable machine learning models with cross-validation"}]},{"@type":"WebSite","@id":"https:\/\/www.comet.com\/site\/#website","url":"https:\/\/www.comet.com\/site\/","name":"Comet","description":"Build Better Models Faster","publisher":{"@id":"https:\/\/www.comet.com\/site\/#organization"},"potentialAction":[{"@type":"SearchAction","target":{"@type":"EntryPoint","urlTemplate":"https:\/\/www.comet.com\/site\/?s={search_term_string}"},"query-input":{"@type":"PropertyValueSpecification","valueRequired":true,"valueName":"search_term_string"}}],"inLanguage":"en-US"},{"@type":"Organization","@id":"https:\/\/www.comet.com\/site\/#organization","name":"Comet ML, Inc.","alternateName":"Comet","url":"https:\/\/www.comet.com\/site\/","logo":{"@type":"ImageObject","inLanguage":"en-US","@id":"https:\/\/www.comet.com\/site\/#\/schema\/logo\/image\/","url":"https:\/\/www.comet.com\/site\/wp-content\/uploads\/2025\/01\/logo_comet_square.png","contentUrl":"https:\/\/www.comet.com\/site\/wp-content\/uploads\/2025\/01\/logo_comet_square.png","width":310,"height":310,"caption":"Comet ML, Inc."},"image":{"@id":"https:\/\/www.comet.com\/site\/#\/schema\/logo\/image\/"},"sameAs":["https:\/\/www.facebook.com\/cometdotml","https:\/\/x.com\/Cometml","https:\/\/www.youtube.com\/channel\/UCmN63HKvfXSCS-UwVwmK8Hw"]},{"@type":"Person","@id":"https:\/\/www.comet.com\/site\/#\/schema\/person\/550ac35e8e821db8064c5bd1f0a04e6b","name":"engineering@atre.net","image":{"@type":"ImageObject","inLanguage":"en-US","@id":"https:\/\/www.comet.com\/site\/#\/schema\/person\/image\/027c18177377edf459980f0cfb83706c","url":"https:\/\/secure.gravatar.com\/avatar\/d002a459a297e0d1779329318029aee19868c312b3e1f3c9ec9b3e3add2740de?s=96&d=mm&r=g","contentUrl":"https:\/\/secure.gravatar.com\/avatar\/d002a459a297e0d1779329318029aee19868c312b3e1f3c9ec9b3e3add2740de?s=96&d=mm&r=g","caption":"engineering@atre.net"},"sameAs":["https:\/\/live-cometml.pantheonsite.io"],"url":"https:\/\/www.comet.com\/site\/blog\/author\/engineeringatre-net\/"}]}},"_links":{"self":[{"href":"https:\/\/www.comet.com\/site\/wp-json\/wp\/v2\/posts\/1643","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/www.comet.com\/site\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/www.comet.com\/site\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/www.comet.com\/site\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/www.comet.com\/site\/wp-json\/wp\/v2\/comments?post=1643"}],"version-history":[{"count":0,"href":"https:\/\/www.comet.com\/site\/wp-json\/wp\/v2\/posts\/1643\/revisions"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/www.comet.com\/site\/wp-json\/wp\/v2\/media\/1645"}],"wp:attachment":[{"href":"https:\/\/www.comet.com\/site\/wp-json\/wp\/v2\/media?parent=1643"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/www.comet.com\/site\/wp-json\/wp\/v2\/categories?post=1643"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/www.comet.com\/site\/wp-json\/wp\/v2\/tags?post=1643"},{"taxonomy":"author","embeddable":true,"href":"https:\/\/www.comet.com\/site\/wp-json\/wp\/v2\/coauthors?post=1643"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}