{"id":7786,"date":"2023-10-04T10:46:19","date_gmt":"2023-10-04T18:46:19","guid":{"rendered":"https:\/\/live-cometml.pantheonsite.io\/?p=7786"},"modified":"2025-04-24T17:06:08","modified_gmt":"2025-04-24T17:06:08","slug":"emotion-classification-with-spacy-v3-comet","status":"publish","type":"post","link":"https:\/\/www.comet.com\/site\/blog\/emotion-classification-with-spacy-v3-comet\/","title":{"rendered":"Emotion Classification with SpaCy v3 &#038; Comet"},"content":{"rendered":"\n<link rel=\"canonical\" href=\"https:\/\/www.comet.com\/site\/blog\/emotion-classification-with-spacy-v3-comet\">\n\n\n\n<div class=\"fh fi fj fk fl\">\n<div class=\"ab ca\">\n<div class=\"ch bg et eu ev ew\">\n<figure class=\"lw lx ly lz ma mb lt lu paragraph-image\">\n<div class=\"mc md eb me bg mf\" tabindex=\"0\" role=\"button\">\n<figure><img loading=\"lazy\" decoding=\"async\" class=\"bg mg mh c\" role=\"presentation\" src=\"https:\/\/miro.medium.com\/v2\/resize:fit:700\/0*L_TJCEXJeStjEX1U\" alt=\"\" width=\"700\" height=\"462\"><\/figure><div class=\"lt lu lv\"><picture><\/picture><\/div>\n<\/div><figcaption class=\"mi mj mk lt lu ml mm be b bf z dv\" data-selectable-paragraph=\"\">Photo by <a class=\"af mn\" href=\"https:\/\/unsplash.com\/@helloimnik?utm_source=unsplash&amp;utm_medium=referral&amp;utm_content=creditCopyText\" target=\"_blank\" rel=\"noopener ugc nofollow\">Nik<\/a> on <a class=\"af mn\" href=\"https:\/\/unsplash.com\/photos\/vSUc4FmgkDg?utm_source=unsplash&amp;utm_medium=referral&amp;utm_content=creditCopyText\" target=\"_blank\" rel=\"noopener ugc nofollow\">Unsplash<\/a><\/figcaption><\/figure>\n<p id=\"5485\" class=\"pw-post-body-paragraph mo mp fo be b mq mr ms mt mu mv mw mx my mz na nb nc nd ne nf ng nh ni nj nk fh bj\" data-selectable-paragraph=\"\">If you are a natural language processing researcher or have an interest in this field, you have surely come across <a class=\"af mn\" href=\"https:\/\/spacy.io\/https:\/\/spacy.io\/\" target=\"_blank\" rel=\"noopener ugc nofollow\">SpaCy<\/a> or you are very close to it!<\/p>\n<p id=\"604f\" class=\"pw-post-body-paragraph mo mp fo be b mq mr ms mt mu mv mw mx my mz na nb nc nd ne nf ng nh ni nj nk fh bj\" data-selectable-paragraph=\"\">SpaCy, a free, open source natural language processing library developed in Python, is very popular for use in real products. This advanced library, which is frequently used in applications that process and understand large volumes of text, is now integrated with <a class=\"af mn\" href=\"\/signup?utm_source=heartbeat&amp;utm_medium=referral&amp;utm_campaign=AMS_US_EN_SNUP_heartbeat_CTA\" target=\"_blank\" rel=\"noopener ugc nofollow\">Comet ML<\/a>, a very useful experiment monitoring tool!<\/p>\n<p id=\"638a\" class=\"pw-post-body-paragraph mo mp fo be b mq mr ms mt mu mv mw mx my mz na nb nc nd ne nf ng nh ni nj nk fh bj\" data-selectable-paragraph=\"\">Now let\u2019s train a multi-label text classifier using SpaCy-v3 on the <a class=\"af mn\" href=\"https:\/\/huggingface.co\/datasets\/dair-ai\/emotion\" target=\"_blank\" rel=\"noopener ugc nofollow\"><strong class=\"be nl\">Huggingface \u2014 dair-ai\/emotion<\/strong><\/a> dataset and track the model trainings and record the results with Comet ML!<\/p>\n<h1 id=\"2a90\" class=\"nm nn fo be no np nq nr ns nt nu nv nw nx ny nz oa ob oc od oe of og oh oi oj bj\" data-selectable-paragraph=\"\">Emotion Recognition Dataset Overview<\/h1>\n<p id=\"09c4\" class=\"pw-post-body-paragraph mo mp fo be b mq ok ms mt mu ol mw mx my om na nb nc on ne nf ng oo ni nj nk fh bj\" data-selectable-paragraph=\"\">The dataset to be used in the application described in the blog post was created for an emotion classification task. For detailed information on the dataset of Twitter messages written in English, which includes <strong class=\"be nl\">six basic emotions: anger, fear, joy, love, sadness and surprise,<\/strong> please review this <a class=\"af mn\" href=\"https:\/\/aclanthology.org\/D18-1404\/\" target=\"_blank\" rel=\"noopener ugc nofollow\">article.<\/a><\/p>\n<blockquote class=\"op oq or\"><p id=\"83b8\" class=\"mo mp os be b mq mr ms mt mu mv mw mx ot mz na nb ou nd ne nf ov nh ni nj nk fh bj\" data-selectable-paragraph=\"\">\ud83d\udd39<a class=\"af mn\" href=\"https:\/\/huggingface.co\/datasets\/dair-ai\/emotion\" target=\"_blank\" rel=\"noopener ugc nofollow\"><strong class=\"be nl\">Please click<\/strong><\/a> to access the Emotion dataset via Hugging Face Datasets!<\/p><p id=\"6cba\" class=\"mo mp os be b mq mr ms mt mu mv mw mx ot mz na nb ou nd ne nf ov nh ni nj nk fh bj\" data-selectable-paragraph=\"\">\ud83d\udd39<a class=\"af mn\" href=\"https:\/\/www.icloud.com\/iclouddrive\/084E9TMZ_lykn3QhU-kIX1DDQ#merged_training\" target=\"_blank\" rel=\"noopener ugc nofollow\"><strong class=\"be nl\">Please click<\/strong><\/a> to download the Emotion dataset directly!<\/p><p id=\"9c01\" class=\"mo mp os be b mq mr ms mt mu mv mw mx ot mz na nb ou nd ne nf ov nh ni nj nk fh bj\" data-selectable-paragraph=\"\">\ud83d\udd39<a class=\"af mn\" href=\"https:\/\/paperswithcode.com\/sota\/text-classification-on-emotion\" target=\"_blank\" rel=\"noopener ugc nofollow\"><strong class=\"be nl\">Please click<\/strong><\/a> to access Papers with Code Public Leaderboard!<\/p><\/blockquote>\n<p id=\"a4ee\" class=\"pw-post-body-paragraph mo mp fo be b mq mr ms mt mu mv mw mx my mz na nb nc nd ne nf ng nh ni nj nk fh bj\" data-selectable-paragraph=\"\">\ud83d\udd38Data fields are:<\/p>\n<ul class=\"\">\n<li id=\"bec3\" class=\"mo mp fo be b mq mr ms mt mu mv mw mx my ow na nb nc ox ne nf ng oy ni nj nk oz pa pb bj\" data-selectable-paragraph=\"\"><code class=\"cw pc pd pe pf b\">text<\/code>: a string feature.<\/li>\n<li id=\"73f9\" class=\"mo mp fo be b mq pg ms mt mu ph mw mx my pi na nb nc pj ne nf ng pk ni nj nk oz pa pb bj\" data-selectable-paragraph=\"\"><code class=\"cw pc pd pe pf b\">label<\/code>: a classification label, with possible values including <strong class=\"be nl\">sadness (0), joy (1), love (2), anger (3), fear (4), surprise (5)<\/strong><\/li>\n<\/ul>\n<p id=\"0512\" class=\"pw-post-body-paragraph mo mp fo be b mq mr ms mt mu mv mw mx my mz na nb nc nd ne nf ng nh ni nj nk fh bj\" data-selectable-paragraph=\"\">\ud83d\udd38An example within the dataset looks like this:<\/p>\n<pre class=\"pl pm pn po pp pq pf pr bo ps ba bj\"><span id=\"fc42\" class=\"pt nn fo pf b bf pu pv l pw px\" data-selectable-paragraph=\"\"><span class=\"hljs-punctuation\">{<\/span>\n  <span class=\"hljs-attr\">\"text\"<\/span><span class=\"hljs-punctuation\">:<\/span> <span class=\"hljs-string\">\"im feeling quite sad and sorry for myself but ill snap out of it soon\"<\/span><span class=\"hljs-punctuation\">,<\/span>\n  <span class=\"hljs-attr\">\"label\"<\/span><span class=\"hljs-punctuation\">:<\/span> <span class=\"hljs-number\">0<\/span>\n<span class=\"hljs-punctuation\">}<\/span><\/span><\/pre>\n<p id=\"c2a5\" class=\"pw-post-body-paragraph mo mp fo be b mq mr ms mt mu mv mw mx my mz na nb nc nd ne nf ng nh ni nj nk fh bj\" data-selectable-paragraph=\"\">\ud83d\udd38A total of 20,000 samples in the dataset are divided into training, validation, and test sets:<\/p>\n<ul class=\"\">\n<li id=\"856e\" class=\"mo mp fo be b mq mr ms mt mu mv mw mx my ow na nb nc ox ne nf ng oy ni nj nk oz pa pb bj\" data-selectable-paragraph=\"\">training set: 16,000<\/li>\n<li id=\"097a\" class=\"mo mp fo be b mq pg ms mt mu ph mw mx my pi na nb nc pj ne nf ng pk ni nj nk oz pa pb bj\" data-selectable-paragraph=\"\">validation set: 2,000<\/li>\n<li id=\"e7e4\" class=\"mo mp fo be b mq pg ms mt mu ph mw mx my pi na nb nc pj ne nf ng pk ni nj nk oz pa pb bj\" data-selectable-paragraph=\"\">test set: 2,000<\/li>\n<\/ul>\n<h2 id=\"dc35\" class=\"py nn fo be no pz qa qb ns qc qd qe nw my qf qg qh nc qi qj qk ng ql qm qn qo bj\" data-selectable-paragraph=\"\">Step 1 : Setup<\/h2>\n<p id=\"003f\" class=\"pw-post-body-paragraph mo mp fo be b mq ok ms mt mu ol mw mx my om na nb nc on ne nf ng oo ni nj nk fh bj\" data-selectable-paragraph=\"\">The first step is to install the necessary libraries, including SpaCy, which we will use for basic language processing tasks, Comet ML, which we will use to log the model trainings and results, and HuggingFace\u2019s datasets library for accessing the dataset.<\/p>\n<pre class=\"pl pm pn po pp pq pf pr bo ps ba bj\"><span id=\"aba5\" class=\"pt nn fo pf b bf pu pv l pw px\" data-selectable-paragraph=\"\">!pip install <span class=\"hljs-string\">\"spacy &gt;= 3.0.6\"<\/span>\n!pip install <span class=\"hljs-string\">\"comet_ml&gt;=3.31.19\"<\/span>\n!pip install datasets<\/span><\/pre>\n<h2 id=\"3494\" class=\"py nn fo be no pz qa qb ns qc qd qe nw my qf qg qh nc qi qj qk ng ql qm qn qo bj\" data-selectable-paragraph=\"\">Step 2: Comet Environment Variables<\/h2>\n<p id=\"366f\" class=\"pw-post-body-paragraph mo mp fo be b mq ok ms mt mu ol mw mx my om na nb nc on ne nf ng oo ni nj nk fh bj\" data-selectable-paragraph=\"\">To configure Comet, you can set your credentials via environment variables as follows:<\/p>\n<pre class=\"pl pm pn po pp pq pf pr bo ps ba bj\"><span id=\"dcc2\" class=\"pt nn fo pf b bf pu pv l pw px\" data-selectable-paragraph=\"\">%env COMET_API_KEY=&lt;Your Comet API Key&gt;\n%env COMET_PROJECT_NAME=&lt;Your Comet Project Name&gt;<\/span><\/pre>\n<h2 id=\"154c\" class=\"py nn fo be no pz qa qb ns qc qd qe nw my qf qg qh nc qi qj qk ng ql qm qn qo bj\" data-selectable-paragraph=\"\">Step 3: Loading Dataset Using Huggingface Datasets<\/h2>\n<p id=\"c437\" class=\"pw-post-body-paragraph mo mp fo be b mq ok ms mt mu ol mw mx my om na nb nc on ne nf ng oo ni nj nk fh bj\" data-selectable-paragraph=\"\"><strong class=\"be nl\">\u201cDatasets\u201d <\/strong>is a library that allows loading datasets that can be used for audio processing, image processing, natural language processing tasks, and more, with a single line of code.<\/p>\n<p id=\"49e2\" class=\"pw-post-body-paragraph mo mp fo be b mq mr ms mt mu mv mw mx my mz na nb nc nd ne nf ng nh ni nj nk fh bj\" data-selectable-paragraph=\"\">Let\u2019s load the \u2018Emotions\u2019 dataset using this library and create a smaller subset of the dataset\u2019s training, validation and test sets as follows.<\/p>\n<pre class=\"pl pm pn po pp pq pf pr bo ps ba bj\"><span id=\"f220\" class=\"pt nn fo pf b bf pu pv l pw px\" data-selectable-paragraph=\"\">small_train_dataset = ds[<span class=\"hljs-string\">\"train\"<\/span>].shuffle(seed=<span class=\"hljs-number\">34<\/span>).take(<span class=\"hljs-number\">5000<\/span>)\nsmall_val_dataset = ds[<span class=\"hljs-string\">\"validation\"<\/span>].shuffle(seed=<span class=\"hljs-number\">34<\/span>).take(<span class=\"hljs-number\">1000<\/span>)\nsmall_test_dataset = ds[<span class=\"hljs-string\">\"test\"<\/span>].shuffle(seed=<span class=\"hljs-number\">34<\/span>).take(<span class=\"hljs-number\">1000<\/span>)<\/span><\/pre>\n<p id=\"3991\" class=\"pw-post-body-paragraph mo mp fo be b mq mr ms mt mu mv mw mx my mz na nb nc nd ne nf ng nh ni nj nk fh bj\" data-selectable-paragraph=\"\">Let\u2019s assign all the labels in the dataset as an array to a variable called <strong class=\"be nl\">\u201ccategories.\u201d<\/strong><\/p>\n<pre class=\"pl pm pn po pp pq pf pr bo ps ba bj\"><span id=\"e375\" class=\"pt nn fo pf b bf pu pv l pw px\" data-selectable-paragraph=\"\">categories = small_train_dataset.features[<span class=\"hljs-string\">\"label\"<\/span>].names<\/span><\/pre>\n<p id=\"4c25\" class=\"pw-post-body-paragraph mo mp fo be b mq mr ms mt mu mv mw mx my mz na nb nc nd ne nf ng nh ni nj nk fh bj\" data-selectable-paragraph=\"\">Let\u2019s define our labels in dictionary format as <strong class=\"be nl\">\u201ctag name\u201d<\/strong> and <strong class=\"be nl\">\u201ctag id\u201d <\/strong>to be used when defining in SpaCy data format.<\/p>\n<pre class=\"pl pm pn po pp pq pf pr bo ps ba bj\"><span id=\"d977\" class=\"pt nn fo pf b bf pu pv l pw px\" data-selectable-paragraph=\"\">labels_ = {}\n<span class=\"hljs-keyword\">for<\/span> index, key <span class=\"hljs-keyword\">in<\/span> <span class=\"hljs-built_in\">enumerate<\/span>(categories):\n  labels_[key] = index<\/span><\/pre>\n<h2 id=\"b604\" class=\"py nn fo be no pz qa qb ns qc qd qe nw my qf qg qh nc qi qj qk ng ql qm qn qo bj\" data-selectable-paragraph=\"\">Step 4: Convert to spaCy Data Format and Save to Disk<\/h2>\n<p id=\"5928\" class=\"pw-post-body-paragraph mo mp fo be b mq ok ms mt mu ol mw mx my om na nb nc on ne nf ng oo ni nj nk fh bj\" data-selectable-paragraph=\"\">We need to convert the text and tags to clean SpaCy Doc Objects.<\/p>\n<pre class=\"pl pm pn po pp pq pf pr bo ps ba bj\"><span id=\"06b6\" class=\"pt nn fo pf b bf pu pv l pw px\" data-selectable-paragraph=\"\"><span class=\"hljs-keyword\">def<\/span> <span class=\"hljs-title.function\">convert_spacy_dataset<\/span>(<span class=\"hljs-params\">dataset, target_file: <span class=\"hljs-built_in\">str<\/span>, labels<\/span>):\n    nlp = spacy.blank(<span class=\"hljs-string\">\"en\"<\/span>)\n    db = DocBin()\n\n    <span class=\"hljs-keyword\">for<\/span> item <span class=\"hljs-keyword\">in<\/span> tqdm(dataset):\n        doc = nlp.make_doc(item[<span class=\"hljs-string\">\"text\"<\/span>])\n        doc.cats = {label: <span class=\"hljs-number\">0<\/span> <span class=\"hljs-keyword\">for<\/span> label <span class=\"hljs-keyword\">in<\/span> labels}\n        doc.cats[labels[item[<span class=\"hljs-string\">\"label\"<\/span>]]] = <span class=\"hljs-number\">1<\/span>\n\n        db.add(doc)\n\n    db.to_disk(target_file)\n    <span class=\"hljs-keyword\">return<\/span> db<\/span><\/pre>\n<p id=\"18c4\" class=\"pw-post-body-paragraph mo mp fo be b mq mr ms mt mu mv mw mx my mz na nb nc nd ne nf ng nh ni nj nk fh bj\" data-selectable-paragraph=\"\">Let\u2019s convert the train, validation, and test datasets to the SpaCy Doc object type and save it to disk with the name we gave it.<\/p>\n<pre class=\"pl pm pn po pp pq pf pr bo ps ba bj\"><span id=\"5ec5\" class=\"pt nn fo pf b bf pu pv l pw px\" data-selectable-paragraph=\"\">convert_spacy_dataset(small_test_dataset, <span class=\"hljs-string\">\"test_data.spacy\"<\/span>, categories)\nconvert_spacy_dataset(small_val_dataset, <span class=\"hljs-string\">\"validation_data.spacy\"<\/span>, categories)\nconvert_spacy_dataset(small_train_dataset, <span class=\"hljs-string\">\"train_data.spacy\"<\/span>, categories)<\/span><\/pre>\n<\/div>\n<\/div>\n<\/div>\n\n\n\n<div class=\"ab ca qp qq qr qs\" role=\"separator\"><\/div>\n\n\n\n<div class=\"fh fi fj fk fl\">\n<div class=\"ab ca\">\n<div class=\"ch bg et eu ev ew\">\n<blockquote class=\"qx\"><p id=\"c38f\" class=\"qy qz fo be ra rb rc rd re rf rg nk dv\" data-selectable-paragraph=\"\">Sometimes simple solutions offer the best results. We made minor hardware optimizations for a huge increase in throughput. <a class=\"af mn\" href=\"https:\/\/www.comet.com\/site\/blog\/how-to-10x-throughput-when-serving-hugging-face-models-without-a-gpu\/?utm_source=heartbeat&amp;utm_medium=referral&amp;utm_campaign=AMS_US_EN_AWA_heartbeat_CTA\" target=\"_blank\" rel=\"noopener ugc nofollow\">Check out the project here<\/a>.<\/p><\/blockquote>\n<\/div>\n<\/div>\n<\/div>\n\n\n\n<div class=\"ab ca qp qq qr qs\" role=\"separator\"><\/div>\n\n\n\n<div class=\"fh fi fj fk fl\">\n<div class=\"ab ca\">\n<div class=\"ch bg et eu ev ew\">\n<h2 id=\"17ad\" class=\"py nn fo be no pz qa qb ns qc qd qe nw my qf qg qh nc qi qj qk ng ql qm qn qo bj\" data-selectable-paragraph=\"\">Step 5: Create the Configuration File<\/h2>\n<p id=\"7b4a\" class=\"pw-post-body-paragraph mo mp fo be b mq ok ms mt mu ol mw mx my om na nb nc on ne nf ng oo ni nj nk fh bj\" data-selectable-paragraph=\"\">The training in SpaCy v3 needs a config file to set up your model and all the hyperparameters. Let\u2019s use SpaCy\u2019s built-in tools to create a configuration. You can create a configuration with the following command:<\/p>\n<pre class=\"pl pm pn po pp pq pf pr bo ps ba bj\"><span id=\"b40a\" class=\"pt nn fo pf b bf pu pv l pw px\" data-selectable-paragraph=\"\">python -m spacy init config --lang en --pipeline textcat_multilabel comet_config.cfg<\/span><\/pre>\n<h2 id=\"2e28\" class=\"py nn fo be no pz qa qb ns qc qd qe nw my qf qg qh nc qi qj qk ng ql qm qn qo bj\" data-selectable-paragraph=\"\">Step 6: Integration with Comet<\/h2>\n<p id=\"2bef\" class=\"pw-post-body-paragraph mo mp fo be b mq ok ms mt mu ol mw mx my om na nb nc on ne nf ng oo ni nj nk fh bj\" data-selectable-paragraph=\"\">Let\u2019s make the following changes in the created configuration file to activate the Comet ML experiment tracking tool and keep the training logs in Comet ML.<\/p>\n<p id=\"db61\" class=\"pw-post-body-paragraph mo mp fo be b mq mr ms mt mu mv mw mx my mz na nb nc nd ne nf ng nh ni nj nk fh bj\" data-selectable-paragraph=\"\">For detailed information on the integration of SpaCy with Comet ML, you can review the <a class=\"af mn\" href=\"https:\/\/www.comet.com\/docs\/v2\/integrations\/third-party-tools\/spaCy\/\" target=\"_blank\" rel=\"noopener ugc nofollow\"><strong class=\"be nl\">official documentation<\/strong><\/a><strong class=\"be nl\">.<\/strong><\/p>\n<pre class=\"pl pm pn po pp pq pf pr bo ps ba bj\"><span id=\"0b2a\" class=\"pt nn fo pf b bf pu pv l pw px\" data-selectable-paragraph=\"\">[training.logger]\n@loggers = \"comet_ml.spacy.logger.v1\"\nworkspace=\"&lt;COMET WORKSPACE NAME&gt;\"\nproject_name=&lt;COMET PROJECT NAME&gt;\nremove_config_values = [\"paths.train\", \"paths.dev\", \"corpora.train.path\", \"corpora.dev.path\"]<\/span><\/pre><\/div><\/div><\/div>\n\n\n<p>[comet]<\/p>\n\n\n\n<p>api_key=&lt;COMET API KEY&gt;\nproject_name=&lt;COMET PROJECT NAME&gt;\n<\/p>\n\n\n\n<figure class=\"wp-block-image pl pm pn po pp mb lt lu paragraph-image\"><img decoding=\"async\" src=\"https:\/\/miro.medium.com\/v2\/resize:fit:700\/1*HmYSNRYP7B_EAU9HgRb4pg.png\" alt=\"\"\/><figcaption class=\"wp-element-caption\">Editing config.cfg for integration with Comet<\/figcaption><\/figure>\n\n\n\n<h2 class=\"wp-block-heading py nn fo be no pz qa qb ns qc qd qe nw my qf qg qh nc qi qj qk ng ql qm qn qo bj\" id=\"20a3\">Step 7: Model Training and Evaluation<\/h2>\n\n\n\n<p class=\"pw-post-body-paragraph mo mp fo be b mq ok ms mt mu ol mw mx my om na nb nc on ne nf ng oo ni nj nk fh bj\" id=\"4bd9\">If you have completed the previous 6 steps, you are now ready to train a model!<\/p>\n\n\n\n<p class=\"pw-post-body-paragraph mo mp fo be b mq mr ms mt mu mv mw mx my mz na nb nc nd ne nf ng nh ni nj nk fh bj\" id=\"168e\">By giving the config file and training dataset we prepared to the \u201cspacy_train\u201d function, we can start the training as follows:<\/p>\n\n\n\n<pre class=\"wp-block-preformatted\"><span id=\"feec\" class=\"pt nn fo pf b bf pu pv l pw px\" data-selectable-paragraph=\"\"><span class=\"hljs-keyword\">from<\/span> spacy.cli.train <span class=\"hljs-keyword\">import<\/span> train <span class=\"hljs-keyword\">as<\/span> spacy_train\n\nconfig_path = <span class=\"hljs-string\">\"comet_config.cfg\"<\/span>\noutput_model_path = <span class=\"hljs-string\">\"output_models\/\"<\/span>\n\nspacy_train(\n    config_path,\n    output_path=output_model_path,\n    overrides={\n        <span class=\"hljs-string\">\"paths.train\"<\/span>: <span class=\"hljs-string\">\"train_data.spacy\"<\/span>,\n        <span class=\"hljs-string\">\"paths.dev\"<\/span>: <span class=\"hljs-string\">\"validation_data.spacy\"<\/span>,\n    },\n    use_gpu=<span class=\"hljs-number\">0<\/span>\n)<\/span><\/pre>\n\n\n\n<p class=\"pw-post-body-paragraph mo mp fo be b mq mr ms mt mu mv mw mx my mz na nb nc nd ne nf ng nh ni nj nk fh bj\" id=\"faa3\">During the training period, you can see the training logs instantly under the workspace project on Comet.<\/p>\n\n\n\n<figure class=\"wp-block-image pl pm pn po pp mb lt lu paragraph-image\"><img decoding=\"async\" src=\"https:\/\/miro.medium.com\/v2\/resize:fit:700\/1*V6z0bT81TChVt0P1nYmxrg.png\" alt=\"\"\/><\/figure>\n\n\n\n<p class=\"pw-post-body-paragraph mo mp fo be b mq mr ms mt mu mv mw mx my mz na nb nc nd ne nf ng nh ni nj nk fh bj\" id=\"9e3b\">You can also see the training outputs in the notebook.<\/p>\n\n\n\n<figure class=\"wp-block-image pl pm pn po pp mb lt lu paragraph-image\"><img decoding=\"async\" src=\"https:\/\/miro.medium.com\/v2\/resize:fit:700\/1*QelXNMTMYpMNn8PwgzbnQQ.png\" alt=\"\"\/><\/figure>\n\n\n\n<p class=\"pw-post-body-paragraph mo mp fo be b mq mr ms mt mu mv mw mx my mz na nb nc nd ne nf ng nh ni nj nk fh bj\" id=\"555a\">When the experiment is complete, two models are created in the <code class=\"cw pc pd pe pf b\">output_models<\/code> folder:<\/p>\n\n\n\n<p class=\"pw-post-body-paragraph mo mp fo be b mq mr ms mt mu mv mw mx my mz na nb nc nd ne nf ng nh ni nj nk fh bj\" id=\"dba5\">\ud83d\udd38<strong class=\"be nl\">model-best: <\/strong>model with highest score during training iteration<\/p>\n\n\n\n<p class=\"pw-post-body-paragraph mo mp fo be b mq mr ms mt mu mv mw mx my mz na nb nc nd ne nf ng nh ni nj nk fh bj\" id=\"1826\">\ud83d\udd38<strong class=\"be nl\">model-last:<\/strong> model trained in last training iteration<\/p>\n\n\n\n<p class=\"pw-post-body-paragraph mo mp fo be b mq mr ms mt mu mv mw mx my mz na nb nc nd ne nf ng nh ni nj nk fh bj\" id=\"f3dd\">Evaluation of the model performance with the test dataset that did not participate in the training can be done as follows:<\/p>\n\n\n\n<pre class=\"wp-block-preformatted\"><span id=\"1538\" class=\"pt nn fo pf b bf pu pv l pw px\" data-selectable-paragraph=\"\">python -m spacy evaluate output_models\/model-best\/ test_data.spacy<\/span><\/pre>\n\n\n\n<p class=\"pw-post-body-paragraph mo mp fo be b mq mr ms mt mu mv mw mx my mz na nb nc nd ne nf ng nh ni nj nk fh bj\" id=\"8c91\">Performance results in the test set of the model:<\/p>\n\n\n\n<pre class=\"wp-block-preformatted\"><span id=\"432c\" class=\"pt nn fo pf b bf pu pv l pw px\" data-selectable-paragraph=\"\">================================== Results ==================================\n\nTOK                   100.00\nTEXTCAT (macro AUC)   94.46\nSPEED                 281793\n\n=========================== Textcat F (per label) ===========================\n\n               P       R       F\nsadness    91.42   75.80   82.88\njoy        90.88   83.53   87.05\nlove       76.00   44.71   56.30\nanger      93.83   58.91   72.38\nfear       94.94   61.48   74.63\nsurprise   80.00   21.62   34.04\n\n======================== Textcat ROC AUC (per label) ========================\n\n           ROC AUC\nsadness       0.96\njoy           0.96\nlove          0.94\nanger         0.96\nfear          0.94\nsurprise      0.91<\/span><\/pre>\n\n\n\n<h2 class=\"wp-block-heading py nn fo be no pz qa qb ns qc qd qe nw my qf qg qh nc qi qj qk ng ql qm qn qo bj\" id=\"087e\">Step 8: Using the Trained Model<\/h2>\n\n\n\n<p class=\"pw-post-body-paragraph mo mp fo be b mq ok ms mt mu ol mw mx my om na nb nc on ne nf ng oo ni nj nk fh bj\" id=\"55f9\">We can quickly use our trained classifier model to get predictions about the sentiment of different texts.<\/p>\n\n\n\n<p class=\"pw-post-body-paragraph mo mp fo be b mq mr ms mt mu mv mw mx my mz na nb nc nd ne nf ng nh ni nj nk fh bj\" id=\"5549\">Let\u2019s load our best trained SpaCy model from disk and then easily get the sensitivity analysis model result for all sample sentences with the SpaCy Pipeline. Finally, let\u2019s filter the scores above the 0.5 threshold value among the multiple classes of the model and print the results.<\/p>\n\n\n\n<pre class=\"wp-block-preformatted\"><span id=\"6861\" class=\"pt nn fo pf b bf pu pv l pw px\" data-selectable-paragraph=\"\"><span class=\"hljs-comment\">#We load our spaCy model from disk<\/span>\ntrained_nlp = spacy.load(<span class=\"hljs-string\">\"output_models\/model-best\/\"<\/span>)\n\ntexts = [<span class=\"hljs-string\">'im feeling pretty anxious'<\/span>,\n        <span class=\"hljs-string\">'i don t feel particularly agitated'<\/span>,\n        <span class=\"hljs-string\">'i find myself in the odd position of feeling supportive of'<\/span>,\n        <span class=\"hljs-string\">'i feel so cold a href http irish'<\/span>,\n        <span class=\"hljs-string\">'im feeling very peaceful about our wedding again now after having'<\/span>\n        ]\n\ncategory_scores = [doc.cats <span class=\"hljs-keyword\">for<\/span> doc <span class=\"hljs-keyword\">in<\/span> trained_nlp.pipe(texts)]\n\nthresh = <span class=\"hljs-number\">0.5<\/span>\n<span class=\"hljs-keyword\">for<\/span> d <span class=\"hljs-keyword\">in<\/span> category_scores:\n  <span class=\"hljs-built_in\">print<\/span>(<span class=\"hljs-built_in\">dict<\/span>((k, v) <span class=\"hljs-keyword\">for<\/span> k, v <span class=\"hljs-keyword\">in<\/span> d.items() <span class=\"hljs-keyword\">if<\/span> v &gt;= thresh))<\/span><\/pre>\n\n\n\n<pre class=\"wp-block-preformatted\"><span id=\"4dcb\" class=\"pt nn fo pf b bf pu pv l pw px\" data-selectable-paragraph=\"\">{'fear': 0.6161302924156189}\n{'anger': 0.7113955020904541}\n{'love': 0.5419280529022217}\n{'anger': 0.8137673139572144}\n{'joy': 0.6693308353424072}<\/span><\/pre>\n\n\n\n<p class=\"pw-post-body-paragraph mo mp fo be b mq mr ms mt mu mv mw mx my mz na nb nc nd ne nf ng nh ni nj nk fh bj\" id=\"be12\">To access all the code in this emotion classification application example, please review the <a href=\"https:\/\/drive.google.com\/file\/d\/18AbUVPPsXT9B7IoRSaxGSK4IVwjkHMH1\/view?source=post_page-----eaff310c0d7c--------------------------------\">Colab Notebook<\/a>!<\/p>\n\n\n\n<div class=\"fh fi fj fk fl\">\n<div class=\"ab ca\">\n<div class=\"ch bg et eu ev ew\">\n<p id=\"83c2\" class=\"pw-post-body-paragraph mo mp fo be b mq mr ms mt mu mv mw mx my mz na nb nc nd ne nf ng nh ni nj nk fh bj\" data-selectable-paragraph=\"\">Today, data science researchers use many machine learning frameworks and third-party tools in their work. Comet integrates with the most commonly used platforms and tools, and its modular and customizable design allows researchers to maintain flexibility to allow their machine learning platform to adapt to future changes and needs.<\/p>\n<p id=\"f435\" class=\"pw-post-body-paragraph mo mp fo be b mq mr ms mt mu mv mw mx my mz na nb nc nd ne nf ng nh ni nj nk fh bj\" data-selectable-paragraph=\"\">Please visit the <a class=\"af mn\" href=\"https:\/\/www.comet.com\/docs\/v2\/integrations\/overview\/\" target=\"_blank\" rel=\"noopener ugc nofollow\"><strong class=\"be nl\">official Comet documentation page<\/strong><\/a> to review other platforms and tools that Comet integrates with!<\/p>\n<p id=\"094b\" class=\"pw-post-body-paragraph mo mp fo be b mq mr ms mt mu mv mw mx my mz na nb nc nd ne nf ng nh ni nj nk fh bj\" data-selectable-paragraph=\"\"><a class=\"af mn\" href=\"https:\/\/heartbeat.comet.ml\/pythae-comet-e78609bf7bdd\" target=\"_blank\" rel=\"noopener ugc nofollow\"><strong class=\"be nl\">Visit here<\/strong><\/a> for my blog post about the Pythae and Comet integration with the \u201cReconstructing MNIST images\u201d dataset!<\/p>\n<\/div>\n<\/div>\n<\/div>\n","protected":false},"excerpt":{"rendered":"<p>Photo by Nik on Unsplash If you are a natural language processing researcher or have an interest in this field, you have surely come across SpaCy or you are very close to it! SpaCy, a free, open source natural language processing library developed in Python, is very popular for use in real products. This advanced [&hellip;]<\/p>\n","protected":false},"author":8,"featured_media":0,"comment_status":"closed","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"customer_name":"","customer_description":"","customer_industry":"","customer_technologies":"","customer_logo":"","footnotes":""},"categories":[6,7],"tags":[],"coauthors":[144],"class_list":["post-7786","post","type-post","status-publish","format-standard","hentry","category-machine-learning","category-tutorials"],"yoast_head":"<!-- This site is optimized with the Yoast SEO Premium plugin v25.9 (Yoast SEO v25.9) - https:\/\/yoast.com\/wordpress\/plugins\/seo\/ -->\n<title>Emotion Classification with SpaCy v3 &amp; Comet - Comet<\/title>\n<meta name=\"robots\" content=\"index, follow, max-snippet:-1, max-image-preview:large, max-video-preview:-1\" \/>\n<link rel=\"canonical\" href=\"https:\/\/www.comet.com\/site\/blog\/emotion-classification-with-spacy-v3-comet\/\" \/>\n<meta property=\"og:locale\" content=\"en_US\" \/>\n<meta property=\"og:type\" content=\"article\" \/>\n<meta property=\"og:title\" content=\"Emotion Classification with SpaCy v3 &amp; Comet\" \/>\n<meta property=\"og:description\" content=\"Photo by Nik on Unsplash If you are a natural language processing researcher or have an interest in this field, you have surely come across SpaCy or you are very close to it! SpaCy, a free, open source natural language processing library developed in Python, is very popular for use in real products. This advanced [&hellip;]\" \/>\n<meta property=\"og:url\" content=\"https:\/\/www.comet.com\/site\/blog\/emotion-classification-with-spacy-v3-comet\/\" \/>\n<meta property=\"og:site_name\" content=\"Comet\" \/>\n<meta property=\"article:publisher\" content=\"https:\/\/www.facebook.com\/cometdotml\" \/>\n<meta property=\"article:published_time\" content=\"2023-10-04T18:46:19+00:00\" \/>\n<meta property=\"article:modified_time\" content=\"2025-04-24T17:06:08+00:00\" \/>\n<meta property=\"og:image\" content=\"https:\/\/miro.medium.com\/v2\/resize:fit:700\/0*L_TJCEXJeStjEX1U\" \/>\n<meta name=\"author\" content=\"Ba\u015fak Buluz K\u00f6me\u00e7o\u011flu\" \/>\n<meta name=\"twitter:card\" content=\"summary_large_image\" \/>\n<meta name=\"twitter:creator\" content=\"@Cometml\" \/>\n<meta name=\"twitter:site\" content=\"@Cometml\" \/>\n<meta name=\"twitter:label1\" content=\"Written by\" \/>\n\t<meta name=\"twitter:data1\" content=\"Ba\u015fak Buluz K\u00f6me\u00e7o\u011flu\" \/>\n\t<meta name=\"twitter:label2\" content=\"Est. reading time\" \/>\n\t<meta name=\"twitter:data2\" content=\"7 minutes\" \/>\n<!-- \/ Yoast SEO Premium plugin. -->","yoast_head_json":{"title":"Emotion Classification with SpaCy v3 & Comet - Comet","robots":{"index":"index","follow":"follow","max-snippet":"max-snippet:-1","max-image-preview":"max-image-preview:large","max-video-preview":"max-video-preview:-1"},"canonical":"https:\/\/www.comet.com\/site\/blog\/emotion-classification-with-spacy-v3-comet\/","og_locale":"en_US","og_type":"article","og_title":"Emotion Classification with SpaCy v3 & Comet","og_description":"Photo by Nik on Unsplash If you are a natural language processing researcher or have an interest in this field, you have surely come across SpaCy or you are very close to it! SpaCy, a free, open source natural language processing library developed in Python, is very popular for use in real products. This advanced [&hellip;]","og_url":"https:\/\/www.comet.com\/site\/blog\/emotion-classification-with-spacy-v3-comet\/","og_site_name":"Comet","article_publisher":"https:\/\/www.facebook.com\/cometdotml","article_published_time":"2023-10-04T18:46:19+00:00","article_modified_time":"2025-04-24T17:06:08+00:00","og_image":[{"url":"https:\/\/miro.medium.com\/v2\/resize:fit:700\/0*L_TJCEXJeStjEX1U","type":"","width":"","height":""}],"author":"Ba\u015fak Buluz K\u00f6me\u00e7o\u011flu","twitter_card":"summary_large_image","twitter_creator":"@Cometml","twitter_site":"@Cometml","twitter_misc":{"Written by":"Ba\u015fak Buluz K\u00f6me\u00e7o\u011flu","Est. reading time":"7 minutes"},"schema":{"@context":"https:\/\/schema.org","@graph":[{"@type":"Article","@id":"https:\/\/www.comet.com\/site\/blog\/emotion-classification-with-spacy-v3-comet\/#article","isPartOf":{"@id":"https:\/\/www.comet.com\/site\/blog\/emotion-classification-with-spacy-v3-comet\/"},"author":{"name":"Team Comet Digital","@id":"https:\/\/www.comet.com\/site\/#\/schema\/person\/6266601170c60a7a82b3e0043fbe8ddf"},"headline":"Emotion Classification with SpaCy v3 &#038; Comet","datePublished":"2023-10-04T18:46:19+00:00","dateModified":"2025-04-24T17:06:08+00:00","mainEntityOfPage":{"@id":"https:\/\/www.comet.com\/site\/blog\/emotion-classification-with-spacy-v3-comet\/"},"wordCount":886,"publisher":{"@id":"https:\/\/www.comet.com\/site\/#organization"},"image":{"@id":"https:\/\/www.comet.com\/site\/blog\/emotion-classification-with-spacy-v3-comet\/#primaryimage"},"thumbnailUrl":"https:\/\/miro.medium.com\/v2\/resize:fit:700\/0*L_TJCEXJeStjEX1U","articleSection":["Machine Learning","Tutorials"],"inLanguage":"en-US"},{"@type":"WebPage","@id":"https:\/\/www.comet.com\/site\/blog\/emotion-classification-with-spacy-v3-comet\/","url":"https:\/\/www.comet.com\/site\/blog\/emotion-classification-with-spacy-v3-comet\/","name":"Emotion Classification with SpaCy v3 & Comet - Comet","isPartOf":{"@id":"https:\/\/www.comet.com\/site\/#website"},"primaryImageOfPage":{"@id":"https:\/\/www.comet.com\/site\/blog\/emotion-classification-with-spacy-v3-comet\/#primaryimage"},"image":{"@id":"https:\/\/www.comet.com\/site\/blog\/emotion-classification-with-spacy-v3-comet\/#primaryimage"},"thumbnailUrl":"https:\/\/miro.medium.com\/v2\/resize:fit:700\/0*L_TJCEXJeStjEX1U","datePublished":"2023-10-04T18:46:19+00:00","dateModified":"2025-04-24T17:06:08+00:00","breadcrumb":{"@id":"https:\/\/www.comet.com\/site\/blog\/emotion-classification-with-spacy-v3-comet\/#breadcrumb"},"inLanguage":"en-US","potentialAction":[{"@type":"ReadAction","target":["https:\/\/www.comet.com\/site\/blog\/emotion-classification-with-spacy-v3-comet\/"]}]},{"@type":"ImageObject","inLanguage":"en-US","@id":"https:\/\/www.comet.com\/site\/blog\/emotion-classification-with-spacy-v3-comet\/#primaryimage","url":"https:\/\/miro.medium.com\/v2\/resize:fit:700\/0*L_TJCEXJeStjEX1U","contentUrl":"https:\/\/miro.medium.com\/v2\/resize:fit:700\/0*L_TJCEXJeStjEX1U"},{"@type":"BreadcrumbList","@id":"https:\/\/www.comet.com\/site\/blog\/emotion-classification-with-spacy-v3-comet\/#breadcrumb","itemListElement":[{"@type":"ListItem","position":1,"name":"Home","item":"https:\/\/www.comet.com\/site\/"},{"@type":"ListItem","position":2,"name":"Emotion Classification with SpaCy v3 &#038; Comet"}]},{"@type":"WebSite","@id":"https:\/\/www.comet.com\/site\/#website","url":"https:\/\/www.comet.com\/site\/","name":"Comet","description":"Build Better Models Faster","publisher":{"@id":"https:\/\/www.comet.com\/site\/#organization"},"potentialAction":[{"@type":"SearchAction","target":{"@type":"EntryPoint","urlTemplate":"https:\/\/www.comet.com\/site\/?s={search_term_string}"},"query-input":{"@type":"PropertyValueSpecification","valueRequired":true,"valueName":"search_term_string"}}],"inLanguage":"en-US"},{"@type":"Organization","@id":"https:\/\/www.comet.com\/site\/#organization","name":"Comet ML, Inc.","alternateName":"Comet","url":"https:\/\/www.comet.com\/site\/","logo":{"@type":"ImageObject","inLanguage":"en-US","@id":"https:\/\/www.comet.com\/site\/#\/schema\/logo\/image\/","url":"https:\/\/www.comet.com\/site\/wp-content\/uploads\/2025\/01\/logo_comet_square.png","contentUrl":"https:\/\/www.comet.com\/site\/wp-content\/uploads\/2025\/01\/logo_comet_square.png","width":310,"height":310,"caption":"Comet ML, Inc."},"image":{"@id":"https:\/\/www.comet.com\/site\/#\/schema\/logo\/image\/"},"sameAs":["https:\/\/www.facebook.com\/cometdotml","https:\/\/x.com\/Cometml","https:\/\/www.youtube.com\/channel\/UCmN63HKvfXSCS-UwVwmK8Hw"]},{"@type":"Person","@id":"https:\/\/www.comet.com\/site\/#\/schema\/person\/6266601170c60a7a82b3e0043fbe8ddf","name":"Team Comet Digital","image":{"@type":"ImageObject","inLanguage":"en-US","@id":"https:\/\/www.comet.com\/site\/#\/schema\/person\/image\/4f0c0a8cc7c0e87c636ff6a420a6647c","url":"https:\/\/www.comet.com\/site\/wp-content\/uploads\/2023\/08\/Screen-Shot-2023-08-12-at-8.58.50-AM-96x96.png","contentUrl":"https:\/\/www.comet.com\/site\/wp-content\/uploads\/2023\/08\/Screen-Shot-2023-08-12-at-8.58.50-AM-96x96.png","caption":"Team Comet Digital"},"sameAs":["https:\/\/www.comet.ml\/"],"url":"https:\/\/www.comet.com\/site\/blog\/author\/teamcometdigital\/"}]}},"_links":{"self":[{"href":"https:\/\/www.comet.com\/site\/wp-json\/wp\/v2\/posts\/7786","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/www.comet.com\/site\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/www.comet.com\/site\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/www.comet.com\/site\/wp-json\/wp\/v2\/users\/8"}],"replies":[{"embeddable":true,"href":"https:\/\/www.comet.com\/site\/wp-json\/wp\/v2\/comments?post=7786"}],"version-history":[{"count":1,"href":"https:\/\/www.comet.com\/site\/wp-json\/wp\/v2\/posts\/7786\/revisions"}],"predecessor-version":[{"id":15524,"href":"https:\/\/www.comet.com\/site\/wp-json\/wp\/v2\/posts\/7786\/revisions\/15524"}],"wp:attachment":[{"href":"https:\/\/www.comet.com\/site\/wp-json\/wp\/v2\/media?parent=7786"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/www.comet.com\/site\/wp-json\/wp\/v2\/categories?post=7786"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/www.comet.com\/site\/wp-json\/wp\/v2\/tags?post=7786"},{"taxonomy":"author","embeddable":true,"href":"https:\/\/www.comet.com\/site\/wp-json\/wp\/v2\/coauthors?post=7786"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}