{"id":6436,"date":"2023-06-19T19:13:11","date_gmt":"2023-06-20T03:13:11","guid":{"rendered":"https:\/\/live-cometml.pantheonsite.io\/?p=6436"},"modified":"2025-04-24T17:15:20","modified_gmt":"2025-04-24T17:15:20","slug":"how-to-build-a-text-classification-model-using-huggingface-transformers-and-comet","status":"publish","type":"post","link":"https:\/\/www.comet.com\/site\/blog\/how-to-build-a-text-classification-model-using-huggingface-transformers-and-comet\/","title":{"rendered":"How to Build a Text Classification Model Using HuggingFace Transformers and Comet"},"content":{"rendered":"\n<div class=\"fh fi fj fk fl\">\n<div class=\"ab ca\">\n<div class=\"ch bg et eu ev ew\">\n<figure class=\"lv lw lx ly lz ma ls lt paragraph-image\">\n<div class=\"mb mc eb md bg me\" tabindex=\"0\" role=\"button\">\n<div class=\"ls lt lu\">\n<figure><img loading=\"lazy\" decoding=\"async\" class=\"bg mf mg c\" style=\"color: var(--wpex-text-2); font-family: var(--wpex-body-font-family, var(--wpex-font-sans)); font-size: var(--wpex-body-font-size, 13px);\" role=\"presentation\" src=\"https:\/\/miro.medium.com\/v2\/resize:fit:700\/1*lhYlG4J73LIP4jrz-F49pg.png\" alt=\"\" width=\"814\" height=\"543\"><\/figure><p><\/p>\n<\/div>\n<\/div>\n<\/figure>\n<p id=\"2fbe\" class=\"pw-post-body-paragraph mh mi fo be b mj mk ml mm mn mo mp mq mr ms mt mu mv mw mx my mz na nb nc nd fh bj\" data-selectable-paragraph=\"\">Text classification is an interesting part of machine learning and natural language processing that is used in business and everyday life to determine the sentiment of a text. For example, a company can use machine learning to determine whether customer reviews are positive or not. Machine learning models may now be developed in a variety of methods to classify text. You can create one from scratch or use a pre-trained model.<\/p>\n<p id=\"4f3f\" class=\"pw-post-body-paragraph mh mi fo be b mj mk ml mm mn mo mp mq mr ms mt mu mv mw mx my mz na nb nc nd fh bj\" data-selectable-paragraph=\"\">This article will show you how to build your text classification model using transformers (which includes a state-of-the-art pre-trained model) and how to utilize Comet to keep track of your model\u2019s experiments. Without further ado, let\u2019s get started!<\/p>\n<h2 id=\"bb6a\" class=\"ne nf fo be ng nh ni nj nk nl nm nn no mr np nq nr mv ns nt nu mz nv nw nx ny bj\" data-selectable-paragraph=\"\">What are Transformers?<\/h2>\n<p id=\"67e1\" class=\"pw-post-body-paragraph mh mi fo be b mj nz ml mm mn oa mp mq mr ob mt mu mv oc mx my mz od nb nc nd fh bj\" data-selectable-paragraph=\"\"><a class=\"af oe\" href=\"https:\/\/huggingface.co\/docs\/transformers\/main\/en\/index\" target=\"_blank\" rel=\"noopener ugc nofollow\">Hugging face transformers <\/a>deliver state-of-the-art (cutting-edge) pre-trained models that let you perform tasks on many aspects of data such as text, audio, or other input. They use a machine learning approach known as transfer learning, in which a model has already been trained, so you don\u2019t have to worry about developing a model from scratch. This can save resources such as processing power, data sourcing, and so on. All you need to do is fine-tune the model so that it works effectively for your application.<\/p>\n<p id=\"431d\" class=\"pw-post-body-paragraph mh mi fo be b mj mk ml mm mn mo mp mq mr ms mt mu mv mw mx my mz na nb nc nd fh bj\" data-selectable-paragraph=\"\">For example, we may utilize transformers to build a text classification model, which saves time and money because we don\u2019t have to train the model from scratch or obtain a large amount of data.<\/p>\n<p id=\"536e\" class=\"pw-post-body-paragraph mh mi fo be b mj mk ml mm mn mo mp mq mr ms mt mu mv mw mx my mz na nb nc nd fh bj\" data-selectable-paragraph=\"\">Transformers are supported by three of the most prominent deep learning libraries \u2014 Jax, PyTorch, and TensorFlow \u2014 with seamless integration.<\/p>\n<h2 id=\"96bd\" class=\"ne nf fo be ng nh ni nj nk nl nm nn no mr np nq nr mv ns nt nu mz nv nw nx ny bj\" data-selectable-paragraph=\"\">What is Comet?<\/h2>\n<p id=\"1e15\" class=\"pw-post-body-paragraph mh mi fo be b mj nz ml mm mn oa mp mq mr ob mt mu mv oc mx my mz od nb nc nd fh bj\" data-selectable-paragraph=\"\"><a class=\"af oe\" href=\"https:\/\/www.comet.com\/site\/\" target=\"_blank\" rel=\"noopener ugc nofollow\">Comet<\/a> is a machine-learning platform that allows you to track the artifacts of your machine-learning experiments such as model metrics (e.g., accuracy score, confusion matrix), hyperparameters, model metadata, and so on.<\/p>\n<p id=\"018b\" class=\"pw-post-body-paragraph mh mi fo be b mj mk ml mm mn mo mp mq mr ms mt mu mv mw mx my mz na nb nc nd fh bj\" data-selectable-paragraph=\"\">We will use Comet to keep track of the metrics, hyperparameters, and so on of the transformer model, we will build for text classification.<\/p>\n<p id=\"9ccb\" class=\"pw-post-body-paragraph mh mi fo be b mj mk ml mm mn mo mp mq mr ms mt mu mv mw mx my mz na nb nc nd fh bj\" data-selectable-paragraph=\"\">Now enough talking, let\u2019s get to business!<\/p>\n<blockquote class=\"of og oh\"><p id=\"f65b\" class=\"mh mi oi be b mj mk ml mm mn mo mp mq oj ms mt mu ok mw mx my ol na nb nc nd fh bj\" data-selectable-paragraph=\"\">To get the most out of this tutorial click <a class=\"af oe\" href=\"https:\/\/colab.research.google.com\/drive\/1_OIqw1tHwnYYSyvrQ-VaxU4z5K4lBBI9?usp=sharing\" target=\"_blank\" rel=\"noopener ugc nofollow\">here<\/a> to download the Colab notebook so you can easily follow along.<\/p><\/blockquote>\n<h2 id=\"43f8\" class=\"ne nf fo be ng nh ni nj nk nl nm nn no mr np nq nr mv ns nt nu mz nv nw nx ny bj\" data-selectable-paragraph=\"\">Dataset<\/h2>\n<p id=\"0bcf\" class=\"pw-post-body-paragraph mh mi fo be b mj nz ml mm mn oa mp mq mr ob mt mu mv oc mx my mz od nb nc nd fh bj\" data-selectable-paragraph=\"\">The dataset we will be utilizing is an IMDb dataset that contains movie reviews with labels indicating whether the review is positive or negative. The dataset is a public dataset from HuggingFace and can be accessed by installing the dataset library. You can install the dataset library using <code class=\"cw om on oo op b\">pip<\/code>.<\/p>\n<h2 id=\"1e00\" class=\"ne nf fo be ng nh ni nj nk nl nm nn no mr np nq nr mv ns nt nu mz nv nw nx ny bj\" data-selectable-paragraph=\"\">Libraries, tools, and environment<\/h2>\n<p id=\"dfe5\" class=\"pw-post-body-paragraph mh mi fo be b mj nz ml mm mn oa mp mq mr ob mt mu mv oc mx my mz od nb nc nd fh bj\" data-selectable-paragraph=\"\">These are the libraries we will be using.<\/p>\n<ul class=\"\">\n<li id=\"2e4b\" class=\"mh mi fo be b mj mk ml mm mn mo mp mq oj ms mt mu ok mw mx my ol na nb nc nd oq or os bj\" data-selectable-paragraph=\"\">Environment: Google Colab for experimentation.<\/li>\n<li id=\"17fc\" class=\"mh mi fo be b mj ot ml mm mn ou mp mq oj ov mt mu ok ow mx my ol ox nb nc nd oq or os bj\" data-selectable-paragraph=\"\">Transformers: for building our state-of-the-art text classification model.<\/li>\n<li id=\"5f57\" class=\"mh mi fo be b mj ot ml mm mn ou mp mq oj ov mt mu ok ow mx my ol ox nb nc nd oq or os bj\" data-selectable-paragraph=\"\">Dataset: For fine-tuning our pre-trained model.<\/li>\n<li id=\"55e0\" class=\"mh mi fo be b mj ot ml mm mn ou mp mq oj ov mt mu ok ow mx my ol ox nb nc nd oq or os bj\" data-selectable-paragraph=\"\">Comet: for tracking the experiments of our model.<\/li>\n<li id=\"d038\" class=\"mh mi fo be b mj ot ml mm mn ou mp mq oj ov mt mu ok ow mx my ol ox nb nc nd oq or os bj\" data-selectable-paragraph=\"\">Scikit-Learn: for evaluating the model performance.<\/li>\n<li id=\"7ef9\" class=\"mh mi fo be b mj ot ml mm mn ou mp mq oj ov mt mu ok ow mx my ol ox nb nc nd oq or os bj\" data-selectable-paragraph=\"\">PyTorch: to work with transformers.<\/li>\n<\/ul>\n<p id=\"e6b5\" class=\"pw-post-body-paragraph mh mi fo be b mj mk ml mm mn mo mp mq mr ms mt mu mv mw mx my mz na nb nc nd fh bj\" data-selectable-paragraph=\"\">As we\u2019ve said earlier, Transformers is backed up by the three most prominent deep-learning libraries: Jax, PyTorch, and TensorFlow. For this project, we will be using PyTorch.<\/p>\n<p id=\"bff5\" class=\"pw-post-body-paragraph mh mi fo be b mj mk ml mm mn mo mp mq mr ms mt mu mv mw mx my mz na nb nc nd fh bj\" data-selectable-paragraph=\"\">You can type the below code in Colab to install the above library.<\/p>\n<pre>%pip install comet_ml torch datasets transformers scikit-learn<\/pre>\n<h2 id=\"9d08\" class=\"ne nf fo be ng nh ni nj nk nl nm nn no mr np nq nr mv ns nt nu mz nv nw nx ny bj\" data-selectable-paragraph=\"\">Load the dataset<\/h2>\n<p id=\"a629\" class=\"pw-post-body-paragraph mh mi fo be b mj nz ml mm mn oa mp mq mr ob mt mu mv oc mx my mz od nb nc nd fh bj\" data-selectable-paragraph=\"\">Once we\u2019ve installed the above libraries the next thing is to start building our text classification model. Firstly we will import the necessary libraries we will be using, then we initialized our Comet experiment and named our project \u201cHugging Face Text Classification\u201d<\/p>\n<blockquote class=\"of og oh\"><p id=\"9a23\" class=\"mh mi oi be b mj mk ml mm mn mo mp mq oj ms mt mu ok mw mx my ol na nb nc nd fh bj\" data-selectable-paragraph=\"\">Note you will be prompted for your API key. If don\u2019t have one you can signup <a class=\"af oe\" href=\"https:\/\/www.comet.com\/site\/\" target=\"_blank\" rel=\"noopener ugc nofollow\">here<\/a> to get one.<\/p><\/blockquote>\n<p id=\"9233\" class=\"pw-post-body-paragraph mh mi fo be b mj mk ml mm mn mo mp mq mr ms mt mu mv mw mx my mz na nb nc nd fh bj\" data-selectable-paragraph=\"\">Then we load the IMDb dataset and print it out.<\/p>\n<pre>import comet_ml\nfrom datasets import load_dataset\n\nfrom sklearn.metrics import accuracy_score, precision_recall_fscore_support\nfrom transformers import AutoTokenizer, Trainer, TrainingArguments, AutoModelForSequenceClassification, DataCollatorWithPadding\n\ncomet.init(project_name = \"Hugging Face Text Classification\")\n\ndf = load_dataset(\"imdb\")\nprint(df)<\/pre>\n<figure class=\"oy oz pa pb pc ma ls lt paragraph-image\">\n<div class=\"mb mc eb md bg me\" tabindex=\"0\" role=\"button\">\n<figure><img loading=\"lazy\" decoding=\"async\" class=\"bg mf mg c\" role=\"presentation\" src=\"https:\/\/miro.medium.com\/v2\/resize:fit:700\/1*O6Vvz1cd07x0O_Lm0OAfFg.png\" alt=\"\" width=\"903\" height=\"295\"><\/figure><div class=\"ls lt pg\"><picture><source srcset=\"https:\/\/miro.medium.com\/v2\/resize:fit:640\/format:webp\/1*O6Vvz1cd07x0O_Lm0OAfFg.png 640w, https:\/\/miro.medium.com\/v2\/resize:fit:720\/format:webp\/1*O6Vvz1cd07x0O_Lm0OAfFg.png 720w, https:\/\/miro.medium.com\/v2\/resize:fit:750\/format:webp\/1*O6Vvz1cd07x0O_Lm0OAfFg.png 750w, https:\/\/miro.medium.com\/v2\/resize:fit:786\/format:webp\/1*O6Vvz1cd07x0O_Lm0OAfFg.png 786w, https:\/\/miro.medium.com\/v2\/resize:fit:828\/format:webp\/1*O6Vvz1cd07x0O_Lm0OAfFg.png 828w, https:\/\/miro.medium.com\/v2\/resize:fit:1100\/format:webp\/1*O6Vvz1cd07x0O_Lm0OAfFg.png 1100w, https:\/\/miro.medium.com\/v2\/resize:fit:1400\/format:webp\/1*O6Vvz1cd07x0O_Lm0OAfFg.png 1400w\" type=\"image\/webp\" sizes=\"(min-resolution: 4dppx) and (max-width: 700px) 50vw, (-webkit-min-device-pixel-ratio: 4) and (max-width: 700px) 50vw, (min-resolution: 3dppx) and (max-width: 700px) 67vw, (-webkit-min-device-pixel-ratio: 3) and (max-width: 700px) 65vw, (min-resolution: 2.5dppx) and (max-width: 700px) 80vw, (-webkit-min-device-pixel-ratio: 2.5) and (max-width: 700px) 80vw, (min-resolution: 2dppx) and (max-width: 700px) 100vw, (-webkit-min-device-pixel-ratio: 2) and (max-width: 700px) 100vw, 700px\"><source srcset=\"https:\/\/miro.medium.com\/v2\/resize:fit:640\/1*O6Vvz1cd07x0O_Lm0OAfFg.png 640w, https:\/\/miro.medium.com\/v2\/resize:fit:720\/1*O6Vvz1cd07x0O_Lm0OAfFg.png 720w, https:\/\/miro.medium.com\/v2\/resize:fit:750\/1*O6Vvz1cd07x0O_Lm0OAfFg.png 750w, https:\/\/miro.medium.com\/v2\/resize:fit:786\/1*O6Vvz1cd07x0O_Lm0OAfFg.png 786w, https:\/\/miro.medium.com\/v2\/resize:fit:828\/1*O6Vvz1cd07x0O_Lm0OAfFg.png 828w, https:\/\/miro.medium.com\/v2\/resize:fit:1100\/1*O6Vvz1cd07x0O_Lm0OAfFg.png 1100w, https:\/\/miro.medium.com\/v2\/resize:fit:1400\/1*O6Vvz1cd07x0O_Lm0OAfFg.png 1400w\" sizes=\"(min-resolution: 4dppx) and (max-width: 700px) 50vw, (-webkit-min-device-pixel-ratio: 4) and (max-width: 700px) 50vw, (min-resolution: 3dppx) and (max-width: 700px) 67vw, (-webkit-min-device-pixel-ratio: 3) and (max-width: 700px) 65vw, (min-resolution: 2.5dppx) and (max-width: 700px) 80vw, (-webkit-min-device-pixel-ratio: 2.5) and (max-width: 700px) 80vw, (min-resolution: 2dppx) and (max-width: 700px) 100vw, (-webkit-min-device-pixel-ratio: 2) and (max-width: 700px) 100vw, 700px\" data-testid=\"og\"><\/picture><\/div>\n<\/div>\n<\/figure>\n<p id=\"7c67\" class=\"pw-post-body-paragraph mh mi fo be b mj mk ml mm mn mo mp mq mr ms mt mu mv mw mx my mz na nb nc nd fh bj\" data-selectable-paragraph=\"\">We can see that the output is compressed to a dictionary of datasets, including an entry for the full dataset to be used in unsupervised learning applications, as well as train and test splits (50:50). Each dataset only has two features: the <code class=\"cw om on oo op b\">text<\/code> \u2014 which contains the reviews \u2014 and <code class=\"cw om on oo op b\">label<\/code>, which includes the value 0 or 1, indicating whether the review is negative or positive.<\/p>\n<p id=\"3920\" class=\"pw-post-body-paragraph mh mi fo be b mj mk ml mm mn mo mp mq mr ms mt mu mv mw mx my mz na nb nc nd fh bj\" data-selectable-paragraph=\"\">The dataset has many observations; however, in our situation, we will only use a subset of it. As a result, we will choose a subset of the train and test split. The model will be trained on 200 rows and evaluated on 100 rows. We will narrow down our sample size later on in our code.<\/p>\n<\/div>\n<\/div>\n<\/div>\n\n\n\n<div class=\"fh fi fj fk fl\">\n<div class=\"ab ca\">\n<div class=\"ch bg et eu ev ew\">\n<blockquote class=\"pp\"><p id=\"ce72\" class=\"pq pr fo be ps pt pu pv pw px py nd dv\" data-selectable-paragraph=\"\">The most important thing to keep in mind when building and deploying your model? Understanding your end-goal. <a class=\"af oe\" href=\"https:\/\/www.comet.com\/site\/blog\/industry-qa-where-most-machine-learning-projects-fail\/\" target=\"_blank\" rel=\"noopener ugc nofollow\">Read our interview with ML experts from Stanford, Google, and HuggingFace to learn more<\/a>.<\/p><\/blockquote>\n<\/div>\n<\/div>\n<\/div>\n\n\n\n<div class=\"fh fi fj fk fl\">\n<div class=\"ab ca\">\n<div class=\"ch bg et eu ev ew\">\n<h2 id=\"96b8\" class=\"ne nf fo be ng nh ni nj nk nl nm nn no mr np nq nr mv ns nt nu mz nv nw nx ny bj\" data-selectable-paragraph=\"\">Defining the Transformer Model<\/h2>\n<p id=\"104b\" class=\"pw-post-body-paragraph mh mi fo be b mj nz ml mm mn oa mp mq mr ob mt mu mv oc mx my mz od nb nc nd fh bj\" data-selectable-paragraph=\"\">As previously said, transformers include hundreds of state-of-the-art pre-trained models. One of these models will be used to create the text classification application. The model we will be using is known as Distilbert-base-uncased-emotion.<\/p>\n<p id=\"7478\" class=\"pw-post-body-paragraph mh mi fo be b mj mk ml mm mn mo mp mq mr ms mt mu mv mw mx my mz na nb nc nd fh bj\" data-selectable-paragraph=\"\"><code class=\"cw om on oo op b\"><a class=\"af oe\" href=\"https:\/\/huggingface.co\/docs\/transformers\/model_doc\/distilbert\" target=\"_blank\" rel=\"noopener ugc nofollow\">distilbert-base-uncased-emotion<\/a><\/code> is a HuggingFace model that has been fine-tuned for identifying emotions in texts such as sadness, joy, love, anger, fear, and surprise. We can use DistilBERT to identify whether the customer review is positive or negative.<\/p>\n<p id=\"40b5\" class=\"pw-post-body-paragraph mh mi fo be b mj mk ml mm mn mo mp mq mr ms mt mu mv mw mx my mz na nb nc nd fh bj\" data-selectable-paragraph=\"\">We will assign the model name to a variable and also assign a <code class=\"cw om on oo op b\">random_seed<\/code>value for reproducibility.<\/p>\n<pre>PRE_TRAINED_MODEL_NAME = \"distilbert-base-uncased\"\nrandom_seed = 42<\/pre>\n<h2 id=\"b0d6\" class=\"ne nf fo be ng nh ni nj nk nl nm nn no mr np nq nr mv ns nt nu mz nv nw nx ny bj\" data-selectable-paragraph=\"\">Tokenize Dataset<\/h2>\n<p id=\"fc0d\" class=\"pw-post-body-paragraph mh mi fo be b mj nz ml mm mn oa mp mq mr ob mt mu mv oc mx my mz od nb nc nd fh bj\" data-selectable-paragraph=\"\">Now that we\u2019ve done that, the next thing will be to tokenize the data (i.e get it readily available for the model to interpret). Note that in our definition of <code class=\"cw om on oo op b\">tokenize_function<\/code>, we also select the smaller subset of data we mentioned earlier (300 rows). We can do that by typing the following command:<\/p>\n<pre>tokenizer = AutoTokenizer.from_pretrained(PRE_TRAINED_MODEL_NAME)\n\ndef tokenize_function(data):\n    return tokenizer(data[\"text\"], padding=\"max_length\", truncation=True)\n\n\ntokenize_df = df.map(tokenize_function, batched=True)\ntrain_df = tokenize_df[\"train\"].shuffle(seed=random_seed).select(range(200))\ntest_df = tokenize_df[\"test\"].shuffle(seed=random_seed).select(range(100))\n\ndata_collator = DataCollatorWithPadding(tokenizer=tokenizer)<\/pre>\n<p id=\"7682\" class=\"pw-post-body-paragraph mh mi fo be b mj mk ml mm mn mo mp mq mr ms mt mu mv mw mx my mz na nb nc nd fh bj\" data-selectable-paragraph=\"\">Let\u2019s break that down line by line.<\/p>\n<p id=\"66df\" class=\"pw-post-body-paragraph mh mi fo be b mj mk ml mm mn mo mp mq mr ms mt mu mv mw mx my mz na nb nc nd fh bj\" data-selectable-paragraph=\"\">So, what we did in the above code was to create a tokenized function that will process the text so that it will be ready for the model.<\/p>\n<pre class=\"oy oz pa pb pc pz op qa qb ax qc bj\"><span id=\"2335\" class=\"ne nf fo op b ho qd qe l ie qf\" data-selectable-paragraph=\"\">tokenizer = AutoTokenizer.from_pretrained(PRE_TRAINED_MODEL_NAME)<\/span><\/pre>\n<p id=\"3396\" class=\"pw-post-body-paragraph mh mi fo be b mj mk ml mm mn mo mp mq mr ms mt mu mv mw mx my mz na nb nc nd fh bj\" data-selectable-paragraph=\"\">With the first line of code (above), we instantiated the <code class=\"cw om on oo op b\">AutoTokenizer<\/code> class and passed in the name of the pre-trained model which we created above as a variable (\u201cDistilbert-base-uncased-emotion\u201d)<\/p>\n<p id=\"357c\" class=\"pw-post-body-paragraph mh mi fo be b mj mk ml mm mn mo mp mq mr ms mt mu mv mw mx my mz na nb nc nd fh bj\" data-selectable-paragraph=\"\">Then we created a function that will tokenize the text. The parameters <code class=\"cw om on oo op b\">padding<\/code> and <code class=\"cw om on oo op b\">truncation<\/code> are required for the model\u2019s efficiency since after the text is collected in batch, the characters are not always the same length. The model will then translate the text into a fixed-size sensor. Padding and truncation help to solve the problem. To begin, padding will add padding tokens to ensure that the small sequence is the same length as the larger one, and truncation, when set to <code class=\"cw om on oo op b\">True,<\/code> will truncate the sequence to the maximum length that the model can execute.<\/p>\n<p id=\"8f92\" class=\"pw-post-body-paragraph mh mi fo be b mj mk ml mm mn mo mp mq mr ms mt mu mv mw mx my mz na nb nc nd fh bj\" data-selectable-paragraph=\"\">To learn more about this process click <a class=\"af oe\" href=\"https:\/\/huggingface.co\/docs\/transformers\/pad_truncation\" target=\"_blank\" rel=\"noopener ugc nofollow\">here<\/a>.<\/p>\n<p id=\"8f94\" class=\"pw-post-body-paragraph mh mi fo be b mj mk ml mm mn mo mp mq mr ms mt mu mv mw mx my mz na nb nc nd fh bj\" data-selectable-paragraph=\"\">Then, the last part maps the function to the dataset, setting batch to <code class=\"cw om on oo op b\">True<\/code> is also needed so that the model can process it in batches which will be faster.<\/p>\n<p id=\"9600\" class=\"pw-post-body-paragraph mh mi fo be b mj mk ml mm mn mo mp mq mr ms mt mu mv mw mx my mz na nb nc nd fh bj\" data-selectable-paragraph=\"\">Then, as previously stated, we subset the data we require, 200 for the training set and 100 for the evaluation set.<\/p>\n<p id=\"cb8e\" class=\"pw-post-body-paragraph mh mi fo be b mj mk ml mm mn mo mp mq mr ms mt mu mv mw mx my mz na nb nc nd fh bj\" data-selectable-paragraph=\"\">The last part of the code is required for the model to collect data in batches. We\u2019ll see how this works at the end of the article.<\/p>\n<h2 id=\"3bf2\" class=\"ne nf fo be ng nh ni nj nk nl nm nn no mr np nq nr mv ns nt nu mz nv nw nx ny bj\" data-selectable-paragraph=\"\">Instantiating the Transformer Model<\/h2>\n<p id=\"a4c2\" class=\"pw-post-body-paragraph mh mi fo be b mj nz ml mm mn oa mp mq mr ob mt mu mv oc mx my mz od nb nc nd fh bj\" data-selectable-paragraph=\"\">Once the data is ready, we can now build our transformer model. To do that we type the following:<\/p>\n<pre>model = AutoModelForSequenceClassification.from_pretrained(\n    PRE_TRAINED_MODEL_NAME, num_labels=2\n)<\/pre>\n<h2 id=\"7d91\" class=\"ne nf fo be ng nh ni nj nk nl nm nn no mr np nq nr mv ns nt nu mz nv nw nx ny bj\" data-selectable-paragraph=\"\">Creating The Trainer Model<\/h2>\n<p id=\"1ee8\" class=\"pw-post-body-paragraph mh mi fo be b mj nz ml mm mn oa mp mq mr ob mt mu mv oc mx my mz od nb nc nd fh bj\" data-selectable-paragraph=\"\">Now that we\u2019ve instantiated our model the next thing will be to create the Trainer object that will be used to fine-tune our model. The Trainer object has 11 parameters but we will be making use of the 6 most important ones. These are:<\/p>\n<ul class=\"\">\n<li id=\"8fd6\" class=\"mh mi fo be b mj mk ml mm mn mo mp mq oj ms mt mu ok mw mx my ol na nb nc nd oq or os bj\" data-selectable-paragraph=\"\"><code class=\"cw om on oo op b\"><strong class=\"be qg\">model<\/strong><\/code>: The model for predictions.<\/li>\n<li id=\"b48a\" class=\"mh mi fo be b mj ot ml mm mn ou mp mq oj ov mt mu ok ow mx my ol ox nb nc nd oq or os bj\" data-selectable-paragraph=\"\"><code class=\"cw om on oo op b\"><strong class=\"be qg\">args<\/strong><\/code>: The arguments to tweak or fine-tune.<\/li>\n<li id=\"5be3\" class=\"mh mi fo be b mj ot ml mm mn ou mp mq oj ov mt mu ok ow mx my ol ox nb nc nd oq or os bj\" data-selectable-paragraph=\"\"><code class=\"cw om on oo op b\"><strong class=\"be qg\">data_collator<\/strong><\/code>: The function that will be used to form a batch of the training and test set.<\/li>\n<li id=\"f863\" class=\"mh mi fo be b mj ot ml mm mn ou mp mq oj ov mt mu ok ow mx my ol ox nb nc nd oq or os bj\" data-selectable-paragraph=\"\"><code class=\"cw om on oo op b\"><strong class=\"be qg\">train_dataset<\/strong><\/code>: The dataset to be used for training.<\/li>\n<li id=\"d588\" class=\"mh mi fo be b mj ot ml mm mn ou mp mq oj ov mt mu ok ow mx my ol ox nb nc nd oq or os bj\" data-selectable-paragraph=\"\"><code class=\"cw om on oo op b\"><strong class=\"be qg\">eval_dataset<\/strong><\/code>: The dataset to be used for evaluation.<\/li>\n<li id=\"edd7\" class=\"mh mi fo be b mj ot ml mm mn ou mp mq oj ov mt mu ok ow mx my ol ox nb nc nd oq or os bj\" data-selectable-paragraph=\"\"><code class=\"cw om on oo op b\"><strong class=\"be qg\">tokenizer<\/strong><\/code>: The tokenizer that is used to preprocess the data.<\/li>\n<li id=\"8ca1\" class=\"mh mi fo be b mj ot ml mm mn ou mp mq oj ov mt mu ok ow mx my ol ox nb nc nd oq or os bj\" data-selectable-paragraph=\"\"><code class=\"cw om on oo op b\"><strong class=\"be qg\">compute_metrics<\/strong><\/code>: The function that will be used to compute metrics for evaluating the model.<\/li>\n<\/ul>\n<p id=\"9d63\" class=\"pw-post-body-paragraph mh mi fo be b mj mk ml mm mn mo mp mq mr ms mt mu mv mw mx my mz na nb nc nd fh bj\" data-selectable-paragraph=\"\">We don\u2019t have two of these six parameters, which are training arguments and the function to compute the metrics of the evaluate set. First, we will create the training argument. To do so, we must use the <code class=\"cw om on oo op b\">TrainingArguments<\/code> object we imported earlier and pass it the argument we want to use. We can accomplish this by entering the code below.<\/p>\n<pre>training_arguments = TrainingArguments(\n    seed=random_seed,\n    optim=\"adamw_torch\",\n    learning_rate=5e-5,\n    num_train_epochs=1,\n    output_dir=\".\/results\",\n    overwrite_output_dir=True,\n    do_train=True,\n    do_eval=True,\n    evaluation_strategy=\"steps\",\n    eval_steps=25,\n    save_strategy=\"steps\",\n    save_total_limit=10,\n    save_steps=25\n)<\/pre>\n<p id=\"e07b\" class=\"pw-post-body-paragraph mh mi fo be b mj mk ml mm mn mo mp mq mr ms mt mu mv mw mx my mz na nb nc nd fh bj\" data-selectable-paragraph=\"\">The transformer provides integration with Comet that can allow us to automatically report all metrics, logs, etc., to the Comet platform.<\/p>\n<p id=\"7133\" class=\"pw-post-body-paragraph mh mi fo be b mj mk ml mm mn mo mp mq mr ms mt mu mv mw mx my mz na nb nc nd fh bj\" data-selectable-paragraph=\"\">The transformer library has a lot of parameters that can be used in the training argument (about 94 of them in total!). You can click <a class=\"af oe\" href=\"https:\/\/huggingface.co\/docs\/transformers\/v4.24.0\/en\/main_classes\/trainer#transformers.TrainingArguments\" target=\"_blank\" rel=\"noopener ugc nofollow\">here <\/a>to learn more about how to tweak your model more accurately.<\/p>\n<p id=\"9dd1\" class=\"pw-post-body-paragraph mh mi fo be b mj mk ml mm mn mo mp mq mr ms mt mu mv mw mx my mz na nb nc nd fh bj\" data-selectable-paragraph=\"\">After that, we can create the function that will be utilized to compute the metrics.<\/p>\n<pre>def compute_metrics(pred):\n\n    #get global experiments\n    experiment = comet_ml.get_global_experiment()\n\n    #get y_true and y_preds for eval_dataset\n    labels = pred.label_ids\n    preds = pred.predictions.argmax(-1)\n\n    #compute precision, recall, and F1 score\n    precision, recall, f1, _ = precision_recall_fscore_support(\n        labels, preds, average='macro')\n\n    #compute accuracy score\n    acc = accuracy_score(labels, preds)\n\n    #log confusion matrix\n    if experiment:\n        epoch = int(experiment.curr_epoch) if experiment.curr_epoch is not None else 0\n        experiment.set_epoch(epoch)\n        experiment.log_confusion_matrix(\n            y_true=labels,\n            y_predicted=preds,\n            labels=[\"negative\", \"postive\"]\n        )\n\n    return {\"accuracy\": acc,\n            \"f1\": f1,\n            \"precision\": precision,\n            \"recall\": recall\n            }<\/pre>\n<p id=\"8ada\" class=\"pw-post-body-paragraph mh mi fo be b mj mk ml mm mn mo mp mq mr ms mt mu mv mw mx my mz na nb nc nd fh bj\" data-selectable-paragraph=\"\">Now that we are done with that we can build our model and train it by typing the following code.<\/p>\n<pre>%env COMET_MODE=ONLINE\n%env COMET_LOG_ASSETS=TRUE\ntrainer = Trainer(\n    model=model,\n    args=training_args,\n    train_dataset=train_df,\n    eval_dataset=test_df,\n    compute_metrics=compute_metrics,\n    data_collator=data_collator,\n)\ntrainer.train()<\/pre>\n<h2 id=\"d370\" class=\"ne nf fo be ng nh ni nj nk nl nm nn no mr np nq nr mv ns nt nu mz nv nw nx ny bj\" data-selectable-paragraph=\"\">Results<\/h2>\n<figure class=\"oy oz pa pb pc ma\">\n<div class=\"pd ig l eb\">\n<div class=\"qh pf l\"><iframe loading=\"lazy\" class=\"ek n fc dx bg\" title=\"New Recording - 11\/6\/2022, 4:14:41 PM\" src=\"https:\/\/cdn.embedly.com\/widgets\/media.html?src=https%3A%2F%2Fplayer.vimeo.com%2Fvideo%2F767808122%3Fh%3De9932545a6%26app_id%3D122963&amp;dntp=1&amp;display_name=Vimeo&amp;url=https%3A%2F%2Fvimeo.com%2F767808122%2Fe9932545a6&amp;image=https%3A%2F%2Fi.vimeocdn.com%2Fvideo%2F1541661715-aa7d4dc91d1223554adc8241be7218b6825552898a9d59ffecdd6aa20283f365-d_1280&amp;key=a19fcc184b9711e1b4764040d3dc5c07&amp;type=text%2Fhtml&amp;schema=vimeo\" width=\"1536\" height=\"864\" frameborder=\"0\" scrolling=\"no\" allowfullscreen=\"allowfullscreen\"><\/iframe><\/div>\n<\/div>\n<\/figure>\n<h1 id=\"3594\" class=\"qi nf fo be ng qj qk ql nk qm qn qo no qp qq qr qs qt qu qv qw qx qy qz ra rb bj\" data-selectable-paragraph=\"\">Conclusion<\/h1>\n<p id=\"ccf9\" class=\"pw-post-body-paragraph mh mi fo be b mj nz ml mm mn oa mp mq mr ob mt mu mv oc mx my mz od nb nc nd fh bj\" data-selectable-paragraph=\"\">In this article, you learned how to create a text classification model using Transformers and how to keep track of the model experiment in Comet. Given that the training set was 200 observations, you can see that our model worked reasonably well. You can test its accuracy by increasing the training set size. Thank you for reading. You can find the link to the colab below.<\/p>\n<ul>\n<li><a href=\"https:\/\/colab.research.google.com\/drive\/1_OIqw1tHwnYYSyvrQ-VaxU4z5K4lBBI9?usp=sharing\">Text Classification<\/a><\/li>\n<li><a href=\"https:\/\/huggingface.co\/docs\/transformers\/main\/en\/index\">Transformers<\/a><\/li>\n<li><a href=\"https:\/\/www.comet.com\/site\/\">Comet<\/a><\/li>\n<\/ul>\n<\/div>\n<\/div>\n<\/div>\n","protected":false},"excerpt":{"rendered":"<p>Text classification is an interesting part of machine learning and natural language processing that is used in business and everyday life to determine the sentiment of a text. For example, a company can use machine learning to determine whether customer reviews are positive or not. Machine learning models may now be developed in a variety [&hellip;]<\/p>\n","protected":false},"author":20,"featured_media":0,"comment_status":"closed","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"customer_name":"","customer_description":"","customer_industry":"","customer_technologies":"","customer_logo":"","footnotes":""},"categories":[6,7],"tags":[],"coauthors":[108,137],"class_list":["post-6436","post","type-post","status-publish","format-standard","hentry","category-machine-learning","category-tutorials"],"yoast_head":"<!-- This site is optimized with the Yoast SEO Premium plugin v25.9 (Yoast SEO v25.9) - https:\/\/yoast.com\/wordpress\/plugins\/seo\/ -->\n<title>How to Build a Text Classification Model Using HuggingFace Transformers and Comet - Comet<\/title>\n<meta name=\"robots\" content=\"index, follow, max-snippet:-1, max-image-preview:large, max-video-preview:-1\" \/>\n<link rel=\"canonical\" href=\"https:\/\/www.comet.com\/site\/blog\/how-to-build-a-text-classification-model-using-huggingface-transformers-and-comet\/\" \/>\n<meta property=\"og:locale\" content=\"en_US\" \/>\n<meta property=\"og:type\" content=\"article\" \/>\n<meta property=\"og:title\" content=\"How to Build a Text Classification Model Using HuggingFace Transformers and Comet\" \/>\n<meta property=\"og:description\" content=\"Text classification is an interesting part of machine learning and natural language processing that is used in business and everyday life to determine the sentiment of a text. For example, a company can use machine learning to determine whether customer reviews are positive or not. Machine learning models may now be developed in a variety [&hellip;]\" \/>\n<meta property=\"og:url\" content=\"https:\/\/www.comet.com\/site\/blog\/how-to-build-a-text-classification-model-using-huggingface-transformers-and-comet\/\" \/>\n<meta property=\"og:site_name\" content=\"Comet\" \/>\n<meta property=\"article:publisher\" content=\"https:\/\/www.facebook.com\/cometdotml\" \/>\n<meta property=\"article:published_time\" content=\"2023-06-20T03:13:11+00:00\" \/>\n<meta property=\"article:modified_time\" content=\"2025-04-24T17:15:20+00:00\" \/>\n<meta property=\"og:image\" content=\"https:\/\/miro.medium.com\/v2\/resize:fit:700\/1*lhYlG4J73LIP4jrz-F49pg.png\" \/>\n<meta name=\"author\" content=\"Sharmila Chockalingam, Ibrahim Ogunbiyi\" \/>\n<meta name=\"twitter:card\" content=\"summary_large_image\" \/>\n<meta name=\"twitter:creator\" content=\"@Cometml\" \/>\n<meta name=\"twitter:site\" content=\"@Cometml\" \/>\n<meta name=\"twitter:label1\" content=\"Written by\" \/>\n\t<meta name=\"twitter:data1\" content=\"Sharmila Chockalingam, Ibrahim Ogunbiyi\" \/>\n\t<meta name=\"twitter:label2\" content=\"Est. reading time\" \/>\n\t<meta name=\"twitter:data2\" content=\"8 minutes\" \/>\n<!-- \/ Yoast SEO Premium plugin. -->","yoast_head_json":{"title":"How to Build a Text Classification Model Using HuggingFace Transformers and Comet - Comet","robots":{"index":"index","follow":"follow","max-snippet":"max-snippet:-1","max-image-preview":"max-image-preview:large","max-video-preview":"max-video-preview:-1"},"canonical":"https:\/\/www.comet.com\/site\/blog\/how-to-build-a-text-classification-model-using-huggingface-transformers-and-comet\/","og_locale":"en_US","og_type":"article","og_title":"How to Build a Text Classification Model Using HuggingFace Transformers and Comet","og_description":"Text classification is an interesting part of machine learning and natural language processing that is used in business and everyday life to determine the sentiment of a text. For example, a company can use machine learning to determine whether customer reviews are positive or not. Machine learning models may now be developed in a variety [&hellip;]","og_url":"https:\/\/www.comet.com\/site\/blog\/how-to-build-a-text-classification-model-using-huggingface-transformers-and-comet\/","og_site_name":"Comet","article_publisher":"https:\/\/www.facebook.com\/cometdotml","article_published_time":"2023-06-20T03:13:11+00:00","article_modified_time":"2025-04-24T17:15:20+00:00","og_image":[{"url":"https:\/\/miro.medium.com\/v2\/resize:fit:700\/1*lhYlG4J73LIP4jrz-F49pg.png","type":"","width":"","height":""}],"author":"Sharmila Chockalingam, Ibrahim Ogunbiyi","twitter_card":"summary_large_image","twitter_creator":"@Cometml","twitter_site":"@Cometml","twitter_misc":{"Written by":"Sharmila Chockalingam, Ibrahim Ogunbiyi","Est. reading time":"8 minutes"},"schema":{"@context":"https:\/\/schema.org","@graph":[{"@type":"Article","@id":"https:\/\/www.comet.com\/site\/blog\/how-to-build-a-text-classification-model-using-huggingface-transformers-and-comet\/#article","isPartOf":{"@id":"https:\/\/www.comet.com\/site\/blog\/how-to-build-a-text-classification-model-using-huggingface-transformers-and-comet\/"},"author":{"name":"Sharmila Chockalingam","@id":"https:\/\/www.comet.com\/site\/#\/schema\/person\/bc6801cf8b757256d822841d7110e9c2"},"headline":"How to Build a Text Classification Model Using HuggingFace Transformers and Comet","datePublished":"2023-06-20T03:13:11+00:00","dateModified":"2025-04-24T17:15:20+00:00","mainEntityOfPage":{"@id":"https:\/\/www.comet.com\/site\/blog\/how-to-build-a-text-classification-model-using-huggingface-transformers-and-comet\/"},"wordCount":1456,"publisher":{"@id":"https:\/\/www.comet.com\/site\/#organization"},"image":{"@id":"https:\/\/www.comet.com\/site\/blog\/how-to-build-a-text-classification-model-using-huggingface-transformers-and-comet\/#primaryimage"},"thumbnailUrl":"https:\/\/miro.medium.com\/v2\/resize:fit:700\/1*lhYlG4J73LIP4jrz-F49pg.png","articleSection":["Machine Learning","Tutorials"],"inLanguage":"en-US"},{"@type":"WebPage","@id":"https:\/\/www.comet.com\/site\/blog\/how-to-build-a-text-classification-model-using-huggingface-transformers-and-comet\/","url":"https:\/\/www.comet.com\/site\/blog\/how-to-build-a-text-classification-model-using-huggingface-transformers-and-comet\/","name":"How to Build a Text Classification Model Using HuggingFace Transformers and Comet - Comet","isPartOf":{"@id":"https:\/\/www.comet.com\/site\/#website"},"primaryImageOfPage":{"@id":"https:\/\/www.comet.com\/site\/blog\/how-to-build-a-text-classification-model-using-huggingface-transformers-and-comet\/#primaryimage"},"image":{"@id":"https:\/\/www.comet.com\/site\/blog\/how-to-build-a-text-classification-model-using-huggingface-transformers-and-comet\/#primaryimage"},"thumbnailUrl":"https:\/\/miro.medium.com\/v2\/resize:fit:700\/1*lhYlG4J73LIP4jrz-F49pg.png","datePublished":"2023-06-20T03:13:11+00:00","dateModified":"2025-04-24T17:15:20+00:00","breadcrumb":{"@id":"https:\/\/www.comet.com\/site\/blog\/how-to-build-a-text-classification-model-using-huggingface-transformers-and-comet\/#breadcrumb"},"inLanguage":"en-US","potentialAction":[{"@type":"ReadAction","target":["https:\/\/www.comet.com\/site\/blog\/how-to-build-a-text-classification-model-using-huggingface-transformers-and-comet\/"]}]},{"@type":"ImageObject","inLanguage":"en-US","@id":"https:\/\/www.comet.com\/site\/blog\/how-to-build-a-text-classification-model-using-huggingface-transformers-and-comet\/#primaryimage","url":"https:\/\/miro.medium.com\/v2\/resize:fit:700\/1*lhYlG4J73LIP4jrz-F49pg.png","contentUrl":"https:\/\/miro.medium.com\/v2\/resize:fit:700\/1*lhYlG4J73LIP4jrz-F49pg.png"},{"@type":"BreadcrumbList","@id":"https:\/\/www.comet.com\/site\/blog\/how-to-build-a-text-classification-model-using-huggingface-transformers-and-comet\/#breadcrumb","itemListElement":[{"@type":"ListItem","position":1,"name":"Home","item":"https:\/\/www.comet.com\/site\/"},{"@type":"ListItem","position":2,"name":"How to Build a Text Classification Model Using HuggingFace Transformers and Comet"}]},{"@type":"WebSite","@id":"https:\/\/www.comet.com\/site\/#website","url":"https:\/\/www.comet.com\/site\/","name":"Comet","description":"Build Better Models Faster","publisher":{"@id":"https:\/\/www.comet.com\/site\/#organization"},"potentialAction":[{"@type":"SearchAction","target":{"@type":"EntryPoint","urlTemplate":"https:\/\/www.comet.com\/site\/?s={search_term_string}"},"query-input":{"@type":"PropertyValueSpecification","valueRequired":true,"valueName":"search_term_string"}}],"inLanguage":"en-US"},{"@type":"Organization","@id":"https:\/\/www.comet.com\/site\/#organization","name":"Comet ML, Inc.","alternateName":"Comet","url":"https:\/\/www.comet.com\/site\/","logo":{"@type":"ImageObject","inLanguage":"en-US","@id":"https:\/\/www.comet.com\/site\/#\/schema\/logo\/image\/","url":"https:\/\/www.comet.com\/site\/wp-content\/uploads\/2025\/01\/logo_comet_square.png","contentUrl":"https:\/\/www.comet.com\/site\/wp-content\/uploads\/2025\/01\/logo_comet_square.png","width":310,"height":310,"caption":"Comet ML, Inc."},"image":{"@id":"https:\/\/www.comet.com\/site\/#\/schema\/logo\/image\/"},"sameAs":["https:\/\/www.facebook.com\/cometdotml","https:\/\/x.com\/Cometml","https:\/\/www.youtube.com\/channel\/UCmN63HKvfXSCS-UwVwmK8Hw"]},{"@type":"Person","@id":"https:\/\/www.comet.com\/site\/#\/schema\/person\/bc6801cf8b757256d822841d7110e9c2","name":"Sharmila Chockalingam","image":{"@type":"ImageObject","inLanguage":"en-US","@id":"https:\/\/www.comet.com\/site\/#\/schema\/person\/image\/6373027f0bdad8c3a2bd3069f1c48136","url":"https:\/\/www.comet.com\/site\/wp-content\/uploads\/2023\/08\/sharmilachockalingam-96x96.jpg","contentUrl":"https:\/\/www.comet.com\/site\/wp-content\/uploads\/2023\/08\/sharmilachockalingam-96x96.jpg","caption":"Sharmila Chockalingam"},"sameAs":["https:\/\/www.comet.com\/"],"url":"https:\/\/www.comet.com\/site\/blog\/author\/sharmilaccomet-com\/"}]}},"_links":{"self":[{"href":"https:\/\/www.comet.com\/site\/wp-json\/wp\/v2\/posts\/6436","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/www.comet.com\/site\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/www.comet.com\/site\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/www.comet.com\/site\/wp-json\/wp\/v2\/users\/20"}],"replies":[{"embeddable":true,"href":"https:\/\/www.comet.com\/site\/wp-json\/wp\/v2\/comments?post=6436"}],"version-history":[{"count":1,"href":"https:\/\/www.comet.com\/site\/wp-json\/wp\/v2\/posts\/6436\/revisions"}],"predecessor-version":[{"id":15610,"href":"https:\/\/www.comet.com\/site\/wp-json\/wp\/v2\/posts\/6436\/revisions\/15610"}],"wp:attachment":[{"href":"https:\/\/www.comet.com\/site\/wp-json\/wp\/v2\/media?parent=6436"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/www.comet.com\/site\/wp-json\/wp\/v2\/categories?post=6436"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/www.comet.com\/site\/wp-json\/wp\/v2\/tags?post=6436"},{"taxonomy":"author","embeddable":true,"href":"https:\/\/www.comet.com\/site\/wp-json\/wp\/v2\/coauthors?post=6436"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}