{"id":8752,"date":"2024-01-18T06:00:52","date_gmt":"2024-01-18T14:00:52","guid":{"rendered":"https:\/\/live-cometml.pantheonsite.io\/?p=8752"},"modified":"2025-04-24T17:03:30","modified_gmt":"2025-04-24T17:03:30","slug":"gemini-a-new-multimodal-ai-model-of-google","status":"publish","type":"post","link":"https:\/\/www.comet.com\/site\/blog\/gemini-a-new-multimodal-ai-model-of-google\/","title":{"rendered":"Gemini: A New Multimodal AI Model of Google"},"content":{"rendered":"\n<figure class=\"wp-block-image graf graf--figure\"><img decoding=\"async\" src=\"https:\/\/cdn-images-1.medium.com\/max\/800\/1*OG7SRMpDJPmSD-QeQ3SOtw.jpeg\" alt=\"human hand and robot hand touching\"\/><figcaption class=\"wp-element-caption\">Photo by <a class=\"markup--anchor markup--figure-anchor\" href=\"https:\/\/unsplash.com\/@omilaev?utm_content=creditCopyText&amp;utm_medium=referral&amp;utm_source=unsplash\" target=\"_blank\" rel=\"noopener\" data-href=\"https:\/\/unsplash.com\/@omilaev?utm_content=creditCopyText&amp;utm_medium=referral&amp;utm_source=unsplash\">Igor Omilaev<\/a> on&nbsp;<a class=\"markup--anchor markup--figure-anchor\" href=\"https:\/\/unsplash.com\/photos\/two-hands-touching-each-other-in-front-of-a-pink-background-gVQLAbGVB6Q?utm_content=creditCopyText&amp;utm_medium=referral&amp;utm_source=unsplash\" target=\"_blank\" rel=\"noopener\" data-href=\"https:\/\/unsplash.com\/photos\/two-hands-touching-each-other-in-front-of-a-pink-background-gVQLAbGVB6Q?utm_content=creditCopyText&amp;utm_medium=referral&amp;utm_source=unsplash\">Unsplash<\/a><\/figcaption><\/figure>\n\n\n\n<h3 class=\"wp-block-heading graf graf--h3\">Introduction<\/h3>\n\n\n\n<p class=\"graf graf--p wp-block-paragraph\">Google has taken a significant leap by introducing the <a class=\"markup--anchor markup--p-anchor\" href=\"https:\/\/blog.google\/technology\/ai\/google-gemini-ai\/#sundar-note\" target=\"_blank\" rel=\"noopener\" data-href=\"https:\/\/blog.google\/technology\/ai\/google-gemini-ai\/#sundar-note\"><strong class=\"markup--strong markup--p-strong\">Gemini AI<\/strong>, its latest large language model (LLM)<\/a>, to the public. This milestone will bring widespread changes across all of Google\u2019s products. With the capability to perform tasks more akin to human actions, Gemini represents a crucial stride toward achieving artificial general intelligence (AGI).<\/p>\n\n\n\n<h3 class=\"wp-block-heading graf graf--h3\">What is&nbsp;Gemini?<\/h3>\n\n\n\n<p class=\"graf graf--p wp-block-paragraph\">Google introduced Gemini, an innovative artificial intelligence (AI) system capable of intelligently comprehending and conversing about various prompts, including pictures, text, speech, music, computer code, and more. This form of AI is called a multimodal model, representing a significant advancement beyond the capability to handle only text or images.<\/p>\n\n\n\n<p class=\"graf graf--p wp-block-paragraph\">Gemini is more than just a single AI model, and one notable feature of Gemini is its capacity for visual language interpretation. It was released on <a class=\"markup--anchor markup--p-anchor\" href=\"https:\/\/blog.google\/technology\/ai\/google-gemini-ai\/#sundar-note\" target=\"_blank\" rel=\"noopener\" data-href=\"https:\/\/blog.google\/technology\/ai\/google-gemini-ai\/#sundar-note\"><strong class=\"markup--strong markup--p-strong\">06 December 2023.<\/strong><\/a><\/p>\n\n\n\n<h3 class=\"wp-block-heading graf graf--h3\">Different Versions of&nbsp;Gemini<\/h3>\n\n\n\n<p class=\"graf graf--p wp-block-paragraph\">Gemini is a flexible model that works well on various platforms, including data centers and mobile devices. As a result, it has been made available in three different versions.<\/p>\n\n\n\n<ol class=\"wp-block-list postList\">\n<li><strong class=\"markup--strong markup--li-strong\">Gemini Nano: <\/strong>Crafted for mobile devices, especially the Google Pixel 8, Gemini Nano excels in compact on-device AI processing. This lightweight model ensures efficient offline performance, delivering powerful AI capabilities for tasks like suggesting chat replies and text summarization on smartphones.<\/li>\n\n\n\n<li><strong class=\"markup--strong markup--li-strong\">Gemini Pro:<\/strong> The advanced variant fuels Google\u2019s latest AI chatbot, Bard, ensuring swift responses and adept query handling. Integrated into data centers, Gemini Pro enhances Bard\u2019s reasoning, planning, and understanding capabilities, marking a significant stride in efficient AI-driven interactions.<\/li>\n\n\n\n<li><strong class=\"markup--strong markup--li-strong\">Gemini Ultra: <\/strong>Google asserts it is the most advanced model, surpassing state-of-the-art results and widely-used academic benchmarks in large language model (LLM) research and development. They are specifically designed to tackle highly complex tasks.<\/li>\n<\/ol>\n\n\n\n<h3 class=\"wp-block-heading graf graf--h3\">Gemini\u2019s Training<\/h3>\n\n\n\n<p class=\"graf graf--p wp-block-paragraph\">The training of Gemini, Google\u2019s powerful multimodal AI model, involved several vital aspects:<\/p>\n\n\n\n<h4 class=\"wp-block-heading graf graf--h4\"><strong class=\"markup--strong markup--h4-strong\">Data:<\/strong><\/h4>\n\n\n\n<ol class=\"wp-block-list postList\">\n<li><strong class=\"markup--strong markup--li-strong\">Massive and diverse:<\/strong> Gemini was trained on a vast dataset of text, code, images, audio, and other modalities. This diversity helps it understand different types of information and make connections between them.<\/li>\n\n\n\n<li><strong class=\"markup--strong markup--li-strong\">High-quality:<\/strong> The data was carefully curated and filtered to ensure accuracy and relevance. This helps prevent biases and promotes reliable performance.<\/li>\n<\/ol>\n\n\n\n<h4 class=\"wp-block-heading graf graf--h4\">Training Techniques:<\/h4>\n\n\n\n<ol class=\"wp-block-list postList\">\n<li><strong class=\"markup--strong markup--li-strong\">Multimodal learning:<\/strong> Gemini leverages multimodal learning, which simultaneously processes information from various modalities. This allows it to understand the relationships between different data types, leading to a richer understanding of the world.<\/li>\n\n\n\n<li><strong class=\"markup--strong markup--li-strong\">Transfer learning: <\/strong>The model also benefits from transfer learning, where knowledge gained from pre-trained models on specific tasks is transferred to new tasks. This helps it learn faster and achieve better performance.<\/li>\n<\/ol>\n\n\n\n<h4 class=\"wp-block-heading graf graf--h4\">Hardware:<\/h4>\n\n\n\n<ol class=\"wp-block-list postList\">\n<li><strong class=\"markup--strong markup--li-strong\">Custom TPUs:<\/strong> Google employed their specially designed Tensor Processing Units (TPUs) v4 and v5e. These advanced AI accelerators allowed for efficient and scalable training, handling the immense processing demands of the model.<\/li>\n<\/ol>\n\n\n\n<h3 class=\"wp-block-heading graf graf--h3\">Architecture<\/h3>\n\n\n\n<p class=\"graf graf--p wp-block-paragraph\">The researchers did not disclose the full architecture details. Still, they mentioned that <a class=\"markup--anchor markup--p-anchor\" href=\"https:\/\/assets.bwbx.io\/documents\/users\/iqjWHBFdfxIU\/r7G7RrtT6rnM\/v0\" target=\"_blank\" rel=\"noopener\" data-href=\"https:\/\/assets.bwbx.io\/documents\/users\/iqjWHBFdfxIU\/r7G7RrtT6rnM\/v0\">Gemini models are constructed upon a transformer decoder<\/a> architecture similar to the one used in popular NLP models like GPT-3. However, not all specifics of the architecture were revealed. The models are written in <a class=\"markup--anchor markup--p-anchor\" href=\"https:\/\/jax.readthedocs.io\/en\/latest\/notebooks\/quickstart.html\" target=\"_blank\" rel=\"noopener\" data-href=\"https:\/\/jax.readthedocs.io\/en\/latest\/notebooks\/quickstart.html\"><strong class=\"markup--strong markup--p-strong\">Jax<\/strong><\/a> and trained using TPUs.<\/p>\n\n\n\n<figure class=\"wp-block-image graf graf--figure\"><img decoding=\"async\" src=\"https:\/\/cdn-images-1.medium.com\/max\/800\/0*gtF2zF38pWU8ylEo.png\" alt=\"gemini ai architecture\"\/><figcaption class=\"wp-element-caption\">Image from: <a class=\"markup--anchor markup--figure-anchor\" href=\"https:\/\/assets.bwbx.io\/documents\/users\/iqjWHBFdfxIU\/r7G7RrtT6rnM\/v0\" target=\"_blank\" rel=\"noopener\" data-href=\"https:\/\/assets.bwbx.io\/documents\/users\/iqjWHBFdfxIU\/r7G7RrtT6rnM\/v0\">https:\/\/www.unite.ai\/wp-content\/<\/a><\/figcaption><\/figure>\n\n\n\n<p class=\"wp-block-paragraph\">&nbsp;<\/p>\n\n\n\n<p class=\"graf graf--p wp-block-paragraph\"><strong class=\"markup--strong markup--p-strong\">Input sequence<\/strong>: The user enters data in various formats, including text, graphs, photos, audio, video, and 3D models.<\/p>\n\n\n\n<p class=\"graf graf--p wp-block-paragraph\"><strong class=\"markup--strong markup--p-strong\">Encoder<\/strong>: These inputs are taken by the encoder, transforming them into a language the decoder can comprehend. The various data types are transformed into a single, cohesive representation to achieve this.<\/p>\n\n\n\n<p class=\"graf graf--p wp-block-paragraph\"><strong class=\"markup--strong markup--p-strong\">Model<\/strong>: The model is fed the encoded inputs. No information about the task\u2019s specifics is required of the multimodal model. It only handles the inputs by the current job.<\/p>\n\n\n\n<p class=\"graf graf--p wp-block-paragraph\"><strong class=\"markup--strong markup--p-strong\">Image and text decoder: <\/strong>The decoder creates the outputs by processing the model\u2019s inputs. Gemini can only produce text and image outputs at this time.<\/p>\n\n\n\n<h3 class=\"wp-block-heading graf graf--h3\">Features of&nbsp;Gemini<\/h3>\n\n\n\n<p class=\"graf graf--p wp-block-paragraph\">The Google Gemini models excel in various tasks spanning text, image, audio, and video understanding. These features offer insights into how Gemini outperforms other AI systems.<\/p>\n\n\n\n<ol class=\"wp-block-list postList\">\n<li><strong class=\"markup--strong markup--li-strong\">Multimodal capabilities: <\/strong>It increases its capacity to understand and produce content in various modes. Because of its adaptability, Gemini can process text, comprehend fine details from images, and identify patterns in audio. It can also interpret and navigate a variety of information sources with ease.<\/li>\n\n\n\n<li><strong class=\"markup--strong markup--li-strong\">Problem-solving and reasoning:<\/strong> Gemini has strong analytical and problem-solving skills comparable to human cognitive processes. It can handle complex tasks like evaluating academic work or analyzing large, complex data sets\u2014an efficient way to assess numerous workpieces simultaneously.<\/li>\n\n\n\n<li><strong class=\"markup--strong markup--li-strong\">Advanced coding: <\/strong>Programming languages that are most widely used worldwide, including Python, Java, C++, and Go, can be understood and explained, and high-quality code can be generated using the first version of Gemini. As one of the top foundation models for coding worldwide, it can reason about complex information and operate across multiple languages.<\/li>\n\n\n\n<li><strong class=\"markup--strong markup--li-strong\">Multiple Versions:<\/strong> Three versions are available for the model: Pro, Nano, and Ultra, which is the most advanced. Every version meets different needs and degrees of complexity.<\/li>\n<\/ol>\n\n\n\n<h3 class=\"wp-block-heading graf graf--h3\"><strong class=\"markup--strong markup--h3-strong\">Accessing Gemini<\/strong><\/h3>\n\n\n\n<p class=\"graf graf--p wp-block-paragraph\">Feel free to explore Google\u2019s new AI technology once you\u2019re familiar with it. No waiting period or beta testing is necessary; Gemini Pro is available through the <a class=\"markup--anchor markup--p-anchor\" href=\"https:\/\/bard.google.com\/chat\" target=\"_blank\" rel=\"noopener\" data-href=\"https:\/\/bard.google.com\/chat\"><strong class=\"markup--strong markup--p-strong\">Bard chatbot website<\/strong><\/a>. Accessing Gemini Pro depends on how you want to use it:<\/p>\n\n\n\n<p class=\"graf graf--p wp-block-paragraph\"><strong class=\"markup--strong markup--p-strong\">Through Google AI Platform:<\/strong><\/p>\n\n\n\n<ol class=\"wp-block-list postList\">\n<li>Sign up for Google AI Platform (AIP): If you haven\u2019t already, create an account on AIP. You can get started for free with a trial access.<\/li>\n\n\n\n<li>Enable the Gemini Pro API: Head to the \u201cMarketplace\u201d section in AIP and search for \u201cGemini Pro.\u201d Click on it and enable the API.<\/li>\n\n\n\n<li>Use the API or Notebook: You can access Gemini Pro through either the API directly or via a pre-built Jupyter Notebook provided by Google. The notebook offers a user-friendly interface for experimenting with the model.<\/li>\n<\/ol>\n\n\n\n<p class=\"graf graf--p wp-block-paragraph\"><strong class=\"markup--strong markup--p-strong\">Through Python:<\/strong><\/p>\n\n\n\n<ol class=\"wp-block-list postList\">\n<li>Install the necessary libraries: You\u2019ll need the <code class=\"markup--code markup--li-code\"><strong class=\"markup--strong markup--li-strong\">google-cloud-aiplatform<\/strong><\/code> library and any other libraries specific to your chosen use case.<\/li>\n\n\n\n<li>Authenticate with AIP: Use your AIP credentials to authenticate with the platform through the library.<\/li>\n\n\n\n<li>Create a Predictor object: This object allows you to send requests to the Gemini Pro model and receive responses.<\/li>\n\n\n\n<li>Send your prompt and interpret the result: Craft your query or task for Gemini Pro and send it through the Predictor object. The model will return its response, which you can then analyze and use.<\/li>\n<\/ol>\n\n\n\n<h3 class=\"wp-block-heading graf graf--h3\">How does Gemini differ from other AI models, like&nbsp;GPT-4?<\/h3>\n\n\n\n<h4 class=\"wp-block-heading graf graf--h4\"><strong class=\"markup--strong markup--h4-strong\">Architecture<\/strong><\/h4>\n\n\n\n<p class=\"graf graf--p wp-block-paragraph\">GPT-4 has an unimodal architecture that only pays attention to text\u2014crafted for diverse textual uses, providing adaptability in managing Natural Language Processing (NLP). Gemini has a Multimodal architecture that integrates text and images, enabling more dynamic interactions and a greater variety of NLP applications.<\/p>\n\n\n\n<h4 class=\"wp-block-heading graf graf--h4\"><strong class=\"markup--strong markup--h4-strong\">Learning Capability<\/strong><\/h4>\n\n\n\n<p class=\"graf graf--p wp-block-paragraph\">GPT-4 allows for incremental learning through version updates, but Gemini has constant learning based on real-time data, which could result in quick updates to our knowledge.<\/p>\n\n\n\n<h4 class=\"wp-block-heading graf graf--h4\">Data Training<\/h4>\n\n\n\n<p class=\"graf graf--p wp-block-paragraph\">GPT-4\u2019s training on an extensive dataset until a specific cut-off date limited its understanding of recent events. Meanwhile, Gemini is trained on real-time data, allowing for up-to-date responses and insights.<\/p>\n\n\n\n<h4 class=\"wp-block-heading graf graf--h4\">Techniques<\/h4>\n\n\n\n<p class=\"graf graf--p wp-block-paragraph\">GPT-4 uses deep learning for text processing, which works well for various language tasks. However, Gemini uses methods for problem-solving inspired by AlphaGo, enabling sophisticated planning and reasoning in challenging tasks.<\/p>\n\n\n\n<h4 class=\"wp-block-heading graf graf--h4\">Application Scope<\/h4>\n\n\n\n<p class=\"graf graf--p wp-block-paragraph\">GPT-4 is mainly employed for text-based programs, customer support, content production, and instructional settings. However, Gemini is anticipated for a broader range of applications, such as image processing, complicated problem solving, and dynamic content creation.<\/p>\n\n\n\n<h3 class=\"wp-block-heading graf graf--h3\">Future of&nbsp;Gemini<\/h3>\n\n\n\n<p class=\"graf graf--p wp-block-paragraph\">The future of Gemini depends on how we develop and deploy it. Google will likely continue to invest in Gemini\u2019s development, improving its accuracy, expanding its knowledge base, and adding new capabilities. This could include:<\/p>\n\n\n\n<ol class=\"wp-block-list postList\">\n<li><strong class=\"markup--strong markup--li-strong\">Handling even more modalities:<\/strong> Integrating new data types like video, sensor data, and virtual reality experiences.<\/li>\n\n\n\n<li><strong class=\"markup--strong markup--li-strong\">Understanding more complex tasks:<\/strong> Moving beyond text generation and code editing to function reasoning, planning, and decision-making tasks.<\/li>\n<\/ol>\n\n\n\n<p class=\"graf graf--p wp-block-paragraph\">It\u2019s important to remember that AI is still developing, and predicting its long-term impact is complex. However, by staying informed and engaging in constructive conversations about the future of AI, we can ensure that it benefits all of humanity.<\/p>\n\n\n\n<h3 class=\"wp-block-heading graf graf--h3\">Conclusion<\/h3>\n\n\n\n<p class=\"graf graf--p wp-block-paragraph\">The landscape of AI shifts with Gemini\u2019s arrival. Its inherent flexibility, stemming from its mastery of multiple data modalities, positions it as a universal tool capable of tackling an unprecedented range of tasks. Witnessing its future development and applications promises to be a fascinating journey.<\/p>\n","protected":false},"excerpt":{"rendered":"<p>Introduction Google has taken a significant leap by introducing the Gemini AI, its latest large language model (LLM), to the public. This milestone will bring widespread changes across all of Google\u2019s products. With the capability to perform tasks more akin to human actions, Gemini represents a crucial stride toward achieving artificial general intelligence (AGI). What [&hellip;]<\/p>\n","protected":false},"author":84,"featured_media":0,"comment_status":"closed","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"customer_name":"","customer_description":"","customer_industry":"","customer_technologies":"","customer_logo":"","_jetpack_memberships_contains_paid_content":false,"footnotes":""},"categories":[65],"tags":[],"coauthors":[181],"class_list":["post-8752","post","type-post","status-publish","format-standard","hentry","category-llmops"],"yoast_head":"<!-- This site is optimized with the Yoast SEO Premium plugin v25.9 (Yoast SEO v25.9) - https:\/\/yoast.com\/wordpress\/plugins\/seo\/ -->\n<title>Gemini: A New Multimodal AI Model of Google - Comet<\/title>\n<meta name=\"description\" content=\"Google has taken a significant leap by introducing the Gemini AI, its latest large language model (LLM), to the public.\" \/>\n<meta name=\"robots\" content=\"index, follow, max-snippet:-1, max-image-preview:large, max-video-preview:-1\" \/>\n<link rel=\"canonical\" href=\"https:\/\/www.comet.com\/site\/blog\/gemini-a-new-multimodal-ai-model-of-google\" \/>\n<meta property=\"og:locale\" content=\"en_US\" \/>\n<meta property=\"og:type\" content=\"article\" \/>\n<meta property=\"og:title\" content=\"Gemini: A New Multimodal AI Model of Google\" \/>\n<meta property=\"og:description\" content=\"Google has taken a significant leap by introducing the Gemini AI, its latest large language model (LLM), to the public.\" \/>\n<meta property=\"og:url\" content=\"https:\/\/www.comet.com\/site\/blog\/gemini-a-new-multimodal-ai-model-of-google\" \/>\n<meta property=\"og:site_name\" content=\"Comet\" \/>\n<meta property=\"article:publisher\" content=\"https:\/\/www.facebook.com\/cometdotml\" \/>\n<meta property=\"article:published_time\" content=\"2024-01-18T14:00:52+00:00\" \/>\n<meta property=\"article:modified_time\" content=\"2025-04-24T17:03:30+00:00\" \/>\n<meta property=\"og:image\" content=\"https:\/\/cdn-images-1.medium.com\/max\/800\/1*OG7SRMpDJPmSD-QeQ3SOtw.jpeg\" \/>\n<meta name=\"author\" content=\"Khushboo Kumari\" \/>\n<meta name=\"twitter:card\" content=\"summary_large_image\" \/>\n<meta name=\"twitter:creator\" content=\"@Cometml\" \/>\n<meta name=\"twitter:site\" content=\"@Cometml\" \/>\n<meta name=\"twitter:label1\" content=\"Written by\" \/>\n\t<meta name=\"twitter:data1\" content=\"Khushboo Kumari\" \/>\n\t<meta name=\"twitter:label2\" content=\"Est. reading time\" \/>\n\t<meta name=\"twitter:data2\" content=\"7 minutes\" \/>\n<!-- \/ Yoast SEO Premium plugin. -->","yoast_head_json":{"title":"Gemini: A New Multimodal AI Model of Google - Comet","description":"Google has taken a significant leap by introducing the Gemini AI, its latest large language model (LLM), to the public.","robots":{"index":"index","follow":"follow","max-snippet":"max-snippet:-1","max-image-preview":"max-image-preview:large","max-video-preview":"max-video-preview:-1"},"canonical":"https:\/\/www.comet.com\/site\/blog\/gemini-a-new-multimodal-ai-model-of-google","og_locale":"en_US","og_type":"article","og_title":"Gemini: A New Multimodal AI Model of Google","og_description":"Google has taken a significant leap by introducing the Gemini AI, its latest large language model (LLM), to the public.","og_url":"https:\/\/www.comet.com\/site\/blog\/gemini-a-new-multimodal-ai-model-of-google","og_site_name":"Comet","article_publisher":"https:\/\/www.facebook.com\/cometdotml","article_published_time":"2024-01-18T14:00:52+00:00","article_modified_time":"2025-04-24T17:03:30+00:00","og_image":[{"url":"https:\/\/cdn-images-1.medium.com\/max\/800\/1*OG7SRMpDJPmSD-QeQ3SOtw.jpeg","type":"","width":"","height":""}],"author":"Khushboo Kumari","twitter_card":"summary_large_image","twitter_creator":"@Cometml","twitter_site":"@Cometml","twitter_misc":{"Written by":"Khushboo Kumari","Est. reading time":"7 minutes"},"schema":{"@context":"https:\/\/schema.org","@graph":[{"@type":"Article","@id":"https:\/\/www.comet.com\/site\/blog\/gemini-a-new-multimodal-ai-model-of-google#article","isPartOf":{"@id":"https:\/\/www.comet.com\/site\/blog\/gemini-a-new-multimodal-ai-model-of-google\/"},"author":{"name":"Khushboo Kumari","@id":"https:\/\/www.comet.com\/site\/#\/schema\/person\/9e9bc90fd931c322a00805c37b5dc8e8"},"headline":"Gemini: A New Multimodal AI Model of Google","datePublished":"2024-01-18T14:00:52+00:00","dateModified":"2025-04-24T17:03:30+00:00","mainEntityOfPage":{"@id":"https:\/\/www.comet.com\/site\/blog\/gemini-a-new-multimodal-ai-model-of-google\/"},"wordCount":1403,"publisher":{"@id":"https:\/\/www.comet.com\/site\/#organization"},"image":{"@id":"https:\/\/www.comet.com\/site\/blog\/gemini-a-new-multimodal-ai-model-of-google#primaryimage"},"thumbnailUrl":"https:\/\/cdn-images-1.medium.com\/max\/800\/1*OG7SRMpDJPmSD-QeQ3SOtw.jpeg","articleSection":["LLMOps"],"inLanguage":"en-US"},{"@type":"WebPage","@id":"https:\/\/www.comet.com\/site\/blog\/gemini-a-new-multimodal-ai-model-of-google\/","url":"https:\/\/www.comet.com\/site\/blog\/gemini-a-new-multimodal-ai-model-of-google","name":"Gemini: A New Multimodal AI Model of Google - Comet","isPartOf":{"@id":"https:\/\/www.comet.com\/site\/#website"},"primaryImageOfPage":{"@id":"https:\/\/www.comet.com\/site\/blog\/gemini-a-new-multimodal-ai-model-of-google#primaryimage"},"image":{"@id":"https:\/\/www.comet.com\/site\/blog\/gemini-a-new-multimodal-ai-model-of-google#primaryimage"},"thumbnailUrl":"https:\/\/cdn-images-1.medium.com\/max\/800\/1*OG7SRMpDJPmSD-QeQ3SOtw.jpeg","datePublished":"2024-01-18T14:00:52+00:00","dateModified":"2025-04-24T17:03:30+00:00","description":"Google has taken a significant leap by introducing the Gemini AI, its latest large language model (LLM), to the public.","breadcrumb":{"@id":"https:\/\/www.comet.com\/site\/blog\/gemini-a-new-multimodal-ai-model-of-google#breadcrumb"},"inLanguage":"en-US","potentialAction":[{"@type":"ReadAction","target":["https:\/\/www.comet.com\/site\/blog\/gemini-a-new-multimodal-ai-model-of-google"]}]},{"@type":"ImageObject","inLanguage":"en-US","@id":"https:\/\/www.comet.com\/site\/blog\/gemini-a-new-multimodal-ai-model-of-google#primaryimage","url":"https:\/\/cdn-images-1.medium.com\/max\/800\/1*OG7SRMpDJPmSD-QeQ3SOtw.jpeg","contentUrl":"https:\/\/cdn-images-1.medium.com\/max\/800\/1*OG7SRMpDJPmSD-QeQ3SOtw.jpeg"},{"@type":"BreadcrumbList","@id":"https:\/\/www.comet.com\/site\/blog\/gemini-a-new-multimodal-ai-model-of-google#breadcrumb","itemListElement":[{"@type":"ListItem","position":1,"name":"Home","item":"https:\/\/www.comet.com\/site\/"},{"@type":"ListItem","position":2,"name":"Gemini: A New Multimodal AI Model of Google"}]},{"@type":"WebSite","@id":"https:\/\/www.comet.com\/site\/#website","url":"https:\/\/www.comet.com\/site\/","name":"Comet","description":"Build Better Models Faster","publisher":{"@id":"https:\/\/www.comet.com\/site\/#organization"},"potentialAction":[{"@type":"SearchAction","target":{"@type":"EntryPoint","urlTemplate":"https:\/\/www.comet.com\/site\/?s={search_term_string}"},"query-input":{"@type":"PropertyValueSpecification","valueRequired":true,"valueName":"search_term_string"}}],"inLanguage":"en-US"},{"@type":"Organization","@id":"https:\/\/www.comet.com\/site\/#organization","name":"Comet ML, Inc.","alternateName":"Comet","url":"https:\/\/www.comet.com\/site\/","logo":{"@type":"ImageObject","inLanguage":"en-US","@id":"https:\/\/www.comet.com\/site\/#\/schema\/logo\/image\/","url":"https:\/\/www.comet.com\/site\/wp-content\/uploads\/2025\/01\/logo_comet_square.png","contentUrl":"https:\/\/www.comet.com\/site\/wp-content\/uploads\/2025\/01\/logo_comet_square.png","width":310,"height":310,"caption":"Comet ML, Inc."},"image":{"@id":"https:\/\/www.comet.com\/site\/#\/schema\/logo\/image\/"},"sameAs":["https:\/\/www.facebook.com\/cometdotml","https:\/\/x.com\/Cometml","https:\/\/www.youtube.com\/channel\/UCmN63HKvfXSCS-UwVwmK8Hw"]},{"@type":"Person","@id":"https:\/\/www.comet.com\/site\/#\/schema\/person\/9e9bc90fd931c322a00805c37b5dc8e8","name":"Khushboo Kumari","image":{"@type":"ImageObject","inLanguage":"en-US","@id":"https:\/\/www.comet.com\/site\/#\/schema\/person\/image\/d5766b081477ed4dc292729a8cfdf38b","url":"https:\/\/secure.gravatar.com\/avatar\/0a4a12b6e00a526ba8df6fba3b372ca0c498565db302b52ccceb6df4329d16a5?s=96&d=mm&r=g","contentUrl":"https:\/\/secure.gravatar.com\/avatar\/0a4a12b6e00a526ba8df6fba3b372ca0c498565db302b52ccceb6df4329d16a5?s=96&d=mm&r=g","caption":"Khushboo Kumari"},"url":"https:\/\/www.comet.com\/site\/blog\/author\/khushboo-writer2244gmail-com\/"}]}},"jetpack_featured_media_url":"","jetpack_sharing_enabled":true,"_links":{"self":[{"href":"https:\/\/www.comet.com\/site\/wp-json\/wp\/v2\/posts\/8752","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/www.comet.com\/site\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/www.comet.com\/site\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/www.comet.com\/site\/wp-json\/wp\/v2\/users\/84"}],"replies":[{"embeddable":true,"href":"https:\/\/www.comet.com\/site\/wp-json\/wp\/v2\/comments?post=8752"}],"version-history":[{"count":1,"href":"https:\/\/www.comet.com\/site\/wp-json\/wp\/v2\/posts\/8752\/revisions"}],"predecessor-version":[{"id":15402,"href":"https:\/\/www.comet.com\/site\/wp-json\/wp\/v2\/posts\/8752\/revisions\/15402"}],"wp:attachment":[{"href":"https:\/\/www.comet.com\/site\/wp-json\/wp\/v2\/media?parent=8752"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/www.comet.com\/site\/wp-json\/wp\/v2\/categories?post=8752"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/www.comet.com\/site\/wp-json\/wp\/v2\/tags?post=8752"},{"taxonomy":"author","embeddable":true,"href":"https:\/\/www.comet.com\/site\/wp-json\/wp\/v2\/coauthors?post=8752"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}