{"id":6195,"date":"2023-06-15T08:07:11","date_gmt":"2023-06-15T16:07:11","guid":{"rendered":"https:\/\/live-cometml.pantheonsite.io\/?p=6195"},"modified":"2025-04-24T17:15:25","modified_gmt":"2025-04-24T17:15:25","slug":"stylegan-use-machine-learning-to-generate-and-customize-realistic-images","status":"publish","type":"post","link":"https:\/\/www.comet.com\/site\/blog\/stylegan-use-machine-learning-to-generate-and-customize-realistic-images\/","title":{"rendered":"StyleGAN: Use machine learning to generate and customize realistic images"},"content":{"rendered":"\n<link rel=\"canonical\" href=\"https:\/\/www.comet.com\/site\/blog\/stylegan-use-machine-learning-to-generate-and-customize-realistic-images\">\n\n\n\n<figure class=\"wp-block-image\"><img loading=\"lazy\" decoding=\"async\" width=\"1024\" height=\"768\" src=\"https:\/\/www.comet.com\/site\/wp-content\/uploads\/2023\/06\/1GIH1Lt7xXzK4Yk9X0y9GWA-1024x768.webp\" alt=\"\" class=\"wp-image-6196\" srcset=\"https:\/\/www.comet.com\/site\/wp-content\/uploads\/2023\/06\/1GIH1Lt7xXzK4Yk9X0y9GWA-1024x768.webp 1024w, https:\/\/www.comet.com\/site\/wp-content\/uploads\/2023\/06\/1GIH1Lt7xXzK4Yk9X0y9GWA-300x225.webp 300w, https:\/\/www.comet.com\/site\/wp-content\/uploads\/2023\/06\/1GIH1Lt7xXzK4Yk9X0y9GWA-768x576.webp 768w, https:\/\/www.comet.com\/site\/wp-content\/uploads\/2023\/06\/1GIH1Lt7xXzK4Yk9X0y9GWA-1536x1152.webp 1536w, https:\/\/www.comet.com\/site\/wp-content\/uploads\/2023\/06\/1GIH1Lt7xXzK4Yk9X0y9GWA-2048x1536.webp 2048w\" sizes=\"auto, (max-width: 1024px) 100vw, 1024px\" \/><\/figure>\n\n\n\n<p><\/p>\n\n\n\n<p class=\"pw-post-body-paragraph mv mw fo be b gm mx my mz gp na nb nc nd ne nf ng nh ni nj nk nl nm nn no np fh bj\" id=\"f1e3\">Ever wondered what the 27th letter in the English alphabet might look like? Or how your appearance would be twenty years from now? Or perhaps how that super-grumpy professor of yours might look with a big, wide smile on his face?<\/p>\n\n\n\n<figure class=\"wp-block-image mg mh mi mj mk mf mq mr paragraph-image\"><img decoding=\"async\" src=\"https:\/\/miro.medium.com\/v2\/resize:fit:700\/1*KumksVbLe4fi0-yOad6XLg.png\" alt=\"\"\/><figcaption class=\"wp-element-caption\">Source : <a class=\"af mu\" href=\"https:\/\/github.com\/albertpumarola\/GANimation\" target=\"_blank\" rel=\"noopener ugc nofollow\">GitHub<\/a><\/figcaption><\/figure>\n\n\n\n<p class=\"pw-post-body-paragraph mv mw fo be b gm mx my mz gp na nb nc nd ne nf ng nh ni nj nk nl nm nn no np fh bj\" id=\"b759\">Thanks to machine learning, all this is not only possible, but relatively easy to do with the inference of a powerful neural network (rather than hours spent on Photoshop). The neural networks that make this possible are termed <em class=\"nv\">adversarial networks.<\/em> Often described as one of the coolest concepts in machine learning, they are actually a set of more than one network (usually two) which are continually competing with each other (hence, <em class=\"nv\">adversarially<\/em>), producing some interesting results along the way.<\/p>\n\n\n\n<p class=\"pw-post-body-paragraph mv mw fo be b gm mx my mz gp na nb nc nd ne nf ng nh ni nj nk nl nm nn no np fh bj\" id=\"06e5\">In this article, we dive into StyleGANs, a type of generative adversarial network that \u201c<a class=\"af mu\" href=\"https:\/\/arxiv.org\/pdf\/1812.04948.pdf?source=post_elevate_sequence_page---------------------------\" target=\"_blank\" rel=\"noopener ugc nofollow\"><em class=\"nv\">enables intuitive, scale-specific control of image synthesis by learned, unsupervised separation of high-level attributes and stochastic variation<\/em><\/a>\u201d. Or, to put it plainly, StyleGANs switch up an image\u2019s style.<\/p>\n\n\n\n<figure class=\"wp-block-image\"><img loading=\"lazy\" decoding=\"async\" width=\"1024\" height=\"576\" src=\"https:\/\/www.comet.com\/site\/wp-content\/uploads\/2023\/06\/0FFjY20j0lYgNc8D0-1024x576.webp\" alt=\"\" class=\"wp-image-6197\" srcset=\"https:\/\/www.comet.com\/site\/wp-content\/uploads\/2023\/06\/0FFjY20j0lYgNc8D0-1024x576.webp 1024w, https:\/\/www.comet.com\/site\/wp-content\/uploads\/2023\/06\/0FFjY20j0lYgNc8D0-300x169.webp 300w, https:\/\/www.comet.com\/site\/wp-content\/uploads\/2023\/06\/0FFjY20j0lYgNc8D0-768x432.webp 768w, https:\/\/www.comet.com\/site\/wp-content\/uploads\/2023\/06\/0FFjY20j0lYgNc8D0.webp 1280w\" sizes=\"auto, (max-width: 1024px) 100vw, 1024px\" \/><\/figure>\n\n\n\n<p><\/p>\n\n\n\n<div class=\"fh fi fj fk fl\">\n<div class=\"ab ca\">\n<div class=\"ch bg et eu ev ew\">\n<h1 id=\"dfa6\" class=\"nw nx fo be ny nz oa go ob oc od gr oe of og oh oi oj ok ol om on oo op oq or bj\" data-selectable-paragraph=\"\">Introduction<\/h1>\n<p id=\"572e\" class=\"pw-post-body-paragraph mv mw fo be b gm os my mz gp ot nb nc nd ou nf ng nh ov nj nk nl ow nn no np fh bj\" data-selectable-paragraph=\"\">StyleGAN was originally an open-source project by NVIDIA to create a generative model that could output high-resolution human faces. The basis of the model was established by a <a class=\"af mu\" href=\"https:\/\/arxiv.org\/pdf\/1812.04948.pdf\" target=\"_blank\" rel=\"noopener ugc nofollow\">research paper<\/a> published by Tero Karras, Samuli Laine, and Timo Aila, all researchers at NVIDIA.<\/p>\n<p id=\"bfbb\" class=\"pw-post-body-paragraph mv mw fo be b gm mx my mz gp na nb nc nd ne nf ng nh ni nj nk nl nm nn no np fh bj\" data-selectable-paragraph=\"\">In this paper, they proposed a new architecture for the \u201cgenerator\u201d network of the GAN, which provides a new method for controlling the image generation process. In simple words, the generator in a StyleGAN makes small adjustments to the \u201cstyle\u201d of the image at each convolution layer in order to manipulate the image features for that layer.<\/p>\n<p id=\"cad0\" class=\"pw-post-body-paragraph mv mw fo be b gm mx my mz gp na nb nc nd ne nf ng nh ni nj nk nl nm nn no np fh bj\" data-selectable-paragraph=\"\">Moreover, this new architecture is able to separate the high-level attributes (such as a person\u2019s identity) from low-level attributes (such as their hairstyle) within an image. This separation is what allows the GAN to change some attributes without affecting others. For example, changing a person\u2019s hairstyle in a given image.<\/p>\n<p id=\"bbd9\" class=\"pw-post-body-paragraph mv mw fo be b gm mx my mz gp na nb nc nd ne nf ng nh ni nj nk nl nm nn no np fh bj\" data-selectable-paragraph=\"\">Before we dive into the specifics of how StyleGANs works, here\u2019s a list of interesting implementations, just to give you a sense of what\u2019s possible with these powerful neural networks:<\/p>\n<ul class=\"\">\n<li id=\"5190\" class=\"mv mw fo be b gm mx my mz gp na nb nc ox ne nf ng oy ni nj nk oz nm nn no np pa pb pc bj\" data-selectable-paragraph=\"\"><a class=\"af mu\" href=\"https:\/\/nanonets.com\/blog\/stylegan-got\/\" target=\"_blank\" rel=\"noopener ugc nofollow\">Generating Game of Thrones characters<\/a><\/li>\n<li id=\"0903\" class=\"mv mw fo be b gm pd my mz gp pe nb nc ox pf nf ng oy pg nj nk oz ph nn no np pa pb pc bj\" data-selectable-paragraph=\"\"><a class=\"af mu\" href=\"https:\/\/evigio.com\/post\/generating-new-watch-designs-with-stylegan\" target=\"_blank\" rel=\"noopener ugc nofollow\">Generating new watch styles<\/a><\/li>\n<li id=\"62c2\" class=\"mv mw fo be b gm pd my mz gp pe nb nc ox pf nf ng oy pg nj nk oz ph nn no np pa pb pc bj\" data-selectable-paragraph=\"\"><a class=\"af mu\" href=\"https:\/\/www.reddit.com\/r\/MachineLearning\/comments\/bkrn3i\/p_stylegan_trained_on_album_covers\/\" target=\"_blank\" rel=\"noopener ugc nofollow\">Generating album covers<\/a><\/li>\n<li id=\"8fb7\" class=\"mv mw fo be b gm pd my mz gp pe nb nc ox pf nf ng oy pg nj nk oz ph nn no np pa pb pc bj\" data-selectable-paragraph=\"\"><a class=\"af mu\" href=\"https:\/\/devopstar.com\/2019\/05\/21\/stylegan-pokemon-card-generator\/\" target=\"_blank\" rel=\"noopener ugc nofollow\">Generating Pok\u00e9mon characters<\/a><\/li>\n<li id=\"5d2f\" class=\"mv mw fo be b gm pd my mz gp pe nb nc ox pf nf ng oy pg nj nk oz ph nn no np pa pb pc bj\" data-selectable-paragraph=\"\"><a class=\"af mu\" href=\"https:\/\/github.com\/ak9250\/stylegan-art\" target=\"_blank\" rel=\"noopener ugc nofollow\">Generating portraits (paintings)<\/a><\/li>\n<\/ul>\n<figure class=\"mg mh mi mj mk mf\">\n<div class=\"pi is l eb\">\n<div class=\"pj pk l\"><iframe loading=\"lazy\" class=\"ek n fc dx bg\" title=\"A Style-Based Generator Architecture for Generative Adversarial Networks\" src=\"https:\/\/cdn.embedly.com\/widgets\/media.html?src=https%3A%2F%2Fwww.youtube.com%2Fembed%2FkSLJriaOumA%3Ffeature%3Doembed&amp;display_name=YouTube&amp;url=https%3A%2F%2Fwww.youtube.com%2Fwatch%3Fv%3DkSLJriaOumA&amp;image=https%3A%2F%2Fi.ytimg.com%2Fvi%2FkSLJriaOumA%2Fhqdefault.jpg&amp;key=a19fcc184b9711e1b4764040d3dc5c07&amp;type=text%2Fhtml&amp;schema=youtube\" width=\"854\" height=\"480\" frameborder=\"0\" scrolling=\"no\" allowfullscreen=\"allowfullscreen\" data-mce-fragment=\"1\"><\/iframe><\/div>\n<\/div>\n<\/figure>\n<h1 id=\"0b9f\" class=\"nw nx fo be ny nz oa go ob oc od gr oe of og oh oi oj ok ol om on oo op oq or bj\" data-selectable-paragraph=\"\">Recap: What are GANs again ?<\/h1>\n<p id=\"cd12\" class=\"pw-post-body-paragraph mv mw fo be b gm os my mz gp ot nb nc nd ou nf ng nh ov nj nk nl ow nn no np fh bj\" data-selectable-paragraph=\"\">Let\u2019s first step back and refresh our knowledge about <a class=\"af mu\" href=\"https:\/\/heartbeat.comet.ml\/introduction-to-generative-adversarial-networks-gans-35ef44f21193\" target=\"_blank\" rel=\"noopener ugc nofollow\">Generative Adversarial Networks<\/a>. The basic GAN is composed of two separate neural networks which are in continual competition against each other (<em class=\"nv\">adversaries<\/em>).<\/p>\n<p id=\"a74f\" class=\"pw-post-body-paragraph mv mw fo be b gm mx my mz gp na nb nc nd ne nf ng nh ni nj nk nl nm nn no np fh bj\" data-selectable-paragraph=\"\">One of these, called the <em class=\"nv\">generator, <\/em>is tasked with the generation of new data instances that it creates from random noise, while the other, called a <em class=\"nv\">discriminator<\/em>, evaluates these generated instances for authenticity.<\/p>\n<p id=\"6459\" class=\"pw-post-body-paragraph mv mw fo be b gm mx my mz gp na nb nc nd ne nf ng nh ni nj nk nl nm nn no np fh bj\" data-selectable-paragraph=\"\">Both tasks are phases in the GAN\u2019s process cycle and are interdependent on each other. The generative phase is influenced by the discriminative phase\u2019s evaluation, and the discriminative phase makes comparisons between the original dataset and the generated samples.<\/p>\n<p id=\"07f5\" class=\"pw-post-body-paragraph mv mw fo be b gm mx my mz gp na nb nc nd ne nf ng nh ni nj nk nl nm nn no np fh bj\" data-selectable-paragraph=\"\">As training progresses, both networks keep getting smarter\u2014the generator at generating fake images and the discriminator at detecting their authenticity. By the time the model has been trained, the generator manages to create an image authentic enough that the discriminator can\u2019t tell if it\u2019s a fake or not. Often, this final generated image is the resulting output.<\/p>\n<figure class=\"mg mh mi mj mk mf mq mr paragraph-image\">\n<figure><img loading=\"lazy\" decoding=\"async\" class=\"bg ml mm c\" role=\"presentation\" src=\"https:\/\/miro.medium.com\/v2\/resize:fit:602\/0*LodTiw8Mc84eFtGF.png\" alt=\"\" width=\"602\" height=\"262\"><\/figure><div class=\"mq mr pl\"><picture><source srcset=\"https:\/\/miro.medium.com\/v2\/resize:fit:640\/format:webp\/0*LodTiw8Mc84eFtGF.png 640w, https:\/\/miro.medium.com\/v2\/resize:fit:720\/format:webp\/0*LodTiw8Mc84eFtGF.png 720w, https:\/\/miro.medium.com\/v2\/resize:fit:750\/format:webp\/0*LodTiw8Mc84eFtGF.png 750w, https:\/\/miro.medium.com\/v2\/resize:fit:786\/format:webp\/0*LodTiw8Mc84eFtGF.png 786w, https:\/\/miro.medium.com\/v2\/resize:fit:828\/format:webp\/0*LodTiw8Mc84eFtGF.png 828w, https:\/\/miro.medium.com\/v2\/resize:fit:1100\/format:webp\/0*LodTiw8Mc84eFtGF.png 1100w, https:\/\/miro.medium.com\/v2\/resize:fit:1204\/format:webp\/0*LodTiw8Mc84eFtGF.png 1204w\" type=\"image\/webp\" sizes=\"(min-resolution: 4dppx) and (max-width: 700px) 50vw, (-webkit-min-device-pixel-ratio: 4) and (max-width: 700px) 50vw, (min-resolution: 3dppx) and (max-width: 700px) 67vw, (-webkit-min-device-pixel-ratio: 3) and (max-width: 700px) 65vw, (min-resolution: 2.5dppx) and (max-width: 700px) 80vw, (-webkit-min-device-pixel-ratio: 2.5) and (max-width: 700px) 80vw, (min-resolution: 2dppx) and (max-width: 700px) 100vw, (-webkit-min-device-pixel-ratio: 2) and (max-width: 700px) 100vw, 602px\"><source srcset=\"https:\/\/miro.medium.com\/v2\/resize:fit:640\/0*LodTiw8Mc84eFtGF.png 640w, https:\/\/miro.medium.com\/v2\/resize:fit:720\/0*LodTiw8Mc84eFtGF.png 720w, https:\/\/miro.medium.com\/v2\/resize:fit:750\/0*LodTiw8Mc84eFtGF.png 750w, https:\/\/miro.medium.com\/v2\/resize:fit:786\/0*LodTiw8Mc84eFtGF.png 786w, https:\/\/miro.medium.com\/v2\/resize:fit:828\/0*LodTiw8Mc84eFtGF.png 828w, https:\/\/miro.medium.com\/v2\/resize:fit:1100\/0*LodTiw8Mc84eFtGF.png 1100w, https:\/\/miro.medium.com\/v2\/resize:fit:1204\/0*LodTiw8Mc84eFtGF.png 1204w\" sizes=\"(min-resolution: 4dppx) and (max-width: 700px) 50vw, (-webkit-min-device-pixel-ratio: 4) and (max-width: 700px) 50vw, (min-resolution: 3dppx) and (max-width: 700px) 67vw, (-webkit-min-device-pixel-ratio: 3) and (max-width: 700px) 65vw, (min-resolution: 2.5dppx) and (max-width: 700px) 80vw, (-webkit-min-device-pixel-ratio: 2.5) and (max-width: 700px) 80vw, (min-resolution: 2dppx) and (max-width: 700px) 100vw, (-webkit-min-device-pixel-ratio: 2) and (max-width: 700px) 100vw, 602px\" data-testid=\"og\"><\/picture><\/div>\n<figcaption class=\"mn mo mp mq mr ms mt be b bf z dv\" data-selectable-paragraph=\"\">Basic GAN architecture (Source : <a class=\"af mu\" href=\"https:\/\/www.freecodecamp.org\/news\/an-intuitive-introduction-to-generative-adversarial-networks-gans-7a2264a81394\/\" target=\"_blank\" rel=\"noopener ugc nofollow\">Medium<\/a>)<\/figcaption>\n<\/figure>\n<\/div>\n<\/div>\n<\/div>\n\n\n\n<div class=\"ab ca pm pn po pp\" role=\"separator\"><span style=\"color: var(--wpex-heading-color); font-size: var(--wpex-text-3xl); font-family: var(--wpex-body-font-family, var(--wpex-font-sans));\">How does the StyleGAN work?<\/span><\/div>\n\n\n\n<div class=\"fh fi fj fk fl\">\n<div class=\"ab ca\">\n<div class=\"ch bg et eu ev ew\">\n<p id=\"38e3\" class=\"pw-post-body-paragraph mv mw fo be b gm os my mz gp ot nb nc nd ou nf ng nh ov nj nk nl ow nn no np fh bj\" data-selectable-paragraph=\"\">Before diving into the changes made by the researchers to the GAN network architecture to build their StyleGAN, it\u2019s important to note that <strong class=\"be qj\">these changes pertain only to the generator network<\/strong>, thus influencing the generative process only. There have been no changes to the discriminator or to the loss function, both of which remain the same as in a traditional GAN.<\/p>\n<p id=\"bfac\" class=\"pw-post-body-paragraph mv mw fo be b gm mx my mz gp na nb nc nd ne nf ng nh ni nj nk nl nm nn no np fh bj\" data-selectable-paragraph=\"\">The purpose of NVIDIA\u2019s StyleGAN is to overcome the limitations of a traditional GAN, wherein control may not be possible for individual characteristics of data, such as facial features in a photographs. The proposed model allows a user to <a class=\"af mu\" href=\"https:\/\/medium.com\/r?url=https%3A%2F%2Fheartbeat.fritz.ai%2Ftuning-machine-learning-hyperparameters-40265a35c9b8\" rel=\"noopener\">tune hyperparameters <\/a>in order to achieve such control. Moreover, it allows for a factor of <em class=\"nv\">variability<\/em> in generated images due to the addition of \u201cstyles\u201d to images at each convolution layer.<\/p>\n<figure class=\"mg mh mi mj mk mf mq mr paragraph-image\">\n<div class=\"nr ns eb nt bg nu\" tabindex=\"0\" role=\"button\">\n<figure><img loading=\"lazy\" decoding=\"async\" class=\"bg ml mm c\" role=\"presentation\" src=\"https:\/\/miro.medium.com\/v2\/resize:fit:700\/1*O16MfmqTbD0ENx3pZEJMIA.png\" alt=\"\" width=\"700\" height=\"438\"><\/figure><div class=\"mq mr qk\"><picture><source srcset=\"https:\/\/miro.medium.com\/v2\/resize:fit:640\/format:webp\/1*O16MfmqTbD0ENx3pZEJMIA.png 640w, https:\/\/miro.medium.com\/v2\/resize:fit:720\/format:webp\/1*O16MfmqTbD0ENx3pZEJMIA.png 720w, https:\/\/miro.medium.com\/v2\/resize:fit:750\/format:webp\/1*O16MfmqTbD0ENx3pZEJMIA.png 750w, https:\/\/miro.medium.com\/v2\/resize:fit:786\/format:webp\/1*O16MfmqTbD0ENx3pZEJMIA.png 786w, https:\/\/miro.medium.com\/v2\/resize:fit:828\/format:webp\/1*O16MfmqTbD0ENx3pZEJMIA.png 828w, https:\/\/miro.medium.com\/v2\/resize:fit:1100\/format:webp\/1*O16MfmqTbD0ENx3pZEJMIA.png 1100w, https:\/\/miro.medium.com\/v2\/resize:fit:1400\/format:webp\/1*O16MfmqTbD0ENx3pZEJMIA.png 1400w\" type=\"image\/webp\" sizes=\"(min-resolution: 4dppx) and (max-width: 700px) 50vw, (-webkit-min-device-pixel-ratio: 4) and (max-width: 700px) 50vw, (min-resolution: 3dppx) and (max-width: 700px) 67vw, (-webkit-min-device-pixel-ratio: 3) and (max-width: 700px) 65vw, (min-resolution: 2.5dppx) and (max-width: 700px) 80vw, (-webkit-min-device-pixel-ratio: 2.5) and (max-width: 700px) 80vw, (min-resolution: 2dppx) and (max-width: 700px) 100vw, (-webkit-min-device-pixel-ratio: 2) and (max-width: 700px) 100vw, 700px\"><source srcset=\"https:\/\/miro.medium.com\/v2\/resize:fit:640\/1*O16MfmqTbD0ENx3pZEJMIA.png 640w, https:\/\/miro.medium.com\/v2\/resize:fit:720\/1*O16MfmqTbD0ENx3pZEJMIA.png 720w, https:\/\/miro.medium.com\/v2\/resize:fit:750\/1*O16MfmqTbD0ENx3pZEJMIA.png 750w, https:\/\/miro.medium.com\/v2\/resize:fit:786\/1*O16MfmqTbD0ENx3pZEJMIA.png 786w, https:\/\/miro.medium.com\/v2\/resize:fit:828\/1*O16MfmqTbD0ENx3pZEJMIA.png 828w, https:\/\/miro.medium.com\/v2\/resize:fit:1100\/1*O16MfmqTbD0ENx3pZEJMIA.png 1100w, https:\/\/miro.medium.com\/v2\/resize:fit:1400\/1*O16MfmqTbD0ENx3pZEJMIA.png 1400w\" sizes=\"(min-resolution: 4dppx) and (max-width: 700px) 50vw, (-webkit-min-device-pixel-ratio: 4) and (max-width: 700px) 50vw, (min-resolution: 3dppx) and (max-width: 700px) 67vw, (-webkit-min-device-pixel-ratio: 3) and (max-width: 700px) 65vw, (min-resolution: 2.5dppx) and (max-width: 700px) 80vw, (-webkit-min-device-pixel-ratio: 2.5) and (max-width: 700px) 80vw, (min-resolution: 2dppx) and (max-width: 700px) 100vw, (-webkit-min-device-pixel-ratio: 2) and (max-width: 700px) 100vw, 700px\" data-testid=\"og\"><\/picture><\/div>\n<\/div>\n<figcaption class=\"mn mo mp mq mr ms mt be b bf z dv\" data-selectable-paragraph=\"\">The generator in a traditional GAN vs the one used by NVIDIA in the StyleGAN.<\/figcaption>\n<\/figure>\n<p id=\"ac77\" class=\"pw-post-body-paragraph mv mw fo be b gm mx my mz gp na nb nc nd ne nf ng nh ni nj nk nl nm nn no np fh bj\" data-selectable-paragraph=\"\">The model starts off by generating new images, starting from a very low resolution (something like 4&#215;4) and eventually building its way up to a final resolution of 1024&#215;1024, which actually provides enough detail for a visually appealing image.<\/p>\n<figure class=\"mg mh mi mj mk mf mq mr paragraph-image\">\n<figure><img loading=\"lazy\" decoding=\"async\" class=\"bg ml mm c\" role=\"presentation\" src=\"https:\/\/miro.medium.com\/v2\/resize:fit:512\/0*g99OxR_QB0YPaC5C.png\" alt=\"\" width=\"512\" height=\"508\"><\/figure><div class=\"mq mr ql\"><picture><source srcset=\"https:\/\/miro.medium.com\/v2\/resize:fit:640\/format:webp\/0*g99OxR_QB0YPaC5C.png 640w, https:\/\/miro.medium.com\/v2\/resize:fit:720\/format:webp\/0*g99OxR_QB0YPaC5C.png 720w, https:\/\/miro.medium.com\/v2\/resize:fit:750\/format:webp\/0*g99OxR_QB0YPaC5C.png 750w, https:\/\/miro.medium.com\/v2\/resize:fit:786\/format:webp\/0*g99OxR_QB0YPaC5C.png 786w, https:\/\/miro.medium.com\/v2\/resize:fit:828\/format:webp\/0*g99OxR_QB0YPaC5C.png 828w, https:\/\/miro.medium.com\/v2\/resize:fit:1100\/format:webp\/0*g99OxR_QB0YPaC5C.png 1100w, https:\/\/miro.medium.com\/v2\/resize:fit:1024\/format:webp\/0*g99OxR_QB0YPaC5C.png 1024w\" type=\"image\/webp\" sizes=\"(min-resolution: 4dppx) and (max-width: 700px) 50vw, (-webkit-min-device-pixel-ratio: 4) and (max-width: 700px) 50vw, (min-resolution: 3dppx) and (max-width: 700px) 67vw, (-webkit-min-device-pixel-ratio: 3) and (max-width: 700px) 65vw, (min-resolution: 2.5dppx) and (max-width: 700px) 80vw, (-webkit-min-device-pixel-ratio: 2.5) and (max-width: 700px) 80vw, (min-resolution: 2dppx) and (max-width: 700px) 100vw, (-webkit-min-device-pixel-ratio: 2) and (max-width: 700px) 100vw, 512px\"><source srcset=\"https:\/\/miro.medium.com\/v2\/resize:fit:640\/0*g99OxR_QB0YPaC5C.png 640w, https:\/\/miro.medium.com\/v2\/resize:fit:720\/0*g99OxR_QB0YPaC5C.png 720w, https:\/\/miro.medium.com\/v2\/resize:fit:750\/0*g99OxR_QB0YPaC5C.png 750w, https:\/\/miro.medium.com\/v2\/resize:fit:786\/0*g99OxR_QB0YPaC5C.png 786w, https:\/\/miro.medium.com\/v2\/resize:fit:828\/0*g99OxR_QB0YPaC5C.png 828w, https:\/\/miro.medium.com\/v2\/resize:fit:1100\/0*g99OxR_QB0YPaC5C.png 1100w, https:\/\/miro.medium.com\/v2\/resize:fit:1024\/0*g99OxR_QB0YPaC5C.png 1024w\" sizes=\"(min-resolution: 4dppx) and (max-width: 700px) 50vw, (-webkit-min-device-pixel-ratio: 4) and (max-width: 700px) 50vw, (min-resolution: 3dppx) and (max-width: 700px) 67vw, (-webkit-min-device-pixel-ratio: 3) and (max-width: 700px) 65vw, (min-resolution: 2.5dppx) and (max-width: 700px) 80vw, (-webkit-min-device-pixel-ratio: 2.5) and (max-width: 700px) 80vw, (min-resolution: 2dppx) and (max-width: 700px) 100vw, (-webkit-min-device-pixel-ratio: 2) and (max-width: 700px) 100vw, 512px\" data-testid=\"og\"><\/picture><\/div>\n<figcaption class=\"mn mo mp mq mr ms mt be b bf z dv\" data-selectable-paragraph=\"\">Number of images trained for a given time with the \u201cprogressive growing\u201d method versus a traditional GAN. (Courtesy : <a class=\"af mu\" href=\"https:\/\/towardsdatascience.com\/progan-how-nvidia-generated-images-of-unprecedented-quality-51c98ec2cbd2\" target=\"_blank\" rel=\"noopener\">towardsdatascience<\/a>)<\/figcaption>\n<\/figure>\n<p id=\"c56b\" class=\"pw-post-body-paragraph mv mw fo be b gm mx my mz gp na nb nc nd ne nf ng nh ni nj nk nl nm nn no np fh bj\" data-selectable-paragraph=\"\">The main principle behind training the StyleGAN is this \u201cprogressive\u201d method which was first used by NVIDIA in their ProGAN. It works by gradually increasing the resolution , thus ensuring that the network evolves slowly, initially learning a simple problem before <em class=\"nv\">progressing <\/em>to learning more complex problems(or, in this case, images of a higher resolution). This kind of training principle ensure stability and has been proven to minimize common problems associated with GANs such as mode collapse. It also makes certain that high level features are worked upon first before moving on to the finer details, reducing the likelihood of such features being generated wrong(which would have a more drastic effect on the final image than the other way around). StyleGANs use a similar principle, but instead of generating a single image they generate multiple ones, and this technique allows for styles or features to be dissociated from each other.<\/p>\n<p id=\"4395\" class=\"pw-post-body-paragraph mv mw fo be b gm mx my mz gp na nb nc nd ne nf ng nh ni nj nk nl nm nn no np fh bj\" data-selectable-paragraph=\"\">Specifically, this method causes two images to be generated and then combined by taking low-level features from one and high-level features from the other. A mixing regularization technique is used by the generator, causing some percentage of both to appear in the output image.<\/p>\n<figure class=\"mg mh mi mj mk mf mq mr paragraph-image\">\n<figure><img loading=\"lazy\" decoding=\"async\" class=\"bg ml mm c\" role=\"presentation\" src=\"https:\/\/miro.medium.com\/v2\/resize:fit:700\/1*9iUmdxuIWclafE0-mRUjfw.gif\" alt=\"\" width=\"700\" height=\"397\"><\/figure><div class=\"mq mr qm\"><picture><source srcset=\"https:\/\/miro.medium.com\/v2\/resize:fit:640\/1*9iUmdxuIWclafE0-mRUjfw.gif 640w, https:\/\/miro.medium.com\/v2\/resize:fit:720\/1*9iUmdxuIWclafE0-mRUjfw.gif 720w, https:\/\/miro.medium.com\/v2\/resize:fit:750\/1*9iUmdxuIWclafE0-mRUjfw.gif 750w, https:\/\/miro.medium.com\/v2\/resize:fit:786\/1*9iUmdxuIWclafE0-mRUjfw.gif 786w, https:\/\/miro.medium.com\/v2\/resize:fit:828\/1*9iUmdxuIWclafE0-mRUjfw.gif 828w, https:\/\/miro.medium.com\/v2\/resize:fit:1100\/1*9iUmdxuIWclafE0-mRUjfw.gif 1100w, https:\/\/miro.medium.com\/v2\/resize:fit:1400\/1*9iUmdxuIWclafE0-mRUjfw.gif 1400w\" type=\"image\/webp\" sizes=\"(min-resolution: 4dppx) and (max-width: 700px) 50vw, (-webkit-min-device-pixel-ratio: 4) and (max-width: 700px) 50vw, (min-resolution: 3dppx) and (max-width: 700px) 67vw, (-webkit-min-device-pixel-ratio: 3) and (max-width: 700px) 65vw, (min-resolution: 2.5dppx) and (max-width: 700px) 80vw, (-webkit-min-device-pixel-ratio: 2.5) and (max-width: 700px) 80vw, (min-resolution: 2dppx) and (max-width: 700px) 100vw, (-webkit-min-device-pixel-ratio: 2) and (max-width: 700px) 100vw, 700px\"><source srcset=\"https:\/\/miro.medium.com\/v2\/resize:fit:640\/1*9iUmdxuIWclafE0-mRUjfw.gif 640w, https:\/\/miro.medium.com\/v2\/resize:fit:720\/1*9iUmdxuIWclafE0-mRUjfw.gif 720w, https:\/\/miro.medium.com\/v2\/resize:fit:750\/1*9iUmdxuIWclafE0-mRUjfw.gif 750w, https:\/\/miro.medium.com\/v2\/resize:fit:786\/1*9iUmdxuIWclafE0-mRUjfw.gif 786w, https:\/\/miro.medium.com\/v2\/resize:fit:828\/1*9iUmdxuIWclafE0-mRUjfw.gif 828w, https:\/\/miro.medium.com\/v2\/resize:fit:1100\/1*9iUmdxuIWclafE0-mRUjfw.gif 1100w, https:\/\/miro.medium.com\/v2\/resize:fit:1400\/1*9iUmdxuIWclafE0-mRUjfw.gif 1400w\" sizes=\"(min-resolution: 4dppx) and (max-width: 700px) 50vw, (-webkit-min-device-pixel-ratio: 4) and (max-width: 700px) 50vw, (min-resolution: 3dppx) and (max-width: 700px) 67vw, (-webkit-min-device-pixel-ratio: 3) and (max-width: 700px) 65vw, (min-resolution: 2.5dppx) and (max-width: 700px) 80vw, (-webkit-min-device-pixel-ratio: 2.5) and (max-width: 700px) 80vw, (min-resolution: 2dppx) and (max-width: 700px) 100vw, (-webkit-min-device-pixel-ratio: 2) and (max-width: 700px) 100vw, 700px\" data-testid=\"og\"><\/picture><\/div>\n<figcaption class=\"mn mo mp mq mr ms mt be b bf z dv\" data-selectable-paragraph=\"\">The generation process in the ProGAN which inspired the same in StyleGAN (Source : <a class=\"af mu\" href=\"https:\/\/towardsdatascience.com\/progan-how-nvidia-generated-images-of-unprecedented-quality-51c98ec2cbd2\" target=\"_blank\" rel=\"noopener\">Towards Data Science<\/a>)<\/figcaption>\n<\/figure>\n<p id=\"9f5c\" class=\"pw-post-body-paragraph mv mw fo be b gm mx my mz gp na nb nc nd ne nf ng nh ni nj nk nl nm nn no np fh bj\" data-selectable-paragraph=\"\">At every convolution layer, different styles can be used to generate an image: coarse styles having a resolution between 4&#215;4 to 8&#215;8, middle styles with a resolution of 16&#215;16 to 32&#215;32, or fine styles with a resolution from 64&#215;64 to 1024&#215;1024.<\/p>\n<p id=\"d9fb\" class=\"pw-post-body-paragraph mv mw fo be b gm mx my mz gp na nb nc nd ne nf ng nh ni nj nk nl nm nn no np fh bj\" data-selectable-paragraph=\"\">Coarse styles govern high-level features such as the subject\u2019s pose of in the image or the subject\u2019s hair, face shape, etc. Middle styles control aspects such as facial features. Lastly, fine styles cover details in the image such as color of the eyes or other microstructures.<\/p>\n<figure class=\"mg mh mi mj mk mf mq mr paragraph-image\">\n<div class=\"nr ns eb nt bg nu\" tabindex=\"0\" role=\"button\">\n<figure><img loading=\"lazy\" decoding=\"async\" class=\"bg ml mm c\" role=\"presentation\" src=\"https:\/\/miro.medium.com\/v2\/resize:fit:700\/1*mDA1ms7D5NrwKXp4r2CXQQ.png\" alt=\"\" width=\"700\" height=\"801\"><\/figure><div class=\"mq mr qn\"><picture><source srcset=\"https:\/\/miro.medium.com\/v2\/resize:fit:640\/format:webp\/1*mDA1ms7D5NrwKXp4r2CXQQ.png 640w, https:\/\/miro.medium.com\/v2\/resize:fit:720\/format:webp\/1*mDA1ms7D5NrwKXp4r2CXQQ.png 720w, https:\/\/miro.medium.com\/v2\/resize:fit:750\/format:webp\/1*mDA1ms7D5NrwKXp4r2CXQQ.png 750w, https:\/\/miro.medium.com\/v2\/resize:fit:786\/format:webp\/1*mDA1ms7D5NrwKXp4r2CXQQ.png 786w, https:\/\/miro.medium.com\/v2\/resize:fit:828\/format:webp\/1*mDA1ms7D5NrwKXp4r2CXQQ.png 828w, https:\/\/miro.medium.com\/v2\/resize:fit:1100\/format:webp\/1*mDA1ms7D5NrwKXp4r2CXQQ.png 1100w, https:\/\/miro.medium.com\/v2\/resize:fit:1400\/format:webp\/1*mDA1ms7D5NrwKXp4r2CXQQ.png 1400w\" type=\"image\/webp\" sizes=\"(min-resolution: 4dppx) and (max-width: 700px) 50vw, (-webkit-min-device-pixel-ratio: 4) and (max-width: 700px) 50vw, (min-resolution: 3dppx) and (max-width: 700px) 67vw, (-webkit-min-device-pixel-ratio: 3) and (max-width: 700px) 65vw, (min-resolution: 2.5dppx) and (max-width: 700px) 80vw, (-webkit-min-device-pixel-ratio: 2.5) and (max-width: 700px) 80vw, (min-resolution: 2dppx) and (max-width: 700px) 100vw, (-webkit-min-device-pixel-ratio: 2) and (max-width: 700px) 100vw, 700px\"><source srcset=\"https:\/\/miro.medium.com\/v2\/resize:fit:640\/1*mDA1ms7D5NrwKXp4r2CXQQ.png 640w, https:\/\/miro.medium.com\/v2\/resize:fit:720\/1*mDA1ms7D5NrwKXp4r2CXQQ.png 720w, https:\/\/miro.medium.com\/v2\/resize:fit:750\/1*mDA1ms7D5NrwKXp4r2CXQQ.png 750w, https:\/\/miro.medium.com\/v2\/resize:fit:786\/1*mDA1ms7D5NrwKXp4r2CXQQ.png 786w, https:\/\/miro.medium.com\/v2\/resize:fit:828\/1*mDA1ms7D5NrwKXp4r2CXQQ.png 828w, https:\/\/miro.medium.com\/v2\/resize:fit:1100\/1*mDA1ms7D5NrwKXp4r2CXQQ.png 1100w, https:\/\/miro.medium.com\/v2\/resize:fit:1400\/1*mDA1ms7D5NrwKXp4r2CXQQ.png 1400w\" sizes=\"(min-resolution: 4dppx) and (max-width: 700px) 50vw, (-webkit-min-device-pixel-ratio: 4) and (max-width: 700px) 50vw, (min-resolution: 3dppx) and (max-width: 700px) 67vw, (-webkit-min-device-pixel-ratio: 3) and (max-width: 700px) 65vw, (min-resolution: 2.5dppx) and (max-width: 700px) 80vw, (-webkit-min-device-pixel-ratio: 2.5) and (max-width: 700px) 80vw, (min-resolution: 2dppx) and (max-width: 700px) 100vw, (-webkit-min-device-pixel-ratio: 2) and (max-width: 700px) 100vw, 700px\" data-testid=\"og\"><\/picture><\/div>\n<\/div>\n<figcaption class=\"mn mo mp mq mr ms mt be b bf z dv\" data-selectable-paragraph=\"\">Copying the styles corresponding to coarse spatial resolutions brings high-level aspects such as pose, general hair style, face shape, and eyeglasses from source B, while all colors (eyes, hair, lighting) and finer facial features resemble source A.<\/figcaption>\n<\/figure>\n<p id=\"c842\" class=\"pw-post-body-paragraph mv mw fo be b gm mx my mz gp na nb nc nd ne nf ng nh ni nj nk nl nm nn no np fh bj\" data-selectable-paragraph=\"\">The StyleGAN architecture also adds noise on a per-pixel basis after each convolution layer. This is done in order to create \u201cstochastic variation\u201d in the image. The researchers observe that adding noise in this way allows a localized style changes to be applied to \u201cstochastic\u201d aspects of the image, such as wrinkles, freckles, skin pores, stubble, etc.<\/p>\n<figure class=\"mg mh mi mj mk mf mq mr paragraph-image\">\n<figure><img loading=\"lazy\" decoding=\"async\" class=\"bg ml mm c\" role=\"presentation\" src=\"https:\/\/miro.medium.com\/v2\/resize:fit:640\/1*M0mIHkS_iVT7Qtiyi989sQ.png\" alt=\"\" width=\"640\" height=\"245\"><\/figure><div class=\"mq mr qo\"><picture><source srcset=\"https:\/\/miro.medium.com\/v2\/resize:fit:640\/format:webp\/1*M0mIHkS_iVT7Qtiyi989sQ.png 640w, https:\/\/miro.medium.com\/v2\/resize:fit:720\/format:webp\/1*M0mIHkS_iVT7Qtiyi989sQ.png 720w, https:\/\/miro.medium.com\/v2\/resize:fit:750\/format:webp\/1*M0mIHkS_iVT7Qtiyi989sQ.png 750w, https:\/\/miro.medium.com\/v2\/resize:fit:786\/format:webp\/1*M0mIHkS_iVT7Qtiyi989sQ.png 786w, https:\/\/miro.medium.com\/v2\/resize:fit:828\/format:webp\/1*M0mIHkS_iVT7Qtiyi989sQ.png 828w, https:\/\/miro.medium.com\/v2\/resize:fit:1100\/format:webp\/1*M0mIHkS_iVT7Qtiyi989sQ.png 1100w, https:\/\/miro.medium.com\/v2\/resize:fit:1280\/format:webp\/1*M0mIHkS_iVT7Qtiyi989sQ.png 1280w\" type=\"image\/webp\" sizes=\"(min-resolution: 4dppx) and (max-width: 700px) 50vw, (-webkit-min-device-pixel-ratio: 4) and (max-width: 700px) 50vw, (min-resolution: 3dppx) and (max-width: 700px) 67vw, (-webkit-min-device-pixel-ratio: 3) and (max-width: 700px) 65vw, (min-resolution: 2.5dppx) and (max-width: 700px) 80vw, (-webkit-min-device-pixel-ratio: 2.5) and (max-width: 700px) 80vw, (min-resolution: 2dppx) and (max-width: 700px) 100vw, (-webkit-min-device-pixel-ratio: 2) and (max-width: 700px) 100vw, 640px\"><source srcset=\"https:\/\/miro.medium.com\/v2\/resize:fit:640\/1*M0mIHkS_iVT7Qtiyi989sQ.png 640w, https:\/\/miro.medium.com\/v2\/resize:fit:720\/1*M0mIHkS_iVT7Qtiyi989sQ.png 720w, https:\/\/miro.medium.com\/v2\/resize:fit:750\/1*M0mIHkS_iVT7Qtiyi989sQ.png 750w, https:\/\/miro.medium.com\/v2\/resize:fit:786\/1*M0mIHkS_iVT7Qtiyi989sQ.png 786w, https:\/\/miro.medium.com\/v2\/resize:fit:828\/1*M0mIHkS_iVT7Qtiyi989sQ.png 828w, https:\/\/miro.medium.com\/v2\/resize:fit:1100\/1*M0mIHkS_iVT7Qtiyi989sQ.png 1100w, https:\/\/miro.medium.com\/v2\/resize:fit:1280\/1*M0mIHkS_iVT7Qtiyi989sQ.png 1280w\" sizes=\"(min-resolution: 4dppx) and (max-width: 700px) 50vw, (-webkit-min-device-pixel-ratio: 4) and (max-width: 700px) 50vw, (min-resolution: 3dppx) and (max-width: 700px) 67vw, (-webkit-min-device-pixel-ratio: 3) and (max-width: 700px) 65vw, (min-resolution: 2.5dppx) and (max-width: 700px) 80vw, (-webkit-min-device-pixel-ratio: 2.5) and (max-width: 700px) 80vw, (min-resolution: 2dppx) and (max-width: 700px) 100vw, (-webkit-min-device-pixel-ratio: 2) and (max-width: 700px) 100vw, 640px\" data-testid=\"og\"><\/picture><\/div>\n<\/figure>\n<p id=\"6f0e\" class=\"pw-post-body-paragraph mv mw fo be b gm mx my mz gp na nb nc nd ne nf ng nh ni nj nk nl nm nn no np fh bj\" data-selectable-paragraph=\"\">In a traditional GAN, the generator network would obtain a random \u201clatent\u201d vector as its input, and using multiple transposed convolutions, would alter that vector into an image that would appear authentic to the discriminator. This latent vector can be thought of as a tensor representation of the image to the network.<\/p>\n<p id=\"9d4c\" class=\"pw-post-body-paragraph mv mw fo be b gm mx my mz gp na nb nc nd ne nf ng nh ni nj nk nl nm nn no np fh bj\" data-selectable-paragraph=\"\">The traditional GAN doesn\u2019t allow for control over finer styling of the image because it follows its own distribution, as governed by its training with high-level attributes, and also because it gets influenced by the general \u201ctrend\u201d of its dataset (say for example, a dominant hair color throughout the dataset). The most one could do is change the input image (the vector) and thus obtain a different result.<\/p>\n<figure class=\"mg mh mi mj mk mf mq mr paragraph-image\">\n<div class=\"nr ns eb nt bg nu\" tabindex=\"0\" role=\"button\">\n<figure><img loading=\"lazy\" decoding=\"async\" class=\"bg ml mm c\" role=\"presentation\" src=\"https:\/\/miro.medium.com\/v2\/resize:fit:700\/1*4jsnaDROd4DFUoK3S0wQjQ.png\" alt=\"\" width=\"700\" height=\"334\"><\/figure><div class=\"mq mr qp\"><picture><source srcset=\"https:\/\/miro.medium.com\/v2\/resize:fit:640\/format:webp\/1*4jsnaDROd4DFUoK3S0wQjQ.png 640w, https:\/\/miro.medium.com\/v2\/resize:fit:720\/format:webp\/1*4jsnaDROd4DFUoK3S0wQjQ.png 720w, https:\/\/miro.medium.com\/v2\/resize:fit:750\/format:webp\/1*4jsnaDROd4DFUoK3S0wQjQ.png 750w, https:\/\/miro.medium.com\/v2\/resize:fit:786\/format:webp\/1*4jsnaDROd4DFUoK3S0wQjQ.png 786w, https:\/\/miro.medium.com\/v2\/resize:fit:828\/format:webp\/1*4jsnaDROd4DFUoK3S0wQjQ.png 828w, https:\/\/miro.medium.com\/v2\/resize:fit:1100\/format:webp\/1*4jsnaDROd4DFUoK3S0wQjQ.png 1100w, https:\/\/miro.medium.com\/v2\/resize:fit:1400\/format:webp\/1*4jsnaDROd4DFUoK3S0wQjQ.png 1400w\" type=\"image\/webp\" sizes=\"(min-resolution: 4dppx) and (max-width: 700px) 50vw, (-webkit-min-device-pixel-ratio: 4) and (max-width: 700px) 50vw, (min-resolution: 3dppx) and (max-width: 700px) 67vw, (-webkit-min-device-pixel-ratio: 3) and (max-width: 700px) 65vw, (min-resolution: 2.5dppx) and (max-width: 700px) 80vw, (-webkit-min-device-pixel-ratio: 2.5) and (max-width: 700px) 80vw, (min-resolution: 2dppx) and (max-width: 700px) 100vw, (-webkit-min-device-pixel-ratio: 2) and (max-width: 700px) 100vw, 700px\"><source srcset=\"https:\/\/miro.medium.com\/v2\/resize:fit:640\/1*4jsnaDROd4DFUoK3S0wQjQ.png 640w, https:\/\/miro.medium.com\/v2\/resize:fit:720\/1*4jsnaDROd4DFUoK3S0wQjQ.png 720w, https:\/\/miro.medium.com\/v2\/resize:fit:750\/1*4jsnaDROd4DFUoK3S0wQjQ.png 750w, https:\/\/miro.medium.com\/v2\/resize:fit:786\/1*4jsnaDROd4DFUoK3S0wQjQ.png 786w, https:\/\/miro.medium.com\/v2\/resize:fit:828\/1*4jsnaDROd4DFUoK3S0wQjQ.png 828w, https:\/\/miro.medium.com\/v2\/resize:fit:1100\/1*4jsnaDROd4DFUoK3S0wQjQ.png 1100w, https:\/\/miro.medium.com\/v2\/resize:fit:1400\/1*4jsnaDROd4DFUoK3S0wQjQ.png 1400w\" sizes=\"(min-resolution: 4dppx) and (max-width: 700px) 50vw, (-webkit-min-device-pixel-ratio: 4) and (max-width: 700px) 50vw, (min-resolution: 3dppx) and (max-width: 700px) 67vw, (-webkit-min-device-pixel-ratio: 3) and (max-width: 700px) 65vw, (min-resolution: 2.5dppx) and (max-width: 700px) 80vw, (-webkit-min-device-pixel-ratio: 2.5) and (max-width: 700px) 80vw, (min-resolution: 2dppx) and (max-width: 700px) 100vw, (-webkit-min-device-pixel-ratio: 2) and (max-width: 700px) 100vw, 700px\" data-testid=\"og\"><\/picture><\/div>\n<\/div>\n<figcaption class=\"mn mo mp mq mr ms mt be b bf z dv\" data-selectable-paragraph=\"\">Intermediate Latent Space with a \u201cmapping network\u201d. (<a class=\"af mu\" href=\"https:\/\/www.lyrn.ai\/2018\/12\/26\/a-style-based-generator-architecture-for-generative-adversarial-networks\/\" target=\"_blank\" rel=\"noopener ugc nofollow\">Source<\/a>)<\/figcaption>\n<\/figure>\n<p id=\"6cee\" class=\"pw-post-body-paragraph mv mw fo be b gm mx my mz gp na nb nc nd ne nf ng nh ni nj nk nl nm nn no np fh bj\" data-selectable-paragraph=\"\">NVIDIA\u2019s architecture includes an intermediate \u201clatent space\u201d, which can be thought of as being \u201cdetachable\u201d. Input latent code can then be embedded into this space. The styles or features present in images are actually different forms of the same latent vector embedding, which is used to normalize the input of each convolutional layer.<\/p>\n<p id=\"ab10\" class=\"pw-post-body-paragraph mv mw fo be b gm mx my mz gp na nb nc nd ne nf ng nh ni nj nk nl nm nn no np fh bj\" data-selectable-paragraph=\"\">Additional noise is also fed to the network in order to assert greater control over the finer details. When generating the output image, the user switches between latent codes at a selected point in the network, thus leading to a mixing of styles.<\/p>\n<h2 id=\"aaca\" class=\"qq nx fo be ny qr qs qt ob qu qv qw oe nd qx qy qz nh ra rb rc nl rd re rf rg bj\" data-selectable-paragraph=\"\"><strong class=\"al\">Feature disentanglement<\/strong><\/h2>\n<p id=\"084c\" class=\"pw-post-body-paragraph mv mw fo be b gm os my mz gp ot nb nc nd ou nf ng nh ov nj nk nl ow nn no np fh bj\" data-selectable-paragraph=\"\">The reason why traditional GANs have a problem with control of styles or features within the same image is due to something called feature entanglement. As the name suggests, a GAN is not as capable of distinguishing these finer details as a human, thus leading the features to become \u201centangled\u201d with each other to some extent within the GAN\u2019s frame of perception.<\/p>\n<p id=\"adf3\" class=\"pw-post-body-paragraph mv mw fo be b gm mx my mz gp na nb nc nd ne nf ng nh ni nj nk nl nm nn no np fh bj\" data-selectable-paragraph=\"\">A good example would be \u201centanglement\u201d between the features of hair color and gender. If the dataset used for training has a general trend of males having short hair and females having long hair, the neural network would learn that males can only have short hair and vice-versa for females. As a result, changing the latent vector to obtain long hair for the image of a male in the result would also end up changing the gender, leading to an image of a woman.<\/p>\n<figure class=\"mg mh mi mj mk mf mq mr paragraph-image\">\n<figure><img loading=\"lazy\" decoding=\"async\" class=\"bg ml mm c\" role=\"presentation\" src=\"https:\/\/miro.medium.com\/v2\/resize:fit:256\/1*dnBtRyld-uWT-1l1XDyXvQ.gif\" alt=\"\" width=\"256\" height=\"256\"><\/figure><div class=\"mq mr rh\"><picture><source srcset=\"https:\/\/miro.medium.com\/v2\/resize:fit:640\/1*dnBtRyld-uWT-1l1XDyXvQ.gif 640w, https:\/\/miro.medium.com\/v2\/resize:fit:720\/1*dnBtRyld-uWT-1l1XDyXvQ.gif 720w, https:\/\/miro.medium.com\/v2\/resize:fit:750\/1*dnBtRyld-uWT-1l1XDyXvQ.gif 750w, https:\/\/miro.medium.com\/v2\/resize:fit:786\/1*dnBtRyld-uWT-1l1XDyXvQ.gif 786w, https:\/\/miro.medium.com\/v2\/resize:fit:828\/1*dnBtRyld-uWT-1l1XDyXvQ.gif 828w, https:\/\/miro.medium.com\/v2\/resize:fit:1100\/1*dnBtRyld-uWT-1l1XDyXvQ.gif 1100w, https:\/\/miro.medium.com\/v2\/resize:fit:512\/1*dnBtRyld-uWT-1l1XDyXvQ.gif 512w\" type=\"image\/webp\" sizes=\"(min-resolution: 4dppx) and (max-width: 700px) 50vw, (-webkit-min-device-pixel-ratio: 4) and (max-width: 700px) 50vw, (min-resolution: 3dppx) and (max-width: 700px) 67vw, (-webkit-min-device-pixel-ratio: 3) and (max-width: 700px) 65vw, (min-resolution: 2.5dppx) and (max-width: 700px) 80vw, (-webkit-min-device-pixel-ratio: 2.5) and (max-width: 700px) 80vw, (min-resolution: 2dppx) and (max-width: 700px) 100vw, (-webkit-min-device-pixel-ratio: 2) and (max-width: 700px) 100vw, 256px\"><source srcset=\"https:\/\/miro.medium.com\/v2\/resize:fit:640\/1*dnBtRyld-uWT-1l1XDyXvQ.gif 640w, https:\/\/miro.medium.com\/v2\/resize:fit:720\/1*dnBtRyld-uWT-1l1XDyXvQ.gif 720w, https:\/\/miro.medium.com\/v2\/resize:fit:750\/1*dnBtRyld-uWT-1l1XDyXvQ.gif 750w, https:\/\/miro.medium.com\/v2\/resize:fit:786\/1*dnBtRyld-uWT-1l1XDyXvQ.gif 786w, https:\/\/miro.medium.com\/v2\/resize:fit:828\/1*dnBtRyld-uWT-1l1XDyXvQ.gif 828w, https:\/\/miro.medium.com\/v2\/resize:fit:1100\/1*dnBtRyld-uWT-1l1XDyXvQ.gif 1100w, https:\/\/miro.medium.com\/v2\/resize:fit:512\/1*dnBtRyld-uWT-1l1XDyXvQ.gif 512w\" sizes=\"(min-resolution: 4dppx) and (max-width: 700px) 50vw, (-webkit-min-device-pixel-ratio: 4) and (max-width: 700px) 50vw, (min-resolution: 3dppx) and (max-width: 700px) 67vw, (-webkit-min-device-pixel-ratio: 3) and (max-width: 700px) 65vw, (min-resolution: 2.5dppx) and (max-width: 700px) 80vw, (-webkit-min-device-pixel-ratio: 2.5) and (max-width: 700px) 80vw, (min-resolution: 2dppx) and (max-width: 700px) 100vw, (-webkit-min-device-pixel-ratio: 2) and (max-width: 700px) 100vw, 256px\" data-testid=\"og\"><\/picture><\/div>\n<figcaption class=\"mn mo mp mq mr ms mt be b bf z dv\" data-selectable-paragraph=\"\"><a class=\"af mu\" href=\"https:\/\/github.com\/Puzer\/stylegan-encoder\/issues\/1\" target=\"_blank\" rel=\"noopener ugc nofollow\">Source<\/a><\/figcaption>\n<\/figure>\n<p id=\"8692\" class=\"pw-post-body-paragraph mv mw fo be b gm mx my mz gp na nb nc nd ne nf ng nh ni nj nk nl nm nn no np fh bj\" data-selectable-paragraph=\"\">Using the intermediate latent space, the StyleGAN architecture lets the user make small changes to the input vector in such a way that the output image is not altered dramatically. A \u201cmapping network\u201d is included that maps an input vector to another intermediate latent vector, which is then fed to the generator network.<\/p>\n<figure class=\"mg mh mi mj mk mf mq mr paragraph-image\">\n<figure><img loading=\"lazy\" decoding=\"async\" class=\"bg ml mm c\" role=\"presentation\" src=\"https:\/\/miro.medium.com\/v2\/resize:fit:442\/0*juPzyoeiUrATZINA.png\" alt=\"\" width=\"442\" height=\"350\"><\/figure><div class=\"mq mr ri\"><picture><source srcset=\"https:\/\/miro.medium.com\/v2\/resize:fit:640\/format:webp\/0*juPzyoeiUrATZINA.png 640w, https:\/\/miro.medium.com\/v2\/resize:fit:720\/format:webp\/0*juPzyoeiUrATZINA.png 720w, https:\/\/miro.medium.com\/v2\/resize:fit:750\/format:webp\/0*juPzyoeiUrATZINA.png 750w, https:\/\/miro.medium.com\/v2\/resize:fit:786\/format:webp\/0*juPzyoeiUrATZINA.png 786w, https:\/\/miro.medium.com\/v2\/resize:fit:828\/format:webp\/0*juPzyoeiUrATZINA.png 828w, https:\/\/miro.medium.com\/v2\/resize:fit:1100\/format:webp\/0*juPzyoeiUrATZINA.png 1100w, https:\/\/miro.medium.com\/v2\/resize:fit:884\/format:webp\/0*juPzyoeiUrATZINA.png 884w\" type=\"image\/webp\" sizes=\"(min-resolution: 4dppx) and (max-width: 700px) 50vw, (-webkit-min-device-pixel-ratio: 4) and (max-width: 700px) 50vw, (min-resolution: 3dppx) and (max-width: 700px) 67vw, (-webkit-min-device-pixel-ratio: 3) and (max-width: 700px) 65vw, (min-resolution: 2.5dppx) and (max-width: 700px) 80vw, (-webkit-min-device-pixel-ratio: 2.5) and (max-width: 700px) 80vw, (min-resolution: 2dppx) and (max-width: 700px) 100vw, (-webkit-min-device-pixel-ratio: 2) and (max-width: 700px) 100vw, 442px\"><source srcset=\"https:\/\/miro.medium.com\/v2\/resize:fit:640\/0*juPzyoeiUrATZINA.png 640w, https:\/\/miro.medium.com\/v2\/resize:fit:720\/0*juPzyoeiUrATZINA.png 720w, https:\/\/miro.medium.com\/v2\/resize:fit:750\/0*juPzyoeiUrATZINA.png 750w, https:\/\/miro.medium.com\/v2\/resize:fit:786\/0*juPzyoeiUrATZINA.png 786w, https:\/\/miro.medium.com\/v2\/resize:fit:828\/0*juPzyoeiUrATZINA.png 828w, https:\/\/miro.medium.com\/v2\/resize:fit:1100\/0*juPzyoeiUrATZINA.png 1100w, https:\/\/miro.medium.com\/v2\/resize:fit:884\/0*juPzyoeiUrATZINA.png 884w\" sizes=\"(min-resolution: 4dppx) and (max-width: 700px) 50vw, (-webkit-min-device-pixel-ratio: 4) and (max-width: 700px) 50vw, (min-resolution: 3dppx) and (max-width: 700px) 67vw, (-webkit-min-device-pixel-ratio: 3) and (max-width: 700px) 65vw, (min-resolution: 2.5dppx) and (max-width: 700px) 80vw, (-webkit-min-device-pixel-ratio: 2.5) and (max-width: 700px) 80vw, (min-resolution: 2dppx) and (max-width: 700px) 100vw, (-webkit-min-device-pixel-ratio: 2) and (max-width: 700px) 100vw, 442px\" data-testid=\"og\"><\/picture><\/div>\n<figcaption class=\"mn mo mp mq mr ms mt be b bf z dv\" data-selectable-paragraph=\"\"><a class=\"af mu\" href=\"https:\/\/www.lyrn.ai\/2018\/12\/26\/a-style-based-generator-architecture-for-generative-adversarial-networks\/\" target=\"_blank\" rel=\"noopener ugc nofollow\">Source<\/a><\/figcaption>\n<\/figure>\n<p id=\"b3d9\" class=\"pw-post-body-paragraph mv mw fo be b gm mx my mz gp na nb nc nd ne nf ng nh ni nj nk nl nm nn no np fh bj\" data-selectable-paragraph=\"\">The researchers made use of a network with 8 layers for this purpose, whose input and output are both a vector with 512 dimensions. In their paper, they also make the case as to why these hyperparameters work best. They present two separate approaches to measure feature disentanglement:<\/p>\n<ol class=\"\">\n<li id=\"e7a5\" class=\"mv mw fo be b gm mx my mz gp na nb nc ox ne nf ng oy ni nj nk oz nm nn no np rj pb pc bj\" data-selectable-paragraph=\"\"><strong class=\"be qj\">Perceptual path length <\/strong>\u2014 Calculate the difference between VGG16 embeddings of images when interpolating between two random inputs. A drastic change indicates that multiple features have changed together and thus might be entangled.<\/li>\n<li id=\"4208\" class=\"mv mw fo be b gm pd my mz gp pe nb nc ox pf nf ng oy pg nj nk oz ph nn no np rj pb pc bj\" data-selectable-paragraph=\"\"><strong class=\"be qj\">Linear separability <\/strong>\u2014 Classify all inputs into binary classes, such as male and female. The better the classification, the more separable the features.<\/li>\n<\/ol>\n<h1 id=\"b280\" class=\"nw nx fo be ny nz oa go ob oc od gr oe of og oh oi oj ok ol om on oo op oq or bj\" data-selectable-paragraph=\"\">GAN you try it out?<\/h1>\n<p id=\"5987\" class=\"pw-post-body-paragraph mv mw fo be b gm os my mz gp ot nb nc nd ou nf ng nh ov nj nk nl ow nn no np fh bj\" data-selectable-paragraph=\"\">Here\u2019s a chance for you to get your hands dirty. Not only has NVIDIA made the whole project open-source, but they\u2019ve also released a variety of resources, documentation, and pre-trained models for developers to play around with. A coder\u2019s paradise!<\/p>\n<p id=\"3566\" class=\"pw-post-body-paragraph mv mw fo be b gm mx my mz gp na nb nc nd ne nf ng nh ni nj nk nl nm nn no np fh bj\" data-selectable-paragraph=\"\">StyleGAN was trained on the CelebA-HQ and FFHQ datasets for one week using 8 Tesla V100 GPUs. Its implementation is in TensorFlow and can be found in NVIDIA\u2019s <a class=\"af mu\" href=\"https:\/\/github.com\/NVlabs\/stylegan\" target=\"_blank\" rel=\"noopener ugc nofollow\">GitHub repository<\/a>, made available under the <a class=\"af mu\" href=\"https:\/\/creativecommons.org\/licenses\/by-nc\/4.0\/\" target=\"_blank\" rel=\"noopener ugc nofollow\">Creative Commons BY-NC 4.0<\/a> license. This means you can use, redistribute, and adapt the material for non-commercial purposes, as long as you give appropriate credit by citing the research paper and indicating any changes made.<\/p>\n<pre># Copyright (c) 2019, NVIDIA CORPORATION. All rights reserved.\n#\n# This work is licensed under the Creative Commons Attribution-NonCommercial\n# 4.0 International License. To view a copy of this license, visit\n# http:\/\/creativecommons.org\/licenses\/by-nc\/4.0\/ or send a letter to\n# Creative Commons, PO Box 1866, Mountain View, CA 94042, USA.\n\n\"\"\"Minimal script for generating an image using pre-trained StyleGAN generator.\"\"\"\n\nimport os\nimport pickle\nimport numpy as np\nimport PIL.Image\nimport dnnlib\nimport dnnlib.tflib as tflib\nimport config\n\ndef main():\n    # Initialize TensorFlow.\n    tflib.init_tf()\n\n    # Load pre-trained network.\n    url = 'https:\/\/drive.google.com\/uc?id=1MEGjdvVpUsu1jB4zrXZN7Y4kBBOzizDQ' # karras2019stylegan-ffhq-1024x1024.pkl\n    with dnnlib.util.open_url(url, cache_dir=config.cache_dir) as f:\n        _G, _D, Gs = pickle.load(f)\n        # _G = Instantaneous snapshot of the generator. Mainly useful for resuming a previous training run.\n        # _D = Instantaneous snapshot of the discriminator. Mainly useful for resuming a previous training run.\n        # Gs = Long-term average of the generator. Yields higher-quality results than the instantaneous snapshot.\n\n    # Print network details.\n    Gs.print_layers()\n\n    # Pick latent vector.\n    rnd = np.random.RandomState(5)\n    latents = rnd.randn(1, Gs.input_shape[1])\n\n    # Generate image.\n    fmt = dict(func=tflib.convert_images_to_uint8, nchw_to_nhwc=True)\n    images = Gs.run(latents, None, truncation_psi=0.7, randomize_noise=True, output_transform=fmt)\n\n    # Save image.\n    os.makedirs(config.result_dir, exist_ok=True)\n    png_filename = os.path.join(config.result_dir, 'example.png')\n    PIL.Image.fromarray(images[0], 'RGB').save(png_filename)\n\nif __name__ == \"__main__\":\n    main()<\/pre>\n<figure class=\"mg mh mi mj mk mf\">\n<figcaption class=\"mn mo mp mq mr ms mt be b bf z dv\">Try out the pretrained_example.py in the StyleGAN repo. It downloads a pre-trained generator from Google Drive and uses it to generate an image.<\/figcaption>\n<\/figure>\n<p id=\"63ba\" class=\"pw-post-body-paragraph mv mw fo be b gm mx my mz gp na nb nc nd ne nf ng nh ni nj nk nl nm nn no np fh bj\" data-selectable-paragraph=\"\">The researchers trained their model on the Celeba-HQ and Flickr-Faces-HQdatasets, both containing 1024&#215;1024 resolution images. They strongly advise training with 8 GPUs in order to produce similar results.<\/p>\n<figure class=\"mg mh mi mj mk mf mq mr paragraph-image\">\n<figure><img loading=\"lazy\" decoding=\"async\" class=\"bg ml mm c\" role=\"presentation\" src=\"https:\/\/miro.medium.com\/v2\/resize:fit:516\/1*pnFPEyLbM43_3_KYpzA1ug.png\" alt=\"\" width=\"516\" height=\"199\"><\/figure><div class=\"mq mr rl\"><picture><source srcset=\"https:\/\/miro.medium.com\/v2\/resize:fit:640\/format:webp\/1*pnFPEyLbM43_3_KYpzA1ug.png 640w, https:\/\/miro.medium.com\/v2\/resize:fit:720\/format:webp\/1*pnFPEyLbM43_3_KYpzA1ug.png 720w, https:\/\/miro.medium.com\/v2\/resize:fit:750\/format:webp\/1*pnFPEyLbM43_3_KYpzA1ug.png 750w, https:\/\/miro.medium.com\/v2\/resize:fit:786\/format:webp\/1*pnFPEyLbM43_3_KYpzA1ug.png 786w, https:\/\/miro.medium.com\/v2\/resize:fit:828\/format:webp\/1*pnFPEyLbM43_3_KYpzA1ug.png 828w, https:\/\/miro.medium.com\/v2\/resize:fit:1100\/format:webp\/1*pnFPEyLbM43_3_KYpzA1ug.png 1100w, https:\/\/miro.medium.com\/v2\/resize:fit:1032\/format:webp\/1*pnFPEyLbM43_3_KYpzA1ug.png 1032w\" type=\"image\/webp\" sizes=\"(min-resolution: 4dppx) and (max-width: 700px) 50vw, (-webkit-min-device-pixel-ratio: 4) and (max-width: 700px) 50vw, (min-resolution: 3dppx) and (max-width: 700px) 67vw, (-webkit-min-device-pixel-ratio: 3) and (max-width: 700px) 65vw, (min-resolution: 2.5dppx) and (max-width: 700px) 80vw, (-webkit-min-device-pixel-ratio: 2.5) and (max-width: 700px) 80vw, (min-resolution: 2dppx) and (max-width: 700px) 100vw, (-webkit-min-device-pixel-ratio: 2) and (max-width: 700px) 100vw, 516px\"><source srcset=\"https:\/\/miro.medium.com\/v2\/resize:fit:640\/1*pnFPEyLbM43_3_KYpzA1ug.png 640w, https:\/\/miro.medium.com\/v2\/resize:fit:720\/1*pnFPEyLbM43_3_KYpzA1ug.png 720w, https:\/\/miro.medium.com\/v2\/resize:fit:750\/1*pnFPEyLbM43_3_KYpzA1ug.png 750w, https:\/\/miro.medium.com\/v2\/resize:fit:786\/1*pnFPEyLbM43_3_KYpzA1ug.png 786w, https:\/\/miro.medium.com\/v2\/resize:fit:828\/1*pnFPEyLbM43_3_KYpzA1ug.png 828w, https:\/\/miro.medium.com\/v2\/resize:fit:1100\/1*pnFPEyLbM43_3_KYpzA1ug.png 1100w, https:\/\/miro.medium.com\/v2\/resize:fit:1032\/1*pnFPEyLbM43_3_KYpzA1ug.png 1032w\" sizes=\"(min-resolution: 4dppx) and (max-width: 700px) 50vw, (-webkit-min-device-pixel-ratio: 4) and (max-width: 700px) 50vw, (min-resolution: 3dppx) and (max-width: 700px) 67vw, (-webkit-min-device-pixel-ratio: 3) and (max-width: 700px) 65vw, (min-resolution: 2.5dppx) and (max-width: 700px) 80vw, (-webkit-min-device-pixel-ratio: 2.5) and (max-width: 700px) 80vw, (min-resolution: 2dppx) and (max-width: 700px) 100vw, (-webkit-min-device-pixel-ratio: 2) and (max-width: 700px) 100vw, 516px\" data-testid=\"og\"><\/picture><\/div>\n<figcaption class=\"mn mo mp mq mr ms mt be b bf z dv\" data-selectable-paragraph=\"\">Training times required for images with various resolution compared against no. of GPUs used.<\/figcaption>\n<\/figure>\n<h1 id=\"7661\" class=\"nw nx fo be ny nz oa go ob oc od gr oe of og oh oi oj ok ol om on oo op oq or bj\" data-selectable-paragraph=\"\">Results<\/h1>\n<p id=\"9e63\" class=\"pw-post-body-paragraph mv mw fo be b gm os my mz gp ot nb nc nd ou nf ng nh ov nj nk nl ow nn no np fh bj\" data-selectable-paragraph=\"\">In addition to the faces datasets, the researchers also used their StyleGAN on three other datasets: the LSUN BEDROOMS, CARS, and CATS datasets. Shown in the picture below are some of the results they obtained by mixing styles from the various images.<\/p>\n<p id=\"da03\" class=\"pw-post-body-paragraph mv mw fo be b gm mx my mz gp na nb nc nd ne nf ng nh ni nj nk nl nm nn no np fh bj\" data-selectable-paragraph=\"\">As we can see, in the BEDROOM dataset, the coarse styles control the viewpoint of the camera, the middle styles select the particular furniture, and fine styles deal with colors and smaller details of materials. The same concept holds true for the other two datasets. The effects of stochastic variation can be observed in the fabrics in BEDROOM, backgrounds and headlamps in CARS, and fur, background, and the positioning of paws in the CATS dataset.<\/p>\n<\/div>\n<\/div>\n<div class=\"mf bg\">\n<figure class=\"mg mh mi mj mk mf bg paragraph-image\"><picture><source srcset=\"https:\/\/miro.medium.com\/v2\/resize:fit:640\/format:webp\/1*2Zct9rz8OjDSi5hRb9pgbw.png 640w, https:\/\/miro.medium.com\/v2\/resize:fit:720\/format:webp\/1*2Zct9rz8OjDSi5hRb9pgbw.png 720w, https:\/\/miro.medium.com\/v2\/resize:fit:750\/format:webp\/1*2Zct9rz8OjDSi5hRb9pgbw.png 750w, https:\/\/miro.medium.com\/v2\/resize:fit:786\/format:webp\/1*2Zct9rz8OjDSi5hRb9pgbw.png 786w, https:\/\/miro.medium.com\/v2\/resize:fit:828\/format:webp\/1*2Zct9rz8OjDSi5hRb9pgbw.png 828w, https:\/\/miro.medium.com\/v2\/resize:fit:1100\/format:webp\/1*2Zct9rz8OjDSi5hRb9pgbw.png 1100w, https:\/\/miro.medium.com\/v2\/resize:fit:3710\/format:webp\/1*2Zct9rz8OjDSi5hRb9pgbw.png 3710w\" type=\"image\/webp\" sizes=\"(min-resolution: 4dppx) and (max-width: 700px) 50vw, (-webkit-min-device-pixel-ratio: 4) and (max-width: 700px) 50vw, (min-resolution: 3dppx) and (max-width: 700px) 67vw, (-webkit-min-device-pixel-ratio: 3) and (max-width: 700px) 65vw, (min-resolution: 2.5dppx) and (max-width: 700px) 80vw, (-webkit-min-device-pixel-ratio: 2.5) and (max-width: 700px) 80vw, (min-resolution: 2dppx) and (max-width: 700px) 100vw, (-webkit-min-device-pixel-ratio: 2) and (max-width: 700px) 100vw, 1855px\"><source srcset=\"https:\/\/miro.medium.com\/v2\/resize:fit:640\/1*2Zct9rz8OjDSi5hRb9pgbw.png 640w, https:\/\/miro.medium.com\/v2\/resize:fit:720\/1*2Zct9rz8OjDSi5hRb9pgbw.png 720w, https:\/\/miro.medium.com\/v2\/resize:fit:750\/1*2Zct9rz8OjDSi5hRb9pgbw.png 750w, https:\/\/miro.medium.com\/v2\/resize:fit:786\/1*2Zct9rz8OjDSi5hRb9pgbw.png 786w, https:\/\/miro.medium.com\/v2\/resize:fit:828\/1*2Zct9rz8OjDSi5hRb9pgbw.png 828w, https:\/\/miro.medium.com\/v2\/resize:fit:1100\/1*2Zct9rz8OjDSi5hRb9pgbw.png 1100w, https:\/\/miro.medium.com\/v2\/resize:fit:3710\/1*2Zct9rz8OjDSi5hRb9pgbw.png 3710w\" sizes=\"(min-resolution: 4dppx) and (max-width: 700px) 50vw, (-webkit-min-device-pixel-ratio: 4) and (max-width: 700px) 50vw, (min-resolution: 3dppx) and (max-width: 700px) 67vw, (-webkit-min-device-pixel-ratio: 3) and (max-width: 700px) 65vw, (min-resolution: 2.5dppx) and (max-width: 700px) 80vw, (-webkit-min-device-pixel-ratio: 2.5) and (max-width: 700px) 80vw, (min-resolution: 2dppx) and (max-width: 700px) 100vw, (-webkit-min-device-pixel-ratio: 2) and (max-width: 700px) 100vw, 1855px\" data-testid=\"og\"><img loading=\"lazy\" decoding=\"async\" class=\"bg ml mm c\" role=\"presentation\" src=\"https:\/\/miro.medium.com\/v2\/resize:fit:1855\/1*2Zct9rz8OjDSi5hRb9pgbw.png\" alt=\"\" width=\"1855\" height=\"759\"><\/picture>\n<figcaption class=\"mn mo mp mq mr ms mt be b bf z dv\" data-selectable-paragraph=\"\">Left to Right : The results depicted in the research paper of style-mixing for LSUN BEDROOMS, CARS and CATS datasets, each containing 50,000 images.<\/figcaption>\n<\/figure>\n<\/div>\n<div class=\"ab ca\">\n<div class=\"ch bg et eu ev ew\">\n<p id=\"2fe8\" class=\"pw-post-body-paragraph mv mw fo be b gm mx my mz gp na nb nc nd ne nf ng nh ni nj nk nl nm nn no np fh bj\" data-selectable-paragraph=\"\">The StyleGAN has been widely used by developers to tinker with image datasets, and many interesting results can be found. From <a class=\"af mu\" href=\"https:\/\/heartbeat.comet.ml\/my-mangagan-building-my-first-generative-adversarial-network-2ec1920257e3\" target=\"_blank\" rel=\"noopener ugc nofollow\">generating anime characters<\/a> to creating brand-new fonts and alphabets in various languages, one could safely note that StyleGAN has been experimented with quite a lot.<\/p>\n<p id=\"e027\" class=\"pw-post-body-paragraph mv mw fo be b gm mx my mz gp na nb nc nd ne nf ng nh ni nj nk nl nm nn no np fh bj\" data-selectable-paragraph=\"\"><a class=\"af mu\" href=\"https:\/\/thispersondoesnotexist.com\/\" target=\"_blank\" rel=\"noopener ugc nofollow\">ThisPersonDoesNotExist.com<\/a> also implements a StyleGan to generate a fake high-res face every time the page is refreshed. The image below shows various characters generated by a StyleGAN trained on scripts from several languages.<\/p>\n<figure class=\"mg mh mi mj mk mf mq mr paragraph-image\">\n<div class=\"nr ns eb nt bg nu\" tabindex=\"0\" role=\"button\">\n<figure><img loading=\"lazy\" decoding=\"async\" class=\"bg ml mm c\" role=\"presentation\" src=\"https:\/\/miro.medium.com\/v2\/resize:fit:700\/1*HTrmeR2Whr0I1G3-CGqmsg.png\" alt=\"\" width=\"700\" height=\"134\"><\/figure><div class=\"mq mr rm\"><picture><source srcset=\"https:\/\/miro.medium.com\/v2\/resize:fit:640\/format:webp\/1*HTrmeR2Whr0I1G3-CGqmsg.png 640w, https:\/\/miro.medium.com\/v2\/resize:fit:720\/format:webp\/1*HTrmeR2Whr0I1G3-CGqmsg.png 720w, https:\/\/miro.medium.com\/v2\/resize:fit:750\/format:webp\/1*HTrmeR2Whr0I1G3-CGqmsg.png 750w, https:\/\/miro.medium.com\/v2\/resize:fit:786\/format:webp\/1*HTrmeR2Whr0I1G3-CGqmsg.png 786w, https:\/\/miro.medium.com\/v2\/resize:fit:828\/format:webp\/1*HTrmeR2Whr0I1G3-CGqmsg.png 828w, https:\/\/miro.medium.com\/v2\/resize:fit:1100\/format:webp\/1*HTrmeR2Whr0I1G3-CGqmsg.png 1100w, https:\/\/miro.medium.com\/v2\/resize:fit:1400\/format:webp\/1*HTrmeR2Whr0I1G3-CGqmsg.png 1400w\" type=\"image\/webp\" sizes=\"(min-resolution: 4dppx) and (max-width: 700px) 50vw, (-webkit-min-device-pixel-ratio: 4) and (max-width: 700px) 50vw, (min-resolution: 3dppx) and (max-width: 700px) 67vw, (-webkit-min-device-pixel-ratio: 3) and (max-width: 700px) 65vw, (min-resolution: 2.5dppx) and (max-width: 700px) 80vw, (-webkit-min-device-pixel-ratio: 2.5) and (max-width: 700px) 80vw, (min-resolution: 2dppx) and (max-width: 700px) 100vw, (-webkit-min-device-pixel-ratio: 2) and (max-width: 700px) 100vw, 700px\"><source srcset=\"https:\/\/miro.medium.com\/v2\/resize:fit:640\/1*HTrmeR2Whr0I1G3-CGqmsg.png 640w, https:\/\/miro.medium.com\/v2\/resize:fit:720\/1*HTrmeR2Whr0I1G3-CGqmsg.png 720w, https:\/\/miro.medium.com\/v2\/resize:fit:750\/1*HTrmeR2Whr0I1G3-CGqmsg.png 750w, https:\/\/miro.medium.com\/v2\/resize:fit:786\/1*HTrmeR2Whr0I1G3-CGqmsg.png 786w, https:\/\/miro.medium.com\/v2\/resize:fit:828\/1*HTrmeR2Whr0I1G3-CGqmsg.png 828w, https:\/\/miro.medium.com\/v2\/resize:fit:1100\/1*HTrmeR2Whr0I1G3-CGqmsg.png 1100w, https:\/\/miro.medium.com\/v2\/resize:fit:1400\/1*HTrmeR2Whr0I1G3-CGqmsg.png 1400w\" sizes=\"(min-resolution: 4dppx) and (max-width: 700px) 50vw, (-webkit-min-device-pixel-ratio: 4) and (max-width: 700px) 50vw, (min-resolution: 3dppx) and (max-width: 700px) 67vw, (-webkit-min-device-pixel-ratio: 3) and (max-width: 700px) 65vw, (min-resolution: 2.5dppx) and (max-width: 700px) 80vw, (-webkit-min-device-pixel-ratio: 2.5) and (max-width: 700px) 80vw, (min-resolution: 2dppx) and (max-width: 700px) 100vw, (-webkit-min-device-pixel-ratio: 2) and (max-width: 700px) 100vw, 700px\" data-testid=\"og\"><\/picture><\/div>\n<\/div>\n<figcaption class=\"mn mo mp mq mr ms mt be b bf z dv\" data-selectable-paragraph=\"\">Source : <a class=\"af mu\" href=\"https:\/\/towardsdatascience.com\/creating-new-scripts-with-stylegan-c16473a50fd0\" target=\"_blank\" rel=\"noopener\">Robert Munro in Towards Data Science<\/a><\/figcaption>\n<\/figure>\n<h1 id=\"0257\" class=\"nw nx fo be ny nz oa go ob oc od gr oe of og oh oi oj ok ol om on oo op oq or bj\" data-selectable-paragraph=\"\">Concurrent Research<\/h1>\n<p id=\"4e81\" class=\"pw-post-body-paragraph mv mw fo be b gm os my mz gp ot nb nc nd ou nf ng nh ov nj nk nl ow nn no np fh bj\" data-selectable-paragraph=\"\">While in this article the focus lies by far on the research conducted by NVIDIA and compiled in their paper \u201c<a class=\"af mu\" href=\"https:\/\/paperswithcode.com\/paper\/a-style-based-generator-architecture-for\" target=\"_blank\" rel=\"noopener ugc nofollow\">A Style-Based Generator Architecture for Generative Adversarial Networks<\/a>\u201d, it would also be worthwhile to take a look at some more recent findings and developments that use NVIDIA\u2019s research as a basis. This section summarizes the recent work relating to styleGANs with a deep learning approach.<\/p>\n<h2 id=\"0288\" class=\"qq nx fo be ny qr qs qt ob qu qv qw oe nd qx qy qz nh ra rb rc nl rd re rf rg bj\" data-selectable-paragraph=\"\"><strong class=\"al\">Semantic Image Synthesis with Spatially-Adaptive Normalization<\/strong><\/h2>\n<figure class=\"mg mh mi mj mk mf mq mr paragraph-image\">\n<div class=\"nr ns eb nt bg nu\" tabindex=\"0\" role=\"button\">\n<figure><img loading=\"lazy\" decoding=\"async\" class=\"bg ml mm c\" role=\"presentation\" src=\"https:\/\/miro.medium.com\/v2\/resize:fit:700\/1*CWn06Y8VGrVqItNVwMASUw.gif\" alt=\"\" width=\"700\" height=\"340\"><\/figure><div class=\"mq mr rn\"><picture><source srcset=\"https:\/\/miro.medium.com\/v2\/resize:fit:640\/1*CWn06Y8VGrVqItNVwMASUw.gif 640w, https:\/\/miro.medium.com\/v2\/resize:fit:720\/1*CWn06Y8VGrVqItNVwMASUw.gif 720w, https:\/\/miro.medium.com\/v2\/resize:fit:750\/1*CWn06Y8VGrVqItNVwMASUw.gif 750w, https:\/\/miro.medium.com\/v2\/resize:fit:786\/1*CWn06Y8VGrVqItNVwMASUw.gif 786w, https:\/\/miro.medium.com\/v2\/resize:fit:828\/1*CWn06Y8VGrVqItNVwMASUw.gif 828w, https:\/\/miro.medium.com\/v2\/resize:fit:1100\/1*CWn06Y8VGrVqItNVwMASUw.gif 1100w, https:\/\/miro.medium.com\/v2\/resize:fit:1400\/1*CWn06Y8VGrVqItNVwMASUw.gif 1400w\" type=\"image\/webp\" sizes=\"(min-resolution: 4dppx) and (max-width: 700px) 50vw, (-webkit-min-device-pixel-ratio: 4) and (max-width: 700px) 50vw, (min-resolution: 3dppx) and (max-width: 700px) 67vw, (-webkit-min-device-pixel-ratio: 3) and (max-width: 700px) 65vw, (min-resolution: 2.5dppx) and (max-width: 700px) 80vw, (-webkit-min-device-pixel-ratio: 2.5) and (max-width: 700px) 80vw, (min-resolution: 2dppx) and (max-width: 700px) 100vw, (-webkit-min-device-pixel-ratio: 2) and (max-width: 700px) 100vw, 700px\"><source srcset=\"https:\/\/miro.medium.com\/v2\/resize:fit:640\/1*CWn06Y8VGrVqItNVwMASUw.gif 640w, https:\/\/miro.medium.com\/v2\/resize:fit:720\/1*CWn06Y8VGrVqItNVwMASUw.gif 720w, https:\/\/miro.medium.com\/v2\/resize:fit:750\/1*CWn06Y8VGrVqItNVwMASUw.gif 750w, https:\/\/miro.medium.com\/v2\/resize:fit:786\/1*CWn06Y8VGrVqItNVwMASUw.gif 786w, https:\/\/miro.medium.com\/v2\/resize:fit:828\/1*CWn06Y8VGrVqItNVwMASUw.gif 828w, https:\/\/miro.medium.com\/v2\/resize:fit:1100\/1*CWn06Y8VGrVqItNVwMASUw.gif 1100w, https:\/\/miro.medium.com\/v2\/resize:fit:1400\/1*CWn06Y8VGrVqItNVwMASUw.gif 1400w\" sizes=\"(min-resolution: 4dppx) and (max-width: 700px) 50vw, (-webkit-min-device-pixel-ratio: 4) and (max-width: 700px) 50vw, (min-resolution: 3dppx) and (max-width: 700px) 67vw, (-webkit-min-device-pixel-ratio: 3) and (max-width: 700px) 65vw, (min-resolution: 2.5dppx) and (max-width: 700px) 80vw, (-webkit-min-device-pixel-ratio: 2.5) and (max-width: 700px) 80vw, (min-resolution: 2dppx) and (max-width: 700px) 100vw, (-webkit-min-device-pixel-ratio: 2) and (max-width: 700px) 100vw, 700px\" data-testid=\"og\"><\/picture><\/div>\n<\/div>\n<figcaption class=\"mn mo mp mq mr ms mt be b bf z dv\" data-selectable-paragraph=\"\">GauGAN (<a class=\"af mu\" href=\"http:\/\/nvidia-research-mingyuliu.com\/gaugan\" target=\"_blank\" rel=\"noopener ugc nofollow\">Source<\/a>)<\/figcaption>\n<\/figure>\n<p id=\"605c\" class=\"pw-post-body-paragraph mv mw fo be b gm mx my mz gp na nb nc nd ne nf ng nh ni nj nk nl nm nn no np fh bj\" data-selectable-paragraph=\"\">Similarly, another method for photorealistic style transfer is<a class=\"af mu\" href=\"https:\/\/arxiv.org\/pdf\/1903.07291v1.pdf\" target=\"_blank\" rel=\"noopener ugc nofollow\"> <em class=\"nv\">Semantic Image Synthesis with Spatially-Adaptive Normalization<\/em><\/a>. SPADE, or spatially-adaptive normalization, acts as a layer for synthesizing images given an input semantic layout instead of directly feeding the semantic layout as input to a neural network. This input is then processed through stacks of convolution, normalization, and non-linearity layers. In many cases, traversing through this deep network of layers tends to \u201cwash away\u201d semantic information.<\/p>\n<p id=\"6947\" class=\"pw-post-body-paragraph mv mw fo be b gm mx my mz gp na nb nc nd ne nf ng nh ni nj nk nl nm nn no np fh bj\" data-selectable-paragraph=\"\">NVIDIA\u2019s GauGAN is based on this approach. It creates photorealistic images from segmentation maps, which are labeled sketches that depict the layout of a scene. Users can use paintbrush and paint bucket tools to design their own landscapes with labels like river, rock, and cloud.<\/p>\n<p id=\"7c49\" class=\"pw-post-body-paragraph mv mw fo be b gm mx my mz gp na nb nc nd ne nf ng nh ni nj nk nl nm nn no np fh bj\" data-selectable-paragraph=\"\">A style transfer algorithm allows users to apply filters \u2014 change a daytime scene to sunset, or a photograph to a painting. Users can even upload their own filters to layer onto their masterpieces, or upload custom segmentation maps and landscape images as a foundation for their artwork.<\/p>\n<\/div>\n<\/div>\n<\/div>\n\n\n\n<div class=\"fh fi fj fk fl\">\n<div class=\"ab ca\">\n<div class=\"ch bg et eu ev ew\">\n<h2 id=\"4860\" class=\"qq nx fo be ny qr qs qt ob qu qv qw oe nd qx qy qz nh ra rb rc nl rd re rf rg bj\" data-selectable-paragraph=\"\"><strong class=\"al\">Image2StyleGAN: How to Embed Images Into the StyleGAN Latent Space?<\/strong><\/h2>\n<figure class=\"mg mh mi mj mk mf mq mr paragraph-image\">\n<div class=\"nr ns eb nt bg nu\" tabindex=\"0\" role=\"button\">\n<figure><img loading=\"lazy\" decoding=\"async\" class=\"bg ml mm c\" role=\"presentation\" src=\"https:\/\/miro.medium.com\/v2\/resize:fit:700\/1*PojcqygXdVBD4p6de-xrsA.png\" alt=\"\" width=\"700\" height=\"284\"><\/figure><div class=\"mq mr ro\"><picture><source srcset=\"https:\/\/miro.medium.com\/v2\/resize:fit:640\/format:webp\/1*PojcqygXdVBD4p6de-xrsA.png 640w, https:\/\/miro.medium.com\/v2\/resize:fit:720\/format:webp\/1*PojcqygXdVBD4p6de-xrsA.png 720w, https:\/\/miro.medium.com\/v2\/resize:fit:750\/format:webp\/1*PojcqygXdVBD4p6de-xrsA.png 750w, https:\/\/miro.medium.com\/v2\/resize:fit:786\/format:webp\/1*PojcqygXdVBD4p6de-xrsA.png 786w, https:\/\/miro.medium.com\/v2\/resize:fit:828\/format:webp\/1*PojcqygXdVBD4p6de-xrsA.png 828w, https:\/\/miro.medium.com\/v2\/resize:fit:1100\/format:webp\/1*PojcqygXdVBD4p6de-xrsA.png 1100w, https:\/\/miro.medium.com\/v2\/resize:fit:1400\/format:webp\/1*PojcqygXdVBD4p6de-xrsA.png 1400w\" type=\"image\/webp\" sizes=\"(min-resolution: 4dppx) and (max-width: 700px) 50vw, (-webkit-min-device-pixel-ratio: 4) and (max-width: 700px) 50vw, (min-resolution: 3dppx) and (max-width: 700px) 67vw, (-webkit-min-device-pixel-ratio: 3) and (max-width: 700px) 65vw, (min-resolution: 2.5dppx) and (max-width: 700px) 80vw, (-webkit-min-device-pixel-ratio: 2.5) and (max-width: 700px) 80vw, (min-resolution: 2dppx) and (max-width: 700px) 100vw, (-webkit-min-device-pixel-ratio: 2) and (max-width: 700px) 100vw, 700px\"><source srcset=\"https:\/\/miro.medium.com\/v2\/resize:fit:640\/1*PojcqygXdVBD4p6de-xrsA.png 640w, https:\/\/miro.medium.com\/v2\/resize:fit:720\/1*PojcqygXdVBD4p6de-xrsA.png 720w, https:\/\/miro.medium.com\/v2\/resize:fit:750\/1*PojcqygXdVBD4p6de-xrsA.png 750w, https:\/\/miro.medium.com\/v2\/resize:fit:786\/1*PojcqygXdVBD4p6de-xrsA.png 786w, https:\/\/miro.medium.com\/v2\/resize:fit:828\/1*PojcqygXdVBD4p6de-xrsA.png 828w, https:\/\/miro.medium.com\/v2\/resize:fit:1100\/1*PojcqygXdVBD4p6de-xrsA.png 1100w, https:\/\/miro.medium.com\/v2\/resize:fit:1400\/1*PojcqygXdVBD4p6de-xrsA.png 1400w\" sizes=\"(min-resolution: 4dppx) and (max-width: 700px) 50vw, (-webkit-min-device-pixel-ratio: 4) and (max-width: 700px) 50vw, (min-resolution: 3dppx) and (max-width: 700px) 67vw, (-webkit-min-device-pixel-ratio: 3) and (max-width: 700px) 65vw, (min-resolution: 2.5dppx) and (max-width: 700px) 80vw, (-webkit-min-device-pixel-ratio: 2.5) and (max-width: 700px) 80vw, (min-resolution: 2dppx) and (max-width: 700px) 100vw, (-webkit-min-device-pixel-ratio: 2) and (max-width: 700px) 100vw, 700px\" data-testid=\"og\"><\/picture><\/div>\n<\/div>\n<figcaption class=\"mn mo mp mq mr ms mt be b bf z dv\" data-selectable-paragraph=\"\">Morphing between two embedded images is a good example of <a class=\"af mu\" href=\"https:\/\/arxiv.org\/pdf\/1904.03189v1.pdf\" target=\"_blank\" rel=\"noopener ugc nofollow\">latent space embedding within a StyleGAN<\/a><\/figcaption>\n<\/figure>\n<p id=\"23f9\" class=\"pw-post-body-paragraph mv mw fo be b gm mx my mz gp na nb nc nd ne nf ng nh ni nj nk nl nm nn no np fh bj\" data-selectable-paragraph=\"\">Yet another algorithm aims to <em class=\"nv\">embed <\/em>a given image into the latent space of StyleGAN. Supposedly, this embedding enables semantic image editing operations that can be applied to existing photographs. Taking the StyleGAN trained on the FFHD dataset as an example, researchers were able to successfully demonstrate results for image morphing, style transfer, and expression transfer.<\/p>\n<h1 id=\"fce3\" class=\"nw nx fo be ny nz oa go ob oc od gr oe of og oh oi oj ok ol om on oo op oq or bj\" data-selectable-paragraph=\"\">Conclusion<\/h1>\n<blockquote class=\"rp rq rr\"><p id=\"3884\" class=\"mv mw nv be b gm mx my mz gp na nb nc ox ne nf ng oy ni nj nk oz nm nn no np fh bj\" data-selectable-paragraph=\"\">\u201c You can\u2019t synthesize a picture out of nothing, we assume; a picture had to be of someone. Sure a scammer could appropriate someone else\u2019s picture, but doing so is a risky strategy in a world with google reverse search and so forth. So we tend to trust pictures. A business profile with a picture obviously belongs to someone. A match on a dating site may turn out to be 10 pounds heavier or 10 years older than when a picture was taken, but if there\u2019s a picture, the person obviously exists.<\/p><p id=\"4fed\" class=\"mv mw nv be b gm mx my mz gp na nb nc ox ne nf ng oy ni nj nk oz nm nn no np fh bj\" data-selectable-paragraph=\"\">No longer. New adversarial machine learning algorithms allow people to rapidly generate synthetic \u2018photographs\u2019 of people who have never existed.\u201d<\/p><p id=\"527b\" class=\"mv mw nv be b gm mx my mz gp na nb nc ox ne nf ng oy ni nj nk oz nm nn no np fh bj\" data-selectable-paragraph=\"\">\u2014 <a class=\"af mu\" href=\"https:\/\/callingbullshit.org\/\" target=\"_blank\" rel=\"noopener ugc nofollow\"><strong class=\"be qj\"><em class=\"fo\">West and Bergstrom<\/em><\/strong><\/a><\/p><\/blockquote>\n<p id=\"6ba8\" class=\"pw-post-body-paragraph mv mw fo be b gm mx my mz gp na nb nc nd ne nf ng nh ni nj nk nl nm nn no np fh bj\" data-selectable-paragraph=\"\">StyleGAN is easily the most powerful GAN in existence. With the ability to generate synthesized images from scratch in high resolution, some would dub its capabilities scary.<\/p>\n<p id=\"66eb\" class=\"pw-post-body-paragraph mv mw fo be b gm mx my mz gp na nb nc nd ne nf ng nh ni nj nk nl nm nn no np fh bj\" data-selectable-paragraph=\"\">Given that we live in an age where many security systems rely on measures such as facial recognition and images form a major part of all the data on the web, it\u2019s important for people to be aware of such technology and to not unquestioningly trust information just because it\u2019s in image form.<\/p>\n<p id=\"f351\" class=\"pw-post-body-paragraph mv mw fo be b gm mx my mz gp na nb nc nd ne nf ng nh ni nj nk nl nm nn no np fh bj\" data-selectable-paragraph=\"\">On the one hand, such advances in machine learning make redundant specialized skills in the domain of image manipulation and engineering. On the other, they offer opportunities for developing skill sets specific to the machine learning domain, for which there is currently a huge demand. As ML development and incorporation grow more commonplace in every sector, we can expect many more milestones such as these soon. Stay tuned!<\/p>\n<\/div>\n<\/div>\n<\/div>\n","protected":false},"excerpt":{"rendered":"<p>Ever wondered what the 27th letter in the English alphabet might look like? Or how your appearance would be twenty years from now? Or perhaps how that super-grumpy professor of yours might look with a big, wide smile on his face? Thanks to machine learning, all this is not only possible, but relatively easy to [&hellip;]<\/p>\n","protected":false},"author":37,"featured_media":0,"comment_status":"closed","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"customer_name":"","customer_description":"","customer_industry":"","customer_technologies":"","customer_logo":"","footnotes":""},"categories":[6],"tags":[],"coauthors":[149],"class_list":["post-6195","post","type-post","status-publish","format-standard","hentry","category-machine-learning"],"yoast_head":"<!-- This site is optimized with the Yoast SEO Premium plugin v25.9 (Yoast SEO v25.9) - https:\/\/yoast.com\/wordpress\/plugins\/seo\/ -->\n<title>StyleGAN: Use machine learning to generate and customize realistic images - Comet<\/title>\n<meta name=\"robots\" content=\"index, follow, max-snippet:-1, max-image-preview:large, max-video-preview:-1\" \/>\n<link rel=\"canonical\" href=\"https:\/\/www.comet.com\/site\/blog\/stylegan-use-machine-learning-to-generate-and-customize-realistic-images\/\" \/>\n<meta property=\"og:locale\" content=\"en_US\" \/>\n<meta property=\"og:type\" content=\"article\" \/>\n<meta property=\"og:title\" content=\"StyleGAN: Use machine learning to generate and customize realistic images\" \/>\n<meta property=\"og:description\" content=\"Ever wondered what the 27th letter in the English alphabet might look like? Or how your appearance would be twenty years from now? Or perhaps how that super-grumpy professor of yours might look with a big, wide smile on his face? Thanks to machine learning, all this is not only possible, but relatively easy to [&hellip;]\" \/>\n<meta property=\"og:url\" content=\"https:\/\/www.comet.com\/site\/blog\/stylegan-use-machine-learning-to-generate-and-customize-realistic-images\/\" \/>\n<meta property=\"og:site_name\" content=\"Comet\" \/>\n<meta property=\"article:publisher\" content=\"https:\/\/www.facebook.com\/cometdotml\" \/>\n<meta property=\"article:published_time\" content=\"2023-06-15T16:07:11+00:00\" \/>\n<meta property=\"article:modified_time\" content=\"2025-04-24T17:15:25+00:00\" \/>\n<meta property=\"og:image\" content=\"https:\/\/www.comet.com\/site\/wp-content\/uploads\/2023\/06\/1GIH1Lt7xXzK4Yk9X0y9GWA-1024x768.webp\" \/>\n<meta name=\"author\" content=\"Jamshed Khan\" \/>\n<meta name=\"twitter:card\" content=\"summary_large_image\" \/>\n<meta name=\"twitter:creator\" content=\"@Cometml\" \/>\n<meta name=\"twitter:site\" content=\"@Cometml\" \/>\n<meta name=\"twitter:label1\" content=\"Written by\" \/>\n\t<meta name=\"twitter:data1\" content=\"Jamshed Khan\" \/>\n\t<meta name=\"twitter:label2\" content=\"Est. reading time\" \/>\n\t<meta name=\"twitter:data2\" content=\"16 minutes\" \/>\n<!-- \/ Yoast SEO Premium plugin. -->","yoast_head_json":{"title":"StyleGAN: Use machine learning to generate and customize realistic images - Comet","robots":{"index":"index","follow":"follow","max-snippet":"max-snippet:-1","max-image-preview":"max-image-preview:large","max-video-preview":"max-video-preview:-1"},"canonical":"https:\/\/www.comet.com\/site\/blog\/stylegan-use-machine-learning-to-generate-and-customize-realistic-images\/","og_locale":"en_US","og_type":"article","og_title":"StyleGAN: Use machine learning to generate and customize realistic images","og_description":"Ever wondered what the 27th letter in the English alphabet might look like? Or how your appearance would be twenty years from now? Or perhaps how that super-grumpy professor of yours might look with a big, wide smile on his face? Thanks to machine learning, all this is not only possible, but relatively easy to [&hellip;]","og_url":"https:\/\/www.comet.com\/site\/blog\/stylegan-use-machine-learning-to-generate-and-customize-realistic-images\/","og_site_name":"Comet","article_publisher":"https:\/\/www.facebook.com\/cometdotml","article_published_time":"2023-06-15T16:07:11+00:00","article_modified_time":"2025-04-24T17:15:25+00:00","og_image":[{"url":"https:\/\/www.comet.com\/site\/wp-content\/uploads\/2023\/06\/1GIH1Lt7xXzK4Yk9X0y9GWA-1024x768.webp","type":"","width":"","height":""}],"author":"Jamshed Khan","twitter_card":"summary_large_image","twitter_creator":"@Cometml","twitter_site":"@Cometml","twitter_misc":{"Written by":"Jamshed Khan","Est. reading time":"16 minutes"},"schema":{"@context":"https:\/\/schema.org","@graph":[{"@type":"Article","@id":"https:\/\/www.comet.com\/site\/blog\/stylegan-use-machine-learning-to-generate-and-customize-realistic-images\/#article","isPartOf":{"@id":"https:\/\/www.comet.com\/site\/blog\/stylegan-use-machine-learning-to-generate-and-customize-realistic-images\/"},"author":{"name":"Jamshed Khan","@id":"https:\/\/www.comet.com\/site\/#\/schema\/person\/d18d17839bcc14e70bfe02df42a19f47"},"headline":"StyleGAN: Use machine learning to generate and customize realistic images","datePublished":"2023-06-15T16:07:11+00:00","dateModified":"2025-04-24T17:15:25+00:00","mainEntityOfPage":{"@id":"https:\/\/www.comet.com\/site\/blog\/stylegan-use-machine-learning-to-generate-and-customize-realistic-images\/"},"wordCount":2689,"publisher":{"@id":"https:\/\/www.comet.com\/site\/#organization"},"image":{"@id":"https:\/\/www.comet.com\/site\/blog\/stylegan-use-machine-learning-to-generate-and-customize-realistic-images\/#primaryimage"},"thumbnailUrl":"https:\/\/www.comet.com\/site\/wp-content\/uploads\/2023\/06\/1GIH1Lt7xXzK4Yk9X0y9GWA-1024x768.webp","articleSection":["Machine Learning"],"inLanguage":"en-US"},{"@type":"WebPage","@id":"https:\/\/www.comet.com\/site\/blog\/stylegan-use-machine-learning-to-generate-and-customize-realistic-images\/","url":"https:\/\/www.comet.com\/site\/blog\/stylegan-use-machine-learning-to-generate-and-customize-realistic-images\/","name":"StyleGAN: Use machine learning to generate and customize realistic images - Comet","isPartOf":{"@id":"https:\/\/www.comet.com\/site\/#website"},"primaryImageOfPage":{"@id":"https:\/\/www.comet.com\/site\/blog\/stylegan-use-machine-learning-to-generate-and-customize-realistic-images\/#primaryimage"},"image":{"@id":"https:\/\/www.comet.com\/site\/blog\/stylegan-use-machine-learning-to-generate-and-customize-realistic-images\/#primaryimage"},"thumbnailUrl":"https:\/\/www.comet.com\/site\/wp-content\/uploads\/2023\/06\/1GIH1Lt7xXzK4Yk9X0y9GWA-1024x768.webp","datePublished":"2023-06-15T16:07:11+00:00","dateModified":"2025-04-24T17:15:25+00:00","breadcrumb":{"@id":"https:\/\/www.comet.com\/site\/blog\/stylegan-use-machine-learning-to-generate-and-customize-realistic-images\/#breadcrumb"},"inLanguage":"en-US","potentialAction":[{"@type":"ReadAction","target":["https:\/\/www.comet.com\/site\/blog\/stylegan-use-machine-learning-to-generate-and-customize-realistic-images\/"]}]},{"@type":"ImageObject","inLanguage":"en-US","@id":"https:\/\/www.comet.com\/site\/blog\/stylegan-use-machine-learning-to-generate-and-customize-realistic-images\/#primaryimage","url":"https:\/\/www.comet.com\/site\/wp-content\/uploads\/2023\/06\/1GIH1Lt7xXzK4Yk9X0y9GWA-1024x768.webp","contentUrl":"https:\/\/www.comet.com\/site\/wp-content\/uploads\/2023\/06\/1GIH1Lt7xXzK4Yk9X0y9GWA-1024x768.webp"},{"@type":"BreadcrumbList","@id":"https:\/\/www.comet.com\/site\/blog\/stylegan-use-machine-learning-to-generate-and-customize-realistic-images\/#breadcrumb","itemListElement":[{"@type":"ListItem","position":1,"name":"Home","item":"https:\/\/www.comet.com\/site\/"},{"@type":"ListItem","position":2,"name":"StyleGAN: Use machine learning to generate and customize realistic images"}]},{"@type":"WebSite","@id":"https:\/\/www.comet.com\/site\/#website","url":"https:\/\/www.comet.com\/site\/","name":"Comet","description":"Build Better Models Faster","publisher":{"@id":"https:\/\/www.comet.com\/site\/#organization"},"potentialAction":[{"@type":"SearchAction","target":{"@type":"EntryPoint","urlTemplate":"https:\/\/www.comet.com\/site\/?s={search_term_string}"},"query-input":{"@type":"PropertyValueSpecification","valueRequired":true,"valueName":"search_term_string"}}],"inLanguage":"en-US"},{"@type":"Organization","@id":"https:\/\/www.comet.com\/site\/#organization","name":"Comet ML, Inc.","alternateName":"Comet","url":"https:\/\/www.comet.com\/site\/","logo":{"@type":"ImageObject","inLanguage":"en-US","@id":"https:\/\/www.comet.com\/site\/#\/schema\/logo\/image\/","url":"https:\/\/www.comet.com\/site\/wp-content\/uploads\/2025\/01\/logo_comet_square.png","contentUrl":"https:\/\/www.comet.com\/site\/wp-content\/uploads\/2025\/01\/logo_comet_square.png","width":310,"height":310,"caption":"Comet ML, Inc."},"image":{"@id":"https:\/\/www.comet.com\/site\/#\/schema\/logo\/image\/"},"sameAs":["https:\/\/www.facebook.com\/cometdotml","https:\/\/x.com\/Cometml","https:\/\/www.youtube.com\/channel\/UCmN63HKvfXSCS-UwVwmK8Hw"]},{"@type":"Person","@id":"https:\/\/www.comet.com\/site\/#\/schema\/person\/d18d17839bcc14e70bfe02df42a19f47","name":"Jamshed Khan","image":{"@type":"ImageObject","inLanguage":"en-US","@id":"https:\/\/www.comet.com\/site\/#\/schema\/person\/image\/27af8c26f0585e3b34635f070418a865","url":"https:\/\/www.comet.com\/site\/wp-content\/uploads\/2023\/08\/1561664362839-96x96.jpg","contentUrl":"https:\/\/www.comet.com\/site\/wp-content\/uploads\/2023\/08\/1561664362839-96x96.jpg","caption":"Jamshed Khan"},"url":"https:\/\/www.comet.com\/site\/blog\/author\/jamshedkhan\/"}]}},"_links":{"self":[{"href":"https:\/\/www.comet.com\/site\/wp-json\/wp\/v2\/posts\/6195","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/www.comet.com\/site\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/www.comet.com\/site\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/www.comet.com\/site\/wp-json\/wp\/v2\/users\/37"}],"replies":[{"embeddable":true,"href":"https:\/\/www.comet.com\/site\/wp-json\/wp\/v2\/comments?post=6195"}],"version-history":[{"count":1,"href":"https:\/\/www.comet.com\/site\/wp-json\/wp\/v2\/posts\/6195\/revisions"}],"predecessor-version":[{"id":15615,"href":"https:\/\/www.comet.com\/site\/wp-json\/wp\/v2\/posts\/6195\/revisions\/15615"}],"wp:attachment":[{"href":"https:\/\/www.comet.com\/site\/wp-json\/wp\/v2\/media?parent=6195"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/www.comet.com\/site\/wp-json\/wp\/v2\/categories?post=6195"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/www.comet.com\/site\/wp-json\/wp\/v2\/tags?post=6195"},{"taxonomy":"author","embeddable":true,"href":"https:\/\/www.comet.com\/site\/wp-json\/wp\/v2\/coauthors?post=6195"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}