Concept: American multinational technology company Nvidia has developed an AI model called GauGAN2 that can create landscape images with a mix of words and drawings. It enables users to type phrases and the AI technology would generate the scene in real-time, as claimed. This deep learning model also allows users to insert additional adjectives or swap certain words to modify the picture instantly.
Nature of Disruption: GauGAN2 combines segmentation mapping, text-to-image generation, and inpainting in a single model. It allows artists to use text, a paintbrush, and paint bucket tools or a combination to design their own landscapes. The platform’s style transfer algorithm can help creators to apply filters to change a photorealistic image to a painting or a daytime scene to sunset. It is powered by generative adversarial networks (GANs) and the GauGAN2 text-to-image capability that can be accessed through the NVIDIA AI Demos site. Additionally, the users can upload their filters to layer onto the image or upload custom segmentation maps and landscape images as a foundation for their artwork. The company claims that GauGAN2’s AI model is trained on 10 million high-quality landscape photographs on the NVIDIA Selene supercomputer.
Outlook: Nvidia claims that GauGAN2’s neural network can help produce a greater variety and higher quality of images compared to state-of-the-art models specifically for text-to-image or segmentation map-to-image applications. This research model has shown possibilities for powerful image-generation tools for artists soon. Nvidia states that its previous version GauGAN is already been used for the creation of concept art for films and video games. It plans to make GauGAN2 code available on GitHub while providing an interactive demo on Playground, the web hub for Nvidia’s AI and deep learning research.
This article was originally published in Verdict.co.uk