Generative Experiences: Race to the Future

What is Generative AI?

Generative AI is a branch of artificial intelligence technology that produces new and unique content (e.g. artwork, video, 3D worlds) through machine learning. The machine will learn to recognize patterns (e.g. a cat will have two eyes, one nose, whiskers, pointy ears) and create content that is relevant to the input (e.g. cat) it receives. We are now seeing rapid development and reinvention of this technology both in business and in general public use. Generative AI is now experiential. Let’s take a look at 4 types of generative experiences and their future implications.

4 types of Generative Experiences

Text-to-image

Generative AI first hit the ground running due to the buzz around deep fakes. Being able to generate an image with AI by using the likeness of a celebrity is definitely attention-grabbing, but text-to-image AI can be used to accomplish other experiences too. At Iterate.ai, we’ve been using platforms like Midjourney and Dall-E to create images for marketing content and sales decks. In fact, the title image corresponding to this blog post was created using Midjourney. All it takes is organizing the right chain of adjectives and nouns to render the perfect image for our marketing materials. For us, using generative AI has made content creation a faster, easier, and exciting experience for our organization.

Text-to-video

We are now seeing text-to-video generative AI, and prominent tech giants, Meta and Google, are simultaneously rolling out these experiences in real-time. Meta first launched the concept of their Make-A-Video platform in late September of this year. The videos produced by this AI are short, without sound, and give that very uncanny valley feel. Although unavailable to users just yet, it is fascinating to see how quickly this technology is being deployed. Soon after Meta announced their generative platform, Google demonstrated two text-to-video AI systems: Imagen Video and Phenaki. Google has been focusing their generative AI efforts in providing longer videos with higher quality, and just like Meta’s prototype, the videos are equally captivating and unsettling.

Text-to-sound

AI-generated text-to-voice systems, like Resemble.ai and Play.ht have been around for a bit, but Meta is taking it to the next level by creating a text-to-sound platform, where a highly realistic audio file is generated through a text prompt, like “whistling with wind blowing.” Google also has skin in this game with their AudioLM technology, an AI system that can create natural-sounding music and speech from being prompted a few seconds of audio. For open-source options, Harmonai is a community-driven organization that offers generative audio tools for music production- most notably, their Dance Diffusion model. From creating synthetic voices, music, and soothing sounds, these generative experiences are being prototyped at a rapid speed.

VR, 3D, and Metaverse

We are also starting to see generative experiences merging into the VR, 3D, and Metaverse spaces. On this side of the house, platforms like Blender make it possible for users to generate their own 3D models and virtual worlds using free, open-source software. Generative AI systems that were once known for their generated images, like Stable Diffusion, are now involved in creating VR experiences for users to move freely in a new VR world. NVIDIA’s team is also working to provide generative 3D experiences for users with their NVIDIA GET3D model. Their model is trained using 2D images and is able to generate 3D shapes with detailed accuracy. With VR and 3D models being generated with AI, we are even more convinced that the Metaverse will be procedurally generated. We can see all of these generative tools being intentionally used to favor users’ metaversal experiences.

Future Implications

With the amalgamation of generative AI in images, videos, sounds, VR, 3D, and the Metaverse, there are an infinite amount of future possibilities for how we will experience art, music, gaming, and digital realities. Will our future generative experiences be led by today’s tech giants, or will a new generation of tech giants emerge? Will generative AI be completely open-sourced for the general public, or will it become a tool that only companies will use? What about when the AI becomes incredibly well-trained, making it impossible to decipher whether it’s a real image/video or just AI-generated content? Meta has already addressed the latter by naming that they plan on watermarking all their generated content so that viewers know that it is not a real captured video, for example.

Generative experiences are also projected to provide great economic value in the near future–trillions of dollars worth of value. Sequoia Capital, an extremely successful VC firm, asserts that generative AI is our future and has the potential to change every single industry that requires humans to create original content. Some investors firmly believe that generative AI is the next transformative technological shift, like the smartphone or world wide web. We are absolutely keeping our eyes on these rapid developments related to generative experiences.

Connect with us here to continue the conversation around generative experiences and how they can benefit your organization.

Carolina Mozee