XR and Artificial Intelligence (AI): A Perfect Match

Written BY

Emily Friedman

December 15, 2023

It may seem like generative AI has replaced the metaverse in the hearts and minds of technology enthusiasts, but there is a sort of symbiotic relationship between the two technologies: AI is being increasingly integrated into metaverse applications, while XR will serve - as Microsoft exec Lili Cheng put it - as “the eyes and ears of AI.”

Moreover, advancements in artificial intelligence (AI) support advancements in XR hardware, allowing for ever lighter, more capable, and efficient devices. AI is also vital for developing and scaling the metaverse into what we all envision the metaverse to be: A shared, persistent virtual world that responds to you, the user, and recognizes your surroundings.

Here are 3 ways AI is already impacting the (enterprise) metaverse:

A quick refresher on AI technology

Artificial Intelligence (AI) is a branch of computer science concerned with training machines to mimic human intelligence in order to perform tasks ranging from simple perception to complex problem solving and reasoning. AI systems learn to simulate human cognitive functions by analyzing large amounts of data, looking for patterns and creating rules or algorithms to inform decisions. The most popular and familiar applications of AI are OpenAI’s DALL-E text-to-image tool and ChatGPT. Other examples include voice assistants like Alexa and Siri, predictive text, facial recognition, and self-driving cars.

Machine learning (ML) is a subset of AI that focuses on developing algorithms that improve their own performance through experience. ML models can learn from data and make decisions or predictions without being explicitly programmed to do so. Deep learning is a subset of ML that uses artificial neural networks (ANNs) inspired by the structure of the human brain to learn from enormous data sets.

A few more terms relevant to this article: Generative AI or GenAI is AI that can generate new (original) content such as images, text, and even code, whereas conversational AI can simulate natural human conversation.

‍

Generating XR Content with AI

Some people are using GenAI to generate marketing copy; others use it to generate art in various styles, but as the technology advances it also has the potential to help populate the metaverse with 3D content, particularly user-generated 3D content. (What would the internet be today, after all, without user-generated content?)

Creating 3D content such as a digital twin of a machine or a virtual store with virtual goods is challenging. Until recently, enterprises have found photorealistic metaverse apps like virtual training simulations to be expensive and time-consuming to develop, typically requiring special talent. For these apps to proliferate and the metaverse to succeed, we need to both accelerate and democratize 3D content creation. Regular people (i.e. non-developers) need to be able to quickly create and populate virtual environments.

The ‘dream’ is to be able to speak 3D assets, avatars, and even entire virtual worlds into existence, but genAI is already making inroads in the creation of 3D objects from text and 2D image prompts. Google’s DreamFusion AI, for instance, is capable of turning text like “I need an office chair with two armrests and a cushion” into a 3D model; and NVIDIA offers enterprise tools for turning 2D images into 3D assets for virtual training simulations.

AI will also help bring AI-generated 3D content to life, adding things like texture to virtual objects to make them appear and behave realistically so users can realistically interact with those objects. Animators and riggers today make use of AI-powered engines and toolkits to speed up and improve the accuracy of their work. It’s only a matter of time before AI will be able to animate and rig complex 3D models on the fly. (Rigging is the process of creating a skeleton for a 3D model including avatars so it can move realistically.)

Training with AI in the Metaverse

While GenAI can help speed up the creation of virtual training sims and meeting rooms, conversational AI increases the realism of training with avatars in virtual environments. ChatGPT is already improving role playing in VR, allowing users to practice for job interviews, sales pitches, and more with avatars capable of responding to their voice and providing personalized feedback.

The combo of VR and AI for training is a powerful one. We know the myriad benefits of VR training: Greater engagement and retention, the ability to practice skills over and over in a risk-free environment, etc. AI can help tailor the experience to the user, providing instructions when needed and altering the scenario in real time based on the user’s actions and responses.

The integration of ChatGPT in particular makes it possible to conduct natural two-way conversations with avatar versions of interviewers, employees, and clients, further enhancing engagement and increasing the immersiveness and thus efficacy of the virtual training scenario. Imagine instead of pre-programmed interview questions, the avatar asks questions and follow-up questions related to your specific role/job experience and live responses, perhaps in multiple languages! As conversational AI advances, such applications will only get more sophisticated.

Beyond conversational AI, organizations are exploring AI to analyze new hires’ progress, predict their needs, compile reports on onboarding and training trends, and more. And then there are virtual (AI) assistants in general. Multimodal AI that understands text and audio as well as objects, drawings, hand gestures, etc. is on the rise. Meta’s AI assistant is now capable of seeing and hearing through Meta’s Ray-Ban smart glasses in order to identify objects and translate languages–it’s not hard to imagine using the tech to support workers.

Comfort and Performance

You can think of AI and XR as highly compatible technologies. XR may be the more dependent one in the relationship, but when the two work together we can achieve increasingly comfortable and seamless immersive experiences.

Metaverse applications require a lot of computing power. AI allows XR devices to use less power in order to process data and render content. One example is foveated rendering, which uses eye tracking and AI to render the image in the center of your vision more crisply than in your peripheral vision, thus reducing the device’s workload.

Think about it: For you to realistically experience the virtual world through an XR headset, it needs to be able to perceive, understand, and track your physical environment. Spatial mapping, object recognition, hand tracking, etc.–all powered by AI algorithms. AI can even anticipate where you’re going to turn your head next, improve tracking accuracy, and more to accelerate rendering and save on power.

AI also enables better interactions and navigation in the metaverse. For one, AI is behind voice and gesture recognition, freeing up our hands (from controllers) in the metaverse. Apple is already conditioning consumers to use neural interfaces - the double tap gesture - to engage with technology like its latest Apple Watch. (AI helps to process such signals.) Advances in artificial intelligence will make our interactions with virtual content feel more and more natural.

‍