Google’s launch of Gemini can be seen as the latest advancement in generative AI, highlighting a shift toward multimodality.
At launch, ChatGPT (GPT3.5) revolutionized content production, and subsequent large multimodal models (LMMs) like GPT4 and Gemini have the potential to revolutionize sectors such as manufacturing, e-commerce, and agriculture.
These new LMMs are trained on images and code, rather than on text alone. Gemini adds audio and video, allowing the AI to directly perceive the physical world.
The race is on among tech companies and open source communities to add new modalities that enhance LMMs’ industrial applications.
Such multimodal capability will be transformational for industry, says Leonid Zhukov, director of the BCG Global AI Institute.
Traditional AI is constrained by preset rules—users decide what they want the AI to do and train it for that task. While GenAI models break free from this constraint, LMMs go even further. They can take in so many forms of data that they could respond to seemingly unlimited situations in the physical world, including those that users can’t predict, Zhukov explains.
Companies’ current 10-20% efficiency gains from GenAI bots could expand into new domains with LMMs, he says.
And this is just the beginning. “Today’s LMMs can see and hear the world. Tomorrow they could also be trained on digital signals from equipment, IoT sensors, or customer transaction data—to create a complete picture of your enterprise’s health on its own, without explicit instruction,” Zhukov says.
Here are just a few potential industrial applications:
Firms need to prepare to integrate multimodal models. According to Zhukov, leaders should:
BCG X is the tech build & design unit of BCG.
Turbocharging BCG’s deep industry and functional expertise, BCG X brings together advanced tech knowledge and ambitious entrepreneurship to help organizations enable innovation at scale.
With nearly 3,000 technologists, scientists, programmers, engineers, and human-centered designers located across 80+ cities, BCG X builds and designs platforms and software to address the world’s most important challenges and opportunities.
Teaming across our practices, and in close collaboration with our clients, our end-to-end global team unlocks new possibilities. Together we’re creating the bold and disruptive products, services, and businesses of tomorrow.
The BCG Henderson Institute is Boston Consulting Group’s strategy think tank, dedicated to exploring and developing valuable new insights from business, technology, and science by embracing the powerful technology of ideas. The Institute engages leaders in provocative discussion and experimentation to expand the boundaries of business theory and practice and to translate innovative ideas from within and beyond business. For more ideas and inspiration from the Institute, please visit our website and follow us on LinkedIn and X (formerly Twitter).
Related Content
Read more insights from BCG’s teams of experts.
BCGは、コンサルティング業務における生成AIの活用を検証する最新の実験を行いました。それによると、人間は生成AIが大きく価値貢献できる分野では不信感を抱き、逆に生成AIに適性がない分野では期待を寄せすぎているようです。
生成AIは驚異的なテクノロジーですが、経営リーダーたちはそれを活用して成果を上げる準備を整えています。企業が今すぐに取り組むべき3つの価値創出戦略を提示します。
Today’s large language models are the just start of the GenAI revolution—companies need to prepare for what’s coming next: autonomous agents that work independently to achieve an assigned goal.