City
Epaper

After ChatGPT, Microsoft working on AI model that takes images as cues

By IANS | Updated: March 3, 2023 18:50 IST

New Delhi, March 3 As the war over artificial intelligence (AI) chatbots heat up, Microsoft has unveiled Kosmos-1, ...

Open in App

New Delhi, March 3 As the war over artificial intelligence (AI) chatbots heat up, Microsoft has unveiled Kosmos-1, a new AI model that can also respond to visual cues or images, apart from text prompts or messages.

The multimodal large language model (MLLM) can help in an array of new tasks, including image captioning, visual question answering and more.

Kosmos-1 can pave the way for the next-stage beyond ChatGPT's text prompts.

"A big convergence of language, multimodal perception, action, and world modeling is a key step toward artificial general intelligence. In this work, we introduce Kosmos-1, a Multimodal Large Language Model (MLLM) that can perceive general modalities, learn in context and follow instructions," said Microsoft's AI researchers in a paper.

The paper suggests that multimodal perception, or knowledge acquisition and "grounding" in the real world, is needed to move beyond ChatGPT-like capabilities to artificial general intelligence (AGI), reports ZDNet.

"More importantly, unlocking multimodal input greatly widens the applications of language models to more high-value areas, such as multimodal machine learning, document intelligence, and robotics," the paper read.

The goal is to align perception with LLMs, so that the models are able to see and talk.

Experimental results showed that Kosmos-1 achieves impressive performance on language understanding, generation, and even when directly fed with document images.

It also showed good results in perception-language tasks, including multimodal dialogue, image captioning, visual question answering, and vision tasks, such as image recognition with descriptions (specifying classification via text instructions).

Disclaimer: This post has been auto-published from an agency feed without any modifications to the text and has not been reviewed by an editor

Tags: microsoftAGI
Open in App

Related Stories

Business‘Microsoft Is a Digital Weapons Manufacturer’: Indian-American Engineer Calls Out Gates, Ballmer, Nadella Over AI Ties to Gaza War (Watch Video)

TechnologyMicrosoft to Bid for TikTok: Will the App Make a Comeback in India?

TechnologyMicrosoft Layoffs: Company Plans Workforce Reduction in 2025, Targets Low-Performing Employees

TechnologyMicrosoft To Train 10 Million People in India on AI Skills by 2030, Says Satya Nadella

NationalPM Narendra Modi Meets Microsoft CEO Satya Nadella, Says Discussion on Tech, Innovation and AI

Technology Realted Stories

TechnologyTech Mahindra headcount drops by 1,757 in Q4

TechnologyCentre extends financial aid to indigenous indoor air purification solution

TechnologyIndia to soon launch safety assessment rating for trucks and heavy vehicles: Nitin Gadkari

TechnologyBroadband subscribers stand at 944.04 million in Feb, tele-density up: TRAI

TechnologyIndia achieves breakthrough in gene therapy for haemophilia: Minister