Microsoft's new AI agent can control software and robots

from Ars Technica 4 months ago

On Wednesday, Microsoft Research unveiled Magma, an innovative AI model that merges visual and language processing to manage software interfaces and robotics. It aims to advance multimodal AI capabilities, functioning interactively in both real and digital realms. Magma distinguishes itself as the first model to natively act on multimodal data, integrating perception and control within one framework. This step towards agentic AI allows it to autonomously craft plans and execute tasks, positioning it alongside other projects in the AI agent landscape like OpenAI's endeavors. Its collaborative development involved prominent institutions, suggesting a broad push for sophisticated AI applications.

Given a described goal, Magma is able to formulate plans and execute actions to achieve it. By effectively transferring knowledge from freely available visual and language data, Magma bridges verbal, spatial, and temporal intelligence to navigate complex tasks and settings.

Unlike many prior multimodal AI systems that require separate models for perception and control, Magma integrates these abilities into a single foundation model.

Read at Ars Technica

#ai #magma #multimodal-ai #robotics #agentic-ai

Collection

[

...

]

Microsoft's new AI agent can control software and robotsMicrosoft's new AI agent can control software and robots Briefly

Microsoft's new AI agent can control software and robots
Microsoft's new AI agent can control software and robots
Briefly