Microsoft has introduced Magma, a groundbreaking generative AI model designed to autonomously interact with both software and physical robots using various forms of input, including text, images, and video. This multimodal capability sets Magma apart as it can not only interpret information across different formats but also operate interfaces and handle objects directly. To foster innovation, Microsoft plans to release parts of Magma’s code on GitHub next week, providing researchers with the opportunity to experiment and contribute to the model's development.
Magma is a pioneering generative AI model by Microsoft capable of controlling software and robots, acting autonomously based on multimodal data like text and images.
This model represents a significant advancement as it can interpret various forms of data, enabling it to manage both digital interfaces and physical objects seamlessly.
Next week, Microsoft plans to release portions of Magma’s code on GitHub, encouraging researchers to experiment with and expand upon this innovative technology.
Collection
[
|
...
]