#ui-automation

[ follow ]
fromComputerworld
3 days ago

Microsoft's Fara-7B brings AI agents to the PC with on-device automation

"Unlike traditional chat models that generate text-based responses, Computer Use Agent (CUA) models like Fara-7B leverage computer interfaces, such as a mouse and keyboard, to complete tasks on behalf of users," Microsoft said in a blog post. "With only 7 billion parameters, Fara-7B achieves state-of-the-art performance within its size class and is competitive with larger, more resource-intensive agentic systems that depend on prompting multiple large models."
Software development
Artificial intelligence
fromInfoQ
1 month ago

Google DeepMind Launches Gemini 2.5 Computer Use Model to Power UI-Controlling AI Agents

Gemini 2.5 Computer Use enables AI agents to perceive and manipulate graphical user interfaces—clicking, typing, scrolling—via a looped screenshot-and-action API, showing strong benchmark performance.
fromThe Verge
1 month ago

Google's latest AI model uses a web browser like you do

Google is previewing a new Gemini AI model designed to navigate and interact with the web via a browser, letting AI agents do things inside interfaces designed for use by people and not robots. The model, called Gemini 2.5 Computer Use, uses "visual understanding and reasoning capabilities" to analyze a user's request and carry out a task, such as filling out and submitting a form. It can be used for UI testing or navigating interfaces made for people who don't have an API or other direct connection available.
Artificial intelligence
[ Load more ]