#ui-automation

[ follow ]
Artificial intelligence
fromInfoQ
5 days ago

Google DeepMind Launches Gemini 2.5 Computer Use Model to Power UI-Controlling AI Agents

Gemini 2.5 Computer Use enables AI agents to perceive and manipulate graphical user interfaces—clicking, typing, scrolling—via a looped screenshot-and-action API, showing strong benchmark performance.
fromThe Verge
6 days ago

Google's latest AI model uses a web browser like you do

Google is previewing a new Gemini AI model designed to navigate and interact with the web via a browser, letting AI agents do things inside interfaces designed for use by people and not robots. The model, called Gemini 2.5 Computer Use, uses "visual understanding and reasoning capabilities" to analyze a user's request and carry out a task, such as filling out and submitting a form. It can be used for UI testing or navigating interfaces made for people who don't have an API or other direct connection available.
Artificial intelligence
[ Load more ]