We let OpenAI's "Agent Mode" surf the web for us-here's what happened
Briefly

We let OpenAI's "Agent Mode" surf the web for us-here's what happened
"But Atlas also goes beyond the usual LLM back-and-forth with Agent Mode, a "preview mode" feature the company says can "get work done for you" by clicking, scrolling, and reading through various tabs. "Agentic" AI is far from new, of course; OpenAI itself rolled out a preview of the web browsing Operator agent in January"
"I wanted to put Atlas' Agent Mode through its paces to see if it could really save me time in doing the kinds of tedious online tasks I plod through every day. In each case, I'll outline a web-based problem, lay out the Agent Mode prompt I devised to try to solve it, and describe the results. My final evaluation will rank each task on a 10-point scale, with 10 being "did exactly what I wanted with no problems" and one being "complete failure.""
Atlas is a web browser integrated with ChatGPT that adds Agent Mode, a preview feature able to click, scroll, and read across tabs to complete web tasks. OpenAI previously released Operator and ChatGPT agents, and Atlas prominently surfaces agentic capabilities to end users. Agent Mode is evaluated on routine online tasks such as playing web games and handling tedious workflows. Tasks are scored on a 10-point scale from complete failure to flawless execution. Initial experiments assess whether Agent Mode can interpret page content, act reliably, and save time on dynamic web interactions.
Read at Ars Technica
Unable to calculate read time
[
|
]