AI models are terrible at betting on soccer-especially xAI Grok
Briefly

AI models are terrible at betting on soccer-especially xAI Grok
""Every frontier model we evaluated lost money over the season and many experienced ruin, with the AI systematically underperforming humans in this scenario.""
""There is so much hype about AI automation, but there's not a lot of measurement of putting AI into a longtime horizon setting.""
""If you... try AI on some real-world tasks, it does really badly... Yes, software engineering is very important and economically valuable, but there are lots of other activities with longer time horizons that are important to look at.""
Every frontier AI model evaluated lost money over the season, with many experiencing ruin. The study indicates that AI systematically underperformed compared to humans. The results may comfort professionals worried about AI job displacement. Many benchmarks for testing AI are flawed, as they are set in static environments. Real-world tasks reveal AI's poor performance. The study serves as a counterweight to the excitement in Silicon Valley regarding AI's recent advancements in programming tasks.
Read at Ars Technica
Unable to calculate read time
[
|
]