Water company spins out homegrown AI after LLMs failed it
Briefly

Water company spins out homegrown AI after LLMs failed it
"We were debating between carbon cloth and cast carbon electrodes. Not being PhDs in the space, we read relevant academic papers and used LLMs like Grok and ChatGPT to validate our findings. We chose carbon cloth, which is heavily used in academic papers like the Stanford dissertation we based our initial prototypes on, due to commercial availability."
"While we were not solely relying on LLMs, they did influence our research meaningfully. LLMs chose statistics from various papers and fields (such as citing the lifespan of a carbon electrode in a capacitor) and put them together in ways that were plausible enough. Ultimately, we spent four months and $200,000 validating this material would not in fact work past pilot scale; cast carbon electrodes would be superior."
"They were confidently wrong in ways that cost us months. That material turned out to have issues that didn't exist for cast carbon electrodes, including poor conductivity, water retention that affected ion removal, and poor durability."
Waterline Development, a water desalination startup, encountered significant problems when relying on large language models for materials science research. The company was developing a water battery-based desalination product and needed to choose between carbon cloth and cast carbon electrodes. After consulting academic papers and using LLMs like Grok and ChatGPT, the team selected carbon cloth based on its prevalence in academic literature. However, this material exhibited poor conductivity, water retention issues, and durability problems that cast carbon electrodes did not have. The LLMs had synthesized statistics from various papers and fields in plausible but ultimately misleading ways, leading the company to spend four months and $200,000 validating that cast carbon electrodes were actually superior. This experience demonstrates that commercial AI models are poorly suited for multidisciplinary research requiring synthesis across different fields.
Read at Theregister
Unable to calculate read time
[
|
]