"The exponential growth of scientific literature presents an increasingly acute challenge across disciplines. Hundreds of thousands of new chemical reactions are reported annually, yet translating them into actionable experiments becomes an obstacle1,2. Recent applications of large language models (LLMs) have shown promise3,4,5,6, but systems that reliably work for diverse transformations across de novo compounds have remained elusive. Here we introduce MOSAIC (Multiple Optimized Specialists for AI-assisted Chemical Prediction), a computational framework that enables chemists to harness the collective knowledge of millions of reaction protocols."
"Here we introduce MOSAIC (Multiple Optimized Specialists for AI-assisted Chemical Prediction), a computational framework that enables chemists to harness the collective knowledge of millions of reaction protocols. MOSAIC is built upon the Llama-3.1-8B-instruct architecture7, training 2,498 specialized chemical experts within Voronoi-clustered spaces. This approach delivers reproducible and executable experimental protocols with confidence metrics for complex syntheses. With an overall 71% success rate, experimental validation demonstrates the realizations of over 35 novel compounds, spanning pharmaceuticals, materials, agrochemicals, and cosmetics."
MOSAIC aggregates millions of reaction protocols and uses the Llama-3.1-8B-instruct architecture to train 2,498 specialized chemical experts organized in Voronoi-clustered model spaces. The specialists generate reproducible, executable experimental procedures and attach confidence metrics to guide complex syntheses. Experimental validation achieved a 71% overall success rate and realized more than 35 novel compounds across pharmaceuticals, materials, agrochemicals, and cosmetics. MOSAIC translates rapidly growing chemical literature into actionable laboratory protocols, supports diverse transformations including de novo compounds, and provides a scalable cheminformatics workflow for AI-assisted prediction and experimental planning.
#cheminformatics #chemical-synthesis #large-language-models #automated-experimental-protocols #voronoi-clustering
Read at www.nature.com
Unable to calculate read time
Collection
[
|
...
]