The researchers indicate that they were interested in understanding what capacities large language models (LLMs) can bring to the scientific endeavor. So all of the AI systems used in this work are LLMs, mostly GPT-3.5 and GPT-4, although some others—Claude 1.3 and Falcon-40B-Instruct—were tested as well. (GPT-4 and Claude 1.3 performed the best.) But, rather than using a single system to handle all aspects of the chemistry, the researchers set up distinct instances to cooperate in a division of labor setup and called it “Coscientist.”

The three systems they used are: Web searcher, Documentation searcher (with access to lab equipment) and Planner (which plans and gives commands to the others).

The system received a prompt to help it understand how to identify chemicals and after that, and a few mistakes, it was able to synthesize what it was tasked with.