Abstract
Materials discovery requires navigating vast chemical and structural spaces while satisfying multiple, often conflicting, objectives. We present LLEMA (LLM-guided Evolution for MAterials design), a unified framework that couples the scientific knowledge embedded in large language models with chemistry-informed evolutionary rules and memory-based refinement. At each iteration, an LLM proposes crystallographically specified candidates under explicit property constraints; a surrogate-augmented oracle estimates physicochemical properties; and a multi-objective scorer updates success/failure memories to guide subsequent generations. Evaluated on 14 realistic tasks spanning electronics, energy, coatings, optics, and aerospace, LLEMA discovers candidates that are chemically plausible, thermodynamically stable, and property-aligned, achieving higher hit-rates and stronger Pareto fronts than generative and LLM-only baselines. Ablation studies confirm the importance of rule-guided generation, memory-based refinement, and surrogate prediction. By enforcing synthesizability and multi-objective trade-offs, LLEMA delivers a principled pathway to accelerate practical materials discovery.
Novel Materials Discovery Benchmark
Overview of the 14 comprehensive materials discovery tasks and their associated properties.
We introduce a new benchmark for multi-objective materials discovery, evaluating LLEMA on 14 comprehensive materials discovery tasks spanning diverse industrial applications across electronics (semiconductors, transparent conductors, thermoelectric materials), energy (battery electrodes, photovoltaics, fuel cell components), coatings (corrosion-resistant alloys, wear-resistant ceramics), optics (high-refractive materials, optical filters), and aerospace (high-temperature alloys, lightweight structural materials). Each task requires optimizing multiple competing objectives while maintaining chemical plausibility, thermodynamic stability, and synthesizability constraints. This benchmark provides a standardized evaluation framework for future materials discovery methods.
Method Overview
LLEMA is formulated as an agentic AI system for materials discovery, consisting of four interconnected components that interact in a closed-loop optimization process. Central to the system is a large language model that operates as an autonomous hypothesis-generation agent, proposing candidate materials at each iteration. Its behavior is governed by dynamically constructed prompts that encode task objectives, chemistry-informed design constraints, demonstrations from prior system trajectories, and structured output specifications, enabling systematic and adaptive exploration of the materials design space.
Overview of the LLEMA framework. An LLM proposes material candidates under task constraints, which are then evaluated and refined using chemistry-informed rules, memory-based guidance, and surrogate property prediction. The iterative process balances exploration and exploitation, enhancing multi-objective materials discovery.
LLM-Guided Generation
Leverages scientific knowledge embedded in large language models to propose chemically plausible material candidates
Chemistry-Informed Rules
Integrates domain-specific evolutionary operators for compositional substitution and crystal structure manipulation
Memory-Based Refinement
Maintains reward and error buffers to guide exploration and exploitation across generations
Multi-Objective Optimization
Balances multiple property constraints including thermodynamic stability and synthesizability
Island-Based Evolution
Manages independent evolution across multiple islands for diverse candidate exploration
Qualitative Results
Impact of Surrogate Predictors
Integrating surrogate models for property prediction in LLEMA achieves higher overall performance.
Diverity of Candidates
LLEMA achieves higher diversity of candidates compared to base LLMs.
Stronger Pareto Plot
LLEMA achieves stronger Pareto plot over other baselines.
Ablation Study
LLEMA achieves the highest stability, hit-rate and the lowest memorization rate by incorporating memory-based evolution and chemistry-informed design principles.
Lower Memorization
LLEMA achieves lowest memorization leading to novel candidate generation.
Quantitative Results
BibTeX
@inproceedings{abhyankar2026llema,
title={LLEMA: Accelerating Materials Design via {LLM}-Guided Evolutionary Search},
author={Abhyankar, Nikhil and Kabra, Sanchit and Desai, Saaketh and Reddy, Chandan K},
booktitle={The Fourteenth International Conference on Learning Representations (ICLR)},
year={2026},
url={https://openreview.net/forum?id=TIqzhBvCNB}}
ICLR 2026