Korean Researcher Fellowship supports AI-driven alloy design

National Research Foundation of Korea awards postdoctoral researcher Jin-Young Lee

At a glance:

  • Award: Nurturing Next-generation Researchers Fellowship of the National Research Foundation of Korea
  • Awardee: Dr. Jin-Young Lee, postdoctoral researcher at the Max Planck Institute for Sustainable Materials
  • Research challenge: Heterogeneous alloy datasets combine fundamentally different material systems and processing routes, creating hidden biases and misleading trends that can hinder reliable AI-driven alloy design.
  • Approach: Using casual inference and large language models to identify hidden patterns in heterogenous alloy datasets and extract structured data to improve AI-driven alloy design.

The National Research Foundation of Korea awarded Dr. Jin-Young Lee, postdoctoral researcher at the Max Planck Institute for Sustainable Materials, with a Nurturing Next-generation Researchers Fellowship. This award is granted annually to only 50 researchers across all engineering and science fields in Korea and supports their research stay abroad. Lee’s project aims to develop an AI-driven methodology that transforms heterogeneous materials data into reliable, causally informed design principles for advanced materials development.

The challenge of heterogeneous materials data

Modern machine-learning models can analyse vast amounts of experimental data and rapidly screen millions of possible alloy compositions in search of materials with desirable properties. However, the success of these approaches depends heavily on the quality of the underlying data.

Many alloy datasets are compiled from hundreds of published studies. While this provides a wealth of information, it also creates a significant challenge: the data often originate from different alloy systems, processing routes, experimental methods and research objectives. As a result, these combined datasets are highly heterogeneous and may contain hidden subgroups with fundamentally different behaviours.

"When these aggregated datasets are analysed together, important relationships can become distorted or even completely hidden," explains Lee. "This can lead researchers to draw conclusions that do not actually apply to individual material systems."

Revealing hidden trends

Lee’s project focuses on addressing a statistical phenomenon known as Simpson's paradox.

This paradox occurs when a trend observed across an entire dataset disappears or reverses once the data are divided into meaningful subgroups. As a result, conventional data analysis can overlook valuable design principles or generate misleading predictions.

Combining AI and causal inference

To overcome this challenge, Lee proposes a new methodology that combines large language models, causal inference and materials informatics.

The approach begins with large language models that automatically extract structured information from scientific publications. The resulting datasets are then analysed using causal inference methods, which help identify hidden subgroups and uncover the factors driving observed material behaviour.

Rather than searching for a single universal trend, the methodology aims to reveal system-specific relationships within different alloy classes and their distinct microstructural domains. These insights can then be used to develop tailored design strategies for each subgroup, thus providing more reliable guidance for alloy design.

Ultimately, the work aims to help researchers move beyond simply finding correlations and towards understanding the causal relationships that govern material performance.

Other Interesting Articles

Go to Editor View