Metadata Schemes for Experimental Materials Science
To achieve FAIR data, i.e. make data findable, acessible, interoperable, and reuseable, extending the raw data with metadata is required. Good metadata includes information about data creation, e.g. experimental setup, and provides key results in a handy way. Such rich information allows us to search for data matching specific requirements. In this project we develop and test metadata schemes for experimental materials science.
FAIR handling of data, i.e. making data findable, accessible, interoperable, and reusable, is one of the key aspects to research data management (RDM). Consequently, funding agencies nowadays require an RDM plan to assure that the generated data is FAIR (see e.g. DFG regulations) and, thus, reusable for upcoming generations of scientists. Metadata – i.e. the data attached to the actual data – is very important for achieving this goal as it provides information about the experiment performed and the origins of the data.
In this project we develop metadata schemes for material science experiments to capture the most important information alongside with the data files. This enables direct access to key settings and results of the experiments without the need for extra consultation of the authors, external documentation, or other experts in the respective field. With this information it is easier to reproduce and/or reuse the data directly.
An additional benefit of metadata schemes are well-defined keywords and corresponding values. These key-value pairs are required for efficiently searching for the data of interest with search engines, e.g. to provide all data for samples with a specific element being a trace element requires to store the contained elements with its concentration in the metadata.
The major challenge for metadata scheme development is to identify and agree on well-defined keywords to describe the actual data. This has to be accomplished within experts of one particular technique as well as on a broader perspective. In this project, we collaborated with a wide range of our colleagues in SFB 1394 to iteratively define the relevant ontology for our research in that project.
Finally, the metadata schemes need to be implemented on a RDM platform (we currently use CoScInE), validated by the users, and benchmarked by running queries for relevant data.