Towards an interoperable perovskite description or how to keep track of 300 perovskite ions.
Hybrid perovskites are interesting optoelectronic materials. The perovskite ABX3 structure offers a vast compositional space, and we have identified over 300 perovskite ions. This flexibility enables tuneable properties and has significantly contributed to the success of perovskite optoelectronics. However, this diversity also leads to confusion, ambiguity, and inconsistencies causing challenges for data mining and machine learning applications. To address this issue, we propose guidelines and a JSON schema to standardize the reporting of perovskite compositions. The schema adheres to IUPAC recommendations and is designed to make data both human- and machine-readable. It captures key descriptors such as perovskite composition, molecular formula, SMILES representation, IUPAC name, and CAS number for each ion. To facilitate adoption, we have developed utilities to automatically generate comprehensive and standardized perovskite descriptions from standard ion abbreviations and stoichiometric coefficients. Additionally, we provide a curated database of all identified perovskite ions with associated descriptive data.