Data-centric mixed-variable Bayesian optimization for materials design
Materials design can be cast as an optimization problem with the goal of achieving desired properties, by varying material composition, microstructure morphology, and processing conditions. Existence of both qualitative and quantitative material design variables leads to disjointed regions in property space, making the search for optimal design challenging. Limited availability of experimental data and the high cost of simulations magnify the challenge. This situation calls for design methodologies that can extract useful information from existing data and guide the search for optimal designs efficiently. To this end, we present a data-centric, mixed-variable Bayesian Optimization framework that integrates data from literature, experiments, and simulations for knowledge discovery and computational materials design. Our framework pivots around the Latent Variable Gaussian Process (LVGP), a novel Gaussian Process technique which projects qualitative variables on a continuous latent space for covariance formulation, as the surrogate model to quantify “lack of data” uncertainty. Expected improvement, an acquisition criterion that balances exploration and exploitation, helps navigate a complex, nonlinear design space to locate the optimum design. The proposed framework is tested through a case study which seeks to concurrently identify the optimal composition and morphology for insulating polymer nanocomposites. We also present an extension of mixed-variable Bayesian Optimization for multiple objectives to identify the Pareto Frontier within tens of iterations. These findings project Bayesian Optimization as a powerful tool for design of engineered material systems.