the solver, two new plots are generated: ( 1) the basis
patterns that were found as solutions and ( 2) a composition map displaying the mixture proportions.
Connection to Solver
The solving feature of Phase-Mapper enables users to
interact with the AI solver behind the scenes. The
user can specify many solver parameters, such as how
much to enforce sparsity, how many phases the solution should have, and how much shift between basis
patterns the solver should allow. The user can also
specify initial or frozen values to use as basis patterns.
Incorporating user inputs helps the solver improve
efficiency and accuracy. We provide tools for expert
users to start the solver off closer to a solution, or to
distort the solution space so the solver finds a more
accurate solution.
Algorithm Effectiveness
In this section, we evaluate many solvers that were
proposed in recent years against our Phase-Mapper
system. We tested NMF (implemented as AgileFD
with M = 1), AgileFD, AgileFD with sparsity regularization (AgileFD-Sp), AgileFD with sparsity and the
Gibbs phase rule enforced (AgileFD-Sp-Gibbs), and
CombiFD (Ermon et. al. 2015). We generated synthetic ternary metallic systems using data provided
by the Materials Project (Jain et al. 2013), which provides crystal structure information and energy of formation using density functional theory for each
phase. We applied a stylized model of solid solubility and used structure interpolation to simulate modified phase diagrams that include the additional
degrees of freedom from alloying. We calculated XRD
patterns for each modified constituent, including
their interpolated structures, using pymatgen (Ong et
al. 2013).
The quality of a solution is judged by how well
each sample’s reconstructed signal matches the corresponding measured signal. We find the permutation of the phases in the solution to best match the
ground truth. In general, we observe that the solutions found by AgileFD (including AgileFD-Sp and
AgileFD-Sp-Gibbs) better match the ground truth
when compared with NMF and CombiFD. NMF
underperforms because it cannot model peak shifting. Despite the fact that CombiFD also captures
some of the physical constraints, it does not scale
well because it formulates the physical constraints
using mixed-integer programming. AgileFD with
extensions (AgileFD-Sp and AgileFD-Sp-Gibbs) outperform vanilla AgileFD. They are able to find solutions that better match the physical constraints.
Illustrative Example: Discovery of
Nb-V-Mn Oxides Light Absorbers
for Energy Applications
The integration of the rapid solver with visualization
tools enables materials scientists to interact with the
data in a variety of ways. The web-accessible visuali-
zation tools enable rapid data exploration by materi-
als scientists, which empowers materials scientists to
inject their expert knowledge into the solution, for
example by specifying the number of phases, the
extent of alloying-based peak shifting, or the known
existence of a phase in a certain composition region.
In this way, Phase-Mapper can run in unsupervised
or semisupervised modes per the availability of prior
knowledge. To demonstrate the phase-mapping capa-
bilities and the importance of the Gibbs constraint,
figure 5 contains solutions for the phase map of 317
XRD patterns in the Nb-V-Mn oxide composition
space using M = 10 shifted versions, which corre-
sponds to approximately 2 percent alloying-based
peak shifting. Although the phase behavior of bina-
ry subcompositions (for example, Nb-V oxides) has
been previously studied, the ternary compositions
are being explored for the first time to discover solar
light absorbers for energy applications. Materials
researchers were unable to obtain a meaningful phase
diagram using manual analysis of this data set, even
with advanced visualization tools, primarily because
there are a number of phases with somewhat similar
basis patterns, and most basis patterns contain
dozens of peaks, yielding a collection of XRD pat-
terns that are rich in information, but that exceed
human conceptualization.
We show in the paper by Suram et al. (2016) that
without accounting for alloying-based peak shifting,
solutions are not meaningful in a number of ways,
most notably the basis patterns do not correspond to
individual phases because the intensity for a phase
whose patterns shift across the data set is spread out
over multiple basis patterns, creating phase-mixed
basis patterns that are as difficult to interpret as the
mixed-phase patterns in the raw data. The overlapping features in the basis patterns amplify this problem and result in its persistence even when alloying-based peak shifting is taken into account. When two
phases have overlapping features in their basis patterns and the phases coexist in a range of compositions, approximately equal data reconstructions can
be obtained using basis patterns that each contain
one phase or that each contain a mixture of phases.
To empower the algorithm to overcome this degeneracy in phase map solutions, we additionally apply
the Gibbs constraint on the number of phases that
can coexist in each composition sample. Figure 5
shows the basis patterns without (top) and with (
bottom) application of this constraint, with one basis
pattern on top highlighted to show that it contains a
mixture of phases. So although this constraint is
applied on the activations of the basis patterns, it
indirectly makes the basis patterns more physically
meaningful. The Phase-Mapper solution also exhibits
excellent composition space connectivity for each
phase concentration map, as expected for equilibrium phase behavior, and it exhibits systematic com-