identify basis patterns, and do not necessarily produce results consistent with physics. Constraint reasoning approaches, including satisfiability modulo
theory (SMT) methods (Ermon et al. 2012), can provide physically meaningful results, but depend heavily on effective preprocessing, such as peak identification, and are computationally intensive.
Approaches based on nonnegative matrix factorization (Long et al. 2009) are computationally efficient,
but generally perform poorly when peak-shifting
phenomena are present, failing to produce physically meaningful solutions. CombiFD (Ermon et al.
2015) is another factor decomposition approach that
uses combinatorial constraints to simultaneously
enforce some of the physical rules and accommodate
peak shifting, but requires solving a combinatorial
problem in each descent step, and is therefore computationally expensive and does not enforce all the
Here, we describe Phase-Mapper, an AI platform
for rapidly solving the phase-mapping problem, integrating three key components: ( 1) cutting-edge AI
solvers, ( 2) human intelligence and feedback, and ( 3)
high-throughput physical experiments. These components form an integrated process (see figure 1).
Phase-Mapper features novel solver AgileFD as a
key component of the platform. Motivated by convolutive NMF, AgileFD includes a set of lightweight
updating rules, and therefore a very fast gradient
descent process. AgileFD is flexible, allowing for the
incorporation of additional contraints, as well as
human feedback through refinement. AgileFD can
also run autonomously, producing physically meaningful solutions.
Phase-Mapper also provides tools for data exploration, visualization, and configuration that allow
human experts as well as laypeople to analyze and
Phase-Mapper’s solutions, obtained by the interaction between solvers and human users or
autonomously, can also shed light on the development of new physical experiments. For example, the
results can be incorporated into an active learning
system, specifying regions of composition space to
sample at higher resolution.
AgileFD: A Novel
The Phase-Mapper platform features the AgileFD
solver for the phase-mapping problem. AgileFD uses
iterative updates of candidate solutions that are sig-
nificantly faster than previously proposed methods.
Human experts can interact with the algorithm in
real time, and this speed is due to an efficient prob-
lem representation. Let the XRD patterns for all sam-
ples be represented by a matrix A, where each col-
umn corresponds to one sample point and each row
corresponds to Aj(q) for a particular value of q. Under
the assumptions of no noise and no shifting, mean-
ing that λij = 1 for all i and j, describing A as a linear
combination of a few basis patterns Wi(q) is equiva-
lent to factoring A as a product of two matrices W
Here, R denotes the approximate reconstruction of A.
In this formulation, the columns of W form a set of
basis patterns Wi(q), and the columns of H corre-
spond to the values hij in equation 1. We enforce
nonnegativity for W and H, which is required for the
solutions to be physically meaningful. Previous
approaches to solve the phase-mapping problem
based on NMF have been unsuccessful in handling
peak shifting, where λij ≠ 1. The first contribution of
AgileFD is to circumvent the shifting problem by a
log space resampling. Under the variable transforma-
tion q into log q, our signal becomes Wi(log q). More
importantly, the shifted phase Wi(log λq) becomes
Wi(log λ + log q), which transforms the multiplicative
shift in the q domain into a constant additive offset.
This allows the problem to be formulated in terms of
convolutive nonnegative matrix factorization. After
this variable substitution, we discretize the values of
allowed λ and interpolate the signals at the corre-
sponding geometric series of q values. The problem
can then be written:
With the columns of W representing the basis pat-
A ; W ;m
Figure 3. An Illustration of the Phase-Mapping Problem.
Given a material system with XRD data read at discrete points, find a set of
basis phases, such that every point’s XRD data can be made by a linear combination of the basis phases. Here, the left image is the original data, the
right image is the found basis phases, and the middle image represents how
much of a particular phase (the “phase concentration”) is present at each
data point along with the composition-dependent shifting.
XRD pattern at N