inal and anomalous behavior. Its major drawback is
that it can only identify anomaly occurrences for the
anomaly types that it knows already. This is a major
limitation since the anomalies that generally will
affect operations the most are the ones that are happening for the first time. Being able to recognize first-time-ever anomalies quickly allows flight control
teams to take action immediately.
The unsupervised approach consists of having
unlabeled examples of data — no prior knowledge is
provided. The implicit assumption made by the systems using a nonsupervised approach is that anomalies happen far less often than nominal behaviors.
So they attempt to automatically distinguish what is
nominal and what it anomalous. The major drawback is the risk of missing anomalies: if an anomaly
happened several times in the past, a nonsupervised
system may consider it a normal behavior and not
report it in the future.
The semisupervised approach is a combination of
the supervised and unsupervised approaches. It consists of providing only nominal behaviors examples.
The advantages of this approach are that engineers
are in full control to specify what should be considered as nominal, repeated anomalies can be detected
since they are not in the nominal set, and since no
assumptions are made about the possible behavior of
the anomalies, any anomalous behavior can be
The proposed monitoring paradigm follows a
semisupervised approach to perform anomaly detection. We will use the term novelty detection instead of
anomaly detection since the only thing that can be
said is that a behavior is novel when compared to a
set of behaviors known to be nominal. The new
behavior might well be also nominal but so far not
present in the nominal set. The decision of classifying a new behavior as nominal or anomalous is left
to the flight control engineers.
Novel Behavior Detection
To characterize behaviors we compute four statistical
features (average, standard deviation, maximum, and
minimum) of fixed-length periods. The duration of
the time period is chosen so that it represents a natural time span (for example, orbit period or time covered by the short-term planning). The exact duration
is not critical; however, it should be long enough to
allow behaviors to develop and not so long that
many different behaviors happen in it.
While there are other ways to characterize behavior in a given time period (for example, Fourier transformations, wavelets, and so on) we used statistical
features because they are robust to sampling rate
changes and behavior order, and work even if very
few samples are available. In addition, they are compatible with the future European Space Operations
Centre (ESOC) infrastructure data archive (DARC).
DARC precomputes and make available these statistical features.
Figure 2 shows representation of how these fixed
time periods look in a side-by-side two dimensional
comparison. We are showing this representation in
this document only as an example. In reality, four
dimensions (average, standard deviation, maximum,
and minimum) are used simultaneously.
Once we have defined the representation of a time period for a given parameter we need to be able to compare
time periods. We need a distance measurement so that
we can say that for a given parameter A, the period X
is closer to the period Y than to the period Z.
Mathematically: d(X, Y) < d(X, Z). We use the Euclidean distance as distance measurement (equation 1):
We make use of outlier detection techniques to find
which periods have anomalous behavior. The general assumption is that anomalous behaviors will have
greater distances to known nominal behaviors than
known nominal behaviors among them. The question is how big the distance should be so that it can
be considered a novel behavior. If the distance is too
small many false anomalies will be reported. If the
distance is too big then some anomalies will be
The solution to overcome the problem of having
to define an outlier distance is to use local density
outlier detection techniques. The most widely used is
called local outlier factor (LOF) (Breunig et al. 2000).
LOF computes a factor that gives an indication of the
degree of outlierness (novel behavior). It takes into
account the density of the k closest points. If they are
very dense, little distance is required to consider a
new behavior a novelty. If the neighbors are sparse a
bigger distance is required to consider a new behavior a novelty.
The major disadvantage of LOF is that the resulting
factor values are quotient values and hard to interpret. A value of 1 or less indicates a clear inlier, but
there is no clear rule for when a point is an outlier. In
one data set, a value of 1. 1 may already be an outlier;
in another data set and parameterization (with strong
local fluctuations) a value of 2 could still be an inlier.
These differences can also occur within a data set due
to the locality of the method.
To overcome the LOF limitations we will use local
outlier probabilities (LoOP) (Kriegel et al. 2009). It is
a relatively new method derived from LOF. It uses
inexpensive local statistics to become less sensitive to
the choice of the parameter k. In addition, the resulting values are scaled to a value range of [0: 1] (Kriegel
et al. 2009), that can be directly interpreted as the
d X ,Y ( ) =
avgx ; avgy ( ) 2 + stdx ; stdy () 2 + max x ; max y ( ) 2 + min x ; min y ( ) 2