Dataset Identification:
Resource Abstract:
- We present a set of novel algorithms which we call sequenceMiner that detect and characterize anomalies in large sets of high-dimensional
symbol sequences that arise from recordings of switch sensors in the cockpits of commercial airliners. While the algorithms
we present are general and domain-independent, we focus on a specific problem that is critical to determining the system-wide
health of a fleet of aircraft. The approach taken uses unsupervised clustering of sequences using the normalized length of
the longest common subsequence (nLCS) as a similarity measure, followed by detailed outlier analysis to detect anomalies.
In this method, an outlier sequence is defined as a sequence that is far away from the cluster centre. We present new algorithms
for outlier analysis that provide comprehensible indicators as to why a particular sequence is deemed to be an outlier. The
algorithms provide a coherent description to an analyst of the anomalies in the sequence when compared to more normal sequences.
In the final section of the paper we demonstrate the effectiveness of sequenceMiner for anomaly detection on a real set of
discrete sequence data from a fleet of commercial airliners. We show that sequenceMiner discovers actionable and operationally
significant safety events. We also compare our innovations with standard HiddenMarkov Models, and show that our methods are
superior.
Citation
- Title Anomaly Detection and Diagnosis Algorithms for Discrete Symbols
-
- revision Date
2014-01-06T11:45:40
Resource language:
[u'en-US']
Constraints on resource usage:
-
- Constraints
-
- Use limitation statement:
- public
point of contact
-
publisher
- individual Name {u'hasEmail': u'mailto:ashok.n.srivastava@gmail.com', u'fn': u'Ashok Srivastava'}
- organisation Name
{u'subOrganizationOf': {u'subOrganizationOf': {u'name': u'U.S. Government'}, u'name': u'National Aeronautics and Space Administration'},
u'name': u'Dashlink'}
-
- Contact information
-
-
- Address
-
- electronic Mail Address
Back to top:
Metadata data stamp:
2014-01-06T11:45:40
Metadata contact
-
publisher
Metadata scope code
dataset
Metadata standard for this record:
ISO 19115:2003 - Geographic information - Metadata
standard version:
ISO 19115:2003
Metadata record identifier:
DASHLINK_147
Metadata record format is ISO19139 XML (MD_Metadata)