Friday, May 4, 2012



Computing pioneers Jeff Hawkins and Donna Dubinsky founded Numenta to develop a new approach to machine intelligence first described in Hawkins' book On Intelligence.

Numenta has created a cloud-based prediction engine for streaming data called Grok. The Grok engine automatically discovers patterns in data streams, enabling your applications to predict future values and detect anomalies. Powered by Numenta’s Cortical Learning Algorithms, Grok features:
  • Online learning models that update continuously
  • Automated model creation
  • Temporal and spatial pattern discovery
 
Temporal
A temporal pattern is a relationship of items in a sequence of time. This is analogous to notes in a melody. A purely temporal pattern may be found if you stream a single field to Grok. On the left side of the diagram, the lines represent the value of a single field at different points in time. In this case, with one field, Grok can only find patterns in how this one field changes over time. For example, imagine we had a website with a hundred different links for news, sports, etc., and we want to predict which link someone is going to click on next. Users might have typical patterns they follow as they visit the site. Grok is able to learn many different sequences and at any time make a prediction of the most likely links to be clicked next. In this case Grok finds only temporal patterns because the data stream consists of a single field containing the ID of each link.

Spatial
A spatial pattern is a relationship between things that happen at the same time. This is analogous to the notes in a musical chord. A purely spatial pattern is shown in the middle of the diagram. Here we show a single record with four fields to suggest that although Grok is receiving a stream of these records it may not be able to find any patterns from record to record. The only patterns Grok has found are between the contemporaneous values of the four fields. We call these “spatial patterns.” For example, say each record represents a loan application with four fields, age, gender, income, and loan amount. Grok may find that age, gender, and income allow it to predict the loan amount, but the loan applications don’t exhibit any patterns from record to record. Knowing the sequence of previous loan applications doesn’t help predict anything about the next one.

Spatial-Temporal
The most common case of predictions is when Grok finds both spatial and temporal patterns, as shown on the right side of the figure. In this case Grok finds relationships between the four fields, and also finds temporal patterns in how the combinations of fields change over time. In the website click example, let’s say each record now has four fields: time of day, day of week, age, and ID of the link that is clicked. As before Grok will learn typical sequences of clicks, but it also finds that knowing the information in the first three fields helps it make better predictions. For example, Grok may find teenagers tend to click different links at different times than seniors do. Although this example is simple, in many cases it is difficult to see the patterns when there are many fields with rapidly changing data.
Grok searches for all three types of patterns when generating predictions in a data stream. Grok’s method is not “all or nothing.” At any point in a data stream Grok may be relying more on temporal patterns or more on spatial patterns. If only one field is streamed to Grok, then Grok can learn only temporal patterns. If more than one field is streamed to Grok, it will try to find spatial and temporal patterns.
Anomaly detection
In addition to making predictions, Grok can detect anomalies. As data is streamed to Grok, it can tell if the current input or the current sequence of inputs is novel. The first time a previously unseen pattern is observed, Grok adds it to a list of anomalies. This list is ordered by how novel the pattern is. A Grok developer can use this list of anomalies to look for machines that might need servicing or to look for potentially fraudulent transactions. Rather than using static rules (which need to be periodically reprogrammed) to detect rare events, Grok is a learning system. Grok looks at each data stream and uses past history to know what is normal and what is novel, and then it adapts as the world changes. If an anomaly is important, you can ask Grok to notify you every time it sees that pattern or sequence again.

The following attributes differentiate Grok from standard techniques:
  • Grok is a memory-based system. Experts using techniques such as linear regression use formulae to model data and make predictions. Formulaic systems can learn fast (only two data points are sufficient to define a line), and they can predict values beyond the range of what has been observed. Memory-based systems like Grok may take more data to train, but they can learn any pattern, including those that don’t fit any kind of mathematical expression. The sequence of notes in a melody is an example of a pattern that doesn’t fit a mathematical expression.
  • Grok is an online learning system. Online systems learn continuously and thus are better suited for applications where the patterns in the data change over time.
  • Grok automatically determines which factors (data fields) to use and how to encode them. This is often the task that requires the most skill. Most machine learning tools do not handle this automatically.
  • Grok learns variable order time-based patterns. Most machine learning techniques do not have the ability to learn time-based patterns. With these systems you can encode previous data points as separate fields and thus include historical data as part of a spatial pattern, but this rarely works as well as methods that are inherently capable of handling time-based patterns.
  • Grok uses sparse distributed representations. SDRs allow Grok to handle almost any kind of data whereas some machine learning techniques are restricted in the kinds of data that can be used or predicted. SDRs also give Grok the ability to generalize as to semantic similarity. 

The following paper describes Numenta's algorithms for learning and prediction. The document is available in the following languages, thanks to the generosity of the translators listed below (Numenta has not verified these translations).
 
  Hierarchical Temporal Memory including HTM Cortical Learning Algorithms





No comments:

Post a Comment