Let me start with my own definition of Cognitive Analytics (which is quite different from that used by IBM):
Cognitive Analytics refers to a class of automated, autonomous, self-learning algorithms capable of data collection, analysis, interpretation, pattern discovery, and event forecasting that evolves over time and mimics the way a human would collect, analyze, and interpret data, as well as discover patterns and forecast future event of interest.
Natural language processing or special infrastructure or anything else is a bonus but isn’t essential to this subject, in my opinion.
The above definition would work for a child left alone in a large house and allowed to move around, try different things, open doors to many rooms, go into the attic, and to learn from his own experience without any parental supervision. The longer this child is doing it, the better he understands (and predicts) his environment, figures out what is normal and what is not, what is dangerous and what is not, etc.
To illustrate the above, I will use an example of a “cognitive” learning algorithm developed at Alchemy IoT for the purpose of detecting anomalies associated with IoT devices to efficiently describe and predict health issues and their maintenance needs. But, in this particular case, the algorithm was “asked” to learn about the anomalies generated by just a couple of sensors (temperature and light) placed in the middle of the office room seating 5 software development and data analytics folks.
The Arduino device collecting this data is shown in the picture below. The third sensor is a button one can push to transmit “error conditions” and could be ignored in our case as irrelevant.
So, the data is collected every 5 seconds from these two sensors 24/7 and sent to our Alchemy IoT cloud for real-time processing, storage, and learning. Our proprietary learning anomaly detector is analyzing the incoming data for the purpose of detecting truly anomalous events in the noisy environment of the room that has things like:
- natural light from the windows (from dawn till dusk) + overhead light (from the morning till the evening)
- temperature-controlled environment which kicks in every once in awhile
- random heat and shadows from the people walking and standing around
Below is the trace that compares our “instantaneous health index” (lower portion of the chart called “Unschooled”) with the learning health index (upper portion of the chart called “Learning”). Two reported states are possible – “yellow” for “anomaly” and “green” for “everything is normal”. Two red health reports correspond to two clicks of the “error” button attached to the Arduino device (“red” means “errors”). The horizontal axis is time and the time window always corresponds to the “last 24 hours”.
Now, notice that the dominating state is “green” because most of what is happening during 24 hours is considered by our anomaly detection engine as “normal”. Some “yellow” events shown correspond to “anomalies” (we use some complex proprietary ML-based anomaly-detection techniques to identify them). Notice that both charts look identical. And this is because the time for learning was too short and nothing had learned so far.
The next figure shows the same plot but 4 days later. Notice the difference between the “unschooled” and “learning” health index:
There are plenty of anomalies reported by the “Unschooled” health index but noticeably fewer reported by its learning colleague: in fact, for the last 24 hours, the learning index reported that it understood and removed from reporting ~50% (7 out of 14) of all the observed “anomalies”, which are not considered as “true anomalies” any longer, but rather as “new normality”.
The longer the algorithm learns, the more it will be able to separate true anomalies from what looks like anomalies but really is a common event. At the end, what is reported after a few weeks of learning, is a small set of truly unusual, anomalous events that require attention. And nothing else.
I used this example to demonstrate what I think should be considered real “cognitive analytics”: unsupervised, autonomous, automated, self-learning. Getting smarter with experience, more data, more time. Like a human child will do early in life.
P.S. Special thanks to Noel Lane and Nick Roseveare for setting up and conducting this experiment.