Big Data Analytics (BDA) and the Internet of Things (IoT) are some of the most exciting and rapidly-changing phenomena in the world. These two fields are closely connected, since IoT devices generate (and stream) lots of data which typically requires BDA to create actionable insights and, in many cases, control these devices. Let’s take a look at what interesting happened in these two areas last week (September 12-18, 2016).
The largest IoT device?
Let’s start with one of the favorite subjects for data storage, distributed computing, and big data analytics professionals: CERN. This article reminds us how much data CERN generates and collects and how much storage and compute power is needed to deal with this data:
“Together, the LHC experiments produce more than 30 petabytes of data per year. CERN’s Data Centre (including its remote extension in Budapest, Hungary) provides 250 petabytes of disk storage space and around 200,000 computing cores. Analysis of the physics data is made possible by a global network of more than 170 computer centers known as the Worldwide LHC Computing Grid. Each day, more than 2 million jobs run on this network.”
The article also infers that the CERN super-collider could be viewed as one large IoT device with “approximately 50,000 sensors and other metering devices the center uses to capture operational data from its accelerator complex. The data is then analyzed by a team of data scientists to ensure these accelerators are operating at their full potential—and if not, identify what resources are needed so that they do.” Looks like a nearly perfect example of the IoT-BDA synergy.
IoT Security – dangers and actions
IoT security is, probably, the biggest concern for the people involved in it. Even governments take notice of this issue. For example, On Sep 9, Reuters reported that the US Justice Department has formed a research group with the purpose of analyzing the national security vulnerabilities of self-driving cars, medical tools and other network-connected devices. As an example of already existing IoT issues, the article mentions that “in July 2015, Fiat Chrysler Automobiles NV recalled 1.4 million U.S. vehicles to install software after a magazine report raised concerns about hacking, the first action of its kind for the auto industry.”
According to this study, “100% of reported Internet of Things vulnerabilities in connected home and wearables products do not have to happen. If manufacturers and developers take security and privacy measures into account throughout the development process, breaches need not be fatal.” This information is comforting. At least, the hacking problem can be controlled for some IoT devices. However, the authors of the study note that things do not get too easy in the IoT industry:
“If businesses do not make a systemic change we risk seeing the weaponization of these devices and an erosion of consumer confidence impacting the IoT industry on a whole due to their security and privacy shortcomings.”
You can check out more of this study if you are interested in their analysis of the main “glaring failures” leading to IoT security issues.
Data has personality?
We all agree that modern data could be very diverse. A somewhat unusual article in Forbes states that “information management company Veritas suggests that this [data] diversity has created a world where we can think of different forms of data having different ‘personalities’ rather than just different values.” Modern data comes in different sizes and shapes, at various speeds, levels of importance (criticality), hot/cold, structured/unstructured, etc. According to Forbes, Veritas “has come forward with the ‘data personality’ analogy as part of series of comments issued in line with its latest product releases in the cloud data management segment.”
Merriam-Webster defines personality as “the set of emotional qualities, ways of behaving, etc., that makes a person different from other people.” Thus, IMHO, data cannot have personality. Algorithms do. They may respond differently to the same data. In many cases, these are personalities of their creators.
New speech recognition accuracy record
According to Microsoft, its speech scientists at Microsoft Research have achieved a word error rate (WER) of just 6.3% under an industry-standard evaluation, using techniques that will eventually enhance Cortana. The previous lowest error rate was 6.9%, achieved by IBM’s Watson team, which beat their own record of 8% set last year.
I wonder what the accuracy rating is for an average human? It cannot be 100%. Otherwise, how do we explain all the “What?” questions during our conversations?
IoT and the future of banking
It is fair to say that ATM machines are some of the most successful IoT devices out there.
They’ve been around for a while, and have served both banks and their customers well for years. There are millions of them in use daily. New technologies are coming to these IoT devices in the near future. If this subject is of interest to you, the following articleprovides lots of relevant numbers and discusses some future trends in both banking and IoT in general.
Big Data for Solar
This interesting article brings up the fact that solar energy products are “still being sold door-to-door just like vacuum cleaners in the 1950s.” It then proceeds to discuss a few new ideas and companies trying to accelerate the solar market using BDA to find the right customers, cheaper financing, the right engineering solution, a way to optimize customer heating and cooling, etc.
No politics, just technology
How important is BDA in today’s world? What about in regards to changing the outcome of the presidential elections in the most influential country in the world? All you need is a team of 60 mathematicians and data analysts. The outcome is not guaranteed but the probability of success goes up. Read more about it here, if you are into a mix of politics and technology.
Nothing but numbers
CERN experiments produce more than 30 petabytes of data per year.
The CERN supercollider has approximately 50,000 sensors and other metering devices.
26% of UK business leaders are citing that being a data analyst is the most important skill or competency a potential new employee can have. Overall, 60% said they “consider data and analytics skills one of the top two skills”, behind only industry experience, which was approximately 69%.
In the US, the average annual salary of a data scientist is, according to Glassdoor.com, $119,000.
The Raspberry Pi foundation announced it has sold more than 10 million Raspberry Pi boards. A large fraction of them have been sold to people working on IoT projects ranging from home automation to industrial sensor networks.
The fifth-generation, or “5G” telecom system is expected to provide the backbone for the Internet of Things, beginning as soon as 2018.
There were 2.7 million ATMs installed around the world in 2015, showing an increase from 2 million in 2010, according to estimates from BI Intelligence.
A new record has been set for the lowest voice recognition error: it is now at 6.3% (courtesy of Microsoft).