Today, devices generate more IoT data than social networks. Each device can send data several times per second. With millions of connected devices, a typical data processing platform might be required to deal with billions of such incoming events every day.
Even though processing this amount of data is obviously a considerable and by no means trivial technological challenge, it is clear that the device data itself – even when stored in a preprocessed form – is not something actionable. To get actionable insights, the collected data must be analyzed.
One type of task that can be effectively tackled with data analysis in the IoT is anomaly detection. Its goal is to identify unusual behavior in connected devices that differs significantly from what has been observed before or from what is expected.
Is everything all right with my connected lawn mowers?
Let’s look at an example taken from one of our anomaly detection projects. We applied our algorithms to a fleet of autonomous lawn mowers (ALMs). Using one of our Bosch IoT Analytics services, anomalies can be calculated for this fleet of IoT-enabled lawn mowers over the mowing season. For this purpose, the data used contains status and error messages sent from the lawn mowers in use to the backend in the cloud.
Let’s assume that every week, our service is configured to identify the top ten anomalies in this data. Lawn mowers that repeatedly pop up in the list of top anomalies can be automatically marked and organized in a list. Service personnel and/or quality managers can then inspect them manually. In addition, the results of the anomaly detection can be analyzed for significant patterns and grouped into categories of incidents.
For instance, particular patterns of state and error messages can be an indication that the firmware of individual mowers needs to be updated, or that the mowers have not been set up properly. By grouping the observed patterns into categories, solution strategies – i.e. specific actions – can be associated with them and triggered automatically whenever the patterns crop up in the event data. This can result in actively pushing the latest firmware on the affected mower, or proactively contacting the customer (provided that they have given their consent) to offer support from a service technician. These are ways of increasing customer satisfaction.
Data analysis – and anomaly detection in particular – is not one procedure but rather a generic name for a number of algorithms and transformations aimed at extracting implicit knowledge hidden in the data. There are many different types of anomalies and many different problem domains with their specific data and problem formulations.
The process of data analysis involves many steps and uses quite different technologies – from format transformations to sophisticated machine-learning algorithms and the construction of valuable visualizations. Typically, a data analysis process includes the following steps:
Step 1: Making device data available
Step 2: Pre-processing device data
In the overall analysis process, various data preprocessing tasks can account for most of the difficulties. That is why it is important to choose or develop a technology for efficient development and execution of such scripts. This step is designed to solve many problems, such as data cleansing and the generation of domain-specific features. It is frequently referred to as data wrangling, which is defined as iterative data exploration and transformation to enable analysis.
Step 3: Analyzing device data
This process step focuses on finding anomalies in the input data while choosing an appropriate data mining algorithm and fine-tuning its parameters.
Step 4: Visualizing device data
Last but not least, the data must be visualized for the end user. In doing so, it is important to choose visual techniques that are appropriate for the task being solved and the respective problem domain.
Detecting anomalies is just a first step towards more complex IoT analytics use cases , such as predictive maintenance. Once devices that are behaving anomalously have been identified, they need to be explored by domain experts and classified into problem classes. If possible, problem solution information should also be annotated. Compiling maintenance information and merging it with this data and the analysis results allows the construction of a clean and rich data set. In turn, this data set can be used to build a prediction model of the type required for predictive maintenance solutions.
Moreover, not only do these anomaly detection results highlight problems, but they can also point domain experts to new (business) opportunities. If specific anomalies are showing up in different devices in a systematic way, it may be an indicator that a certain feature is missing. In the case of the autonomous lawn mowers, systematic anomalies showing up in a sub-group might be caused by a repeated pattern of special topography in the gardens being mowed. Hence, these might require an algorithmic add-on for the lawn mower that can be sold as an advanced functionality.
We recently published a white paper on “Anomaly detection with event data in the Internet of Things” that generated a lot of interest. It focuses on the challenges and best practices for the above mentioned processing steps and includes observations I made in various data analytics projects.
Since anomaly detection has encountered so much interest, I am delighted to offer you a free webinar on January 31, 2017. The webinar will address IoT project leaders, data analytics experts, and software architects of IoT solutions in particular.