Published online by Cambridge University Press: 05 December 2012
Introduction
With sensors becoming ubiquitous, there is an increasing interest in mining the data from these sensors as the data are being collected. This analysis of streaming data, or data streams, is presenting new challenges to analysis algorithms. The size of the data can be massive, especially when the sensors number in the thousands and the data are sampled at a high frequency. The data can be non-stationary, with statistics that vary over time. Real-time analysis is often required, either to avoid untoward incidents or to understand an interesting phenomenon better. These factors make the analysis of streaming data, whether from sensors or other sources, very data- and compute-intensive. One possible approach to making this analysis tractable is to identify the important data streams to focus on them. This chapter describes the different ways in which this can be done, given that what makes a stream important varies from problem to problem and can often change with time in a single problem. The following illustrate these techniques by applying them to data from a real problem and discuss the challenges faced in this emerging field of streaming data analysis.
This chapter is organized as follows: first, I define what is meant by streaming data and use examples from practical problems to discuss the challenges in the analysis of these data. Next, I describe the two main approaches used to handle the streaming nature of the data – the sliding window approach and the forgetting factor approach.
To save this book to your Kindle, first ensure [email protected] is added to your Approved Personal Document E-mail List under your Personal Document Settings on the Manage Your Content and Devices page of your Amazon account. Then enter the ‘name’ part of your Kindle email address below. Find out more about saving to your Kindle.
Note you can select to save to either the @free.kindle.com or @kindle.com variations. ‘@free.kindle.com’ emails are free but can only be saved to your device when it is connected to wi-fi. ‘@kindle.com’ emails can be delivered even when you are not connected to wi-fi, but note that service fees apply.
Find out more about the Kindle Personal Document Service.
To save content items to your account, please confirm that you agree to abide by our usage policies. If this is the first time you use this feature, you will be asked to authorise Cambridge Core to connect with your account. Find out more about saving content to Dropbox.
To save content items to your account, please confirm that you agree to abide by our usage policies. If this is the first time you use this feature, you will be asked to authorise Cambridge Core to connect with your account. Find out more about saving content to Google Drive.