I have been reading quite a bit lately about the coming wave of new data—how ubiquitous sensors, personal devices, and the whole IoT space will create a deluge of data that current analytical systems could not even begin to handle.
The fact of the matter is that our universe of data to be handled centrally should be shrinking rather than growing. Here’s why:
I recently worked in an oilfield application where active oil and gas production wells sitting in the middle of a Texas desert sent back hundreds of megabytes of data each day. I asked the folks in charge: what kind of data is being sent back? “Sensor readings”, they told me, “measurements of the casing pressure every 5 minutes or so and hundreds of other kinds of measurements that are the vital signs of the well.”
“OK”, I said, “this data value here”, as I pointed to a sample entry, “what action would you take based on that reading?”
“Nothing, because that’s a normal reading.”
“You mean you collected that data and there’s no action to perform on it?”
“Yes, but then you never know when you get an exception. It could come at any time. We have to collect it all and then our app back at the central server will figure out when an exception has occurred.”
Upon further investigation I found that less than one tenth of one percent of the data on this particular well was actionable. Less than one percent was even worth archiving. Yet that entire huge stream of data was expensively transported “back to corporate” for processing, where almost all of it was eventually thrown away. Sheer instinct around “let’s collect that data and report it” caused a huge overreach in what was actually needed to run the business.
A big part of the IoT revolution is not simply that we can acquire more sensor data. Its also about moving real computing power “to the edge” —in other words at the physical business end of company operations. I think we often forget that the little device that is attached to a pump or a vehicle or a machine or even a human is as powerful as desktop computers were less than a decade ago. Let’s use that computing power to dramatically REDUCE the amount of data we extract and archive, using code running on the device to note averages and exceptions to condense the noise and improve the signal/noise ratio that does move back to central. Let’s perform real computing at the edge, and offload the burden from central servers.
Now I have written extensively in the past, including in my recent book Profit from Science, about how measurement is nearly free, so therefore we should measure everything. That idea does not conflict with my assertions here. Yes, measure everything, but that does not mean that everything needs to be stored in 10 second intervals. Be generous with sensory inputs but stingy when it comes to archival. I’ve observed many companies embracing IoT and quite a few get this dichotomy wrong.
Moving beyond IoT onto the whole universe of data we work with, I don’t see that we are placing smart value judgments on the data that we handle, casting aside reams of information that is not actionable or likely to be actionable in the future. Why don’t we take the same level of intelligence in the analysis of data and apply it to the question of what data is worth saving and further handling?
It’s a big (data) world out there. Let’s make it a little smaller.
About the Author
George Danner is president of Business Laboratory, LLC, an award-winning firm that uses scientific techniques and methods to improve organizational performance. With more than 30 years of experience in corporate strategy, George keeps his finger on the pulse of the latest trends global data realm. He recently authored a book on Big Data business strategies, Profit From Science, which debuted as the #1 Bestseller in Business Mathematics on Amazon. To learn more about George and Business Laboratory, visit www.business-laboratory.com.