The Trouble with Big Data
Be in no doubt. Data is big and getting bigger.
Hang on, scratch that. Purists will already have spotted the error. Data is, of course, the plural and so to describe data as big suggests that each individual, indivisible piece of information is itself large.
I'm not being pedantic, indeed language evolves and the precise meanings of words shift to meet the times in which they are employed, so I have no problem with data being used to describe a singular corpus of information. But thinking about what part of "data" is big and getting bigger did lead me to a certain conclusion.
It's precisely because individual data do not increase in size as our total data set explodes in girth that we may have an issue with getting the most value out of it.
At any particular point it is possible to say that in the last year, quarter, month or whatever, more data has been added to the global corpus than in the previous millennium, century or whichever way you want to cut it.
The growth of information that is stored (and yet remains retrievable) outpaces Moore's Law by a considerable factor and thus the promise of trawling the data for insight, value and intelligence becomes not just an enticing possibility but a technical challenge.
One of the greatest challenges will be how we learn to contextualise and give weight to the data from different sources, be they operational, geographic or historical. Look at it this way, if you have one photograph of a great-grandparent, yet fifteen thousand of each of your children, it might it be tough to compile a meaningful document fully describing the lives of five generations of your family in pictures alone.
In procurement, a lot of the intel that we need to support strategy and planning for the future can be derived from our corporate experience, and the further back (and further afield) we go, the more likely we are to be able to make accurate predictions of future trends. But with those more distant data sources vying for mind-space with today's bulkier feeds, we might find as much effort is put in to sifting the wheat from the chaff as is dedicated to actual decision making.
Maybe there is a case for a kind preemptive positive weighting of data that is more limited in scale, giving it an equal footing in its influence on the overall pattern. But alongside that we'd need to perhaps evaluate those data which are at a greater remove with a higher degree of critical appraisal. After all, there's no way to tell what Great-Grandmother was really like from one awkwardly posed grainy print from 1922.
Image credit: Pixabay.com