This morning I was doing some reflection on the difference between Data and Information (specifically thinking about the DIKW Pyramid). While the distinction is inherently simple, I’ve always had a difficult time describing it, so here is what I came up with –
Data is Objective, Information is Subjective
Data is absolute, it is objective, and therefore, inert.
Information is different. Information is always made up of at least two data points, meaning two things:
1. A piece of Information is always relative – its meaning is dependent on which data points make it up
2. A piece of Information must be curated – Someone has to connect the data points together. The person creating the association (whether on purpose or by accident) is a determining factor in the meaning created.
Data does not exist
Well, for all intents and purposes, as humans cannot see data.
“Language” is a tool in the Information space. Because information must be curated, it can no longer be inert, a piece of data must always be framed, so it will always be biased. It is the simple difference between stating “almost a million people…” and “not even a million people…” When we are on the receiving end of this information, this is very similar to the Anchoring and Adjusting Heuristic – the words “almost” or “not even” imply an anchor, so “a million” will be understood as a variation in one direction or another from a norm.
But Data isn’t always expressed to us in Language! What about times when we simply observe a data point in a vacuum?
Think for a second about blue blood. We know that blue blood cells exist, we can even define them: they are red blood cells without oxygen, it is the oxygen that makes them Red, so the natural state of a red blood cell is a blue one. The catch is, (without the proper equipment), you can’t see a blue blood cell – every time you open a vein, oxygen rushes in and the blue cell you’re trying to see immediately turns red!
Data is like a blue blood cell, think of our own past experience as the oxygen.
Humans are living, breathing pattern collectors. When an “informed*” person looks at a piece of “raw” Data, her brain – based on patterns she already knows – immediately implies the most likely context for it. The observer supplying the context elevates that data point to information before she even consciously considers it – informed persons cannot see Data, only information.
Even if we feel uninformed about a particular data piece, we have a host ofstrategies for understanding things to fall back on that help us construct a meaning.**
Like the blue blood in our veins, we can imply the existence of non-biased data (for instance, computers collect and store data without meaning), but as humans we are unable to experience it. (If you want to get close, choose an industry that you aren’t an expert in and find a detailed report – for instance, the trading price today for a commodity you don’t use in a country you’ve never heard of)
These have been raw and unstructured ramblings, I hope to take a more refined pass in the future. But I think that these definitions will be important to standardize as we prepare for all of the “Big Data” arguments that are already beginning to pop up.
*By extension, we usually think about “knowledgable people as those who have a more extensive pattern libraries and can more quickly imply possible contexts for whatever the situation is at hand.
**Lets also remember that Bias is a good thing – as a strategy we have for focusing, it helps us fight cognitive overload by framing information incrementally and constructing the best-immediate-solution real-world situations.