In this post, I am going to start to elaborate on why big
data makes sense. Now, this doesn’t clearly sound like ground-breaking insight.
You can google “Big data” and you can come up with literally hundreds of
articles that will invariably say how the amount of data generated in the world
exceeds the storage capacity available. That customers are generating petabytes
of data through their interactions, feedback, etc. That cost of computing and
storage is a fraction of what it used to be even ten years back. That Google,
Amazon, Facebook have
invested in big data infrastructure by setting up commodity servers.
But what I have personally found missing in all of this
megatrend information, is that there is rarely a clear articulation of why a
big company should embrace big data. There are a number of good reports and
industry studies on the subject, and the McKinsey report on big data
is an exceptional read (the graphic above is derived from the McKinsey Global
Institute’s study on big data) – but all of them spend an extensive amount of
time making the case for big data technologies, and not enough time, in my
opinion, on the business rationale that
makes it inevitable for an organization to invest in big data.
So in my understanding of the space, what are some of these elements of business rationale that support
investment into big data? (I have to qualify my statement, that these would
apply to a typical large organization that already has a well-established RDBMS
or traditional-data-based infrastructure. For a start-up, using big data
technologies for one’s data infrastructure is a no-brainer decision. The
question of rationale comes up when an organization has considerable already
invested in traditional data and where the adjustment to introduce big data
technologies into the overall ecosystem is not going to be trivial.)
There are 6 specific areas where I have been able to find a
sound business rationale for investing in big data. These are:
1. Reducing storage costs for historical data and allowing data
to be retained for extended periods and making it readily accessible
2. Where significant batch processing is needed to create a
single summarized record (for different downstream business decisions) Creating
a single summarized record based on batch processing
3. When different types of data need to be combined, to
create business insight – or rather to get slightly more specific, to create a
single summarized customer-level record
4. Where there are significant parallel processing needs
5. Where there is a need to have capital expenditure on
hardware scale with requirements
6. Where there are significant data capture and storage
needs
In subsequent posts, I will make these different elements of
business rationale tangible through specific business situations.
No comments:
Post a Comment