Wednesday, August 8, 2012

the death of BI

Business Intelligence (BI) processes have long been tied to Data Warehouses (DW).
  1. collect a bunch of data
  2. query it for insight
  3. hypothesize
  4. validate 
  5. test in the wild
  6. goto #1
Where the BI processes fail is in speed.  This six step process takes months in most environments, and requires lots of human intervention.  I'm not kidding.  Let's try to real world this for a minute.  

So say a business partner (BP) says that they have an idea that students who read the syllabus do better in a class.  They come to the BI team who tries to see if there is data in the warehouse on syllabus reading.  If not they have to get it added, which is invariably a giant pain in the ass.  Next they need to map syllabus reading to outcomes (grades, completion, etc).  More pain and suffering.  Models are made, opinions formed, etc.

Now we have to do something about it.  

Say we make it a requirement or slap the student in the face with syllabus usage stats.  Then a deck is made with the findings supporting the thesis and business case for acting.  The deck is sold and prioritized in the development queue.  This now needs to be pushed live (in some capacity) and then tested and whatnot.  

The problem is the students who could have been helped by the process are gone, dropped out, failed, etc.  We "might" be able to help the next set of students, but not those who are already gone.  

Big Data solutions give us a near real time feedback loop by opening up access to analysis.

If we  are already collecting and saving everything, there is no need to "get stuff added to the DW".  Win #1.  Since data is stored in an unstructured format, there is no need to do any transformations either.  Win #2.  I might need to bust out some SQL-fu, but that is all.  I have everything I need at my disposal.  Now here is the kicker, that giant bastion of knowledge is not locked up on some deep storage DW with limited access (aka one way in, one way out).  It is in a live store I have query access to and the ability to process and automate.  Win #3.

So now,  I can put the onus on the end user and have them do the work for me.  Simply query the store for statistics on syllabus reading (for a particular class, term, etc) and present it directly to the student.  

"Students who read this IT193 syllabus <LINK> on average earn a .56 grade higher".

The feats of strength required to get this sort of thing done in a traditional BI/DW environment are Herculean.  They are not made for real time processing and presentation. They are made for spitting out reports and safeguarding data.

Now I'm not suggesting people go burn down the DWs and fire the BI folks.  This will be a gradual process over time, where Big Data solutions automate away the need for BI.  


















No comments:

Post a Comment