When I say big data I am not referring to the size of the data. Though the ability to store and own petabyte, exabyte, zettabyte, yottabyte’s worth of information is dropping in capital and operational costs, the idea of storing information for information sake isn’t valuable. Kind of like the several terabytes of music I have (sitting on externals no longer plugged into a machine). Don’t tell my wife.
Instead big data should be thought of in terms of Volume, Velocity and Variability (discussion points at O’Reilly Stratus or GigaOm conferences). In actuality and rooted in common sense is the idea of ‘it isn’t how much you store, but what you do with it.’ And for that you need tools, resources and most of all logic. Many systems allow for the dynamic display of content (including transactional information) based on session information or even onsite behavior. But what will happen with the legacy systems of bohemith organizations are brought up to modern standards to take into account personal information when determining the content we should be seeing?
Two things are holding back the onslaught of this trend of personal or transactional data driving user experience across all devices. First is, it’s creepy. Charles Dhigg in his book “The Power of Habit” offers a great case study on Target® and how they analyze and trend shopping patterns, and can determine whether their shopper (or their family) is expecting a child; and in what trimester they are. Ummmm. Yeah. Though useful, this information turns creepy if used in an incorrect way. This is the essence of Big Data. Target was using it for traditional communication, I imagine largely because the algorithms could be run offline and used to determine traditional advertising and marketing messages. Which leads me to the second thing holding this back. Cost.
Though the capital cost for the storage is largely negligible compared to other large capital and operational expenses, to an earlier point, storing it is only one of the three key elements to big data. Once you have volume you need velocity and variability. To handle those last to ‘v’ words you need a whole new system and likely skill sets to manage it. The advantage is of course, once you have it and put it into play, it can largely run itself. For organizations so large that an upgrade from IE6 to IE7 is an operational expense that hits high six figures, the idea of “dynamicising” big data isn’t just a budget add on, it is a shift in culture, value and channel.
With ‘creepy and cost’ joined in today’s market you can see why, what is already possible with gold standard technologies, is taking so long to hit the market from a Fortune 500 perspective. In fact the companies that are largely doing this well, were doing it from their inception. Amazon and Overstock are very good at leveraging their big data against the consumer experience, but they knew nothing else! They had no roots in Madison Ave to drive the market share of their business, the idea of not utilizing that data is foreign to them.
At some point the big brands of today will decide to pull the trigger and buck up the dollars required to purchase technologies that allow the leveraging of the data they, or their vendors have been storing for years. And when that happens they will be tempted to try to ‘boil the ocean’, wanting to justify the sheer investment as quickly as possible, and to that I would push back whole-heartedly.
There are two reasons not to try to go from 0-100 when it comes to turning big data into Big Data. First is adoption. I often refer to video games when I talk to my clients about adoption. Virtually everyone remembers the original Nintendo with Mario Bros, Zelda and Mike Tyson’s Punch Out. That was about the last time I played video games with regularity. And then I graduated college and somehow found myself interested in them again. But I was thwarted. The controller had gone from ‘A, B, Left, right, up, down’ to something that looked more like a fighter jet control console. I was stunned and to this day haven’t picked up a controller. Imagine taking your consumer from the online experiences they are having today to a place where every digital marker we give our devices enters an algorithm and impacts what we see next. Where a purchase at Best Buy triggers an alert telling me to go get an accessory for 20% off next door. And that is a conservative example. Think the mall scene in Minority Report to get fully creeped.
The second reason is that you don’t have to. Right now companies are (if they are smart) leveraging session and onsite behavior to determine content and presentation. We call this personalization. Attempting to expand that limited consumer view with Big Data risks putting the wrong em-PHAS-sis on the wrong syl-LAB-le . Once your plug in Big Data into any system the focus point immediately turns to the algorithm (or rule) being used to run logic on all this information. That risk has real financial and speed to market impacts as the relational structure of Big Data is complex and expansive. Instead take a learning from a competition that Anand Rajaraman set up for his data mining class at Stanford. When given a DB of 18k movies and asked to determine the best algorithm to recommend movies to the consumer the winner wasn’t the author of the most complicated piece of logic, instead it was a less robust algorithm which took advantage of a second system. Rather than just rely on the DB of 18k films, this algorithm looked to IMDB to augment the data for additional value against the data stack.
The result being that a simple algorithm with more data beats a complicated algorithm with less. (Google validated this approach with their PageRank logic) So when the Fortune 500 gets into a position to pull the trigger on Big Data and saddle up on costs and effort, make Big Data your IMDB, augment and build, for adoption, for speed to market and for learnings.
And to build on that even more is the idea is that the result of the rules being put into place are new data sets in and of themselves. Companies shouldn’t just be leveraging data against marketing or business rules blindly, the result of the rule as well as the result with the consumer also needs to be stored and actioned against as well; there is meaningful data there too! The behavior of and response to chosen data becomes the contextual overlay from which additional relationships (data and brand) can be tracked and used to enhance user engagement.
Scary? It’s coming.