Continuing with the big data meets big hype theme:
So you want to get into Business Analytics/Big Data/Predictive Analytics.

What areas, skills, tools, data should you focus on first ?

There are three rather big questions that you need to ask yourself:

1. How well do I really understand the problem(s) that I want Analytics to solve, and The roles(s) that Analytics would play ?

2. How well do I understand my data?

3. What data do I actually have, or can get ?

Each question explores a continuum. Together they represent a three dimensional space of possibilities. There is no “magic quadrant” here, each part of the space is a legitimate place to be, with its own solutions, risks and benefits.

Let’s go through them.

1. The range of possibilities looks something like this:
A: having built preliminary offline random forest models and created some prototypes, I want to extend these existing customer acquisition and retention models we have to our intentional markets, and operationalise them for real-time, event based activity, provided this is seem to yield further significant yield. We will need an industrial strength, scalable, and reliable tool, probably a commercial vendor tool, and possibly a Hadoop-based MapReduce solution

B. my CEO just attended a lavish conference where he saw a slide presentation mentioning the Davenport HBR article from 2006 and now he wants us to “get into analytics”.

Most people are somewhere in between. But you get the idea. And there are far too many initiatives that are precisely at B. the ideal vendor customer is precisely at A. Unfortunately, there are not enough A’s around (we call them “Eduacated Buyers”) so some vendors must sell to people who look more like B’s.

Naturally, Analyst First does not advise Bs to get into Big Data, buy expensive vendor tools, or ever believe anyone that there is such a thing as “a solution for getting started in Analytics” especially when said solution is no more than a bunch of software and maybe a few relatively junior technical consultants for a few months.

Indeed, we advise the Bs of this world to invest in learning, exploring and gaining experience, while managing their sponsors’ expectations and growing their personal investment and participation in the new Analytics enterprise (yep, it’s an Enterprise, with all the Lean Startup that entails), and eliciting from said sponsors their real, and realistically achievable needs.
This is a crucial time to invest in smarts, experience, talent, learning and plenty of Lean Startup.
If this approach is not feasible, I do not have high hopes for the future of the function, which will, at best become a showpiece trophy of high tech adding no value, and will more likely be shut down, “restructured” and restarted again, hopefully with a more sensible approach.

And what of the As ?
Speaking to an A recently, indeed one of the best As I know, he noted that his team had kicked some great business goals recently, having implemented a very necessary expensive vendor tool, after trying R and seeing that it was not up to the big data / big crunch job they had to do. He noted that this was necessary, even though he agreed with A1, and that this was not in line with A1′s preference for open source tools.

“not at all”, I replied, “This is exactly A1, you were the quintessential Educated Buyer! A1 is not against vendor tools. We are against people spending money on what they do not understand in the hope of a magic solution. You don’t fall into that category.”

Hopefully, the anonymous A in question will write a more detailed post on this blog, outlining his success story in more detail.

So, our advice to As is… You don’t really need our advice, until you want to do something new again. In which case, chances are you are following A1 principles already, explicitly or not – otherwise how did you get to A in the first place,anyway ?

Most people are somewhere in between, and usually closer to B than to A.

Answering the “what the heck are we going to do?” question involves exploration on a number of axes, including stakeholders needs, own capability, available resources (human and electronic), any impediments or constraints (Hello IT!) and data, the subject of questions 2 and 3. The actual hidden contents of the data, the “gold” of the data “mining” metaphor is a huge exploratory subject in its own right, and must be considered in the context of the others.
This is not a very easy target to hit, and needs defining before that can happen !

So, to all the Bs and almost-B’s out there : invest in learning : invest in your own and your sponsors’. Invest in getting your sponsor invested, supporting and covering you, letting you explore and grow. Invest, above all, in exploration and invest in managing expectations and delivering intermediate ressults to allow all this to happen. Buy your analytics function a chance to grow, learn, explore and breathe free of unreasonable pressures and constraints.

The other two questions will be covered in upcoming posts.

2 Responses to Big Data ? Three Big Questions

  1. Terry Simmonds says:

    I agree you have to know what you are doing with big data analytics. There is much value though in persuing ideas via proof of concepts which focus on dramatically reduced subsets of data and determining if you can undertake relatively valuable analytics from the skills, tools, processes and data you have. I have worked this way over the last 10 years with great success.

  2. Terry,

    I agree with you completely with the prototyping / sub sampling approach, and use it myself routinely. In the context of A1, prototyping is an aspect of exploration and iterative analysis.
    Not all analysis requires some kind of implemetation, but almost all implementation benefits from one or more iterations of prototyping.
    Sub-sampling is sometimes an inescapable necessity, but almost always a major efficiency gain. As I am sure you realise, but some readers may not: sub-sampling must be done carefully, especially in such areas as predictive modelling with rare classes.

Leave a Reply

Your email address will not be published.

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>

Set your Twitter account name in your settings to use the TwitterBar Section.