“Appropriate Empowerment” is the third and final element of the Holy Trinity, the three essential characteristics of sponsors of successful analytics practices, covered in the current series of posts. Appropriate Understanding and Appropriate Incentive were covered previously.
As before, this is an examination of the success mode and failure modes of the element in question. What does Appropriate Empowerment (or just “Empowerment” for short) look like when it succeeds, and what happens when it fails, or other elements fail to support it ? The success mode of Empowerment considers the situation where all elements of the Trinity are in place, but focuses on the role played by Empowerment.
The Failure Mode of Empowerment is the situation where the sponsor possesses Understanding and Incentive, but lacks Empowerment. We explore this situation, along with possible remedies, before concluding with the Isolation Mode, the situation where Empowerment is present, but alone, with neither Understanding nor Incentive in place beside it.
The success mode of Empowerment is simple, yet essential. Empowerment is the least visible element of the Trinity, more notable in its absence. Where the Sponsor sees the need for something to be done to the benefit of the business through analytics, and has the right Incentive to make it happen, then Appropriate Empowerment simply means : it happens. There is no one who can overrule, block, derail or otherwise unhelpfully modify any analytics initiative that has been put into motion.
Understanding ensures that the sponsor identifies the right analytics initiative for the greatest benefit to the business, and takes into account all that is required to enable it. Incentive ensures that the Sponsor actually wants this to happen. Empowerment then is simple : the Sponsor is in a position to launch the initiative, and ensure that it proceeds to the correct conclusion. He is able to support it with all the resources it requires, and protect it from unhelpful stakeholders. He is also in place to ensure that recipients of analytics recommendations act on them if the process requires them to do so. Tyrannical ? Perhaps. Far-fetched ? Certainly. But this is the ideal, however out of reach it may be for (current) real-world large organisations.
Empowerment is thus quite simple. It is the ability to make things happen.
It is also an absence of unhelpful constraints. A Sponsor with the Holy Trinity is sufficiently empowered not to worry about unreasonable or ill-defined expectations of value before the initiative or function is ready. Empowerment ensures that the function is not subject to IT-style management practices, deterministic waterfall and project management approaches. His analytics function is lean, agile and experimental : free to learn, fail repeatedly (for a time), as required to continually reach insights of massive value and exploit them.
The failure mode sets in when a sponsor has all the best intentions in terms of Incentive, and is well versed in Understanding what an analytics function can do, and what it requires to achieve it, alongside budget and a mandate to create the analytics function. Unfortunately, he may well lack the power to act as Understanding and Incentive may compel him to.
Any dilution of empowerment invites unreasonable expectations born of poorer Understanding and Incentive. A sponsor beholden to other managers, stakeholders etc is subject to constraints, expectations and pressures that may prevent an agile, experimental approach. The cargo cult of analytics, “Analytics in a Box” solutions promoted by some vendors stand in opposition to the agile approach, and enjoy attention and support from far too many senior executives. The resulting analytics cargo cult, subscribed to by much of senior management, expects great value from analytics, but does not know how to define this value, or even to measure it. This very lack of clarity may be what imposes inappropriate deterministic project management frameworks such as PRINCE2, and other inappropriate business analysis and management oversight by people who have no idea what they are managing or why. in such situations, project managers may grab the first objective metric, however irrelevant or minor and focus on it as a box ticking exercise. The analytics function is then little more than an IT production line, creating something of indeterminate value to satisfy a management fad. A sponsor beholden to such powers cannot be said to be sufficiently empowered. Worse yet, ignorant or indifferent management may relegate the sponsor under the auspices of IT. Needless to say, this is not an ideal outcome.
One large pathology crippling Empowerment is the modern corporate stakeholder model. A committee of stakeholders is not a Sponsor, especially when enough members of that committee have far from perfect Incentive or Understanding, and perhaps far too much Empowerment. A committee can be on the whole more stupid, poorly Incentivised and disempowered than any one member. A Sponsor beholden to such a Committee is hardly empowered, and the Committee as Sponsor is a far from ideal scenario. The fact that this situation is reality in so many large organisations does not diminish the fact that it is utterly pathological.
In the ideal situation the Sponsor is beholden to no one with excessive power who is inadequate in the other two key characteristics. The ideal Sponsor is therefore the CEO, and better yet a manager / owner. Again, this is perhaps unrealistic, but still needs to be identified as the ideal, and any deviation from it analysed in terms of potential failure of Empowerment. It is also the reason that the most innovative, valuable and agile analytics exist in tech startups and not large “Enterprises” (in quotes because they are usually the very opposite of that word)
Not all pathologies of Empowerment concern levels of power above the Sponsor. Other pathologies of Empowerment are lateral. The most immediate lateral power issue is one with IT : too many IT functions find it their job to block analytics access to tools, especially open source tools that are otherwise readily available, free and powerful. They may prevent access to adequate, and otherwise cheap and readily available hardware and useful online services such as cloud computing. They are also known for starving the analytics function of data. Too many analytics functions are in a situation where the main expenditure of effort is building business cases for data, tools and hardware. A sponsor who knows this to be the case but cannot fix it is clearly not sufficiently empowered.
Lateral Empowerment is also an issue with “trigger pullers”, people whose job it is to act on the recommendations of operational analytics. The most striking case of this is a pathology i have seen in a multitude of organisations making use of predictive operational risk analytics. Predictive models provide lists of targets (eg revenue leakage, non-compliance, suspicious behaviour, fraud risk indicators etc). In all cases a human being is provided a list of targets generated by the predictive model. Ideally, this human being proceeds to manually investigate the targeted cases. Unfortunately, in most situations, these individuals do not understand or trust these predictive models. In my experience, many such individuals cannot conceive the very idea of the inference of a model from data. It would appear that there are whole cultures of people who cannot imagine such a thing as statistical induction. They naturally voice their displeasure and challenge, stall and undermine the process. Much of a Sponsors job seems to be the thankless, draining and often never ending task of “winning them over”. A sufficiently empowered Sponsor would, however, be in a different situation. When asked why these people should trust these models he would be able to answer “because if you do not, I will fire you and perhaps hire someone who does what they are told. Or replace you with a smart pattern matching algorithm”. Again, this is perhaps not realistic, and perhaps suggesting something that certain Public Sector Unions would consider on par with a crime against humanity – asking that people do their jobs. The whole issue of uncooperative “trigger pullers” was only raised to make a point about Appropriate Empowerment: if a Sponsor is not able to ensure that human components of an operational analytics value chain do cooperate and act as a part of the analytics value chain, there is a failure of Empowerment. Perhaps effective analytics sponsorship, as defined in this series is impossible in most organisations where employee non-compliance and stakeholding is a given.
A lack of Empowerment is however, far from the end of the world, and the relatively dystopian situation described above matches many existing analytics functions, particularly in government and quasi-government organisations. They still manage to survive, and add some value, although arguably but a fraction of what would be possible if only sponsors were more Empowered. These organisations have in fact found themselves innovating in a number of fronts, dealing with insufficient Empowerment, and in some cases developing methods of generating more of it.
One key solution to the problem of insufficient Empowerment is Separation from IT. As far as possible, as quickly as possible, it is important to establish a “sandpit environment”, separate from the main IT network, where new hardware may be added, and software loaded outside of IT governance. This is essential if appropriate computational power and open source tools are to be leveraged quickly and effectively.
Another part of the solution, and one that is even more fundamental is Stealth Mode. It is imperative that a new analytics function has the ability to learn, experiment, and fail in its early stages. Expensive budget items such as vendor tools create massive, thought ill-defined expectations. Expectation management is yet another reason to avoid expensive vendor software early in the creation of an analytics function.
Ideally, the function has a small crew of capable, flexible people, a small budget and access to data and open source tools. Also, the function has a main focus that is a well-defined, business as usual task such as reporting. Actual analytics can be done on the side, as a side project, and not announced until it yields results. These results can then be presented as wins to formalise and Empower the nascent analytics function. There may then be sufficent leverage to acquire more staff, create a sandpit environment and acquire data reliably.
As discussed previously, the most important element of the Trinity is Incentive. With Incentive alone, the Sponsor knows that their first task is to increase their Understanding. Some of this is reading/study, some of this is consultation with experts, and much of this is experience which can be obtained in stealth mode. Empowerment is important, but as we can see it comes third in importance.
Indeed, most capable analytics professionals find themselves working for under-empower sponsors. This is not ideal, but not a career-ending situation. Indeed, the struggle for further Empowerment of the Sponsor is the defacto KPI of most analytics functions, and many professionals find it as exhilarating as they may find it frustrating.
It remains to discuss the “Isolation mode” of Empowerment. What happens when the Sponsor has all the power, but no Understanding, and,lacks the right Incentive? Here ignorance conspires with either a lack of real enthusiasm for Analytics, or an entirely different agenda, and gives them a hefty cheque book. So, what can happen ? A storm of Cargo Cults, management fads and buzzwords. “Analytics”, having something to do with “data” and software must clearly be some kind of IT, best managed and bought by the CIO and best explained by people who sell software. And that’s how the wrong kinds of Vendors happen. Long sales lunches. Exciting pre-sales presentations. Use of the words “Enterprise”, “Innovation” and “Insight” by people who don’t have anything to do with any of them. “Case studies” of previous such exchanges in other high profile corporations, presented as success. People who may not really care what they are selling, sell to people who don’t really understanding or care what they are buying. Consultants, the “best practice”, “brand recognition” kind jump in. More money gets spent. Everybody involved wins, except the (theoretical and distant) shareholders, citizens and other ultimate beneficiaries of the business. Almost always, none of the parties is an actual owner of the business in question. Most owners are far more sensible than that.
So what happens after that? Software get installed. Systems get integrated. People get hired, maybe, as an afterthought to mind the (far more important) Machines. These people are likely software developers, data base managers and project managers. Maybe even a token statistician. Gannt charts get ticked. Bonuses get paid (at least on the vendor side). Conferences benefit from new “Best practice” case studies. The Vendor-Consulting complex marches on in all its dinosauric grandeur.
So Incentive and Understanding matter, and Empowerment on its own is not a great idea, however common this situation may be.
This article highlights the communication challenge for accredited A1 professionals.
We all recognise Analytics is about using information better than competitors, so we are: 1. doing things better, and 2. doing better things than competitors/relevant comparators. But like so much of the coverage of our sector, the article focusses solely on Operational Analytics, not the latter area of Strategic Analytics.
Secondly, the article fails to recognise speed is only one part of the equation.
Taking the author’s example of the retail sector, sure real time analytics can detect an early decline in sales for a particular product, controlling for some extraneous factors. But a retailer’s promotional response (who they target and how) doesn’t necessarily require real time analytics (they can apply in real time outputs from models created last week, with little risk of degradation).
The most important questions for shareholders of the retailer require Strategic Analytic capability: how should pricing across the entire product portfolio be optimised?, what products should we be ordering now for next season (or the season after)?, how to optimise the physical network and supply chain? These strategic questions demand the right answer, not necessarily the fastest answer.
Any experienced industry professional gets that making sense of data is our primary role. But clearly interpreting data to the best of our ability flies in the face of throwing away information (e.g. because inconsistencies in the available data makes the task more cognitively complex). No one would advocate storing and processing data which possesses no incremental information value, but information value can be measured, so that shouldn’t be an issue.
Critically this article fails to recognise many of the barriers for Australian companies in effectively using their data relate to data quality, not their data storage and processing capacity.
Finally, there is no explicit recognition of the talent required to use data more effectively than your competitors.
From an A1 perspective we should welcome the growing focus on our sector, but we need to better articulate the more nuanced (and interesting) story of Analytics in an A1 Practice. It would be easy to criticise the journalist for being naive in swallowing the line of vendors and other vested interests, but the responsibility is ours to better explain the reality.
Eugene is totally right that we need to stand with a united voice. From today, NTF with publicly back A1 in all our proposals and marketing collateral. I regret not taking this action sooner.
Continuing with the big data meets big hype theme:
So you want to get into Business Analytics/Big Data/Predictive Analytics.
What areas, skills, tools, data should you focus on first ?
There are three rather big questions that you need to ask yourself:
1. How well do I really understand the problem(s) that I want Analytics to solve, and The roles(s) that Analytics would play ?
2. How well do I understand my data?
3. What data do I actually have, or can get ?
Each question explores a continuum. Together they represent a three dimensional space of possibilities. There is no “magic quadrant” here, each part of the space is a legitimate place to be, with its own solutions, risks and benefits.
Let’s go through them.
1. The range of possibilities looks something like this:
A: having built preliminary offline random forest models and created some prototypes, I want to extend these existing customer acquisition and retention models we have to our intentional markets, and operationalise them for real-time, event based activity, provided this is seem to yield further significant yield. We will need an industrial strength, scalable, and reliable tool, probably a commercial vendor tool, and possibly a Hadoop-based MapReduce solution
B. my CEO just attended a lavish conference where he saw a slide presentation mentioning the Davenport HBR article from 2006 and now he wants us to “get into analytics”.
Most people are somewhere in between. But you get the idea. And there are far too many initiatives that are precisely at B. the ideal vendor customer is precisely at A. Unfortunately, there are not enough A’s around (we call them “Eduacated Buyers”) so some vendors must sell to people who look more like B’s.
Naturally, Analyst First does not advise Bs to get into Big Data, buy expensive vendor tools, or ever believe anyone that there is such a thing as “a solution for getting started in Analytics” especially when said solution is no more than a bunch of software and maybe a few relatively junior technical consultants for a few months.
Indeed, we advise the Bs of this world to invest in learning, exploring and gaining experience, while managing their sponsors’ expectations and growing their personal investment and participation in the new Analytics enterprise (yep, it’s an Enterprise, with all the Lean Startup that entails), and eliciting from said sponsors their real, and realistically achievable needs.
This is a crucial time to invest in smarts, experience, talent, learning and plenty of Lean Startup.
If this approach is not feasible, I do not have high hopes for the future of the function, which will, at best become a showpiece trophy of high tech adding no value, and will more likely be shut down, “restructured” and restarted again, hopefully with a more sensible approach.
And what of the As ?
Speaking to an A recently, indeed one of the best As I know, he noted that his team had kicked some great business goals recently, having implemented a very necessary expensive vendor tool, after trying R and seeing that it was not up to the big data / big crunch job they had to do. He noted that this was necessary, even though he agreed with A1, and that this was not in line with A1′s preference for open source tools.
“not at all”, I replied, “This is exactly A1, you were the quintessential Educated Buyer! A1 is not against vendor tools. We are against people spending money on what they do not understand in the hope of a magic solution. You don’t fall into that category.”
Hopefully, the anonymous A in question will write a more detailed post on this blog, outlining his success story in more detail.
So, our advice to As is… You don’t really need our advice, until you want to do something new again. In which case, chances are you are following A1 principles already, explicitly or not – otherwise how did you get to A in the first place,anyway ?
Most people are somewhere in between, and usually closer to B than to A.
Answering the “what the heck are we going to do?” question involves exploration on a number of axes, including stakeholders needs, own capability, available resources (human and electronic), any impediments or constraints (Hello IT!) and data, the subject of questions 2 and 3. The actual hidden contents of the data, the “gold” of the data “mining” metaphor is a huge exploratory subject in its own right, and must be considered in the context of the others.
This is not a very easy target to hit, and needs defining before that can happen !
So, to all the Bs and almost-B’s out there : invest in learning : invest in your own and your sponsors’. Invest in getting your sponsor invested, supporting and covering you, letting you explore and grow. Invest, above all, in exploration and invest in managing expectations and delivering intermediate ressults to allow all this to happen. Buy your analytics function a chance to grow, learn, explore and breathe free of unreasonable pressures and constraints.
The other two questions will be covered in upcoming posts.
The recent IAPA discussion panel on ‘Aligning IT and Analytics to deliver sustainable innovation’, plus a later conversation with fellow panellist, EMC-Greenplum’s James Horton, prompted me to sketch some thoughts on what an Analytics Lab ought to do. The lab is the natural home for Analysts engaged in the narrower definition of Analytics:
The Analytics Lab is an innovation factory which constantly evaluates data, quantitative methods and tools looking for sources of competitive advantage.
- Data: structured and unstructured, sourced from both inside and outside the organisation, established and new.
- Methods: data transformation, and then data mining, machine learning, statistical, mathematical, and other analytical methods.
- Tools: as appropriate to method, from programming languages through to GUI applications, from commodity and open source through to commercial tools.
- Analysts: the lab enables the organisation to evaluate the technical abilities and innovative propensities of its analysts, as well as those on offer from external service providers, without many of the interfering factors present in operationally hardened IT environments.
Its outputs are:
- BI prototypes
- Instantiation candidates
- Identifies data and knowledge gaps: Analysing data and generating insights brings to light new data needs and exposes gaps in knowledge which may impact the business. Additional data may need to be sourced, gathered through survey, collected by tweaking an existing business process, or purchased from a third party. Additional analyses and subject matter expertise may be required to close knowledge gaps.
- Resolves disharmonies: All businesses struggle with ‘different views of the truth’, and it’s often the crunching of data which brings these to light. Disharmonies might be within or between data sets, or between conventional wisdom and the drivers of a model. They could relate to anything from actual observations to tacit assumptions. Resolving such disharmonies—harmonisation—involves identifying, scoping, validating, and correcting them.
These last two are not the core business of Analytics, but they’re important activities, and doing Analytics naturally leads to them. Most organisations don’t explicitly provision for them, but arguably they should. The lab is as good a home for them as any other.
The Analytics Lab services all levels of business, but in different ways:
- Senior Management: through the provision of strategic insights.
- Middle Management and Knowledge Workers: through one-off and/or prototyped BI analyses.
- Frontline Workers: through the identification of instantiation candidates, i.e. deployable operational analytics.
Many analyses typically need to be tried before those which merit instantiation are discovered. Furthermore, “instantiation” doesn’t necessarily mean a repeatable process. It could simply mean the communication of a one-off insight, e.g. “revenue growth is unmistakeably slowing in all but one customer segment” or “the most reliable predictor of a customer’s propensity to churn is their social network membership.” Such insights are typically complex, valuable, but not “actionable” in any deterministic, automatable way.
Other findings are suited to more regularised delivery, for example as managerial decision support through business intelligence.
Some analytical results, in order to be fully leveraged, need to be integrated into frontline business processes. Predictive models which predict customer acquisition or churn, for example, might require integration in sales, marketing, call centre, channel management and customer support processes.
Conduct disciplined, exploratory analyses which repeatedly cycle through the following sorts of questions:
- Is there structure in the data (patterns, trends, relationships, networks, segments, clusters, indicators, drivers, outliers, anomalies)?
- Are there new insights in the data?
- Which models are viable?
- Which variables are important?
- Which variables do we control?
- What are the implications for revenue, cost, risk?
- What data do we want that we don’t have? How could we get it?
- What are the implications of this insight?
- Who is our internal customer for this insight?
- Would this analysis be valuable if provided on an ongoing basis? To whom?
- Into which existing or envisioned business processes should this insight be instantiated?
- Where are there disharmonies in tacit or explicit data and assumptions?
- Which projects, processes and decisions are affected by these disharmonies?
- How do we validate and resolve these disharmonies?
Infrastructure can usefully be separated into the ‘electronic infrastructure’ of hardware and software and the ‘human infrastructure‘ of people, relationships, management and incentives.
- Secure, off-network ‘sandpit area’
- Big storage, big memory, scalable to big data
- Eclectic analytical toolset: commodity, open source, commercial, experimental, in-house
- Snapshots, copies, feeds of all manner of available data sources: pre-ETL, pre-warehouse, post-warehouse, external, web, social media, unstructured. In the context of the lab, the data warehouse is just another source system.
- De-emphasis on repeatable technical processes and compliance with production IT architecture
- Insulated from IT Service Level Agreements and other production / core system / business-as-usual constraints
- Human Resources:
- Analysts: Data scientists
- Management: Validate analysis objectives, ensure that analysts remain focused, performance manage the innovation process.
- Sponsorship from Executive
- Cross-functional relationships with business units: both ‘push’ (business unit as customer) and ‘pull’ (business unit as subject matter expert)
- Close relationship with Strategy function
- ‘Caveat utilitor’ relationship with IT for data provision and tool support
- Various relationships with service providers: vendors, consultants, training and mentoring providers, industry expertise, academia if appropriate
- Performance Management:
- Innovation / Research metrics
- Risk metrics
- Sentiment metrics
- Dimensions of opportunity: Internal, Competitor, Market, Customer, Product, Channel
Related Analyst First posts:
- *Aligning IT and Analytics to deliver sustainable innovation*
- Needles, Haystacks, and Category Errors, or, Where Does Operational Analytics Fit?
- Systemising skepticism
- Assume bad data
- The Economics of Data – Analytics Is… Investing in Data
- Decision support versus decision automation
Last week I attended a very interesting IAPA panel discussion in Canberra, organised by Peter O’Hanlon, head of the IAPA ACT chapter. The panel discussion was lively, informative and controversial, exploring as it did the often difficult relationship between Analytics and IT. A1′s very own Stephen Samild was one of five panelists. Peter did a great job of facilitating, and all five panelists made some great points. People in the audience also pitched in with interesting questions and reflections on real-world experience.
The conversation continued to return to a central topic, one that lives in the murky grey area between the two functions, and acts too often as a political football. I speak of the instantiation and deployment of Analytics outputs to IT systems. This essential activity, often referred to as “operational analytics” is the source of much confusion, conflict and business failure. Much of the trouble arises from poor fundamental philosophical distinctions which have arisen historically. These lead to unhelpful naming conventions and political turf demarcations. To explore the issue is to re-examine some fundamental definitions and distinctions. The first task is to ask what do we mean by Analytics. Two possible definitions might be:
- Any electronic manipulation of large amounts of data.
- Any exploratory analysis of data that results in information leading to innovation or insight.
Definition 1 covers both the quest for insight and its deployment and operation in an IT system. Definition 2 covers only the former. Which definition is preferable?
The operational step itself consists of two steps, which is the deployment of an insight (e.g. a predictive model) and the ongoing monitoring of its effectiveness.
Reasons for preferring definition 1, which places both steps within the Analytics realm, include the following:
- “Operational Analytics” has the word “Analytics” in it
- There is data crunching involved. Isn’t that what Analytics is?
- There is model evaluation/monitoring involved. That is stuff only Analytics people do, right?
- Historically, this has been stuff only the Analytics people cared about.
- The software that does all this stuff comes from Analytics providers.
There are however some solid counter-arguments to these:
- Could this just be an unhelpful and confusing historical accident?
- There is plenty of data crunching in payroll, accounts payable and other operational systems that few would think of as Analytics.
- Monitoring and evaluation should be applied to a lot more than just predictive models. In particular, it should be applied to any business process that Analytics would seek to improve. This is Performance Management and Business Intelligence, but hardly Advanced Analytics. While this kind of measurement is often seen as part and parcel of Analytics, there is no reason that the two need go hand in hand. The extent to which they do is an artefact of history, and a reflection of the poor penetration of empiricism and appropriate performance management across business generally.
- Historical accident is no reason to maintain a coupling of what are fundamentally different activities.
Naturally, there may be counter-counter arguments, and I invite readers to raise them in comments.
To argue for the narrower definition of Analytics is to demystify “models”, and to demonstrate that an operationalised predictive model is no different to an operational accounting system. The argument is simple:
- Both deal with potentially large data sets.
- Both apply a range of rules, consisting of if-then-else conditions and arithmetic.
- Both produce outputs to some workflow.
And that is it. The emperor has no clothes where actual models are concerned: a predictive model is little more than a bunch of if-then-else logic and arithmetic. These rules can be read and deployed by IT staff. Indeed, it is not important to know where the rules came from, be it a Support Vector Machine or human defined rules laid down by the CFO.
The magic of Analytics lies in its ability to find the right set of rules. The rules themselves are not that complicated in comparison to the learning algorithms that find them. My favourite analogy here is the needle and haystack problem. A metal detector would be handy here, and is arguably a very sophisticated tool compared to the humble needle. The detector makes sure you end up with the needle and not just hay. Once found, you notice that the needle is a rather simple yet valuable tool, and one that can be put to work sewing. So far, so good. You might also agree that looking for metal and sewing are somewhat different tasks and that the metal detector guy can now go off and look for more needles in some other haystack, or for gold. Putting the needle to work sewing is a completely different skill for someone else.
The broader definition of Analytics creates commonality between sewing and metal detection. The narrower definition accepts that any such commonality is neither necessary nor natural. So historical baggage aside, there may be an argument that insight and innovation generation is the business of Analytics, while the operational deployment of business rules is the province of IT, as might be the ongoing monitoring of the effectiveness of such systems.
There are then counter-arguments to this distinction. These rely on specific definitions of the words “exploratory” and “deploy”. Both are to a large extent a misunderstanding of terms rather than a true disagreement, but they can naturally lead to a preference for the broader definition of Analytics. Political factors also come into play. Again, the counter-arguments are on good footing with respect to history, but may lead to unhelpful category errors.
First of all, the word “exploratory” raises the hackles of many an Analytics manager. This is because analysts are by nature explorers, and rightly so. Unfortunately this can be taken to extremes, and a small but conspicuous minority of analysts are always at the ready to run off into uncharted waters, performing analysis of questionable or nil business value, treating their job like an open ended research project/video game, and perhaps violating a number of principles of science, reason and IT security in the process. While actually rare, this approach to Analytics is memorable enough to give exploration a bad name, especially among people in business not used to scientific inquiry.The good news is that pathological exploratory behavior is a small and manageable problem. It can usually be turned around by more attentive supervision, incentives and leadership.
There is also a cultural issue clouding an appreciation of exploration. Managers accustomed to process, best practice, and clear objectives often have trouble distinguishing dysfunctional exploration from more productive kinds. Further, they may have trouble identifying the successful performance of Analytics in an exploratory context due to the unexpected and seemingly random nature of outputs, as well as the need to interpret, evaluate and implement them before value is realised.
Analytics management based on a conventional, deterministic IT project management model is perhaps more common. Traditional project managers may not perceive exploration as delivering any value, and may share their concerns with others in the business. In this way exploration may earn an undeservedly poor reputation. Again, this understanding is in the minority—a shrinking one—and is being steadily replaced by more appropriate agile and Lean Startup approaches. And, once again, it’s a problem easily rectified by acknowledging the uncertain, exploratory nature of Analytics, and ensuring that the sandpit function is not led by traditional project management approaches, nor incentivised according to deterministic KPIs.
The very rare combination of the two pathologies is a perfect storm and a recipe for failure, but even then not irredeemably so. The management issue is the first one to fix in this case, and the analyst issue will either fix itself, or benefit from new resources.
A related argument is a political one, mindful of the organisational status of a unit that “only” does “exploration” as opposed to something “real”. This is certainly a cultural issue affecting many organisations, but there is no reason to take it as a normative argument for how Analytics should be defined in an ideal organisation. At best, it is an argument for a temporary arrangement that may allow Analytics to prove its true worth to the organisation and hopefully rearrange to a more logical structure at a later stage.
A related issue is one of deployment: the argument that for an insight to be valuable it must be deployed. The usual implication is that only Operational Analytics is of value. This is not an argument against the narrow definition of Analytics. Rather, it suggests that the business of Exploratory Analytics is entirely the creation of business rules to deploy in IT systems. The counter-argument here is not so much disagreement as a broadening of the definition of “deployment”, “data” and “IT”. If by “IT” we mean the brains of senior executives—“data” can be unstructured, graphical or tacit (e.g. verbal), and “deployment” can include sharing insights by word of mouth or PowerPoint slides—then there is actually no argument.
Take a predictive model as an illustration. While the model is a valuable operational rule set when deployed on an IT system and let loose on giga/tera/petabytes of data, it is also a valuable summary of behaviour—indicating key drivers, leading indicators, and interactions from which behaviour can be inferred. Such insights are valuable to executives, but not as business rules. Their “deployment” is largely manual and one-off, often requiring additional explanation and visualisation provided by highly skilled statisticians.
Thus, Analytics is responsible for “deploying”, valuable, complex, unrepeatable strategic insights, while the simple, repeatable ones are relegated to IT. Note also that both sets of “deployables”, strategic and operational, can come from the same predictive model.
This completes an outline of a case for a narrow definition of Analytics, demystifying deployment and leaving it to IT, along with model performance measurement, and leaving Analytics to act as an innovation, insight and strategic intelligence function.
Related Analyst First posts:
We’re building new information delivery systems for a future that isn’t there. Our state-of-the-art environments are already becoming obsolete because our view is distorted by the lens of the past, showing us the future as it was years ago. That world of scarce computing resources and limited data is gone.
That’s Mark Madsen at TDWI, arguing that many of the key assumptions driving our construction of analytic systems—decision support systems, data warehouses, and business intelligence—are wrong. The first wrong assumption is of scarcity. Processor cycles, memory, and storage used to be expensive. They aren’t any longer, but we’re still batch processing our ETL, prematurely archiving, summarising and normalising our data, and limiting our storage of derived information.
His second target is the tabula rasa impulse:
Most data warehouse and BI methodologies assume that you start with no analysis systems in place. The methodologies were created at a time when information delivery meant reports from OLTP applications.
The reality today is that analytics projects don’t start with a clean slate. Reporting and BI applications are common in different parts of the organization.
Third is the assumption of stability. Build-from-scratch methodologies made sense the first time around, but:
By not focusing on evolution, the methodologies miss a key element about analytics: they often focus on decisions that change business processes. Process change means the business works differently and new data will be needed. When someone solves a problem, they move on to a new problem. The work is never done because an organization is constantly adapting to changing market conditions.
One of Analyst First’s key principles is that:
Analytics is not a linear process, like most engineering projects. Its end product is discovery: you cannot determine what will be discovered ahead of time. Thus the outcomes of analytics, and the decisions based on them, cannot be made before the analysis has been carried out.
Consequently we advocate the primacy and ongoing centrality of strategic analytics over operational analytics. The more analytics is conceived of as a set of activities which only adds value at existing operational margins, the more it is unnecessarily constrained and the less it is able to change the game. Echoing yesterday’s Analyst First post on analysis as a read-write activity, Madsen continues that:
Business intelligence methods and architecture assume that what’s being built is a single system to meet all data needs. We still think of analytics as giving reports to users. This ignores what they really want: information in the context of their work process and in support of their goals. Sometimes reports are sufficient; sometimes more is needed.
He goes on to confirm that big data is both challenging status quo electronic infrastructures and driving demand for higher value advanced analytics:
The interaction model for BI delivery is that a user asks a question and gets an answer. This only works if they know what they are looking for. Higher data volumes, more sophisticated business needs, and high-performance platforms require that BI be extended to include advanced analytics. These answer “why” questions that can’t be answered by the simple sums and sorts of BI.
As I’ve argued before, the assumption-heaviness and manual intensiveness of standard BI technologies such as OLAP can’t compete, at scale, with the automated exploration that machine learning methods make possible. Madsen concludes that the data warehouse should be conceived of as a platform rather than an application. His closing four paragraphs are worth quoting in full:
The data warehouse has evolved to the point where it needs to provide data infrastructure, and needs to support information delivery by other applications rather than trying to do both. Data infrastructure requires a focus on longer planning horizons, stability where it matters, and standardized services. Information delivery requires meeting specific needs and use cases.
Design methods today seldom address the need to separate data infrastructure from delivery applications. Designs focus on data management and fitting the database to the delivery tools. This leads to IT efforts to standardize on one set of user tools for everything, much like Henry Ford tried to limit the color of his cars to black.
The new needs and analysis concepts go against the idea that a data warehouse is a read-only repository with one point of entry. They do not fit with established ideas, tools, and methodologies.
Today, the tight coupling of data, models and tools via a single SQL-based access layer prevent us from delivering what both business users and application developers need. The data warehouse must be split into data management infrastructure that can meet high-performance storage, processing, and retrieval needs, and an application layer that is decoupled from this infrastructure. This separation of storage and retrieval from delivery and use is a key concept required by data warehouse architectures as business and technology move forward.
Related Analyst First posts:
Week 3, Day 2 of the CORTEX MBAnalytics program includes TDWI’s best practices report, ‘Strategies for Managing Spreadmarts – Migrating to a Managed BI Environment’, by Wayne W. Eckerson and Richard P. Sherman, from 2008 and based in part on a survey conducted in 2007. It’s an excellent document—one of the best articulations of a problem with which Business Analytics practitioners and interested parties ought to be familiar: the nature and causes of ‘spreadmarts’, their strengths, weaknesses and limitations, what to do about them, and the risks involved. Given how well the report covers these topics, I commend it in full. But I’m also going to address where it falls short. It’s right in its reasoning and its conclusions, but potentially misleading in its emphasis and what it omits.
By way of definition:
A spreadmart is a reporting or analysis system running on a desktop database (e.g., spreadsheet, Access database, or dashboard) that is created and maintained by an individual or group that performs all the tasks normally done by a data mart or data warehouse, such as extracting, transforming, and formatting data as well as defining metrics, submitting queries, and formatting and publishing reports to others. Also known as data shadow systems, human data warehouses, or IT shadow systems.
Finance generates the most spreadmarts by a wide margin, followed by marketing, operations, and sales… Finance departments are particularly vulnerable to spreadmarts because they must create complex financial reports for internal and external reporting as well as develop detailed financial plans, budgets, and forecasts on an ad hoc basis. As a result, they are savvy users of spreadsheets, which excel at this kind of analysis.
Spreadmarts are categorised into three types:
- One-off reports. Business people use spreadsheets to filter and transform data, create graphs, and format them into reports that they present to their management, customers, suppliers, or partners. With this type of spreadmart, people are using data they already have and the power of Excel to present it. There’s no business justification—or even time—for IT to get involved.
- Ad hoc analysis. Business analysts create spreadmarts to perform exploratory, ad hoc analysis for which they don’t have a standard report. For instance, they may want to explore how new business conditions might affect product sales or perform what-if scenarios for potential business changes. They use the spreadmart to probe around, not even sure what they’re looking for, and they often bring in supplemental data that may not be available in the data warehouse. This exploration can also be time-sensitive and urgent.
- Business systems. Most spreadmarts start out as one-off reports or ad hoc analysis, then morph into full-fledged business systems to support ongoing processes like budgeting, planning, and forecasting. It’s usually not the goal to create such a system, but after a power user creates the first one, she’s asked by the business to keep producing the report until, eventually, it becomes an application itself. This type of spreadmart is called a “data shadow system.”
Of these, it’s ‘business systems’ which are the report’s focus. The ‘one-off report’ and ‘ad hoc analysis’ categories of spreadmart, the report fittingly concludes, are inappropriate for systemisation.
But defining spreadmarts in terms of them being ‘shadow systems performing all the tasks normally done by a data mart of data warehouse’ gets things somewhat backwards. Presupposing that all spreadmarts are appropriate for systemsation—in effect viewing them as ‘data marts in waiting’—is misleading when their one-off and ad hoc uses are recognised and their wide proliferation and coverage taken into consideration. One might more accurately define data marts and data warehouses as ‘scaled-up systems which perform some of the tasks normally done by a spreadmart’.
The report’s framing of spreadmarts is a product of its BI/DW view of the world. In that worldview, data lives in source systems and needs to be ETL-ed into a relational repository in order to be published en masse to business users whose analysis requirements consist of read-only slice and dice. These needs do of course exist, but analysis entails much more. This narrower BI/DW view is reflected in the TDWI survey’s design. For example, respondents are asked to rank the top five reasons why spreadmarts exist in their group. “Quick fix to integrating data” ranks second overall, “Inability of the BI team to move quickly” third, ”This is the way it’s always been done” fifth, and “Desire to protect one’s turf” seventh. These options are hardly worded neutrally. They’re phrased so as to norm systemisation and to cast ad hoc and one-off uses of spreadmarts as the products of sloppiness, frustration, ignorance, and narrow self-interest. The section on the benefits of spreadsheets is similarly biased in its framing. “Ideal for one-time analysis” and “Good for prototyping a permanent system” are listed, but not ‘ideal for exploring data, creating scenarios, capturing assumptions, and enriching existing data’. Interpretation also suffers. For example:
Ironically, organizations with a “low” degree of [BI] standardization have the lowest median number of spreadmarts (17.5), and only 31% haven’t counted them. The proper conclusion here is that standardization increases awareness of spreadmarts.
It may be a convenient conclusion, but it certainly isn’t the only one. Perhaps the 70 to 80 percent failure rate of BI standardisation projects is driving business users back to spreadsheets. Excel integration with BI platforms is also filtered through a read-only lens:
BI vendors are starting to offer more robust integration between their platforms and Microsoft Office tools. Today, the best integration occurs between Excel and OLAP databases, where users get all the benefits of Excel without compromising data integrity or consistency, since data and logic are stored centrally.
This tends to be how BI vendors understand Excel integration. They recognise that users enjoy Excel as a query and reporting interface, but they understate its importance as a data and logic creation tool. The BI/DW worldview understands all data as the product of business processes which write it to ‘source systems’. (The one exception to this is the sub-discipline of enterprise budgeting and planning, which is in effect BI with people as the source systems.) What is underappreciated is that analysis itself is a business process—one which can’t help but create data.
These biases are understandable. The TDWI report is sponsored by a collection of BI software vendors, and 60% of those surveyed are IT professionals. What’s missing, then, is a more nuanced understanding of what analysis entails. Such an understanding needs to recognise:
- The centrality of data transformation and enrichment by individual analysts
- The value of tacit data
- The importance of presentation
Simply put, analysis is a read-write activity. Routine analytical tasks I find myself doing, for example, include:
- Entering hitherto tacit data
- Codifying business knowledge
- Finding and synthesising data from outside sources
- Creating dummy and randomised data
- Capturing novel assumptions
- Imposing new categories on existing categorical data
- Enriching existing data by deriving or devising on-the-fly metadata
- Building scenarios and constructing counterfactuals
- Drafting and adding commentary, interpretation, and notes
- Formalising and detailing new questions and follow-on analyses
All of these activities involve me creating new data, and I would submit that neither I nor any BI/DW requirement gathering cycle would be able to anticipate that data ahead of time. These are creative, reflective, results-contingent activities. As the report puts it:
[M]ost BI vendors have recognized that a large portion of customers are using their BI tools simply to extract data from corporate servers and dump them into Excel, Word, or PowerPoint, where they perform their “real work.”
If we additionally overlay the question of data value, one of Analyst First’s key contentions is that the most valuable information in organisations lives in people’s heads. It’s tacit, and spreadsheets are one of the best tools for eliciting it and making it explicit:
Spreadsheets are the most pervasive and effective decision support tools. No organisation doesn’t use them, and it’s a safe bet that this will always be the case. No amount of data warehousing will ever be able to provide decision makers with all the information they need. To the extent that it can, those decisions can be automated. Decisions invariably require new data. That new data will be either unanticipatable, or tacit, or both. Spreadsheets are unbeatable for ad hoc data analysis and turning tacit data into explicit data.
But spreadsheets aren’t the only tools available for tacit data mining. Nor, for some types of data, are they the best tools. As the ‘The Economics of Data – Analytics is… Investing in Data‘ post argued:
The most interesting, readily available, strategically relevant and poorly understood form of data is tacit data: the information contained in the brains of staff, board, shareholders and anyone else who would see the organisation do well… How is tacit data mined? The most effective and powerful way is by use of collective intelligence and forecasting techniques, such as prediction markets.
Finally—analysis process, ad hoc, and one-off needs aside—decision support systems need to be more than portals for publishing structured data as tables, charts, and indicators. As the TDWI survey picks up:
While Excel is the most popular tool for building spreadmarts, business analysts also use Microsoft Access, PowerPoint, and SAS statistical tools.
SAS and PowerPoint are telling inclusions: SAS contains a great deal of statistical and modelling functionality that BI stacks don’t (or certainly didn’t in 2007); PowerPoint is able to flexibly integrate unstructured commentary with the more structured outputs of BI platforms—as is Word. The TDWI report itself is an example of this: most of it is unstructured text, then there are graphics and other design elements, and finally the charts and tables. Very few high value analyses don’t contain narrative, diagrams, and other unstructured presentation elements.
The TDWI report does in fact acknowledge all of the above:
[T]here is often no acceptable alternative to spreadmarts. For example, the data that people need to do their jobs might not exist in a data warehouse or data mart, so individuals need to source, enter, and combine the data themselves to get the information. The organization’s BI tools may not support the types of complex analysis, forecasting, or modeling that business analysts need to perform, or they may not display data in the format that executives desire.
The report’s biases are in its view of analysis as fundamentally amendable to structure, and spreadmarts (as one of its enablers) as precursors of systems. Scalability, repeatability and automation certainly have their place, but a more realistic view would recognise that the analytical activities which a data warehouse can viably support are a subset. This has important implications for BI/DW practices now that the Business Analytics domain incorporates them alongside advanced analytics and the emerging field of big data.
Related Analyst First posts:
Week 1, Day 5 of the CORTEX MBAnalytics program includes Tom Davenport’s ‘Rethinking Knowledge Work: A Strategic Approach’ from the McKinsey Quarterly of January 2011. In the essay, Davenport argues that productivity software hasn’t boosted the productivity of “knowledge workers” to the extent hoped for given the outlays of the last two decades. The primary method employed over this period has been what he calls ‘free-access’: providing knowledge workers with tools and information and leaving it to them to work out what to do with them:
In this model, knowledge workers define and integrate their own information environments. The free-access approach has been particularly common among autonomous knowledge workers with high expertise: attorneys, investment bankers, marketers, product designers, professors, scientists, and senior executives, for example. Their work activities are seen as too variable or even idiosyncratic to be modeled or structured with a defined process.
This approach suits when there is uncertainty, ambiguity, and contingency, each of which work against predictability. The upside is the ability of humans to adapt to these. The downside is that autonomy doesn’t come for free. Workers will execute variably, some poorly. The lack of standardisation leads to duplication and other kinds of inefficiency. Precise performance measurement and management is also a challenge. Typical productivity metrics in the free-access domain are rough and high level if present at all, and there is a trade-off between additional measurement and ease of information access.
The alternative model Davenport terms ‘structured-provisioning’, in which tasks and deliverables are defined and knowledge workers slotted in. Typical examples are workflow or ‘case management’ systems, which integrate decision automation, content management, document management, business process management, and collaboration technologies:
Case management can create value whenever some degree of structure or process can be imposed upon information-intensive work. Until recently, structured-provision approaches have been applied mostly to lower-level information tasks that are repetitive, predictable, and thus easier to automate.
The upside is efficiency. The downsides are worker alienation and resistance, and detrimental business outcomes resulting from complexity and poor specification—bad mortgages, for example.
Davenport believes that businesses should increasingly “structure previously unstructured processes”. That is, that the free-access domain should be progressively structure-provisioned. He uses a 2 x 2 matrix to frame his argument. On the x-axis is ‘Complexity of work’, ranging from Routine across to Intepretation/judgement. On the y-axis is ‘Level of interdependence’, ranging from Individual actors up to Collaborative groups. The resulting knowledge work quadrants are:
- Transaction model (Routine x Individual actors)
- Expert model (Interpretation/judgement x Individual actors)
- Integration model (Routine x Collaborative groups)
- Collaboration model (Interpretation/judgement x Collaborative groups)
The Transaction model contains most existing structure-provisioning, and the Collaboration model—consisting of “Improvisational work”, being “Highly reliant on deep expertise across multiple functions”, and “Dependent on fluid deployment of flexible teams”—is inherently free-access. Davenport sees the Expert and Integration models, however, as open to further structured-provisioning.
Martin Ford’s book, The Lights in the Tunnel: Automation, Accelerating Technology and the Economy of the Future, (free as a PDF download), further illuminates these trends. Ford identifies three categories of job vulnerable to displacement by technology:
- Hardware jobs, such as assembly line jobs, which become displaced by robotics—a process which is already well underway.
- Software jobs, such as radiology, which are first displaced by outsourcing, then by AI.
- Interface jobs, such as loan officers, which become displaced by telecommunications, digitisation, and data standardisation.
‘Rethinking Knowledge Work’ is an interesting change of direction for Davenport. His seminal ‘Competing on Analytics‘ essay, and the book that followed, profiled business effectiveness and adaptiveness powered by analytics. The arguments here, by contrast, are all about efficiencies.
[T]o date, high-end knowledge workers have largely remained free to use only the technology they personally find useful. It’s time to think about how to make them more productive by imposing a bit more structure. This combination of technology and structure, along with a bit of managerial discretion in applying them to knowledge work, may well produce a revolution in the jobs that cost and matter the most to contemporary organizations.
Given the vulnerability of so much knowledge work to displacement, it’s a good time to be an analyst. Business Analytics clearly lives in the “Expert model” quadrant. Further to that, Davenport sees it as playing a role in augmenting other expertise within that domain:
Expert jobs may also benefit from “guided” data-mining and decision analysis applications for work involving quantitative data: software leads the expert through the analysis and interpretation of data.
This further validates Analyst First principles, namely our insistence on the importance of human over electronic infrastructure, our conception of Business Analytics as an intelligence rather than IT function, and our focus on strategic in preference to operational analytics.
Related Analyst First posts:
The Analyst First view is that strategic analytics is in many respects easier than operational analytics. In part, operational analytics is hard because motivating and coordinating humans is hard. For typical operational analytics applications to consistently work end-to-end (e.g. driving up customer acquisition, retention and value via predictive modelling and campaign management) and to be able to prove and articulate their value-add, they require the coordination and cooperation of, at a minimum, people in each of the following organisational functions:
- Data Warehousing and BI
- Call Centre
- Product Management
The dependent set of business processes are difficult to execute. They are inherently brittle due to their many moving parts. But they’re also difficult to coordinate because they’re human-centric processes. The monitoring and performance management needs of humans are demanding and resistant to automation. The maintenance of human capital is far more mercurial and challenging than the maintenance of physical or information capital. This has implications for competition. It decreases the attractiveness of competing on analytics—particularly operational analytics—relative to alternative competitive frontiers.
Google is a competing on analytics business through and through. But its recent purchase of Motorola Mobile, according to many analyses, was about arming it to attack its competitors in the courtroom using its lawyers rather than in the marketplace using its engineers. Motorola Mobile’s thousands of patents provide it with new ammunition in its patent arms race with Apple, Microsoft, and others in the mobile telephony hardware sector. Its move into the political lobbying game was similarly explained five years ago.
What makes lawsuits and lobbying more attractive than analytics—to a company built on analytics? The law and the legislature are like analytics in many respects: complex domains, information based, and the province of highly qualified, experienced and intelligent specialists. However, far less of their complexity is contingent on the effective coordination and performance management of human activities inside an organisation. Operational analytics frequently fails because a business is divided against itself. Lawyers and lobbyists, on the other hand, can represent a large, complex, multinational business as a single, unified entity. The human coordination effort is far simpler and much less fragile.
Like any other competitive processes, lawsuits and lobbying campaigns can be more effective if they’re informed by analytics. But the sort of analytics that can do this will be of a more bespoke, tactical, or strategic nature, and less amenable to standard practices and operationalisation as IT processes.
Related Analyst First posts:
Philip Russom at TDWI warns against the ‘analytic cul-de-sac’, “where the epiphanies of advanced analytics never get off a dead-end street to be fully leveraged elsewhere in the enterprise.” He is concerned about analysts who love to chase insights but “can’t be bothered with operationalizing their epiphanies”:
In other words, once you discover the new form of churn, analytic models, metrics, reports, warehouse data, and so on need to be updated, so the appropriate managers can easily spot the churn and do something about quickly, if it returns. Likewise, hidden costs, once revealed, should be operationalized in analytics (and possibly reports and warehouses), so managers can better track and study costs over time, to keep them down.
I find this interesting primarily for its overarching assumption, which is that an insight suggests its own operationalisation. Russom’s examples contain insights that close the gap between data and decision, that are “actionable”. This has not always been my experience. To the contrary, insights more often than not turn the simple and understandable into the complex and uncertain. In response, the appropriate course of action may be unclear. It may be that nothing can be done. It may be that nothing should be done—that doing anything will make things worse. It certainly can’t be assumed that managers will know what to do, let alone quickly.
Insights can contribute to sensemaking and situational awareness without carrying implications for decision support and action. This makes them valuable, but not always welcome. Sometimes new information is disruptive.
Related Analyst First posts:
About usAnalyst First is a new approach to analytics, where tools take a far less important place than the people who perform, manage, request and envision analytics, while analytics is seen as a non-repetitive, exploratory and creative process where the outcome is not known at the start, and only a fraction of efforts are expected to result in success. This is in contrast with a common perception of analytics as IT and process.
Tags in a CloudAIPIO analyst first Analyst First Chapters analytics analytics is not IT arms race environments big data business analytics business intelligence cargo cults collective forecasting commodity and open source tools complexity data decision automation decision support educated buyer EMC-greenplum forecasting HBR holy trinity human infrastructure incentives intelligence model of analytics investing in data lean startup literacy management culture MBAnalytics operational analytics organisational-political considerations Philip Russom Philip Tetlock prediction markets presales R Robin Hanson Strategic Analytics tacit data TDWI Tom Davenport uncertainty uneducated buyer vendors why analyst first