The buzzword of the year seems to be “Big Data”. There is a massive wave of promoters of the term, and there are inevitable detractors. There is also the issue of exactly how to define it. What follows is the A1 view on Big Data.
It is real, it is a game changer, and it is here to stay. It is no one thing, and its definition, both quantitiative and qualitative, is rather fluid. Nevertheless, some basic truths apply: Big Data is not a brand name. Neither is Big Data a tool, a business process or a solution. It isn’t even an idea as such. In fact, Big Data is best understood as a problem. Not a problem as in “trouble”, but a problem in the sense of a challenge or puzzle, or more precisely a growing family of problems that we are increasingly forced to grapple with. It’s a problem that does not come with an automatic solution, although there are a growing number of tools to help roll it around.
The A1 angle on this is: you cannot outsource your investment in Big Data any more than you can outsource your own education, or exercise, or being a patient in a surgery theatre. In this sense, what is true of Big Data is also true of Analytics.
Getting Big Data right means getting Small Data even righter. The sort of business that can get value out of Big Data will be one already getting value out of Small Data. Without the business fundamentals in place, Big Data will produce only Big Nonsense. Alternatively, if the logic is there, then Big Data will enhance an existing value-adding framework.
So: small data first, then big data. And before small data, tacit data, which you can always get your hands on, even if you have trouble wrangling the electronic stuff. And before all of those: logic, and human infrastructure. A well understood, well defined business model with well defined intelligence objectives. And incentives, with staff capable of navigating such an environment, managed by a sponsor possessed of the A1 “holy trinity” of adequate influence, appropriate motivation and sufficient understanding of the value, role, and needs of Analytics under their command. Is this too much to ask for?
I should also probably mention tools. Maybe. Last. Do they matter? Of course. So does oxygen. But it is ubiquitous, effectively free, and we take it for granted…
The recent IAPA discussion panel on ‘Aligning IT and Analytics to deliver sustainable innovation’, plus a later conversation with fellow panellist, EMC-Greenplum’s James Horton, prompted me to sketch some thoughts on what an Analytics Lab ought to do. The lab is the natural home for Analysts engaged in the narrower definition of Analytics:
The Analytics Lab is an innovation factory which constantly evaluates data, quantitative methods and tools looking for sources of competitive advantage.
- Data: structured and unstructured, sourced from both inside and outside the organisation, established and new.
- Methods: data transformation, and then data mining, machine learning, statistical, mathematical, and other analytical methods.
- Tools: as appropriate to method, from programming languages through to GUI applications, from commodity and open source through to commercial tools.
- Analysts: the lab enables the organisation to evaluate the technical abilities and innovative propensities of its analysts, as well as those on offer from external service providers, without many of the interfering factors present in operationally hardened IT environments.
Its outputs are:
- BI prototypes
- Instantiation candidates
- Identifies data and knowledge gaps: Analysing data and generating insights brings to light new data needs and exposes gaps in knowledge which may impact the business. Additional data may need to be sourced, gathered through survey, collected by tweaking an existing business process, or purchased from a third party. Additional analyses and subject matter expertise may be required to close knowledge gaps.
- Resolves disharmonies: All businesses struggle with ‘different views of the truth’, and it’s often the crunching of data which brings these to light. Disharmonies might be within or between data sets, or between conventional wisdom and the drivers of a model. They could relate to anything from actual observations to tacit assumptions. Resolving such disharmonies—harmonisation—involves identifying, scoping, validating, and correcting them.
These last two are not the core business of Analytics, but they’re important activities, and doing Analytics naturally leads to them. Most organisations don’t explicitly provision for them, but arguably they should. The lab is as good a home for them as any other.
The Analytics Lab services all levels of business, but in different ways:
- Senior Management: through the provision of strategic insights.
- Middle Management and Knowledge Workers: through one-off and/or prototyped BI analyses.
- Frontline Workers: through the identification of instantiation candidates, i.e. deployable operational analytics.
Many analyses typically need to be tried before those which merit instantiation are discovered. Furthermore, “instantiation” doesn’t necessarily mean a repeatable process. It could simply mean the communication of a one-off insight, e.g. “revenue growth is unmistakeably slowing in all but one customer segment” or “the most reliable predictor of a customer’s propensity to churn is their social network membership.” Such insights are typically complex, valuable, but not “actionable” in any deterministic, automatable way.
Other findings are suited to more regularised delivery, for example as managerial decision support through business intelligence.
Some analytical results, in order to be fully leveraged, need to be integrated into frontline business processes. Predictive models which predict customer acquisition or churn, for example, might require integration in sales, marketing, call centre, channel management and customer support processes.
Conduct disciplined, exploratory analyses which repeatedly cycle through the following sorts of questions:
- Is there structure in the data (patterns, trends, relationships, networks, segments, clusters, indicators, drivers, outliers, anomalies)?
- Are there new insights in the data?
- Which models are viable?
- Which variables are important?
- Which variables do we control?
- What are the implications for revenue, cost, risk?
- What data do we want that we don’t have? How could we get it?
- What are the implications of this insight?
- Who is our internal customer for this insight?
- Would this analysis be valuable if provided on an ongoing basis? To whom?
- Into which existing or envisioned business processes should this insight be instantiated?
- Where are there disharmonies in tacit or explicit data and assumptions?
- Which projects, processes and decisions are affected by these disharmonies?
- How do we validate and resolve these disharmonies?
Infrastructure can usefully be separated into the ‘electronic infrastructure’ of hardware and software and the ‘human infrastructure‘ of people, relationships, management and incentives.
- Secure, off-network ‘sandpit area’
- Big storage, big memory, scalable to big data
- Eclectic analytical toolset: commodity, open source, commercial, experimental, in-house
- Snapshots, copies, feeds of all manner of available data sources: pre-ETL, pre-warehouse, post-warehouse, external, web, social media, unstructured. In the context of the lab, the data warehouse is just another source system.
- De-emphasis on repeatable technical processes and compliance with production IT architecture
- Insulated from IT Service Level Agreements and other production / core system / business-as-usual constraints
- Human Resources:
- Analysts: Data scientists
- Management: Validate analysis objectives, ensure that analysts remain focused, performance manage the innovation process.
- Sponsorship from Executive
- Cross-functional relationships with business units: both ‘push’ (business unit as customer) and ‘pull’ (business unit as subject matter expert)
- Close relationship with Strategy function
- ‘Caveat utilitor’ relationship with IT for data provision and tool support
- Various relationships with service providers: vendors, consultants, training and mentoring providers, industry expertise, academia if appropriate
- Performance Management:
- Innovation / Research metrics
- Risk metrics
- Sentiment metrics
- Dimensions of opportunity: Internal, Competitor, Market, Customer, Product, Channel
Related Analyst First posts:
- *Aligning IT and Analytics to deliver sustainable innovation*
- Needles, Haystacks, and Category Errors, or, Where Does Operational Analytics Fit?
- Systemising skepticism
- Assume bad data
- The Economics of Data – Analytics Is… Investing in Data
- Decision support versus decision automation
IBM has released its latest biennial C-suite study of CIOs, The Essential CIO: Insights from the Global Chief Information Officer Study, summarising interviews with more than 3,000 CIOs. The study is available for download here (free, requires registration). It contains much of interest for Business Analytics practitioners and sponsors:
One of the most compelling findings in the study is that CIOs are now increasingly in step with CEOs’ top priorities. One priority they agree on is how critical it is for today’s public and private sector organizations to derive insight from the huge volumes of data being amassed across the enterprise, and turn those insights into competitive advantage with tangible business benefits.
CIOs increasingly help their public and private sector organizations cope with complexity by simplifying operations, business processes, products and services. To increase competitiveness, 83 percent of CIOs have visionary plans that include business intelligence and analytics.
Our research suggests that this new alignment [between CIOs and CEOs] comes as CEOs better understand the importance of technology. They increasingly rely on CIOs to turn data into usable information, information into intelligence and intelligence into better decisions.
Business intelligence and analytics ranked as the highest CIO priority across the board, ahead of Mobility solutions, Virtualization, Cloud computing, Business process management, Risk management and compliance, Self-service portals, and Collaboration and social networking. There was also “remarkable consensus” on how these priorities should be addressed:
CIOs identified the top three success factors for IT initiatives as putting in place the correct IT/ business talent, managing beyond line responsibilities and creating the right conditions before starting.
Read that as an advocation of the importance of human infrastructure.
The study segments organisations into four groups based on ‘CIO Mandate’. Each mandate reflects how the IT function is viewed by the rest of the organisation:
- Leverage: ”Provider of fundamental technology services”
- Expand: ”Facilitator of organizational process efficiency” (the most common CIO mandate)
- Transform: ”Provider of industry-specific solutions to support business”
- Pioneer: ”Critical enabler of business/organization vision”
These are presented as cumulative, but not as a progression path per se – rather as a reflection of the nature of different businesses. They also read as a continuum from operational to strategic. Some organisations see IT’s job as keeping the machines running (Leverage), and perhaps facilitating marginal efficiencies (Expand). The more strategic IT functions are seen by their organisations as enablers of competitive advantage (Transform) or drivers of change (Pioneer). Nonetheless, Business Analytics is on most mandates’ radar:
A full 95 percent [of Expand mandate CIOs] said they would lead or support efforts to drive better real-time decisions and take advantage of analytics.
Analytics and data management hold the key to extracting greater value from data. Over the next three to five years, the majority of Transform mandate CIOs across our sample will focus on customer analytics, product/service profitability analysis and master data management.
This means moving beyond traditional relational database management systems into the next generation of integrated data warehouses and analytical tools. A Consumer Products CIO in Australia said, “A master data management initiative will cleanse corporate data, facilitating our ability to deliver rich customer analytics for the business.”
IBM’s recommendations to Transform CIOs include:
Harness more real-time data Generate insights through feedback collection, sentiment analysis and connecting CRM to social networks. Use the data explosion to grow relationships with all key stakeholders.
Analyze! Dive deep into advanced analytics to develop insights into customer behavior, value chain relationships and competitive intelligence. Deploy text analysis to glean insights from structured and unstructured data, including blogs, customer service records and Web transactions.
On the Pioneers:
This group of CIOs ranked product/service profitability analysis and product/service utilization analysis as their top two priorities for turning data into usable intelligence.
[Pioneer] CIOs are in a unique position within an organization. They help generate and have access to customer preference data, supply chain patterns, emerging trends—both within their organizations and from competitors— Internet behavior and response patterns, and so much more. Combining this data with marketing analytics can reveal previously undiscovered and unmet needs. It can lead to product innovations, massive process changes, cross-industry value chain cooperation and other synergies across industries.
IBM’s recommendations to Pioneer CIOs include:
Develop a culture of analytics Build predictive intelligence capabilities that can fundamentally change the business. Encourage widespread application of analytics to fully leverage business intelligence. Take an advanced look at what drives profitability.
Add dials to your dashboards Offer dynamic dashboards using real-time data and use predictive analytics to provide situational metrics, including: formal business case monitoring; customer satisfaction; employee motivation; and social value and sustainability.
Pioneers are encouraged to ask themselves, among other questions:
How can you develop the talent to apply predictive intelligence to radically change your business model, products or industry?
How will you design dynamic dashboards that leverage real-time data and predictive analytics?
Among the concluding advice for CIOs across the board is:
Embrace the power of analytics Educate yourself, your team and your organization about extracting meaning from unstructured data sources, predictive intelligence, social network analysis and sentiment mining.
The study is a useful companion piece to the recent McKinsey report on big data. Both align Business Analytics with recent technological trends: big data, RFIDs and sensors, real-time Web transactions, and so on. Both also focus on operational analytics (e.g. “dynamic dashboards using real-time data”) and decision automation. I see this as a shortcoming of the McKinsey report and as more understandable in IBM’s case given their market and worldview.
There is a notable recognition of complexity as a growing problem and a desire to simplify (internally, for clients, for partners).
Also, reading between the lines, evidence that Business Analytics basics are still causing trouble for organisations. Transform CIOs are either planning to or advised to build dashboards, for example. I’ve learned to interpret this as “we’re having trouble getting to our data”. There is also a tendency to conflate data warehousing with data analysis. The quote above from the Australian Consumer Products CIO is a good example of this.
Finally, the study focuses on CIO plans and priorities, not on their successes and failures. I’d be fascinated to know what CIOs have been surprised by in the past. What’s exceeded and disappointed expectations?
Related Analyst First posts:
- McKinsey on big data
- The Business Analytics market should be much bigger
- Vendor worldviews
- Vendor worldviews evolve
- Decision support versus decision automation
“We’ve got good data.”
It’s common to hear people make this claim in the context of a nascent or proposed Business Analytics initiative. Sometimes a rationale is offered: there’s a data warehouse, or a system migration project was recently completed. However if by “good” we mean data having at least all of the following properties…
- Fitness for purpose
…then a better starting assumption would be: “We don’t have all the data we want, and what we have is problematic.”
There are some basic reasons for this expectation:
- In BI you’re generally surfacing existing data that has been historically under-accessible. You should expect data that has not been routinely scrutinised to be bad because there’s been no incentive for it to be good.
- In Analytics you’re frequently uncovering new insights. You should expect these to raise hitherto unanticipated questions. Answering these is typically going to require at least some data you haven’t thought to collect, source, acquire, or derive before.
The prudent approach is to assume data problems. This shouldn’t derail your Business Analytics initiative. Far from it. Acquiring a comprehensive understanding of your data should be one of your initial, primary and explicit objectives – not an incidental series of in-project obstacles.
Related Analyst First posts:
This post examines data as an economic resource and a source of value, and provides a new functional definition of Analytics on that basis.
Data is a fascinating resource, with a number of characteristics that distinguish it from other things that we think of as “resources”.
The first point is that data is not a commodity. The inherent value of data is that every piece tells you something new, and the pieces vary dramatically in their meaning, importance, value and reliability. While it is intuitively appealing to think of data in aggregate, and to consider its volume in tera-, peta- or zettabytes, it pays to note that the value in such data is not a function of volume in any reliable sense. This is at odds with commodities such as iron ore, gold or crude oil, which can be measured directly in dollars per kilogram.
The value in data, while very real, is not easy to quantify. In this way, data is more like human resources and less like gold or oil. It is inherently heterogeneous in nature, and its value is not proportional to its mass.
Nor is the value extraction process homogeneous. While the way you extract value from one lump of iron or is no different than that applied to another, the same does not hold true for data. Further, the method of value extraction may well be unknown, and require further exploration. Even if a value extraction method has been identified, and indeed proven to work, there may well be additional value in the data. Finally, the value of data may well only prove itself in concert with other data. This is a synergistic, almost alchemical effect for which there is no good metaphor in the world of commodities.
Where the commodity resource analogy does hold is that data holds value locked within it, and effort must be applied to locate and extract it.
Analytics, particularly in its “data mining” incarnation, is best described as investment in the extraction of value from this curious, heterogeneous, synergistic resource. The tools, skills, techniques and processes of Analytics ares all in the service of this investment enterprise.
The act of investing in data is best considered by way of analogy. Investing can be a simple, hands-off activity, based on a simple transaction – money is put in, and value is generated by an external process. This is how stock investment works. The agent investing in the stock merely puts up the money.
Now consider investing in one’s own education. Or health and fitness program. In these cases, considerable effort by the individual is required: they cannot outsource their intimate involvement in the process invested in.
Analytics is like the latter process for any business leader investing in data. If they are not intimately involved in the process, then the process is a most likely a failure, and almost certainly a waste.
The “mining” analogy is thus used at one’s peril. So is an over reliance on specific tools, processes or preconceptions about what is in the data. Regular mining is not a highly specialised form of labour, and much mining activity is rapidly automated.
Data mining is very different, and the mining analogy fails spectacularly if it envisages a repetitive, well-defined activity, where large machines extract value in a predictable, reliable way, supported by interchangeable people performing repetitive tasks. Indeed, following the mining analogy, real data miners are less like miners and more like prospectors and geologists, performing difficult, ever-changing and highly skilled work to detect value when and how it may arise.
Where they do identify repeatably extractable value (e.g. credit scoring models, churn models, fraud detection algorithms) these nuggets of (temporary) value can indeed be automated and a production processes applied. But this is no longer Analytics, rather it is the IT instantiation of Analytics results. The data miner has moved on to discover value somewhere else.
Thus data is a very unusual value store, with equally unusual, heterogeneous value extraction methods, and a reliance on adaptive, exploratory, highly skilled professionals to make them happen.
Data has another unusual economic property: it remains. As an economic asset, it can be preserved at negligible cost, while other assets -premises, machines, staff, funds and lines of credit – may diminish in tough economic times. The potential value of the data is hard to measure as discussed above, but whatever it may be it grows relative to other shrinking assets. Thus, in hard times, the relative value of data grows, and the business case for Analytics, or Investing in Data grows.
Data itself may grow in volume, scope and complexity with relative ease and low cost, thus the business case for Analytics grows with it.
Data may take many forms, with the structured, electronic enterprise data most familiar to analysts the most familiar form. But it is one of many. Unstructured data is anywhere from 4 to 20 times more voluminous than the structured kind. Survey data may be collected to answer those questions that enterprise data fails to answer.
A Postscript – Tacit Data Mining
The definition of Analytics above extends very broadly in its definition of data.
The most interesting, readily available, strategically relevant and poorly understood form of data is tacit data: the information contained in the brains of staff, board, shareholders and anyone else who would see the organisation do well. Tacit data is seen as the most difficult to identify, extract, value and leverage.
The definition of Analytics presented above includes investment in the extraction of value from all data, including tacit. How is tacit data mined? The most effective and powerful way is by use of collective intelligence and forecasting techniques, such as prediction markets. This is a topic for another day.
About usAnalyst First is a new approach to analytics, where tools take a far less important place than the people who perform, manage, request and envision analytics, while analytics is seen as a non-repetitive, exploratory and creative process where the outcome is not known at the start, and only a fraction of efforts are expected to result in success. This is in contrast with a common perception of analytics as IT and process.
Tags in a CloudAIPIO analyst first Analyst First Chapters analytics analytics is not IT arms race environments big data business analytics business intelligence cargo cults collective forecasting commodity and open source tools complexity data decision automation decision support educated buyer EMC-greenplum forecasting HBR holy trinity human infrastructure incentives intelligence model of analytics investing in data lean startup literacy management culture MBAnalytics operational analytics organisational-political considerations Philip Russom Philip Tetlock prediction markets presales R Robin Hanson Strategic Analytics tacit data TDWI Tom Davenport uncertainty uneducated buyer vendors why analyst first