Electronic data that is. The most important decisions aren’t data-friendly. But they are the ones worth the most dollars, nerves, careers and lives.
“Do we want to mail an offer to this particular person” is a far less important question than “Do we want to acquire this company”. The former is a decision supporting precise, very low level action, for which data exists, because essentially the same action has been carried out many times, and will be again. But how do we apply analytics directly to the second question ?
This is where collective forecasting can help, by applying analytics rigour to get the benefit of the most important data in an organisation, the tacit data in the heads of its people.
Collective forecasting is a truly “Analyst First” technique: the analyst comes before software, and even before (electronic) data. Indeed, software is helpful, but not essential, and data may be scattered, in short supply orabsent entirely.
Here is a presentation given last week at the Australian Institute of Professional Intelligence Officers (AIPIO) annual conference, explaining the benefits of the collective forecasting approach to organisational strategic decision making. These include a powerful KPI for strategic forecasting and decision making, and flow-on effects of a truly meritocratic, depoliticised decision making culture, where Highly Paid People’s Opinions (HiPPOs) do not carry the same weight as a good predictive track record.
Improvement is gained through the use of the group, or collective forecast, which fuses the tacit knowledge of relevant knowledge holders to create a more reliable decision making mechanism.
The presentation also presents results of the first round of AIPIO’s collective forecasting competition, where the group forecast performed very well, as expected.
Readers are invited in the second round of the competition, which is running currently.
The buzzword of the year seems to be “Big Data”. There is a massive wave of promoters of the term, and there are inevitable detractors. There is also the issue of exactly how to define it. What follows is the A1 view on Big Data.
It is real, it is a game changer, and it is here to stay. It is no one thing, and its definition, both quantitiative and qualitative, is rather fluid. Nevertheless, some basic truths apply: Big Data is not a brand name. Neither is Big Data a tool, a business process or a solution. It isn’t even an idea as such. In fact, Big Data is best understood as a problem. Not a problem as in “trouble”, but a problem in the sense of a challenge or puzzle, or more precisely a growing family of problems that we are increasingly forced to grapple with. It’s a problem that does not come with an automatic solution, although there are a growing number of tools to help roll it around.
The A1 angle on this is: you cannot outsource your investment in Big Data any more than you can outsource your own education, or exercise, or being a patient in a surgery theatre. In this sense, what is true of Big Data is also true of Analytics.
Getting Big Data right means getting Small Data even righter. The sort of business that can get value out of Big Data will be one already getting value out of Small Data. Without the business fundamentals in place, Big Data will produce only Big Nonsense. Alternatively, if the logic is there, then Big Data will enhance an existing value-adding framework.
So: small data first, then big data. And before small data, tacit data, which you can always get your hands on, even if you have trouble wrangling the electronic stuff. And before all of those: logic, and human infrastructure. A well understood, well defined business model with well defined intelligence objectives. And incentives, with staff capable of navigating such an environment, managed by a sponsor possessed of the A1 “holy trinity” of adequate influence, appropriate motivation and sufficient understanding of the value, role, and needs of Analytics under their command. Is this too much to ask for?
I should also probably mention tools. Maybe. Last. Do they matter? Of course. So does oxygen. But it is ubiquitous, effectively free, and we take it for granted…
A1 is a proud supporter of the AIPIO Collective Forecasting Competition, hosted on Presciient’s new collective forecasting platform System II.
A beginner’s guide may be found at the top of the page.
Collective forecasting and related methods such as prediction markets represent the area of analytics that we call Tacit Data Mining, and allow the extraction, deployment and analysis of the most vital data in the organisation, which lives in people’s heads. It also provides the ultimate data fusion platform, fusing all available data through human filters to provide powerful strategic decision support.
Collective forecasting allows accurate forecasting of future events, and also can condition those events on possible actions, thus providing a powerful decision support. It identifies the consistently most effective forecasters, acting as a filter for the most insightful and prescient members of staff or the public.
It has application in any strategic decision support domain.
The competition at hand has 3 expiry dates for predicted events: in April, July and October, each has prizes for 1 month ahead, 1 week ahead and 1 day ahead. The July and October expiries also have 3 months ahead prizes, and a six month ahead prize for the October expiry.
The one-month ahead April expiry deadline is tomorrow, so don’t delay, register and put in your predictions.
The recent IAPA discussion panel on ‘Aligning IT and Analytics to deliver sustainable innovation’, plus a later conversation with fellow panellist, EMC-Greenplum’s James Horton, prompted me to sketch some thoughts on what an Analytics Lab ought to do. The lab is the natural home for Analysts engaged in the narrower definition of Analytics:
The Analytics Lab is an innovation factory which constantly evaluates data, quantitative methods and tools looking for sources of competitive advantage.
- Data: structured and unstructured, sourced from both inside and outside the organisation, established and new.
- Methods: data transformation, and then data mining, machine learning, statistical, mathematical, and other analytical methods.
- Tools: as appropriate to method, from programming languages through to GUI applications, from commodity and open source through to commercial tools.
- Analysts: the lab enables the organisation to evaluate the technical abilities and innovative propensities of its analysts, as well as those on offer from external service providers, without many of the interfering factors present in operationally hardened IT environments.
Its outputs are:
- BI prototypes
- Instantiation candidates
- Identifies data and knowledge gaps: Analysing data and generating insights brings to light new data needs and exposes gaps in knowledge which may impact the business. Additional data may need to be sourced, gathered through survey, collected by tweaking an existing business process, or purchased from a third party. Additional analyses and subject matter expertise may be required to close knowledge gaps.
- Resolves disharmonies: All businesses struggle with ‘different views of the truth’, and it’s often the crunching of data which brings these to light. Disharmonies might be within or between data sets, or between conventional wisdom and the drivers of a model. They could relate to anything from actual observations to tacit assumptions. Resolving such disharmonies—harmonisation—involves identifying, scoping, validating, and correcting them.
These last two are not the core business of Analytics, but they’re important activities, and doing Analytics naturally leads to them. Most organisations don’t explicitly provision for them, but arguably they should. The lab is as good a home for them as any other.
The Analytics Lab services all levels of business, but in different ways:
- Senior Management: through the provision of strategic insights.
- Middle Management and Knowledge Workers: through one-off and/or prototyped BI analyses.
- Frontline Workers: through the identification of instantiation candidates, i.e. deployable operational analytics.
Many analyses typically need to be tried before those which merit instantiation are discovered. Furthermore, “instantiation” doesn’t necessarily mean a repeatable process. It could simply mean the communication of a one-off insight, e.g. “revenue growth is unmistakeably slowing in all but one customer segment” or “the most reliable predictor of a customer’s propensity to churn is their social network membership.” Such insights are typically complex, valuable, but not “actionable” in any deterministic, automatable way.
Other findings are suited to more regularised delivery, for example as managerial decision support through business intelligence.
Some analytical results, in order to be fully leveraged, need to be integrated into frontline business processes. Predictive models which predict customer acquisition or churn, for example, might require integration in sales, marketing, call centre, channel management and customer support processes.
Conduct disciplined, exploratory analyses which repeatedly cycle through the following sorts of questions:
- Is there structure in the data (patterns, trends, relationships, networks, segments, clusters, indicators, drivers, outliers, anomalies)?
- Are there new insights in the data?
- Which models are viable?
- Which variables are important?
- Which variables do we control?
- What are the implications for revenue, cost, risk?
- What data do we want that we don’t have? How could we get it?
- What are the implications of this insight?
- Who is our internal customer for this insight?
- Would this analysis be valuable if provided on an ongoing basis? To whom?
- Into which existing or envisioned business processes should this insight be instantiated?
- Where are there disharmonies in tacit or explicit data and assumptions?
- Which projects, processes and decisions are affected by these disharmonies?
- How do we validate and resolve these disharmonies?
Infrastructure can usefully be separated into the ‘electronic infrastructure’ of hardware and software and the ‘human infrastructure‘ of people, relationships, management and incentives.
- Secure, off-network ‘sandpit area’
- Big storage, big memory, scalable to big data
- Eclectic analytical toolset: commodity, open source, commercial, experimental, in-house
- Snapshots, copies, feeds of all manner of available data sources: pre-ETL, pre-warehouse, post-warehouse, external, web, social media, unstructured. In the context of the lab, the data warehouse is just another source system.
- De-emphasis on repeatable technical processes and compliance with production IT architecture
- Insulated from IT Service Level Agreements and other production / core system / business-as-usual constraints
- Human Resources:
- Analysts: Data scientists
- Management: Validate analysis objectives, ensure that analysts remain focused, performance manage the innovation process.
- Sponsorship from Executive
- Cross-functional relationships with business units: both ‘push’ (business unit as customer) and ‘pull’ (business unit as subject matter expert)
- Close relationship with Strategy function
- ‘Caveat utilitor’ relationship with IT for data provision and tool support
- Various relationships with service providers: vendors, consultants, training and mentoring providers, industry expertise, academia if appropriate
- Performance Management:
- Innovation / Research metrics
- Risk metrics
- Sentiment metrics
- Dimensions of opportunity: Internal, Competitor, Market, Customer, Product, Channel
Related Analyst First posts:
- *Aligning IT and Analytics to deliver sustainable innovation*
- Needles, Haystacks, and Category Errors, or, Where Does Operational Analytics Fit?
- Systemising skepticism
- Assume bad data
- The Economics of Data – Analytics Is… Investing in Data
- Decision support versus decision automation
The New Yorker recently ran a fascinating profile of Ray Dalio, the founder of Bridgewater Associates, the world’s richest hedge fund. From an Analyst First point of view the piece offers a window into the human infrastructure of an arms race environment. Bridgewater is a culture committed to making and learning from its mistakes:
“Our greatest power is that we know that we don’t know and we are open to being wrong and learning.”
In his Principles, Dalio declares that acknowledging errors, studying them, and learning from them is the key to success. He writes, “Pain + Reflection = Progress.” Bridgewater puts this equation into action by organizing lengthy assessment sessions, in which employees must discuss their mistakes.
“What we’re trying to have is a place where there are no ego barriers, no emotional reactions to mistakes. . . . If we could eliminate all those reactions, we’d learn so much faster.”
Part of Bridgewater’s human infrastructure is a commitment to radical transparency. Some of its key items of electronic infrastructure are therefore video and tape recorders:
Like virtually all meetings at Bridgewater, this one was taped. Dalio says that the tapes—some audio, some video—provide an objective record of what has been said; they can be used for training purposes, and they allow Bridgewater’s employees to keep up with what is going on at the firm, including his discussions with senior colleagues. “They get to see all of my mistakes,” Dalio told me.
One rule of radical transparency is that Bridgewater employees refrain from saying behind a person’s back anything that they wouldn’t say to his face.
This means that management’s misgivings about a particular employee’s suitability for promotion are discussed openly with him, and recorded. (He doesn’t get the promotion.)
James Comey, the firm’s top lawyer… [took] a while to get used to dealing with Dalio. “When Ray sent me an e-mail saying, ‘I think what you said today doesn’t make sense,’ I tended to think, What does he really mean? Where’s he coming from? And what is my play? Who are my allies? All of the things you think about in the outside world. It took me three months to realize that when Ray says, ‘I think you are wrong,’ he really means ‘I think you are wrong.’ He’s not trying to provoke you, or anything else.”
“What is a typical organization?” [Dalio] asked me one day. “A typical organization is one where people are walking around saying, ‘This is stupid, this doesn’t make sense,’ behind each other’s backs.”
The article is also illuminating in its discussion of Bridgewater’s analysis and trading philosophies, which reflect its acceptance of uncertainty:
[T]he Pure Alpha fund typically has in place about thirty or forty different trades. “I’m always trying to figure out my probability of knowing,” Dalio said. “Given that I’m never sure, I don’t want to have any concentrated bets.” Such thinking runs counter to the conventional wisdom in the hedge-fund industry, which is that the only way to score big is to bet the house.
Many economists start at the top and work down. They look at aggregate statistics—inflation, unemployment, the money supply—and figure out what the numbers mean for particular industries, such as autos or tech. Dalio does things the other way around. In any market that interests him, he identifies the buyers and sellers, estimates how much they are likely to demand and supply, and then looks at whether his findings are already reflected in the market price. If not, there may be money to be made.
Bridgewater is more a qualitative than a quantitative trading fund. In this context, its decision support systems are interesting:
To guide its investments, Bridgewater has put together hundreds of “decision rules.” These are the financial analogue of Dalio’s Principles. He used to write them down and keep them in a ring binder. Today, they are encoded in Bridgewater’s computers. Some of these indicators are very general. One of them says that if inflation-adjusted interest rates decline in a given country, its currency is likely to decline. Others are more specific. One says that, over the long run, the price of gold approximates the total amount of money in circulation divided by the size of the gold stock. If the market price of gold moves a long way from this level, it may indicate a buying or selling opportunity.
In any given market, Bridgewater may have a dozen or more different indicators. However, even when most or all of the indicators are pointing in a certain direction, Dalio doesn’t rely solely on software. Unless he and Jensen and Prince agree that a certain trade makes sense, the firm doesn’t make it. While this inevitably introduces an element of human judgment to the investment process, Dalio insists it is still driven by the rules-based framework he has built up over thirty years. “When I’m thinking, ‘What is going on today?,’ I also need to make the connection to ‘How does what is happening today fit into our framework for making this decision?’ ’’ he said. Ultimately, he says, it is the commitment to systematic analysis and systematic investment that distinguishes Bridgewater from other hedge funds.
In other words, Bridgewater runs on human judgement augmented by decision support, not decision automation. It recognises that decision support leads to higher value decisions but in practice makes decision making harder, not easier. As a recent Analyst First post argued, new information is not always “actionable”:
Comey was initially struck by how long it took Bridgewater to make decisions, because of the ceaseless internal debates. “I said, ‘Lordy, we have to put tops on bottoms. Let’s get something done,’ ” Comey recalled. But he added, laughing, “The mind control is working. I’ve come to believe that all the probing actually reduces inefficiencies over the long run, because it prevents bad decisions from being made.”
Dalio on ownership:
“I don’t want Bridgewater to go public or have it controlled by anybody outside the firm,” he said. “I think people who do that tend to mess up the firm.”
On the nature of competition in financial markets:
[Dalio] regards it as self-evident that all social systems obey nature’s laws, and that individual participants get rewarded or punished according to how far they operate in harmony with those laws. He views the financial markets as simply another social system, which determines payoffs and punishments in a like manner. “You have to be accurate,” he says. “Otherwise, you are going to pay. Alpha is zero sum. In order to earn more than the market return, you have to take money from somebody else.
And finally, on the global economy:
Dalio believes that some heavily indebted countries, including the United States, will eventually opt for printing money as a way to deal with their debts, which will lead to a collapse in their currency and in their bond markets. “There hasn’t been a case in history where they haven’t eventually printed money and devalued their currency,” he said. Other developed countries, particularly those tied to the euro and thus to the European Central Bank, don’t have the option of printing money and are destined to undergo “classic depressions”.
Related Analyst First posts:
In my experience working for software vendors the answer to this has always been ‘yes and no’, but the one sure thing is that everyone uses Excel. Spreadsheets are the most pervasive and effective decision support tools. No organisation doesn’t use them, and it’s a safe bet that this will always be the case. No amount of data warehousing will ever be able to provide decision makers with all the information they need. To the extent that it can, those decisions can be automated. Decisions invariably require new data. That new data will be either unanticipatable, or tacit, or both. Spreadsheets are unbeatable for ad hoc data analysis and turning tacit data into explicit data.
Evelson poses his questions in the context of (presumably Forrester) research into BI Pricing, which he says is:
[S]howing a broad range of transparency (or non transparency) from BI vendors themselves. Some vendors welcomed our research RFI and are happily providing all the info we requested. Some are less transparent and are insisting that we only publish price ranges or comparative analysis (who’s more/less expensive) without showing their exact quotes. Yet, some others have declined to participate.
That doesn’t surprise me. Wide price ranges are both inevitable and understandable. Software businesses, particularly in growth markets like BI, concentrate more on increasing revenue than on managing to the bottom line. Costs just don’t matter as much. They’re also indirect – software being an information product. Part of the software sales process is working out what the prospect is willing to pay, which is basically what they’ll end up paying, which will vary from customer to customer.
Related Analyst First posts:
That is the title of a highly recommended discussion being hosted this month at Cato Unbound:
- The editors’ introduction is here.
- Dan Gardner and Philip Tetlock’s lead essay, ‘Overcoming Our Aversion to Acknowledging Our Ignorance’, is here.
- Robin Hanson’s reaction essay, ‘Who Cares About Forecast Accuracy?’, is here.
- John Cochrane’s reaction essay, ‘In Defense of the Hedgehogs’, is here.
- Bruce Bueno de Mesquita’s reaction essay, ‘Fox-Hedging or Knowing: One Big Way to Know Many Things’, is here.
Every year, corporations and governments spend staggering amounts of money on forecasting and one might think they would be keenly interested in determining the worth of their purchases and ensuring they are the very best available. But most aren’t. They spend little or nothing analyzing the accuracy of forecasts and not much more on research to develop and compare forecasting methods. Some even persist in using forecasts that are manifestly unreliable, an attitude encountered by the future Nobel laureate Kenneth Arrow when he was a young statistician during the Second World War. When Arrow discovered that month-long weather forecasts used by the army were worthless, he warned his superiors against using them. He was rebuffed. “The Commanding General is well aware the forecasts are no good,” he was told. “However, he needs them for planning purposes.”
Even in business, champions need to assemble supporting political coalitions to create and sustain large projects. As such coalitions are not lightly disbanded, they are reluctant to allow last minute forecast changes to threaten project support. It is often more important to assemble crowds of supporting “yes-men” to signal sufficient support, than it is to get accurate feedback and updates on project success. Also, since project failures are often followed by a search for scapegoats, project managers are reluctant to allow the creation of records showing that respected sources seriously questioned their project.
Now “forecasting” as Gardner and Tetlock characterize it, is an attempt to figure out which event really will happen, whether the coin will land on heads or tails, and then make a plan based on that knowledge. It’s a fool’s game.
Once we recognize that uncertainty will always remain, risk management rather than forecasting is much wiser…The good use of “forecasting” is to get a better handle on probabilities, so we focus our risk management resources on the most important events. But we must still pay attention to events, and buy insurance against them, based as much on the painfulness of the event as on its probability. (Note to economics techies: what matters is the risk-neutral probability, probability weighted by marginal utility.)
Good prediction—and this is my belief—comes from dependence on logic and evidence to draw inferences about the causal path from facts to outcomes. Unfortunately, government, business, and the media assume that expertise—knowing the history, culture, mores, and language of a place, for instance—is sufficient to anticipate the unfolding of events. Indeed, too often many of us dismiss approaches to prediction that require knowledge of statistical methods, mathematics, and systematic research design. We seem to prefer “wisdom” over science, even though the evidence shows that the application of the scientific method, with all of its demands, outperforms experts.
-Bueno de Mesquita
Related Analyst First posts:
This post examines data as an economic resource and a source of value, and provides a new functional definition of Analytics on that basis.
Data is a fascinating resource, with a number of characteristics that distinguish it from other things that we think of as “resources”.
The first point is that data is not a commodity. The inherent value of data is that every piece tells you something new, and the pieces vary dramatically in their meaning, importance, value and reliability. While it is intuitively appealing to think of data in aggregate, and to consider its volume in tera-, peta- or zettabytes, it pays to note that the value in such data is not a function of volume in any reliable sense. This is at odds with commodities such as iron ore, gold or crude oil, which can be measured directly in dollars per kilogram.
The value in data, while very real, is not easy to quantify. In this way, data is more like human resources and less like gold or oil. It is inherently heterogeneous in nature, and its value is not proportional to its mass.
Nor is the value extraction process homogeneous. While the way you extract value from one lump of iron or is no different than that applied to another, the same does not hold true for data. Further, the method of value extraction may well be unknown, and require further exploration. Even if a value extraction method has been identified, and indeed proven to work, there may well be additional value in the data. Finally, the value of data may well only prove itself in concert with other data. This is a synergistic, almost alchemical effect for which there is no good metaphor in the world of commodities.
Where the commodity resource analogy does hold is that data holds value locked within it, and effort must be applied to locate and extract it.
Analytics, particularly in its “data mining” incarnation, is best described as investment in the extraction of value from this curious, heterogeneous, synergistic resource. The tools, skills, techniques and processes of Analytics ares all in the service of this investment enterprise.
The act of investing in data is best considered by way of analogy. Investing can be a simple, hands-off activity, based on a simple transaction – money is put in, and value is generated by an external process. This is how stock investment works. The agent investing in the stock merely puts up the money.
Now consider investing in one’s own education. Or health and fitness program. In these cases, considerable effort by the individual is required: they cannot outsource their intimate involvement in the process invested in.
Analytics is like the latter process for any business leader investing in data. If they are not intimately involved in the process, then the process is a most likely a failure, and almost certainly a waste.
The “mining” analogy is thus used at one’s peril. So is an over reliance on specific tools, processes or preconceptions about what is in the data. Regular mining is not a highly specialised form of labour, and much mining activity is rapidly automated.
Data mining is very different, and the mining analogy fails spectacularly if it envisages a repetitive, well-defined activity, where large machines extract value in a predictable, reliable way, supported by interchangeable people performing repetitive tasks. Indeed, following the mining analogy, real data miners are less like miners and more like prospectors and geologists, performing difficult, ever-changing and highly skilled work to detect value when and how it may arise.
Where they do identify repeatably extractable value (e.g. credit scoring models, churn models, fraud detection algorithms) these nuggets of (temporary) value can indeed be automated and a production processes applied. But this is no longer Analytics, rather it is the IT instantiation of Analytics results. The data miner has moved on to discover value somewhere else.
Thus data is a very unusual value store, with equally unusual, heterogeneous value extraction methods, and a reliance on adaptive, exploratory, highly skilled professionals to make them happen.
Data has another unusual economic property: it remains. As an economic asset, it can be preserved at negligible cost, while other assets -premises, machines, staff, funds and lines of credit – may diminish in tough economic times. The potential value of the data is hard to measure as discussed above, but whatever it may be it grows relative to other shrinking assets. Thus, in hard times, the relative value of data grows, and the business case for Analytics, or Investing in Data grows.
Data itself may grow in volume, scope and complexity with relative ease and low cost, thus the business case for Analytics grows with it.
Data may take many forms, with the structured, electronic enterprise data most familiar to analysts the most familiar form. But it is one of many. Unstructured data is anywhere from 4 to 20 times more voluminous than the structured kind. Survey data may be collected to answer those questions that enterprise data fails to answer.
A Postscript – Tacit Data Mining
The definition of Analytics above extends very broadly in its definition of data.
The most interesting, readily available, strategically relevant and poorly understood form of data is tacit data: the information contained in the brains of staff, board, shareholders and anyone else who would see the organisation do well. Tacit data is seen as the most difficult to identify, extract, value and leverage.
The definition of Analytics presented above includes investment in the extraction of value from all data, including tacit. How is tacit data mined? The most effective and powerful way is by use of collective intelligence and forecasting techniques, such as prediction markets. This is a topic for another day.
About usAnalyst First is a new approach to analytics, where tools take a far less important place than the people who perform, manage, request and envision analytics, while analytics is seen as a non-repetitive, exploratory and creative process where the outcome is not known at the start, and only a fraction of efforts are expected to result in success. This is in contrast with a common perception of analytics as IT and process.
Tags in a CloudAIPIO analyst first Analyst First Chapters analytics analytics is not IT arms race environments big data business analytics business intelligence cargo cults collective forecasting commodity and open source tools complexity data decision automation decision support educated buyer EMC-greenplum forecasting HBR holy trinity human infrastructure incentives intelligence model of analytics investing in data lean startup literacy management culture MBAnalytics operational analytics organisational-political considerations Philip Russom Philip Tetlock prediction markets presales R Robin Hanson Strategic Analytics tacit data TDWI Tom Davenport uncertainty uneducated buyer vendors why analyst first