For those in Sydney or able to get there in early July, I will be presenting on the results, findings and workings of the AIPIO Collective Forecasting Competition at the Sydney Users of R Forum On Tue July 10.
The AIPIO is of course the Australian Institute of Professional Intelligence Officers. They have hosted a number of A1 related presentations in the past, and are a natural friend of A1, given our principle of The Intelligence Model of Analytics.
Collective forecasting and related methods such as Prediction Markets, working as they do on the aggregated tacit data of human beings, are truly “Analyst First” analytics techniques, with human beings adopting the traditional roles of algorithms and electronic data. Collective Forecating is also the only tool consistently appropriate for the most important decisions made in businesses. These tend to be one-off, data poor and often based largely on human-held tacit knowledge.
Big claims? Come along and argue if you happen to be around.
The AIPIO Collective Forecasting Competiton also begins its new round.
All are invited to participate by registering and entering predictions, or by suggesting additional events to forecast.
The months of July and August will involve a number of other Analyst First related events, including:
The long anticipated launch of regular public Analyst First events in Canberra, with thanks to BAE Systems for providing the venue, as they kindly do for monthly A1 ACT chapter Leadership Group meetings.
A presentation of the Analyst First vision in Wellington, New Zealand at the annual conference of the New Zealand Institute of Intelligence Professionals (NZIIP)
Regular Analyst First Leadership Group meetings in Melbourne, Canberra and Sydney. Those interested in being involved in chapter leadership groups please contact me or the other Chaoter heads: Graham Williams in Canberra, Yuval Marom and Tony Laing in Melbourne, and Kevin Gray in Tokyo.
Also forthcoming are a number of expansions of the website, and the creation of new content including a charter of A1 principles, and a list of subscribing A1 practitioners. A number of other initiatives in the works too rhanks to good work by Canberra Chapter Leadership Group.
A1 is a proud supporter of the AIPIO Collective Forecasting Competition, hosted on Presciient’s new collective forecasting platform System II.
A beginner’s guide may be found at the top of the page.
Collective forecasting and related methods such as prediction markets represent the area of analytics that we call Tacit Data Mining, and allow the extraction, deployment and analysis of the most vital data in the organisation, which lives in people’s heads. It also provides the ultimate data fusion platform, fusing all available data through human filters to provide powerful strategic decision support.
Collective forecasting allows accurate forecasting of future events, and also can condition those events on possible actions, thus providing a powerful decision support. It identifies the consistently most effective forecasters, acting as a filter for the most insightful and prescient members of staff or the public.
It has application in any strategic decision support domain.
The competition at hand has 3 expiry dates for predicted events: in April, July and October, each has prizes for 1 month ahead, 1 week ahead and 1 day ahead. The July and October expiries also have 3 months ahead prizes, and a six month ahead prize for the October expiry.
The one-month ahead April expiry deadline is tomorrow, so don’t delay, register and put in your predictions.
Week 3, Day 2 of the CORTEX MBAnalytics program includes TDWI’s best practices report, ‘Strategies for Managing Spreadmarts – Migrating to a Managed BI Environment’, by Wayne W. Eckerson and Richard P. Sherman, from 2008 and based in part on a survey conducted in 2007. It’s an excellent document—one of the best articulations of a problem with which Business Analytics practitioners and interested parties ought to be familiar: the nature and causes of ‘spreadmarts’, their strengths, weaknesses and limitations, what to do about them, and the risks involved. Given how well the report covers these topics, I commend it in full. But I’m also going to address where it falls short. It’s right in its reasoning and its conclusions, but potentially misleading in its emphasis and what it omits.
By way of definition:
A spreadmart is a reporting or analysis system running on a desktop database (e.g., spreadsheet, Access database, or dashboard) that is created and maintained by an individual or group that performs all the tasks normally done by a data mart or data warehouse, such as extracting, transforming, and formatting data as well as defining metrics, submitting queries, and formatting and publishing reports to others. Also known as data shadow systems, human data warehouses, or IT shadow systems.
Finance generates the most spreadmarts by a wide margin, followed by marketing, operations, and sales… Finance departments are particularly vulnerable to spreadmarts because they must create complex financial reports for internal and external reporting as well as develop detailed financial plans, budgets, and forecasts on an ad hoc basis. As a result, they are savvy users of spreadsheets, which excel at this kind of analysis.
Spreadmarts are categorised into three types:
- One-off reports. Business people use spreadsheets to filter and transform data, create graphs, and format them into reports that they present to their management, customers, suppliers, or partners. With this type of spreadmart, people are using data they already have and the power of Excel to present it. There’s no business justification—or even time—for IT to get involved.
- Ad hoc analysis. Business analysts create spreadmarts to perform exploratory, ad hoc analysis for which they don’t have a standard report. For instance, they may want to explore how new business conditions might affect product sales or perform what-if scenarios for potential business changes. They use the spreadmart to probe around, not even sure what they’re looking for, and they often bring in supplemental data that may not be available in the data warehouse. This exploration can also be time-sensitive and urgent.
- Business systems. Most spreadmarts start out as one-off reports or ad hoc analysis, then morph into full-fledged business systems to support ongoing processes like budgeting, planning, and forecasting. It’s usually not the goal to create such a system, but after a power user creates the first one, she’s asked by the business to keep producing the report until, eventually, it becomes an application itself. This type of spreadmart is called a “data shadow system.”
Of these, it’s ‘business systems’ which are the report’s focus. The ‘one-off report’ and ‘ad hoc analysis’ categories of spreadmart, the report fittingly concludes, are inappropriate for systemisation.
But defining spreadmarts in terms of them being ‘shadow systems performing all the tasks normally done by a data mart of data warehouse’ gets things somewhat backwards. Presupposing that all spreadmarts are appropriate for systemsation—in effect viewing them as ‘data marts in waiting’—is misleading when their one-off and ad hoc uses are recognised and their wide proliferation and coverage taken into consideration. One might more accurately define data marts and data warehouses as ‘scaled-up systems which perform some of the tasks normally done by a spreadmart’.
The report’s framing of spreadmarts is a product of its BI/DW view of the world. In that worldview, data lives in source systems and needs to be ETL-ed into a relational repository in order to be published en masse to business users whose analysis requirements consist of read-only slice and dice. These needs do of course exist, but analysis entails much more. This narrower BI/DW view is reflected in the TDWI survey’s design. For example, respondents are asked to rank the top five reasons why spreadmarts exist in their group. “Quick fix to integrating data” ranks second overall, “Inability of the BI team to move quickly” third, ”This is the way it’s always been done” fifth, and “Desire to protect one’s turf” seventh. These options are hardly worded neutrally. They’re phrased so as to norm systemisation and to cast ad hoc and one-off uses of spreadmarts as the products of sloppiness, frustration, ignorance, and narrow self-interest. The section on the benefits of spreadsheets is similarly biased in its framing. “Ideal for one-time analysis” and “Good for prototyping a permanent system” are listed, but not ‘ideal for exploring data, creating scenarios, capturing assumptions, and enriching existing data’. Interpretation also suffers. For example:
Ironically, organizations with a “low” degree of [BI] standardization have the lowest median number of spreadmarts (17.5), and only 31% haven’t counted them. The proper conclusion here is that standardization increases awareness of spreadmarts.
It may be a convenient conclusion, but it certainly isn’t the only one. Perhaps the 70 to 80 percent failure rate of BI standardisation projects is driving business users back to spreadsheets. Excel integration with BI platforms is also filtered through a read-only lens:
BI vendors are starting to offer more robust integration between their platforms and Microsoft Office tools. Today, the best integration occurs between Excel and OLAP databases, where users get all the benefits of Excel without compromising data integrity or consistency, since data and logic are stored centrally.
This tends to be how BI vendors understand Excel integration. They recognise that users enjoy Excel as a query and reporting interface, but they understate its importance as a data and logic creation tool. The BI/DW worldview understands all data as the product of business processes which write it to ‘source systems’. (The one exception to this is the sub-discipline of enterprise budgeting and planning, which is in effect BI with people as the source systems.) What is underappreciated is that analysis itself is a business process—one which can’t help but create data.
These biases are understandable. The TDWI report is sponsored by a collection of BI software vendors, and 60% of those surveyed are IT professionals. What’s missing, then, is a more nuanced understanding of what analysis entails. Such an understanding needs to recognise:
- The centrality of data transformation and enrichment by individual analysts
- The value of tacit data
- The importance of presentation
Simply put, analysis is a read-write activity. Routine analytical tasks I find myself doing, for example, include:
- Entering hitherto tacit data
- Codifying business knowledge
- Finding and synthesising data from outside sources
- Creating dummy and randomised data
- Capturing novel assumptions
- Imposing new categories on existing categorical data
- Enriching existing data by deriving or devising on-the-fly metadata
- Building scenarios and constructing counterfactuals
- Drafting and adding commentary, interpretation, and notes
- Formalising and detailing new questions and follow-on analyses
All of these activities involve me creating new data, and I would submit that neither I nor any BI/DW requirement gathering cycle would be able to anticipate that data ahead of time. These are creative, reflective, results-contingent activities. As the report puts it:
[M]ost BI vendors have recognized that a large portion of customers are using their BI tools simply to extract data from corporate servers and dump them into Excel, Word, or PowerPoint, where they perform their “real work.”
If we additionally overlay the question of data value, one of Analyst First’s key contentions is that the most valuable information in organisations lives in people’s heads. It’s tacit, and spreadsheets are one of the best tools for eliciting it and making it explicit:
Spreadsheets are the most pervasive and effective decision support tools. No organisation doesn’t use them, and it’s a safe bet that this will always be the case. No amount of data warehousing will ever be able to provide decision makers with all the information they need. To the extent that it can, those decisions can be automated. Decisions invariably require new data. That new data will be either unanticipatable, or tacit, or both. Spreadsheets are unbeatable for ad hoc data analysis and turning tacit data into explicit data.
But spreadsheets aren’t the only tools available for tacit data mining. Nor, for some types of data, are they the best tools. As the ‘The Economics of Data – Analytics is… Investing in Data‘ post argued:
The most interesting, readily available, strategically relevant and poorly understood form of data is tacit data: the information contained in the brains of staff, board, shareholders and anyone else who would see the organisation do well… How is tacit data mined? The most effective and powerful way is by use of collective intelligence and forecasting techniques, such as prediction markets.
Finally—analysis process, ad hoc, and one-off needs aside—decision support systems need to be more than portals for publishing structured data as tables, charts, and indicators. As the TDWI survey picks up:
While Excel is the most popular tool for building spreadmarts, business analysts also use Microsoft Access, PowerPoint, and SAS statistical tools.
SAS and PowerPoint are telling inclusions: SAS contains a great deal of statistical and modelling functionality that BI stacks don’t (or certainly didn’t in 2007); PowerPoint is able to flexibly integrate unstructured commentary with the more structured outputs of BI platforms—as is Word. The TDWI report itself is an example of this: most of it is unstructured text, then there are graphics and other design elements, and finally the charts and tables. Very few high value analyses don’t contain narrative, diagrams, and other unstructured presentation elements.
The TDWI report does in fact acknowledge all of the above:
[T]here is often no acceptable alternative to spreadmarts. For example, the data that people need to do their jobs might not exist in a data warehouse or data mart, so individuals need to source, enter, and combine the data themselves to get the information. The organization’s BI tools may not support the types of complex analysis, forecasting, or modeling that business analysts need to perform, or they may not display data in the format that executives desire.
The report’s biases are in its view of analysis as fundamentally amendable to structure, and spreadmarts (as one of its enablers) as precursors of systems. Scalability, repeatability and automation certainly have their place, but a more realistic view would recognise that the analytical activities which a data warehouse can viably support are a subset. This has important implications for BI/DW practices now that the Business Analytics domain incorporates them alongside advanced analytics and the emerging field of big data.
Related Analyst First posts:
The idea is that there are a number of subjects, such as statistics, accounting, and economics, that lawyers cannot expect to be competent in but should be familiar with. We spend a week or two on each.
Other methods covered in the course include decision theory, game theory, and ‘back of the envelope’ calculation. In essence, the classes (which are available for download here as recordings and whiteboard snapshots) provide ‘literacy primers’ for business professionals on each of these forms of reasoning. Analyst First maintains that analytics—over and above being a discipline, a set of techniques, and a profession (all of which it is)—is a literacy. As such, does analytical literacy draw on, live in parallel with, or subsume these?
All of the above. Business Analytics at a minimum fuses statistics (probabilistic reasoning) with accounting (the language of business). Some analytical techniques additionally integrate game theory (e.g. agent-based modelling). Others such as prediction markets bring in price theory from economics. All involve the scientific method, and therefore require empirical literacy.
David Friedman is a brilliant communicator and his lecture audio files are highly recommended. There is something new for everyone in the Analytic Methods course.
Related Analyst First posts:
Enough of Cato Unbound’s What’s Wrong With Expert Predictions debate has now unfolded that it makes sense for me to offer some commentary. The discussion encompasses many predictive and decision-making subject areas and institutions – politics, economics, business, media punditry – but for the purposes of Analyst First I’m primarily interested in prediction in the context of organisations.
All the discussants agree that expert predictive track records are terrible, but they diverge in the degree to which they see this as problematic and in their recommendations as to what to do about it. The debate so far:
In their Lead Essay, Dan Gardner and Philip Tetlock present a puzzle:
Every year, corporations and governments spend staggering amounts of money on forecasting and one might think they would be keenly interested in determining the worth of their purchases and ensuring they are the very best available. But most aren’t. They spend little or nothing analyzing the accuracy of forecasts and not much more on research to develop and compare forecasting methods.
They go on to provide an overview of Tetlock’s longitudinal study of experts, encompassing 28,000 predictions over a fifteen year period, which found that eclectic foxes outperform dogmatic hedgehogs, but that both are outperformed by extrapolation algorithms. They argue that we need to get better at accepting our limitations and to “give greater consideration to living with failure, uncertainty, and surprise”. They accordingly call for “decentralized decision-making and a proliferation of small-scale experimentation”.
In the Reaction Essay section, Robin Hanson addresses the puzzle of why forecasting remains so immune to accountability via – presumably easy to assemble – track records: “[s]urprising disinterest [he means uninterest] in forecasting accuracy could be explained either by its costs being higher, or its benefits being lower, than we expect.” His conclusion is that, even in profit and loss settings such as organisations, the signalling value of forecasting must compete with its information value:
Even in business, champions need to assemble supporting political coalitions to create and sustain large projects. As such coalitions are not lightly disbanded, they are reluctant to allow last minute forecast changes to threaten project support. It is often more important to assemble crowds of supporting “yes-men” to signal sufficient support, than it is to get accurate feedback and updates on project success. Also, since project failures are often followed by a search for scapegoats, project managers are reluctant to allow the creation
He points out that, while prediction markets are best able to incentivise information holders to provide accurate forecasts, institutional respect for accuracy is a necessary and thus far absent precondition to their widespread uptake.
John H. Cochrane turns the tables by arguing that unforecastability is a good sign as seen through the lens of economics:
In fact, many economic events should be unforecastable, and their unforecastability is a sign that the markets and our theories about them are working well.
This statement is clearest in the case of financial markets. If anyone could tell you with any sort of certainty that “the market will go up tomorrow,” you could use that information to buy today and make a fortune. So could everyone else. As we all try to buy, the market would go up today, right to the point that nobody can tell whether tomorrow’s value will be higher or lower.
An “efficient” market should be unpredictable. If markets went steadily up and delivered return without risk, then markets would not be working as they should.
Forecasting, in the sense of accurately trying to predict the future, is a “fool’s game”. But it does work as an input into risk management:
The good use of “forecasting” is to get a better handle on probabilities, so we focus our risk management resources on the most important events. But we must still pay attention to events, and buy insurance against them, based as much on the painfulness of the event as on its probability. (Note to economics techies: what matters is the risk-neutral probability, probability weighted by marginal utility.)
So it’s not really the forecast that’s wrong, it’s what people do with it. If we all understood the essential unpredictability of the world, especially of rare and very costly events, if we got rid of the habit of mind that asks for a forecast and then makes “plans” as if that were the only state of the world that could occur; if we instead focused on laying out all the bad things that could happen and made sure we had insurance or contingency plans, both personal and public policies might be a lot better.
Cochrane defends a hedgehog-like reversion to principles – basic economic principles like supply and demand – in order to build effective conditional forecasts which inform plans and provide decision support.
Bruce Bueno de Mesquita argues that expert prediction is, properly contextualised, a sideshow. Statistical methods are widely used, so much so that we’ve ceased to notice (e.g. in insurance pricing and political polling). Game theory is better still, and continues to make incremental progress:
Are these methods perfect or omniscient? Certainly not! Are the marginal returns to knowledge over naïve methods (expert opinion; predicting that tomorrow will be just like today) substantial? I believe the evidence warrants an enthusiastic “Yes!” Nevertheless, despite the numerous successes in designing predictive methods, we appropriately focus on failures. After all, by studying failure methodically we are likely to make progress in eliminating some errors in the future.
So why do we continue to focus on the poorly performing experts? De Mesquita’s view is that:
Unfortunately, government, business, and the media assume that expertise—knowing the history, culture, mores, and language of a place, for instance—is sufficient to anticipate the unfolding of events. Indeed, too often many of us dismiss approaches to prediction that require knowledge of statistical methods, mathematics, and systematic research design. We seem to prefer “wisdom” over science, even though the evidence shows that the application of the scientific method, with all of its demands, outperforms experts.
De Mesquita goes on to explain and advocate his own game theoretic (Expected Utility Model) approach:
Acting like a fox, I gather information from a wide variety of experts. They are asked only for specific current information (Who wants to influence a decision? What outcome do they currently advocate? How focused are they on the issue compared to other questions on their plate? How flexible are they about getting the outcome they advocate? And how much clout could they exert?). They are not asked to make judgments about what will happen. Then, acting as a hedgehog, I use that information as data with which to seed a dynamic applied game theory model. The model’s logic then produces not only specific predictions about the issues in question, but also a probability distribution around the predictions. The predictions are detailed and nuanced. They address not only what outcome is likely to arise, but also how each “player” will act, how they are likely to relate to other players over time, what they believe about each other, and much more.
In the Conversation section, Robin Hanson challenges Cochrane and de Mesquita to produce conditional forecasts and submit them to systematic public measurement and verification. He is doubtful, however, that they will assent:
The sad fact is that the many research patrons eager to fund hedgehoggy research by folks like Cochrane and De Mesquita show little interest in funding forecasting competitions at the scale required to get public participation by such prestigious folks.
Forecasting, he contends, is a domain in which the rewards to affiliation with prominent expertise trump accuracy.
Bruce Bueno de Mesquita replies that the acceptance of his methods in journals, via peer review, is evidence of their having been sufficiently scrutinised; furthermore that no one has been willing to publically compete with him; additionally that he has successfully beaten alternative approaches; and finally that he has made his methods available online.
Robin Hanson responds that more comprehensive standards of proof are required to settle the matter.
Gardner and Tetlock then provide an insightful running summary. In response to Hanson they speculate that the costs of admitting to poor forecasting performance would disenfranchise those currently enjoying their – unjustified in terms of performance – public and organisational reputations:
Open prediction contests will reveal how hard it is [for them] to outperform their junior assistants and secretaries. Insofar as technologies such as prediction markets make it easier to figure out who has better or worse performance over long stretches, prediction markets create exactly the sort of transparency that destabilizes status hierarchies… If these hypotheses are correct the prognosis for prediction markets—and transparent competitions of relative forecasting performance—is grim. Epistemic elites are smart enough to recognize a serious threat to their dominance.
In response to Cochrane they speak up for the value of hedgehogs – more compelling, more visionary, better at envisioning extreme events – but note that the cost of this is that they are more wrong, more often.
They close by welcoming de Mesquita’s willingness to be publically scrutinised, note that the jury is still out in terms of systematic and decomposed measurement of his methods, and caution that:
For many categories of forecasting problems, we are likely to bump into the optimal forecasting frontier quite quickly. There is an irreducible indeterminacy to history and no amount of ingenuity will allow us to predict beyond a certain point.
De Mesquita responds that he welcomes being assessed.
Although Cochrane comes close, none of the discussants explicitly recognises and makes central the difference between forecasting and other activities which organisations call forecasting (i.e. planning and goal setting). I explained this distinction in a previous post, namely:
- Forecasting means objectively estimating the most likely future outcome: “what’s going to happen?”
- Goal setting means putting a target in place, generally for motivational purposes: “what would we like to happen?”
- Planning means establishing an intended course of action, usually to direct the allocation of resources: “what are we going to do?”
This distinction is key because, while all three activities are based on prediction, only in the case of forecasting is predictive accuracy the primary purpose. Organisations can improve all of these, but to do so they need to address three tiers of potential failure:
All the Cato discussants take it as read that, in assessing predictions, they’re operating in an empirical paradigm. In organisations, however, this can’t be taken for granted. Many organisations place prediction either in the wrong paradigm, or no paradigm at all. It’s common for predictive activities and processes to be ritualised and adhered to, but without any systematic error measurement or validation. Gardner and Tetlock acknowledge the “widespread lack of curiosity—lack of interest in thinking about how we think about possible futures” as “a phenomenon worthy of investigation in its own right,” pointing out the wastefulness of remaining ignorant given the resources involved.
Systematic error measurement and validation can’t happen without the right categories being first recognised and agreed upon. Disambiguating forecasting from goal setting from planning is critical. Organisations don’t do this well. Loose language doesn’t help. The same Finance department will update a budget (a plan) and call it a “forecast”, oversee the revision of sales “forecasts” (goals), and publish revenue estimates for the scrutiny of stock market analysts (true forecasts). As an earlier Analyst First post pointed out, these activities, while all reliant on objective estimation, do not share the same benchmarks when it comes to assessing error and value. Forecast error makes sense for forecasting; execution error makes more sense for goal setting and planning.
The Cato discussants all tacitly acknowledge these distinctions, but none recognises its implications when it comes to understanding the way organisations do prediction.
Tetlock’s experiment required that pundits’ anonymity be protected. Participants knew to distance themselves from their projections when they were accountable for accuracy. The implication here is either that pundits are dishonest, or that they recognise that their projections serve a purpose other than informing people about the likelihood of future events. Gardner and Tetlock, and Hanson, acknowledge that punditry is a form of entertainment, has signalling value, and by virtue of this trades off accuracy for clarity and narrative value. As Hanson puts it:
Media consumers can be educated and entertained by clever, witty, but accessible commentary, and can coordinate to signal that they are smart and well-read by quoting and discussing the words of the same few focal pundits. Also, impressive pundits with prestigious credentials and clear “philosophical” positions can let readers and viewers gain by affiliation with such impressiveness, credentials, and positions. Being easier to understand and classify helps “hedgehogs” to serve many of these functions.
Hanson recognises that affiliation with sophistication has signalling value within organisations too. He notes the multiple roles played by managers, including the requirement that they appear impressive enough to attract affiliation and inspire their subordinates:
[C]onsider next the many functions and roles of managers, both public and private. By being personally impressive, and by being identified with attractive philosophical positions, leaders can inspire people to work for and affiliate with their organizations. Such support can be threatened by clear tracking of leader forecasts, if that questions leader impressiveness.
He goes on to describe the motivational impact of managerial ‘overconfidence’:
Often, managers can increase project effort by getting participants to see an intermediate chance of the project making important deadlines—the project is both likely to succeed, and to fail. Accurate estimates of the chances of making deadlines can undermine this impression management. Similarly, overconfident managers who promise more than they can deliver are often preferred, as they push teams harder when they fall behind and deliver more overall.
Incentivising workers to “deliver more overall” is precisely the purpose of goal setting. Consistently producing overshooting projections in this context isn’t necessarily “forecast hypocrisy,” as Hanson characterises it. It may be effective stretch targeting.
Many of the discussants also acknowledge that planning is a different activity from forecasting (and goal setting), but don’t pursue the full implications of this in terms of error and value measurement. The Kenneth Arrow anecdote relayed by Gardner and Tetlock, for example, illustrates that plans are reliant on, but different from, forecasts:
Some [corporations and governments] even persist in using forecasts that are manifestly unreliable, an attitude encountered by the future Nobel laureate Kenneth Arrow when he was a young statistician during the Second World War. When Arrow discovered that month-long weather forecasts used by the army were worthless, he warned his superiors against using them. He was rebuffed. “The Commanding General is well aware the forecasts are no good,” he was told. “However, he needs them for planning purposes.”
Gardner and Tetlock look also at the role of self-aware (i.e. of limitations) prediction in preparedness planning, comparing the effectiveness of the recent New Zealand and Haiti earthquake responses:
Designing for resiliency is essential, as New Zealanders discovered in February when a major earthquake struck Christchurch. 181 people were killed. When a somewhat larger earthquake struck Haiti in 2010, it killed hundreds of thousands. The difference? New Zealand’s infrastructure was designed and constructed to withstand an earthquake, whenever it might come. Haiti’s wasn’t.
Cochrane seconds this, adding that predictions have scenario generation utility regardless of their accuracy:
Once we recognize that uncertainty will always remain, risk management rather than forecasting is much wiser. Just the step of naming the events that could happen is useful.
In these and other ways, the discussants acknowledge that accuracy isn’t the only purpose of prediction. It should therefore follow that forecast error might not be the only relevant measure.
Much of the discussion contrasts different predictive tools, techniques and approaches: expert judgement, statistical algorithms, prediction markets, game theory. Methodologies and expectations both need to be appropriately calibrated: simple statistical extrapolation works well in some settings, but in complex systems environments the best we can hope for may be a better feel for the probabilities involved.
There are a range of insights here for organisations. Individual human judgement on its own, it is unanimously acknowledged, performs poorly. Statistical algorithms consistently beat the experts. There is general agreement among the discussants that eclecticism is desirable. The clear implication is that organisations should adopt collective intelligence methods.
Tetlock’s wider work on expert political judgement has implications for optimal forecasting team composition (use hedgehogs to generate possibilities and foxes to synthesise and calibrate probabilities). Gardner and Tetlock also call for what we term Decision Performance Management:
Imagine a system for recording and judging forecasts. Imagine running tallies of forecasters’ accuracy rates. Imagine advocates on either side of a policy debate specifying in advance precisely what outcomes their desired approach is expected to produce, the evidence that will settle whether it has done so, and the conditions under which participants would agree to say “I was wrong.” Imagine pundits being held to account.
It’s also worth imagining what sort of environment supports this, as Hanson does in his discussion of a different “social equilibrium”:
A track record tech must be combined with a social equilibrium that punishes those with poor records, and thus encourages rivals and victims to collect and report records. The lesson I take for forecast accuracy is that it isn’t enough to devise ways to record forecast accuracy—we also need a new matching social respect for such records.
He’s right. New ways to record accuracy aren’t enough. We also need to know whether accuracy is the real goal. On the subject of goals, when it comes to organisational planning and goal setting, it may well be that these are best understood in a game theoretic context.
Whatever the case, once they are disambiguated, and because they are all related, empiricism means doing forecasting and goal setting and planning better.
Related Analyst First posts:
- Hedgehogs are foxy when they’re right
- *What’s Wrong with Expert Predictions*
- *The Folly of Prediction*
- Forecast error versus execution error
- Forecasting, goal setting, planning
- Robin Hanson on Information Accounting
- Paying for software is buying insurance
That is the title of a highly recommended discussion being hosted this month at Cato Unbound:
- The editors’ introduction is here.
- Dan Gardner and Philip Tetlock’s lead essay, ‘Overcoming Our Aversion to Acknowledging Our Ignorance’, is here.
- Robin Hanson’s reaction essay, ‘Who Cares About Forecast Accuracy?’, is here.
- John Cochrane’s reaction essay, ‘In Defense of the Hedgehogs’, is here.
- Bruce Bueno de Mesquita’s reaction essay, ‘Fox-Hedging or Knowing: One Big Way to Know Many Things’, is here.
Every year, corporations and governments spend staggering amounts of money on forecasting and one might think they would be keenly interested in determining the worth of their purchases and ensuring they are the very best available. But most aren’t. They spend little or nothing analyzing the accuracy of forecasts and not much more on research to develop and compare forecasting methods. Some even persist in using forecasts that are manifestly unreliable, an attitude encountered by the future Nobel laureate Kenneth Arrow when he was a young statistician during the Second World War. When Arrow discovered that month-long weather forecasts used by the army were worthless, he warned his superiors against using them. He was rebuffed. “The Commanding General is well aware the forecasts are no good,” he was told. “However, he needs them for planning purposes.”
Even in business, champions need to assemble supporting political coalitions to create and sustain large projects. As such coalitions are not lightly disbanded, they are reluctant to allow last minute forecast changes to threaten project support. It is often more important to assemble crowds of supporting “yes-men” to signal sufficient support, than it is to get accurate feedback and updates on project success. Also, since project failures are often followed by a search for scapegoats, project managers are reluctant to allow the creation of records showing that respected sources seriously questioned their project.
Now “forecasting” as Gardner and Tetlock characterize it, is an attempt to figure out which event really will happen, whether the coin will land on heads or tails, and then make a plan based on that knowledge. It’s a fool’s game.
Once we recognize that uncertainty will always remain, risk management rather than forecasting is much wiser…The good use of “forecasting” is to get a better handle on probabilities, so we focus our risk management resources on the most important events. But we must still pay attention to events, and buy insurance against them, based as much on the painfulness of the event as on its probability. (Note to economics techies: what matters is the risk-neutral probability, probability weighted by marginal utility.)
Good prediction—and this is my belief—comes from dependence on logic and evidence to draw inferences about the causal path from facts to outcomes. Unfortunately, government, business, and the media assume that expertise—knowing the history, culture, mores, and language of a place, for instance—is sufficient to anticipate the unfolding of events. Indeed, too often many of us dismiss approaches to prediction that require knowledge of statistical methods, mathematics, and systematic research design. We seem to prefer “wisdom” over science, even though the evidence shows that the application of the scientific method, with all of its demands, outperforms experts.
-Bueno de Mesquita
Related Analyst First posts:
- The centrality of prediction to all decision making
- Real forecasting versus punditry
- The ‘fake supply’ of bogus and unverifiable forecasts
- Foxes and hedgehogs
- The ‘no change’ benchmark
- Cognitive styles and biases
- The importance of forecast track records
- The case for prediction markets
- Why organisations often don’t want accurate forecasts
- DARPA’s ill-fated Policy Analysis Market
Anyone interested in forecasting and prediction who doesn’t know the work of Tetlock, Taleb, or Hanson should chase up the above links.
Related Analyst First posts:
About usAnalyst First is a new approach to analytics, where tools take a far less important place than the people who perform, manage, request and envision analytics, while analytics is seen as a non-repetitive, exploratory and creative process where the outcome is not known at the start, and only a fraction of efforts are expected to result in success. This is in contrast with a common perception of analytics as IT and process.
Tags in a CloudAIPIO analyst first Analyst First Chapters analytics analytics is not IT arms race environments big data business analytics business intelligence cargo cults collective forecasting commodity and open source tools complexity data decision automation decision support educated buyer EMC-greenplum forecasting HBR holy trinity human infrastructure incentives intelligence model of analytics investing in data lean startup literacy management culture MBAnalytics operational analytics organisational-political considerations Philip Russom Philip Tetlock prediction markets presales R Robin Hanson Strategic Analytics tacit data TDWI Tom Davenport uncertainty uneducated buyer vendors why analyst first