“Was Nate Silver Wrong?” Bram Wessel’s talk at the 2017 IA Summit

Here is Bram Wessel’s talk from the 2017 Information Architecture Summit in Vancouver BC.

Was Nate Silver Wrong? from Bram Wessel

Transcript:

What the hell happened in the 2016 US Election?

Let me clarify, because I’m sure there are diverse political viewpoints in this audience. What I mean is not how did Donald Trump get elected, but how did everyone get this so wrong? This is a screenshot of 538’s Polls Only Forecast on November 8, 2016 — which if you are a US citizen is a date you might recognize. For the rest of you, that was the day we elected Trump. As I’m sure many of you know, fivethirtyeight.com is the brainchild of 2011 IA Summit Keynote Speaker Nate Silver.

So was Nate Silver Wrong? For anyone who is familiar with the very recent practice of forecasting elections by feeding polling data into computer models, that was THE BIG question of this election season.

There are plenty of people on the Internet for whom the answer would be yes, Nate Silver was wrong! The thing is, many of them have a reason why they want you to think this. Because right around election day, this became a very popular question, and thus it became very profitable to have the words “Nate,” “Silver,” and “wrong” in a headline.

This spiking search of course has very little to do with the actual wrongness of Nate Silver or 538. Nor does it have much to do with how accurate the 538 forecast was compared to other forecasts. For the record, among all the major election forecasts the 538 forecast gave Donald Trump the highest odds of victory at 28.6%.

And many are asking the question, were the polls themselves wrong? Was the media wrong? And if so many entities, the entities we traditionally trust to interpret the world, who had access to such a vast amount of data, could be wrong about something so fundamental, something so visible, something so seemingly obvious, then, can we ever trust them again? How do we establish meaning in an environment like this?

One of the things I’m going to focus on today is the nature of what Neil Postman and others have called The Information Environment. We live in multiple environments. We live in a natural environment, a built environment, and an information environment. And since we’re all Information Architects here to talk about Information Architecture, it seems like a great time to ask a few questions about what our role in these respective environments might be.

So what do we commonly understand about this information environment? At a base level, we understand that there are places and things made of information. Just like the natural and built environments, in the information environment there are vast forests of data, and there are neighborhoods and communities where inhabitants are bound by shared values or a common purpose. We also understand that we are creatures set in time with finite amounts of attention, so this means that there is an attention economy that powers the information environment.

But are there hidden costs to the incursion of the information environment into the environments we traditionally inhabit?

One way to read the current situation is that we’ve reached a point of consequence provoked by the ways and extent to which our information environment has (to borrow a phrase from Ted Nelson and Peter Morville) become intertwingled with these other environments.

For example, though literacy is at an all time statistical high, some believe the public is not as well-informed than it was in the past. Some believe a search and social-media driven tsunami of quote unquote “Fake” I’ll call it “content” is to blame for this.

So how about this. For the purposes of this discussion in this room, let’s (informally and unscientifically) test this hypothesis —do a little user research on ourselves— by taking what in the polling business they call a straw poll:

Ever done Google search and seen a good number of results that are obviously crap? OK, now, I want you to raise you hand if that’s NEVER happened to you.

Second question (we’ll collect two data points). Ever clicked on what you thought was a news link in your Twitter or Facebook feed and ended up with something deceptive or misleading? OK, hand up if that has NEVER happened to you.

OK, if you had your hand up either time you lead a charmed life, because this stuff is pervasive. But why is this so pervasive, despite Google’s well-documented attempts to combat it? Well my question is, should it even surprise us? Because much of this process has been given over to machines that communicate with each other with minimal human intervention.

Let’s take content farming for example. Everyone knows what content farming is, right? content farms are intentionally designed to automate the generation of raw ad revenue with no regard for value exchange. It’s not even that content farms intentionally want to feed us false or deceptive information (all though in some cases, perhaps they do). They simply don’t care about content, because human curation costs money. Their goal is only to exploit the market for our attention with as little human effort as possible.

This is the impact of the attention economy on the information environment. Content is not actually the product. We are. A huge volume of microscopic units of our attention is the product Google offers for sale in the open market. Naturally, inevitably, this is going to lead to a race to the bottom in value. Because that’s the way the business model of the ecosystem itself is structured.

Human interpretation doesn’t come into play in this scenario until you and I are forced to wade through oceans of crap in a futile attempt to find what we need. So the perverse result of automation is… wait, what? the burden is shifted to us!? You’re telling me that automating information retrieval generates more human effort? That doesn’t really align with the brand promise of the Alphabet corporation, does it?

And what about social media? The original promise of social media was to create neighborhoods made of information. To facilitate more direct interaction between humans than just asking a machine, like The Google, a question. This is why Facebook goes to some length (even making questionable decisions about what declarations of identity it will allow) to verify there there’s a human behind each profile, and why Twitter goes to equally great lengths to present “verified” accounts.

But here again is an ecosystem ripe for exploitation. And in this case, we’re doing even more work for the machines. Very early in the history of social media, link referral became the most popular activity. And it makes sense that it would. It gives each of us the sensation of harnessing the power of all this automation, all of these supposedly benign algorithms to influence people we’re connected to in this community made of information. It creates the illusion of a network effect at our disposal.

Of course that behavior itself has now been gamed. This is what the publishers of fake news and clickbait have figured out — by creating content that reinforces our convictions, and using the social graph that the social networks offer for sale on the open market, these publishers can generate lots and lots of revenue.

It also creates a feedback loop. We read something that reinforces our convictions, confirms our suspicions, foments our outrage, so we feel a sense of urgency — that it’s critical to spread the word, so that everyone in our network can see and feel what we feel. And the people who already agree with us do. The people who don’t troll us in the comments. (A behavior which also gets monetized.)

And this is a hell of a drug. All this activity doesn’t ultimately gain us much beyond a temporary dopamine rush, but it gains the publishers of this content plenty of units of our valuable attention. The owners of the ecosystem who are taxing it have little economic incentive to curtail it. So the feedback loop continues, and suddenly most people, regardless of level of education, get what they consider their “news” from this massive referral engine, which has a huge artificial motive driving it. The entities who publish this content have no interest in quality of information. In that, they represent a very efficient marketplace for content. But what is getting lost in this market efficiency?

What all this amounts to, even though there’s no one malign entity in direct control of any of this, is a denial of service attack on meaning. The volume of low-quality information out there has gotten so high that the signal to noise ratio is out of proportion for the majority of humans. Never mind who has the critical thinking skills to separate the signal from the noise. Most ordinary people simply don’t have time to devote the effort required to figure out what’s legitimate, or accurate, or authentic.

How did all of this lead to the result we got in this election, and the inability of even the most rigorous and traditionally trustworthy to see that this was a likely result? That brings us right back to the question that’s the title of this talk.

Was Nate Silver wrong?

To answer that, I need to provide some background. 538 is a leading purveyor of what is now called “Data Journalism.” What Data Journalism boils down to is taking a more analytically rigorous, and less access-driven approach to reporting the news than journalists traditionally have. When you’re reporting a story like Watergate, where the truth is out there, but it’s hidden, and you need to uncover facts intentionally obscured from public view, you need access. Just like talking to actual users, there’s no substitute for the diligent gumshoe research of finding people to give you the evidence you need to put something in print.

The research in Data Journalism is quite different. Your evidence is data, and your sources are the researchers that generate that data. It’s kind of like the difference between quant and qual in our IA world. Data journalists use tools to get at facts or the truth in what they consider an objective, data-driven way. They might not have the access to find out truths or facts nobody is supposed to know, but they can draw broader and deeper and often more definitive conclusions.

But just like access journalists, data journalists rely on their sources, and things have gotten harder for them. It’s an enormous challenge to forecast based on polling data when there’s such a wide variance in polling methods themselves. For example, online polls differ greatly in results from live interview polls and polling mobile phones gets different results than polling only land-lines.

Once you really start to dig into it, you realize there’s a bewildering array of modeling that is happening in elections forecasting (and other data journalism), from the source data all the way up to the meta-aggregators of all the forecasts. Everyone is trying to get at the truth, but encounters second and third order modeling effects that can distort conclusions.

Which brings us back to Nate. Was he wrong? A better question might be, how can anyone get this right?

Well, one place to start is with something Nate himself said in his 2011 IA Summit Keynote. It might seem obvious, but there’s actually quite a few nuances that create these huge downstream effects: “So number one, and this kind of goes back to the supply chain, but if you’re working with actually trying to manufacture something out of data, to come to some analysis or some conclusion, you have to make sure the data is good to begin with.” -Nate Silver, 2011 IA Summit Keynote

In this quote Nate danced around what I think is a very important concept for us as Information Architects: There’s a supply chain of information. So, let’s take a walk in 538’s shoes and explore their information supply chain, to hopefully shed some light on the information supply chains we interact with as IA’s.

So the first principle is that your source data is probably wrong. To put a much finer point on it, the nature of opinion polling is that no poll is ever precisely correct except the one you take on election day. If we know our source data will be ambiguous, the question then becomes how and to what degree? As my brother, who is one of the best scientists I know, is fond of saying; it’s not about being 100% right. The goal is being the least amount wrong.

This is just like what we face as information architects. When we’re trying to model an information domain, there is no instrument that allows us to read the minds of our users. We can’t know how they form their mental models unless we talk to them, and even then we can’t model it exactly. As Nate also said in his IAS Keynote, we have to embrace ambiguity, and consider it a feature, not a bug.

So the 538 model, and other forecasting models, introduce all manner of corrections for the inherent ambiguity, the inherent wrongness, if you will, of polling data. In 538’s case, they’re very inclusive. They’ve made a choice that more data is better. So they aggregate all data except from pollsters that have been shown to be sloppy or unethical. So right away there’s a methodological choice a forecaster faces when they decide what polls they do or do not include.

This is very similar to methodological choices we make when we decide what discovery inputs we’re going to trust and prioritize. Are we going to be driven more by analytics or qualitative user research? By Big Data or by Thick Data? What balance are we going to strike?

One of the very most basic ways election forecasters do this is by choosing which polls to prioritize. One thing they do is prioritize state polls over national polls. Turns out that’s a wise thing to do because in the US we have this thing maybe you’ve heard of, it’s called the electoral college? It’s actually electors, Five Hundred and Thirty Eight individual humans chosen by states, that elect the president, not the popular vote. So if you wanted to create a model that just tracked the polling of the national popular vote, like the Real Clear Politics National Polling Average, some of the time you’d get the winner wrong. Just ask Hillary Clinton. Or Al Gore.

Then a forecaster has to decide which version of each poll to use. If you’re surveying all adults, you stand a good chance of surveying a lot of people who aren’t going to vote because they’re not registered to vote. Almost no serious polling firm refuses to screen at least for registered voters, and even then it’s a well-know fact that only some fraction of registered voters end up voting.

So if a polling firm publishes multiple versions of a poll, forecasters tend to want to use the “likely voter” version, because it’s considered more accurate. This is important in the US because we have a problem with turnout. In fact, if you multiply the percentage of eligible adults who actually voted, by the percentage of the vote Trump got, it turns out that less than a quarter of eligible adults in the US elected Trump for everyone else.

If you’ve ever done user research, (and that should be what pollsters would call a plurality of the people in this room,) you’ve probably used a recruiter, and given them a screener to help them find the target audience. As IA’s, we can’t afford to talk to people who aren’t in the target audience. Not only is it a waste of time and money, but it’s very risky. It could lead us to the wrong conclusions, or creating the wrong design.

So naturally, pollsters that limit their results to likely voters get different results than firms that publish results from all registered voters. So how do pollsters decide how likely someone they’re surveying is to vote? They create a likely voter model. They do this by asking qualifying questions before they ask who people intend to vote for. (Gallup has a famous set of seven questions.) So just like us, the choices pollsters make about how they screen affect the precision of their results.

Forecasters have to decide how to interpret the choices the pollsters make. So just like we do with discovery inputs, forecasters give their inputs different weights. In 538’s case, they weight polling firms based on a firm’s historical track record for accuracy, but they also weigh based on methodology. Part of how forecasters model the data they receive is by modeling their own value judgements about the different methodologies of their sources. And they have to keep this rating current, because the performance of pollsters changes over time. (Sometimes pollsters will even change their methodologies midstream during an election season!)

Another weighting factor is time. Each poll is a snapshot in time. People’s opinion’s change over time. And some elections have higher volatility than others. 2016 was much more volatile than 2012, for instance. So it stands to reason that a poll taken two months ago is probably not going to be as predictive as one taken yesterday. You can see here how quickly opinions change by holding the slider over October 28, the day of FBI director James Comey’s letter to Congress. Look at how much the race tightened in the shaded area after the Comey letter.

So forecasters weight for recency. This is especially valuable if the same firms are taking multiple polls of the same population with the same methodology, because it’s instructive to see how a population’s opinion shifts over time in an important state like, say Pennsylvania. Or Michigan. Or Wisconsin.

Next come adjustments. Remember, forecasters assume the polling has entropy, and try to reduce it. So they change the data. This, obviously, can be fraught with peril. A very basic choice most forecasters make is to adjust results to compare apples to apples. So they essentially re-model likely voters. They adjust polls of all adults and all registered voters to be functionally equivalent. But remember what we talked about before; most pollsters use their own likely voter model. So what forecasters are adjusting has already been adjusted.

Another adjustment arises because some firms don’t ask their respondents about third party candidates. So forecasters adjust so that every result is as if the firm had asked about all candidates. The challenge with this adjustment is this data doesn’t exist, so a forecaster has to model how they think third party support might distribute. Because Libertarian or Green party voters will not divide equally among candidates for the Democratic or Republican parties.

Since opinion evolves over time, some forecasters do a trend-line adjustment. Which is like saying well, since everything else has been trending toward Clinton, and we get an outlier showing support for Trump, we’re going to pull that back toward the middle. Then if a forecaster has observed that a polling firm has what’s called a house effect, meaning their polls tend to lean one way relative to the average, they also get adjusted. So there’s a smoothing that happens to try to keep the forecast less volatile. But that can be a slippery slope because every adjustment you make has downstream effects because every time you adjust a data point to an average, it changes the average.

Then a forecaster has to figure out what to do with undecideds. They have to model all the folks who weren’t able to say who they were going to vote for when polled. More missing data they have to model. And I’m not even going to get into things like the Convention Bounce adjustment, Partisan Voter Index, Demographic Regressions, blending Polls with Economic Fundamentals, etc.

The reason that I’ve taken us on this long and winding path through an example of some of the most complex, high stakes modeling that exists anywhere is to demonstrate all of the complexities that emerge in an information supply chain. And what is striking about all this modeling is just how human it all is. Humans are responsible for all the modeling decisions. In fact, creating an election forecast IS INFORMATION ARCHITECTURE!

And none of it is new. Election forecasting is grounded in classic information theory straight out of the work of Claude Shannon. Forecasters are trying to reduce the entropy of poll results by aggregation and comparison. If you take one poll, it provides lots of new information. But since taking a poll is a predictive act, a second poll should provide relatively less new info; it should in part be predicted by the first.

“Information is the resolution of uncertainty.”

— Claude Shannon

So if you gather and compare all of the information people say they believe about a given subject and aggregate it, the theory is you can eventually reduce the entropy in the channel to a more predictive level. At this point, I want to share one of my favorite quotes from Peter Morville. He was talking about Findability, but this I think this applies generally to modeling information:

“Findability is at the center of a quiet revolution in how we define authority, allocate trust,

and make decisions.”

— Peter Morville, Ambient Findability

What does this mean for Information Architects? How is the broader information environment impacting the jobs we do? What are some survival tactics to make meaning in this increasingly chaotic environment? What are our information supply chains?

As Nate Silver pointed out to us six years ago, how do we know that our source data is good? We have to continually question our methodologies by thinking about the information supply chain at every level of the information environment we inhabit.

Just like it’s getting harder for data journalists, the demands of managing our information supply chains are making our jobs harder. Way back when, in the early days of IA, we used to think our job was creating self-contained navigation systems with internal coherence and consistency. People would hit www – dot – our domain name – dot com and we had to make sure they could find what they might have been looking for. It wasn’t that difficult to figure out what everything was, because the information environment —and our little slice of it— was much smaller.

How quaint that all sounds compared to today, when everybody is coming in through a side door and virtually everything we work on is dynamic. It’s extremely likely users just had to wade through the sea of crap in whatever search or social experience led them to wherever they land within our site, so they’re likely to be very disoriented when they get here. Our job at this point is to help them shake off the clutter and get reoriented as quickly as possible. So now instead of building a self-contained navigation system, we need to structure information in a permeable way that helps people establish context when they parachute into the middle of our world.

So, we have to create the structure of what’s here. We have to create the definitions. We have to establish the rules of understanding. What are all the things? What are they called? What can you do with them? How are they related to each other?

Fortunately this is in alignment with the way Google ranks things these days. In reaction to content farming, Google is trying to teach Page Rank how to distinguish authentic content from crap. So how great is this? The best way to optimize content for search is to make sure it fails to suck!

And this is social-media proof too! It doesn’t matter what context or state of mind someone is in when they arrive if you give them the tools to quickly re-orient themselves. Even if someone is super pissed off about the flame fest they just had with their racist former friend from high school, when they arrive in our world, we have to afford them the opportunity to make sense of their surroundings and get what they need from the experience.

So what can we do as information architects in the trenches as practitioners? What can we do when we have limited control over the vastness of the information environment? How do we figure out what impact we can have in our limited sphere?

We need to understand how the broader information environment is affected by everything we put into it. It is on us to help users get on top of the signal to noise ratio. Because no matter how much we try to create highly effective self-contained systems of understanding, the broader information environment is going to increasingly dictate how these systems are used and how authoritative they are. Whatever we’re working on, it’s our responsibility to give information users the ability to determine what is quality and to sift through the volumes of information we provide to find what’s valuable to them.

Let’s zoom out a little. Most of us work in or with some kind of organization. So what are the implications of the emerging information environment for organizations?

Just like the rest of us, organizations have to wade through the same sea of crap if they want customers or constituents to be able to satisfy their goals. In this attention economy, with the heavily mediated experiences everyone immersed in and the continuous partial attention demanded of all of us, it’s difficult to get a message across, because there’s so much entropy in the channel.

At lot of the work we do in my company, Factor, is literally just helping organizations understand what their own information supply chains look like. Without doing this, they have no way of gaining control over their information assets to construct systems of meaning their constituents can engage with.

This is important for every organization. No matter what your organizational charter is, your organization is inescapably about something. There is an aboutness to how your information assets are organized and instrumented that communicates the goals, values, and priorities of your organization. To elevate your signal above the noise, you need to reduce the entropy in your channels. You need to model the information you’re putting out into the world, so you’re not obscuring this aboutness. You need to give the people and the machines who are interpreting the information you are providing the semantic handles to be able to use it effectively.

And what we’ve seen is that organizations should resist the impulse to run to the perceived safety of things like referral engines, best bets, and other automated means of creating an experience. People who bought… also bought… might work, but how meaningful is it? Of course, find smart ways to integrate automated processes as inputs or augmentations to a core semantic experience, but don’t let it become a substitute for the fundamental act of modeling and exposing what an organization is about.

How do you do this? It starts all the way at the very beginning by asking the right questions to begin research. We need to validate and revalidate our inputs, because they are likely to have second and third order effects on how we model information and experience downstream.

In modeling information, we have to prioritize like election forecasters do. Each decision we make when we’re modeling is a commitment to what we value. And we’re going to have to consider the environment our models will inhabit. Not just the technical environment, or the user context, but the broader information environment.

Finally, what is our responsibility as a community? How can we call attention to this challenge of an information environment burdened by so much entropy? How can we build durable, sustainable ecosystems of understanding that allow people to get on top of their signal to noise ratio?

Perhaps we need to broaden the question to what are our goals as a society? Since about 1996, it seems like we’ve been trying to create a digital ecosystem where attention can be optimally monetized. When there is finite human attention, is that sustainable? It’s not possible for an natural ecosystem to thrive in a polluted natural environment. Is it possible for a civil and just society to sustain itself in a polluted information environment?

Define the damn thing (#DTDT) has been a useful, valuable debate for our community. But what would happen if we directly engaged the folks who own and control the ecosystems that power the information environment, like Google, Twitter, Facebook etc, in a larger debate about defining our damn society? As Alan Cooper pointed out, they’re being nice now, because it’s in their interest to, so isn’t now the time to engage?

We’re already conscious of the imperative to avoid dark patterns. We already have codes of conduct that govern our community gatherings. Do we also need a hippocratic oath for makers of information spaces?

On Friday, Alan Cooper reminded us of the power and responsibility we have. “We are the information alchemists,” he said. This amounts to an ecological responsibility for information environments. We are the information ecologists. If we want to live in a just and civil society, I submit we must also become information environmentalists.

How can we take back control of the signal to noise ratio? We’ve all been in that meeting with our developer friends where we have to decide what we want to compromise based on what the code can and can’t do. When we do this, do we think about the consequences of these decisions for the broader information environment and its inhabitants?

Each time we decide to let an algorithm do something for us we’re intentionally ceding a measure of human curatorial control because we’ve decided that something is too burdensome for humans to manage. And that’s fair, some things are. So we need to make sure when we do that, we understand the ambient entropy an automated process is going to create. We need to ask the same tradeoff questions we ask about how hard it is to develop something when we think about how disruptive something may be once it’s unleashed.

Fortunately we have experience doing this. We know how to assess, how to analyze, and how to measure the user impacts of our tradeoffs. We know the eternal product development equation: do you want Good, Cheap, or Fast? You can only pick two. There are limits, laws of nature, that we must abide by.

I’m not nearly the first person to be asking these questions. The leaders in our community are engaging this head on. Andrea Resmini talked about this at last year’s IA summit. Andrew Hinton talked about this at World IA Day in Atlanta. It came up during the Reframing IA Roundtable on Wednesday. Alan Cooper rallied us all this Friday. Marsha Haverty spun yet another lyric poem about our dynamic information environments just an hour ago.

And encouragingly, we’ve been hearing all week about how people are doing great work at Google, and Etsy, and REI, and IBM, and Autodesk, just to name a few of the talks I’ve attended. Embracing the humanity of modeling, and taking responsibility for the power of automation. I hope the founders of our discipline among us, like Marcia Bates and Alan Cooper, are as encouraged as I am about what we’ve seen here this week.

We need more of this, at the organizational and societal level. None of this is new. There are models (!) that we can adapt from the sustainability movement. If we think about our impact on the information environment what’s the triple bottom line of our decisions? What is our entropy footprint? How do we break through the noise to elevate the signal for all of us?

So this means that just like 538, we have to model on every level. We will increasingly be forced to make choices about what facets of the experiences we create are going to be semantically driven and then model those systems of meaning.

We have to model from the outside in. Like an access journalist, our modeling choices must led by user research and an understanding of information behavior, because that’s the only way users are going to be able to discern the quality and authenticity of the information we introduce into the environment.

And we have to model from the inside out. We have to understand the information assets our organizations possess, and we have to give ourselves similar, if not even more precise handles to manipulate this information. Because we’re not going to be able to set it and forget it, we’re going to have to curate it over time. We’re going to have to continue elevating the signal over the noise in perpetuity. We’re going to have to take responsibility.

So it’s up to us. As information architects, as users, as humans it’s up to us to decide what kind of information environment we want to live in, and and to echo what Andrew Hinton said at World IA Day this year, those of us who are lucky enough to have jobs curating this information environment will be the ones responsible for it. On a local and a global level, the responsibility is ours. Just like we care about what goes on in our back yard, or on our block, or in our neighborhood, city, province, nation and world, this curatorial imperative exists at every level of all the environments we inhabit.

So have I answered that question I set out to? I hope you have seen that there are many, many answers, but Nate Silver got his start creating predictive models for baseball players, so I’ll just leave you with this:

Chance 538 gave Trump of winning the presidency in 2016: 28.6%

Chance a major league baseball player would get a hit during any at-bat in 2016: 25.5%

That’s right, the chance Nate Silver and 538 gave Trump of winning the presidency in 2016 was higher than the 2016 Major League batting average. And yet, players get hits, runs are scored, games are won many times every single day. Probabilities in the high 20’s come through all the time.

So out of everyone who tried to forecast the 2016 election Nate Silver turned out to be the least wrong. And as all good information scientists know, least wrong is as close as anyone can get.

Factor

+ posts

About Factor

Factor

Leave a Comment Cancel Reply