Without going into too much of the back story (you should all know it by now), the discussion of the self-labeled "advanced" stats around these parts is not something we like to get involved in. It led to a game thread discussion that left many feeling the community lost its way, it has led to a few frayed professional relationships, and has led to the mockery of one of our fellow SBNation communities.
That last part may be my favorite part, but that's just me.
After several very lengthy discussions on the topic, posts hijacked and turned into lectures about how great stats are, and several requests to simply be left alone, we followed some sage advice and went silent on the issue. Sure, there are a few sarcastic remarks every now and again, but for the most part, we went our way and hoped the stats world would go theirs.
When the Wild went into their annual crash and burn phase, we ignored (for the most part) the "I told you so" posts that followed. Everything seemed to be hunky-dory, and no one seemed the wiser that Wild fans simply opted out of the discussion.
Until yesterday. Why it came back, and why I don't really get why... after the jump. (Warning: 2700 words ahead. Bring a donut.)
Before going too far, let me say that I have a great deal of respect fro Kent Wilson. I understand completely why Puck Daddy chose him to come on board as their stats guy. Stats are a big deal right now, and posts about stats likely get a good deal of attention. I get it. I'm glad it's Kent. He deserves the stage he is on, and does superb work.
I disagree with nearly everything he writes, but I respect the work he does, and the man behind the work.
This is one time I am bit confused by the timing of the article, and the message behind it. If you haven't read it, please take a moment to head over at read it. It is titled 'Stats Guys' vs. Cinderella, or how Minnesota Wild Regressed to the Mean.
Let's take a look at a few passages:
It starts with a run down of the background. The rise of the Wild to the top of the league, and the fiery crash that followed. It includes the words you have all read 1000 times:
Their total shots for/against (or "CORSI ratio") to that point was just .419, one of the worst in the league. Nevertheless, the underdog Wild were "finding ways to win," to borrow a cliché, so any skepticism was dismissed out of hand.
After all, pointing to the standings could readily silence any unbeliever.
Wilson goes on to say that there were posts around the net predicting the fall. The one he fails to link to is the one that set off yours truly (and the bulk of Wild fans). The post by Derek "I Need Proof" Zona decrying the Wild as the worst team in the league while sitting atop the NHL.
This distinction is important, since before that post, the stats were something Wild fans didn't rightly care about, weren't discussing, and likely wouldn't have discussed. Instead, Zona took a pot shot at the team, and at the fan base (read the comments if you decide to find the article), which pissed quite a few people off. If anyone is wondering where the "fight" started, that's it.
Wild fans didn't go out looking for a fight. Wild fans didn't act like fans from Dallas or Colorado and go to Behind the Net and attack Gabe. No, Zona took a shot, and Wild fans responded. Somehow, that gets lost in the mix of all of this.
We're a bit off track, but it is an important piece of information.
"Regression to the mean" became a punch line in Wild fan circles.
Still is. Always will be. Count on it. With the way the "stats guys" presented their case, and the voices they not only allowed to speak for them but also hold up as beacons of the stat world, the "stats guys" are losing fans much faster than they are gaining them. At least here in Minnesota.
Of course, with Minny currently sitting 12th in the Western Conference heading into Thursday night, the next chapter of this story is an obvious one.
It's only obvious because the "I told you so" dance doesn't only happen on Scrubs. The obvious reaction was... obvious. The Wild started losing, and the "stat guys" went on their own little parade of self-congratulatory masturbation. The article Wilson links to is one of the prime examples of it. In that article, it is discussed why the Wild have the same issues the Preds are showing, but no one says the Preds are going to fail. In other words, the numbers say they should fail, they aren't, and the "stat guys" aren't all over them. So if the numbers never lie, why not say the Preds will fail? Where is the objectivity?
They create an entirely new twist, saying there has to be a perfect storm in order to say the things they said about the Wild.
The purpose of this article isn't to dance on the grave of the Wild's short-lived elite status. Nor is it to point and laugh at Wild fans. The episode is an object lesson in how percentages can vary wildly around a mean in small samples and why that is so counter-intuitive to the fan experience.
It isn't to dance on the grave of the Wild's short lived elite status? Huh. Could have fooled me. Couldn't have made this a generic post about stats? It had to include the term "Wild" 18 times? Yep. Not about the Wild, though.
In his recent book "Thinking, Fast and Slow," psychologist Daniel Kahneman notes how poorly people tend to grasp statistical truths like regression or the influence of sample size on results. In his chapter "Regression to the Mean," Kahneman details how apparently foreign the concept is to us:
Whether undetected or wrongly explained, the phenomenon of regression is strange to the human mind. So strange, indeed, that it was first identified and understood two hundred years after the theory of gravitation and differential calculus.
This passage, right here, highlights the fundamental chasm between the "stat guys" and the people here at Hockey Wilderness. The "stat guys" assume Wild fans don't grasp the concept of "regression to the mean." That, somehow, we aren't as smart as them. After all, if we just could grasp the concept, we would be magically converted and there would be harmony amongst the tribes.
If the "zealots" could just grasp the concept of Islamist living, there would be peace and prosperity. Not to suggest the "stats guys" are terrorists, but to show that a line of thinking that suggests one group is wrong, and if they would just believe what we do we would all get along is flawed in a fundamental way.
I, and everyone else I have had the pleasure of discussing the numbers with, grasp the concept of regression just fine. We all took algebra, and many of us were even wily enough to somehow con our way through advanced statistical analysis classes in college. Damn if business and math degrees don't require you understand statistical concepts, including regression.Those silly millionaires tend to want the people protecting their business interests to understand how to analyse numbers. Bastards.
Silly me. Having learned statistical analysis at a college level, it would be absurd to assume I understand a basic concept like regression. I'm so glad the "stats guys" are here to save me from myself.
One thing I really enjoy, as student of communication, is how Wilson uses the word "us." It's a great way to diffuse the "us against them" line of thinking, to subtly hint that we're all in this together. Too bad the entire article is an us against them line of thinking, or it may just have worked.
Humans prefer patterns and causal chains to abstract notions of variance or probability. In fact, people tend to identify apparent patterns in randomness without effort and to fit noisy, complex events with tidy narratives that make them easier to understand and more coherent.
This may very well be true. However, this particular human prefers to follow the rules of science when attempting to prove a hypothesis. The prediction was that the Wild would fall apart. They did. Congrats on being right. However, the numbers didn't change. The Corsi numbers didn't change much, if at all. In science, that is called a control. It didn't change. So if the situation changed you look for the variable to explain it.
The stats are a control, so why not look for the variable? Well, because that doesn't fit with the... wait for it... tidy narratives that made it easier to understand and more coherent.
Wilson goes on to intimate that this line of thinking is wrong by bringing up the system, "playing for each other," etc as a lead in to how it is actually the numbers and a decent length of time (allowing for regression) that brought the team to earth. The problem is that's where the variables lie - in the intangibles, the systems, the "playing for each other." The numbers didn't change. The variables did, but they don't fit the narrative weaved by the "stats guys," so dismiss them out of hand and throw more numbers around until people finally just tell you that you're right so you will shut up and leave them alone.
Perhaps the greatest and most frequent charge against the spreadsheet analysts was that they were assessing the Wild's abilities without "watching the games."
This is true. I contend they don't watch the Wild enough to understand the team. However, to dismiss the "stats guys" out of hand like that would be no better than what they are doing. "Objective" analysis of numbers can be welcomed as a counterpoint to what someone with admitted biases can gather. That said, it wasn't the Wild's abilites they need to watch the game to see, it is the changes, those pesky variables that they don't get to see. The change from a team playing a system, to one made up of individual parts all trying to play a different game. It's like an orchestra, with everyone playing a different song.
But that can't be quantified, and thus, should be ignored, right? Right.
Direct observation is indeed the most data rich method of analyzing hockey teams, but it is also potentially the most deceptive; particularly through the rose-colored glasses of a fan.
Another way to dismiss an argument is to say that the people making it are biased. I admit that bias. When the "stats guys" claim they have no bias, they are lying to you, and to themselves. They are trying to protect their religion, and will fight anyone to do it. They all use the same links, the same posts as their basis of fact, but no one has ever bothered to consider the basics might just be fallible.
By the way, SBNation was founded on the very idea that there is no such thing as "objective" in sports. The "stats guys" are, themselves, fans. Fans of other teams, and fans of the stats themselves. Everyone in the conversation is a fan.
The next portion of the article uses the example of a ball being thrown up in the air will always return to earth, no matter the arm a person throwing it has. This is to draw an analogy to gravity, and show how regression is a guarantee, just like the ball coming back to earth.
The problem here? The stats are not a certainty. Gravity is. No matter what type of Star Trekian spaceship you build, gravity can never be escaped. Ever. Scientifically proven. There is a reason they are called "laws" of physics and not "suggestions" of physics.
Stats can be overcome. The Bad News Bears can win. The Mighty Casey can strike out. People beat cancer when the numbers say they don't have a chance. Stats are overcome every single day. Matt Kassian can score two goals in a single NHL game.
Fights between a Cinderella club's fan base and "the stats guys" aren't new. Last year the Dallas Stars had a similar flight up the standings and then a subsequent fall from grace in the second half. The Colorado Avalanches incredibly unlikely run to a playoff appearance in 2009-10 was also a bone of contention between the bean counters and Avs fans. In fact, a cult of personality sprung up around Joe Sacco similar to Mike Yeo, although he was later targeted as a scapegoat when the team failed to replicate the feat one short year later.
Here's where the history lesson I gave you above is important. The fight was not started by this side of the backyard. Even when Gabe and others (including Wilson) predicted the Wild would be failures, not one Wild fan went to their site and blew up at them. No one call them nerds, geeks, weirdos, calculator loving morons. Not until Zona pushed one too many buttons, acted like an pompous ass one too many times, and went for headline that would get him the most clicks, did this turn into a fight.
All Wild fans were doing was sitting around, watching games, amazed by the team's performance, and hoping it would continue. No one predicted the Wild were going to win the Stanley Cup, no one thought the Wild were sure fire winners in any given game. They were quietly enjoying the fun that is watching your favorite team outperform its expectations.
Out of the blue, the "stat guys'" pit bull broke his leash and went on his little bender. Then, and only then, was there any fight at all. Wild fans, and me, just wanted to be left alone. The "stats guys" couldn't resist the urge to not only rain on the parade, but to piss on the people at the parade and rub dog shit in the eyes of all the kids at the parade for thinking there might be candy, all for the grand offense of enjoying the way their favorite team was playing.
Wild fans really are a bunch of bastard coated bastards with bastard filling, aren't they? How dare you enjoy hockey? Don't you know you aren't allowed to like hockey unless you are a proud Canadian?
Screw Dallas, screw Colorado. The Wild are not those teams. The fact that these two examples are bandied about is more proof that the "stats guys" just want their tidy narrative, and will not accept there are other possibilities. After all, if Wild fans didn't start the fight (which they didn't), the story line doesn't work. If Wild fans didn't go out and look for a fight like the fans in Colorado and Dallas, there is no way for the "stats guys" to make the narrative work.
Hockey is a complex game and all sorts of things can happen that makes predicting the future nearly impossible. However, regression is as persistent as gravity and a team dependent on unusually high percentages for success will inevitably fall back down to earth - regardless of how many of their games you watch.
Regression is not as persistent as gravity. Regression is not a given. The analogy is poor at best. It's actually an indicator of just how far over the edge the "stats guys" have gone that they believe their numbers are so infallible that they should be likened to gravity.
The Bottom Line
Wilson claims this isn't about the Minnesota Wild, yet the entire article is about the Minnesota Wild. The background is laid out to make his narrative work, without giving all the details, and without respect for the fact that there are a few things you need to know to fully grasp the issues in play.
- Wild fans didn't want the fight. The fight was brought to them.
- Wild fans fully grasp the concept of regression.
- The "stats guys" continue to go against fundamental scientific method and ignore the variables in favor of using controls to claim their hypotheses are correct.
- There has been no one that can prove causation. In other words, no one can show, beyond a doubt, that Corsi (and the like) predict anything at all. If there is any doubt this is true, they haven't convinced Wild fans, and they haven't convinced me.
- No one on this side of the "fight" gives a rat's ass about the numbers. We understand them, we can grasp their concepts, and we still flat out don't care about them. As offensive as that might be to the "stats guys," it is a truth they should quit ignoring. The "stats guys" can be as right as they can, and the people on this side still won't give a shit.
- Everyone I have spoken to about stats from the Wild fan perspective just wants to be left alone. To be allowed to enjoy the games, to watch the team play, and to cheer or boo when needed. That can't be done while people like Zona continually fling shit into the campground and then dance a little dance every time something they say was predicted comes to be.
- As long as the "stats guys" continue to write condescending pieces about how right they were, and continue to retread the same story line three months after it died, they aren't going to win many converts.