Rec’ing on…The Ranking Project (4)
The figuring out the quality of opponents and wins has definitely been a struggle. Our goal is to have things impartial and quantifiable as well as bringing to the table good scheduling choices for teams while also guarding against systemic abuses (i.e. cheating).
I have to tell ya, this is sort of a tough nut to crack, but I think I’ve finally started getting a handle on how to do this. We start with the win-quality calculations I wrote about in part 2 of this series (link). It gives a better picture than pure win percentage. Unfortunately, it doesn’t take into account the quality of opponents.
See, here’s the deal: let’s say a team ranked #217 loses to the #2 team by ten points. How does that compare to team #3 beating team # 23 by one point? Shouldn’t team #2’s victory been by a greater margin. so given that they won, they didn’t win by as much as they should have? In comparison, isn’t team 3’s victory closer to what we’d expect? I think the answer is yes, but the problem lies in how to represent that on paper.
I’ve written code that uses the difference between two teams’ wq scores to estimate what the win margin of a particular game should be. Within that win margin, there’s no change to a team’s wq. Outside of that margin, the winner’s wq can be bumped up or down depending on whether they exceeded the margin, or underperformed (respectively). For the loser, their wq can be bumped up if the keep the score under the estimated margin, and will be lowered if they lose by too much.
When examining the data produced by this procedure, I was amazed that most of the winning margins actually fell within the estimate–over eighty percent. While this isn’t too surprising for teams that are ranked close together and set the winning margin at six points (plus a fudge factor), but when you consistently hit when the estimated winning margin sits at twenty-four points…well, that shows that the rankings are holding true.
The refined numbers result in some shuffling: nothing quite as dramatic as the difference between won/loss and wq, but enough where tournament selections could be affected.
A further examination reveals a 67% hit rate from my top sixty-four teams to those that made the NCAA tournament. It’s interesting to see the numbers of teams that are outside of the wq top twenty. It makes me wonder if perhaps the selections are a little too protective of the "power" conferences and should maybe dip into the well of the mid-tier conferences a little more.
There’s still more work to do, however. While I now have a foundation that’s based on a tweaked form of won/loss, it still doesn’t take into account the quality of opponents. In the next installment of this project, I’ll be exploring my options when it comes to R.P.I., intra- and inter-conference play, and other factors. At the moment, I’m thinking that using the generalized form of the NCAA’s R.P.I. combined with my adjusted wq scores, might be the right compromise. But only testing will tell. We’ll see what happens.
Leave a Reply