A score was incorrectly added to a database that is used to rank football teams in the BCS system, which led to two rank reversals in the rankings (between the #10 and #11 teams and between the #17 and #18 teams). Many sports journalists are talking about all of the implications of the error (What if the error affected the top two ranked teams?!?). Only one of the six data sources is available publicly, so it is not clear the extent of the data inaccuracies or any errors in the actual BCS formula.
What is interesting is that this is being reported as a math error, but from what I have read, it is a data entry error. I was actually a little disappointed when I read about what happened, because I initially thought that the BCS algorithm was incorrectly used (it wasn’t, as far as anyone seems to know, but there is room for improvement there).
I have three thoughts:
- Sports journalists apparently do not understand the different between mathematical formulas and the data used to populate those formulas. Throwing out the BCS formula for ranking teams would not eliminate this type of error, since data from experts (votes, points awarded, etc.) could be incorrectly tabulated and entered into a database. Such errors would be more difficult to spot than a missing score, since they would not be obvious.
- Having incorrect, missing, or inaccurate data is a part of life for many of us who analyze data. In every other type of industry or sector, people make big decisions with inaccurate and incomplete information, and life apparently goes on. When is the BCS data “good enough?” In NCAA football, accurate and complete data should be available for the BCS formula, and as a football fan, I do hope that every effort is made to collect good data. One missing data point isn’t evidence that there is a systematic problem. On the contrary, it sounds like the data set is pretty accurate and complete.
- How can we come to reasonable conclusions about the BCS system when the journalists who are supposed to inform us get the important stuff wrong? This story has become a non-story for me, as I have been reading articles written by people who are not really responding to the issue at hand (one inaccurate data point). There are surely other issues with the BCS system (such as not enough oversight), but replacing the BCS formula with another system would not necessarily imply that there would be more oversight.
The secrecy of the BCS formula seems to be one of the reasons for its unpopularity, yet it produces rankings that are virtually identical to other ranking systems. Is the bias against BCS a function of how most people don’t understand math well? If the BCS formula was more transparent, would that even be a good thing? Do you consider a single data error a big deal?
But I could be wrong, of course.