With the National Championship game now set, I am certain that the results of the tournament to this point are not what the majority of us expected, which is no doubt reflected in bracket pools across the country.
If it is any consolation to those of us who had Cornell going out in the first round and Kansas winning it all, few brackets, including those of the best-known sports writers, fared much better. In fact, USA Today points out that only 12 ESPN.com brackets, out of 4.78 million entered, correctly picked the first eight teams to make the Sweet 16!
With nearly everyone’s bracket taking serious hits from ‘bracket busters’ this year, the question must be asked: what would it take to pick the perfect bracket?
Having left behind cricket and rugby two years ago to move from the UK to the US, I am probably not the best person with whom to strike up a casual conversation about basketball. However, one thing I am quite comfortable discussing is data. So when recently considering the data and analysis required to create the perfect bracket, I noted many similarities to issues I am dealing with professionally. For example, how do two major corporations integrate their databases after a merger? How can federal and international governmental agencies share information to prevent terrorist attacks or the spread of deadly pathogens?
What makes these questions similar is the immense volume of data that is needed to solve the problem. The data required to pick a perfect bracket is on a similar scale to the volume of data necessary to make certain aspects of our business, financial, health and public sectors work. Another similarity is that once the amount of data required to understand something -or as is the case with our brackets, predict something-explodes. Data integration, or connecting the data dots, so to speak, becomes a very important, if not essential tool.
Consider each team’s regular season wins and losses, quality wins, individual player statistics, respective tournament histories, prior match-ups, injuries, possible home-court advantages, seeding/ranking, player-to-player match ups…and so on. The data points that would be generated from these variables alone would require the sort of full data integration view that governments and some of the biggest corporations in the world regularly tap data integration experts to provide for them.
The truth is, without a unified view of the billions of individual data points for each of the 64 teams in the NCAA tournament, this mountain of information will not set you on the path to a perfect bracket – the odds of which are one in 9 million trillion. And this is a reason for why the experts, and possibly the Villanova alumnus in your office, have been so wrong. Without an integrated view of all data, or without the ability to connect all of the dots, it is nearly impossible to make sense of the mass of data.
But what happens when we start to use data integration to pick our brackets in the future?
For now, it is safe to say that the excitement and surprises of the tourney will be a spring fixture for years to come. On the other hand, it is quite uncertain if UCSF, UCSC or Sonoma State making it to the Final Four in 2025 will be such a shocker.
(Also published on Silicon Valley Watcher: A Windsor Brit in the NCAA’s Court? The Science of Picking the Perfect Bracket)