For the last six months, Boris Chen has been a data scientist for the New York Times. He largely spends his time sifting through data trying to help the world-renowned newspaper better target their audience.
The information he gleans could help The Times retain a subscriber, or even add new ones. He does this with a slew of complex mathematical formulas, and one of them is called a Gaussian mixture model, named after 18th and 19th century German mathematician and physicist Carl Fredrich Gauss.
The model is essentially a clustering algorithm that finds natural breaks among data points. In his work, Chen employs what is known as a machine learning technique, which allows him to compare and contrast segments of readers. Machine learning is concerned with making predictions for the future, unlike statistics, which largely measures the past.
Chen, 26, has now applied this method to fantasy football, and the results could be the newest breakthrough owners look to in the growing world of fantasy sports.
Typically, Chen would go off a user’s browsing history, but for fantasy football he uses the comprehensive rankings from FantasyPros, a site that combines every major expert’s recommendations into one simple average. Chen takes those rankings, and plots his findings on graphs. He then uses "R," a software programming language used for statistics, to break down each skill position (no defenses or kickers) into tiers.
Continue Reading Below
For instance, the consensus calls for Minnesota Vikings running back Adrian Peterson to be the No. 1 pick in every fantasy draft. And Chen agrees, since his model shows Peterson is in the first tier of running backs all by himself. However, all is not lost if an owner manages to snag a rusher in the fifth tier, like Matt Forte, of the Chicago Bears, or Steven Jackson, of the Atlanta Falcons. According to Chen, an owner wants at least two running backs above New York Giant David Wilson, who has an average rank of 21st overall and slots into the seventh tier of Chen’s graph.
In terms of receivers, there is an obvious drop-off after Detroit's Calvin Johnson, and one between the second and third tier of receivers. Meaning an owner should not feel disappointed if they can't draft Dallas's Dez Bryant or Cincinnati's A.J. Green, but should at least aim for Denver’s Demaryius Thomas (2nd tier) or Tampa Bay’s Vincent Jackson or New York Giant Victor Cruz (the last two receivers in the 3rd tier.)
Chen has degrees in applied mathematics from Princeton and engineering from Cal Tech. He believes his method is unlike anything available to owners. Strangely, Chen didn’t even like football three years ago.
“I thought football rules were arbitrary,” Chen said in an interview with IB Times. “The thought of football as a sport just didn’t appeal to me.”
That all changed when Chen started playing fantasy football. First he joined one league, then the following year two, and this year he’s in three.
“Now it’s the sport I watch the most. Something about the scarcity of the games," Chen said. "It’s a common complaint about baseball how there’s 160 or 170 games a year, where football there’s 16 games a year.”
Chen won in his first year, and has been hooked ever since.
Keeping with the idea of “sharing is caring,” a common belief in the data science, technology, and hacker fields, Chen started a blog last month. He said fantasy sports are more fun when they’re competitive, and didn’t mind sharing his new method and strategy with owners. Chen felt it would have been a “crime” to not share his novel idea.
He has received a huge amount of positive feedback on Reddit. For the rest of the season, Chen plans on continuing his rankings on whom to start and sit, providing a helpful visual aid to novice and more experienced fantasy players.
Chen’s latest post involves the flex spot, where owners have the option to start another running back, wide receiver, or tight end. As an example of just how unpredictable fantasy football can be, Chen found that Tennessee Titans running back Chris Johnson, average draft position 19th overall, was ranked as the best starter for Week One.
Adrian Pereira is the owner of eDraft and Fantasy Football Café, two sites aimed at helping the 36 million fantasy football owners in North America, and he’s never seen anything like Chen’s analysis.
“Most people have seen players ranked side by side,” Pereira said, “but in this case you can actually see there are wider margins than most people would have anticipated.”
Pereira did point out that Chen’s breakdowns had Denver Broncos tight end Julius Thomas near the bottom. Thomas had five catches for 161 yards and two touchdowns in Denver’s 49-27 rout of defending champion Baltimore on Thursday night. But Pereira said that’s just part of the many variables out of the control of owners, a point Chen agreed with when it came to Broncos quarterback Peyton Manning being ranked as the third best passer. Manning tossed a record-tying seven touchdown passes against the Ravens.
In terms of his own drafts, Chen doesn’t ever go with gut or instinct. He is a strict believer in being unbiased when it comes to the draft.
“My main draft belief is: what others pick, dictates who you pick,” he said. “It’s better to take advantage of other people’s reaching.”
Chen lucked out in one of his leagues, and had the No. 1 overall pick. He took Peterson. That helped him avoid picking in the No. 2 to No. 5 range, a spot Chen said was the most difficult this year, with lots of doubt surrounding every rusher after Peterson.
One player Chen believes in is New England Patriots tight end Rob Gronkowski. Chen thinks concerns over Gronkowski’s forearm and back injuries aren’t enough to pass on the possibility of his voluminous production on the field. Chen also likes Cleveland Browns receiver Josh Gordon, but is a little skeptical on Atlanta’s Jackson, due to his advanced age.
But dissimilar to the usual pitfalls owners face like injuries, over-reaching, or just pure bad luck, Chen has other worries. He found that many of the owners in his leagues were using his methods and blog for research.