Novak Djokovic
Novak Djokovic of Serbia returns a shot to Roberto Bautista Agut of Spain during their fourth-round match at the U.S. Open Championships tennis tournament in New York, on Sept. 6, 2015. Reuters

In the summer of 2014, teenage tennis prodigy Noah Rubin was preparing for his appearance in the Wimbledon boys’ singles tournament, and Lawrence Kleger, his coach of more than a decade, was searching for an edge. Using data-driven analysis of match footage, Kleger noticed Rubin was struggling with his “serve+1” -- tennis lingo for serving the ball to the opponent, quickly readying for the opponent's return shot and then hitting the return to the open court to win a point.

Strong serve+1 technique is crucial to effective play at high-level tournaments, and Rubin was finding it difficult to execute with any consistency. Analytics helped Rubin’s team identify the issue. By the time Rubin arrived at the All England Club in London, his onetime weakness had turned into a strength. He won the boys’ singles tournament in 2014 and joined the likes of Roger Federer and Björn Borg as a junior champion.

“At Wimbledon, he must have done that successfully, had to be like 90 percent of the time,” said Kleger, who also serves as director of tennis at the John McEnroe Tennis Academy in New York. “That became his go-to play. He started serving out wide a lot better, because he was a lot more confident in his second shot.”

Rubin is one of several tennis players who have joined pro athletes in baseball, football and basketball in the use of data analysis to improve play. But overall, the tennis world has been slow to adopt advanced analytics. As a sponsor for this week’s U.S. Open and other Grand Slam events, technology firm IBM is leading the charge with its SlamTracker, an application that utilizes 41 million data points to provide fans and players with real-time statistical match analysis, but that's limited to IBM-sponsored tournaments. Instead, players at all levels are increasingly turning to expensive private firms for their analytics needs. But inaccurate data and prohibitive costs are limiting the benefits these programs can offer.

The Rise Of Big Data

In just a few decades, advanced analytics rose from a sideshow-like novelty to a crucial element in how professional sports franchises build their rosters, offer contracts and prepare for games. This transformation is especially evident in professional baseball, thanks to the efforts of sabermetrics pioneer Bill James, as well as author Michael Lewis’ 2011 book “Moneyball,” which chronicled Oakland Athletics General Manager Billy Beane’s use of advanced statistics to objectively rate baseball performance. Once-experimental statistics like wins above replacement (WAR) or batting average on balls in play (BAPIP) are now as important to baseball executives as home runs.

Pro basketball and football have made similar strides. Any fantasy football player or viewer of HBO’s “Hard Knocks” series has an idea about the NFL community’s obsession with statistics and game film. In the NBA, more and more teams are turning to advanced data to assess player efficiency and dictate game plans. Teams are attempting more three-pointers and fewer two-point jump shots, because statistical analysis dictates they do so to win. Players are rewarded with new contracts for statistical efficiency, not just points per game.

The tennis world has been far slower on the uptake, with its notoriously spotty record keeping and lack of a team dynamic. Unlike major league baseball (MLB) or the NFL, where play is scheduled and strictly regulated by a central body, tennis organizations like the Association of Tennis Professionals (ATP) allow individual tournaments a large degree of freedom to manage their own affairs. Tennis scorecards provide little in the way of context, and they aren’t compiled in a way that’s centralized or easily accessible to the public.

“For years, tennis so trailed behind some of the other sports,” Kleger said. “I mean, football, the quarterback goes off the field and they hand them this book [of data] that looks like an encyclopedia.”

SlamTracker and Tennis Data Analysis

Tennis professionals are growing more vocal about the sport’s previous apathy toward analytics, but progress has been slow. Starting in 2006, some tournaments, including the U.S. Open, implemented “Hawk-Eye” technology, a multicamera system that records player and ball movement during matches. But data gleaned from that system remains private, to be doled out at the discretion of officials from individual tournaments.

IBM’s efforts at the U.S. Open are a step forward, in terms of both breadth and accessibility. Through a partnership with the United States Tennis Association, IBM offers real-time statistics to tennis fans in attendance at U.S. Open matches and across six digital platforms, including the U.S. Open’s official smartphone app and website. A total of 15 million fans interact with IBM-driven platforms during the event, according to the company.

By checking out IBM’s service, U.S. Open fans can access eight years of data from Grand Slam events, as well as real-time match statistics. SlamTracker also boasts a Keys to the Match feature, which uses the system’s purported bank of 41 million data points to predict which aspects of a player’s game will determine his or her performance in a given match.

Players and coaches who participate in the tournament also have access to IBM’s data. Within 30 minutes of a match’s conclusion, players can access the company’s statistical analysis of their performance, which is linked to actual video of the contest. For example, a coach can ask to see all of his player’s forehand winners, in order to determine strategy for future matches.

SlamTracker provides an exhaustive level of data for fans and players alike, but it also comes with some restrictions. IBM only tracks data from Grand Slams, not all tournaments. Players can only access their own matches, which limits the program’s effectiveness as a scouting tool. And IBM doesn’t make the raw data it amasses available to the public.

“If you think of the USTA as a client of ours -- it’s really a partnership -- our mission here is to help the USTA grow the game of tennis by using technology. In so doing, we create, basically, a case study in front of the world on what IBM technology can do, where we provide the infrastructure, the security and the analytics,” said Elizabeth O’Brien, IBM’s strategy and marketing leader for sports and entertainment sponsorships. “It’s all built on the same software, hardware services that we would bring to a client in any industry around the world.”

O’Brien declined to comment on possibly extending the SlamTracker system to other tournaments and venues, citing the limits of IBM’s sponsorship deal with the USTA. IBM sponsors the four Grand Slam tournaments and the China Open.

A Work In Progress

IBM’s SlamTracker technology is one of the most expansive data-related undertakings in tennis history, but questions about its real value remain. Most of the criticism has centered on its Keys to the Match feature, which IBM touts as an example of its “predictive analytics.” A 2013 Wall Street Journal analysis by Carl Bialik, who now writes for ESPN’s data-driven FiveThirtyEight blog, found losing players achieved as many or more statistical “keys” to victory as winning players in about one-third of matches. That’s not a great success rate for IBM.

Jeff Sackmann, a noted baseball and tennis analytics expert, founded Tennis Abstract five years ago after he noticed a void in publicly accessible tennis data. On their website, Sackmann and his team of volunteers have compiled years of raw data and statistics for use in identifying player trends and debunking tennis myths.

While Sackmann praised IBM’s comprehensive effort at gathering match data at Grand Slam events, he, like Bialik, has concerns about the program’s reliability. He’s also skeptical about just how valuable SlamTracker is to growth of data analysis in tennis.

“Ultimately, they aren’t doing much with the data they have, and they aren’t releasing any data to the public,” said Sackmann. “So their efforts are mostly geared toward marketing themselves and creating shining graphics for TV broadcasts. I’m not in a position to judge the success of the latter, but they aren’t doing much for tennis analytics in general.”

An Expensive (But Valuable) Service

As more tennis players begin to recognize the value of big data, some professionals, including longtime instructor Warren Pretorius, have succeeded in monetizing analytics services. Pretorius played college tennis at Weber State University in Utah and later became a tennis consultant for Dartfish, a Swiss video-analysis company that once worked with Billy Beane of “Moneyball” fame, among many other clients.

These days, Pretorius runs Tennis Analytics, a service that, like IBM, combines statistical analysis with actual match film. Pretorius’ past list of professional clients includes Novak Djokovic, Maria Sharapova and Grigor Dmitrov. He also works with several NCAA tennis programs, and with Kleger and Rubin at the John McEnroe Tennis Academy.

Pretorius works with a staff of 12, including three full-time employees, to log countless hours of tennis footage into databases for his clients. Rather than sitting through an entire three-hour match, coaches can simply view footage sorted by whatever mind-bogglingly specific parameters they wish.

“They can say, ‘I want to take a look at all the 30-0 points where my player served a second serve wide and the return was cross-court and short. I just want to see those points.’ And they can view just that,” Pretorius said.

Tennis Analytics also implements expansive quality-control checks to ensure the accuracy of the information it provides to clients, Pretorius said. Would-be employees of the company have to achieve 97 percent accuracy on a collection of “control matches,” correctly identifying instances such as forehand winners and unforced errors, before they are allowed to assess client data.

Analysis of this depth can be expensive, even though Pretorius argues Tennis Analytics is actually one of the cheaper services on the market. Most NCAA schools operate on razor-thin athletic budgets, with many reliant on public subsidies to survive. In that environment, it can be difficult to justify paying thousands of dollars for video software.

College teams can buy 200 hours of video analysis for $11,650. Individual player rates run $555 for 10 hours. Services for professional players or tennis academies, which often ask for data on an even deeper level, are more expensive, but Pretorius would not discuss specifics.

At John McEnroe Tennis Academy, Kleger said the video analysis data is simply too expensive to be purchased for use on every single player. The academy is still trying to figure out how best to use its resources, to weigh the costs of programs like Tennis Analytics against their demonstrable on-court benefits. Right now, top players like Rubin, who turned pro earlier this summer, are given priority.

“As the game progresses, the margins between players have become so minute that every little edge that you can get is going to be important,” Kleger said. “I think analytics are going to be a big part of that.”