From PitchF/X to Now: The Changing World of Sports Tracking

The way things change and progress these days, seven years can seem like a lifetime. When viewed through certain filters, 2007 looks like ancient history. Our laptops were huge, our smartphones were pretty dumb, the Bears were good at football. A lot has changed. Our interaction with sports — the ways we observed and learned about our games — have also taken enormous leaps. The level of analysis available to us is at an all-time high. And as recent news suggests, it’s only going to get better. What we take for granted now may have seemed like a pipe dream seven years ago, but it was a technology that came about in 2007 that let us peek into the future.

Baseball has always had statistics. From Henry Chadwick’s box scores to the early Bill James annuals, baseball latched onto numbers more than any other sport. But with those numbers came a thirst for more — more ways to dissect and prod and massage statistics to help learn more. Though the work of so-called sabermatricians were helping push our understanding of the numbers we had, there was still room for more. And in 2007, we got it.

MLB, in partnership with Sportvision, began installing special cameras in ballparks to track pitches. Dubbed Pitchf/x, these two-camera systems began collecting data on things like trajectory, release point, and speed of pitches. Based on certain criteria, pitch types could be extrapolated from that data. It was tested during the 2006 playoffs, and turned on league wide in 2007. It was, in part, to help their Gameday feeds — allowing the data to be shown on screen to fans following a game online. In either a stroke of genius or a happy accident, MLB published this data in a publicly-available XML feed. It didn’t take long for people to figure out a way to collect, collate, and relay the data to any fan who had interest. Sabermetrics was about to get a huge shot in the arm.

Smash-cut to present day, and pitch data can be found on a myriad of baseball sites. Analysts are using it to prove or disprove theories, looking at everything from speeds, pitch counts, pitch usage, and time between pitches. We can easily look up the rate at which Miguel Cabrera whiffs on a slider, and just how effective Clayton Kershaw’s curveball is. Sites like FanGraphs and Brooks Baseball offer in-depth charts and graphs, while new-comers like Baseball Savant allows users to do searches at an amazingly-granular level in a matter of seconds. Want to know what pitches Andrew McCutchen likes to blast for homers? Done. How good is Corey Kluber’s command? Here’s a heat map. Pitchf/x has allowed analysts to do ground-breaking research on subjects like pitch framing — how a catcher can frame a ball to make it look like a strike to the umpire. We can know so much now, and it’s only going to get better.

In the spring of 2014, Major League Baseball Advanced Media (MLBAM) announced an all-new system for collecting data from the field. And by “from the field” I mean THE WHOLE FIELD. The product we now know as Statcast will reportedly be looking at everything at once, the pitcher, batter, fielder, and baserunners. These systems have been slowly rolling out, but the plan is to have all 30 ballparks up and running come 2015. This will be a big boon to the world of fielding metrics, where there has been a good deal of contention over current methods. A standardized way to observe and quantize fielding should help centralize the corresponding stats. That is of course, if we can get to the data.

The problem with gathering all that data? There’s so darn much of it. Some reports suggest that about seven terabytes per game isn’t out of the question. For big corporations like MLBAM, this isn’t too much of a problem. For the average fan at home, the freelance analyst? It’s a big problem. For one season, as much as 17,000 TB can be in play. Cost and space concerns abound with those kinds of numbers. This means we’ll have to rely on MLB to boil down the information and present it to us, and present it to us in a compilable and digestible way. What happens in 2015 remains to be seen, but no matter what we get our grubby fingers on, it’s hard to not get excited about stuff like this:

But the “mo’ data, mo’ problems” issue isn’t just relegated to baseball. Basketball is dealing with it at as well. The NBA installed their own tracking system in 2014, provided by SportsVU. It uses cameras to create a constant stream of game data, tracking every movement of every player on the floor.

“Each arena has cameras (4-6) that capture each player, referee and the ball in XYZ coordinates and tracks them something like 25 times per second,” says Blake Murphy, a writer and NBA editor for theScore. “From there, the basic info is running distances and speed, player interaction, dribbles and passes.”

And while teams are busy digging through this data, and keeping their findings to themselves, some advancements like EPV have been made public. Murphy sees a whole lot more coming.

“My guess would be that there’s more utility on defense currently. Things like close-outs, shot contests and defensive assignments can be measured with relative ease and can do a better job explaining how a defense did than just results.”

While basketball is just getting off the ground with their data revolution, a perhaps unexpected sport has been implementing tracking into their game — the PGA. Their partnership with CDW and the Shotlink system has brought on a plethora of stats for fans and players alike to digest. Just scrolling through the PGA’s stats page shows just how much data is being collected and processed. Shotlink has been around since 2001, but it’s clear that the craving for stats that is showing up in other sports is pushing the PGA to offer more online.

Not to be outdone by golfers, the NHL is also introducing tracking tech into their game. Details are sketchy as to which company will actually do the tracking, and nothing has really gone past the “we’re planning on planning on testing it” phase, but sports writer Neil Paine sees some big things coming from a fully-implemented system.

“I think the most long-awaited aspect of this system is that it can tell us precise times of possession for each team,” says Paine. “Over the past half-decade or so, statisticians have realized that possession is incredibly important in hockey, yet we’ve had to resort to using imperfect measurements like Corsi and Fenwick, which estimate possession using shots and offensive events.

“In addition, this technology will allow precise tracking of things like expected goals by shot location, dump-and-chase situations and carry-ins, defensive positioning in each zone, and a bunch of other things that, heretofore, devoted fans were having to track by hand.”

Up to even a few months ago, this story would end right about here, but wouldn’t you know it, the NFL is getting in the mix too. At the end of July, they put out a press release announcing their new partnership with Zebra Technologies. The goal is to use quarter-sized RFID chips in players’ pads, officials, and yardage sticks to follow every movement on the field. This has great potential for enhancing the game footage sessions the NFL is so famous for, as well as providing the fans with even more info. I can only imagine the boom of fantasy football analysis that will come if and when this information is made public.

Nearly 150 years passed between the advent of the boxscore and the introduction of Pitchf/x data. It took over a century to make that pivotal leap. The way things are progressing now, a comparable leap might be achievable within 10 years. We are continually at the forefront of a revolution and the prescapace of a sort of sports singularity. Soon, we’re all going to be kids again — figuring out how to read a boxscore, and being amazed.





David G. Temple is the Managing Editor of TechGraphs and a contributor to FanGraphs, NotGraphs and The Hardball Times. He hosts the award-eligible podcast Stealing Home. Dayn Perry once called him a "Bible Made of Lasers." Follow him on Twitter @davidgtemple.

2 Comments
Oldest
Newest Most Voted
Inline Feedbacks
View all comments
Daniel Rodríguez
9 years ago

Hey David, congratulations for the new site. Very interesting.
I’m a frequent listener of stealing home.
I run also a just created blog, called Graficos y Metricas, in Spanish, about analysis of baseball presented in Graphs and Metrics.
If you want, you can follow me at @gmbeisbol and visit my site at http://graficosymetricas.wordpress.com

Best Regards,

Daniel

Jonathan
9 years ago

how about something like this https://bitcasa.com/platform/how-it-works