Data analytics often proves a tough concept for people to grasp. However, one sport has helped many people understand the power behind interpreting large sets of information: Major League Baseball.
Baseball is the oldest professional sport in the United States. The game provides a treasure trove of data for both sports historians looking at trends and current teams looking for a competitive edge.
Part of the allure is the sheer size of available data. Major League Baseball has been in existence since 1903. That’s 114 years of baseball and more than 200,000 games.
That’s a great data set for looking at overall trends. But baseball teams also now use data to make decisions within the games themselves. To do so, they use many different tools and analytical approaches to track important in-game information.
This approach has helped teams such as the Chicago Cubs and Boston Red Sox win championships. They’ve also helped smaller-market clubs such as the Tampa Bay Rays, Oakland Athletics and Pittsburgh Pirates remain competitive.
And it has opened up a whole new career field for those who want to earn a data analytics degree.
Data Analytics Tools
Scour the Internet for hours. Call all 30 Major League clubs. Read a book or two about data analytics in baseball. Even with all that, the secrets of baseball analytics will remain secret. With every club now committed at some level to analytics, no one wants to talk about exactly what they are doing.
Still, some things are known. And it involves a level of analysis unrivaled in sports.
Collecting Data
Teams use high-resolution cameras to capture dozens of data points during a game. These include:
- A player’s base-to-base running speed
- Pitching velocity
- Exit velocity of home runs
- Pinpointing where every batted ball lands
- Pinpointing the position from which a fielder moves to catch a ball traveling to another point (and whether it’s success or a failure)
- The spin of each pitched ball
- The location of each pitched ball
- The pitcher’s arm angle
- The position of every defensive player on every pitch
Teams get reports every morning from Statcast, the MLB-owned data service that provides teams with play-by-play statistics. Most teams also seek outside information to get an edge on other franchises.
These analytical tools include Trackman, which offers finely grained information on pitchers. That includes the spin rate of the ball and the exact height and angle of the pitch release point.
This has spilled over to the fan experience as well. Anyone watching a game on TV has seen PITCHf/x, which provides a grid showing where a pitch landed in or out of the strike zone. It also provides velocity and movement information for each pitch.
But all of that is just the start.
Interpreting Data
As any data analytics student knows, collection is just the first step. While it’s important to have strict guidelines on collection, it’s equally important to have a strategy on approaching data analysis.
Give a class of analytics students data from an entire baseball season and they will all come up with different interpretations and strategies. The same is true of the 30 Major League clubs.
In baseball, it has led to defensive shifts that match where data says a batter usually hits a ball. It also has led to batters changing their swings to get more lift on the ball. Data shows a ball hit the air is typically more productive than one hit on the ground.
It also has led statisticians to develop a whole range of new barometers to assess a player’s performance. Gone are the days of judging pitchers on ERA and batters on batting average. The new measurements include:
ERA+
Earned run average is calculated by taking the total number of earned runs allowed multiplied by nine and divided by innings pitched. However, ERA+ includes factors such as ballpark dimensions.
Fielding Independent Pitching
Fielding Independent Pitching is expressed like an ERA, but solely factors events pitchers can control. That includes home runs allowed, walks and strikeouts. This prevents pitchers from getting credit for runs prevented by the defense behind him.
On-Base Plus Slugging
The OPS combines the percentage of time a batter gets on base with slugging, which is the total number of bases a player reaches with a hit divided by at-bats. Elite players have an OPS of .900 and above.
Wins Above Replacement
This measurement involves a complex formula used to judge how many wins a player contributes to in comparison to what an average player contributes.
All of the above explains why many teams now have large and growing staffs of data analysts. For those who want to earn a degree in data analysis, baseball has opened up a whole new area of employment opportunity.