Kirk Harrington, SAEG partner
Kirk, SAEG: How long have you been an analyst with the Cleveland Indians? I understand that you are the Lead analyst, correct? How many analysts report to you?
KW: I've been with the Indians for a little over 6 years. My title is Director of Baseball Analytics. When I started, I was a one-person department, but I now have a couple of analysts working directly for me.
Kirk, SAEG: How did you come to be an analyst for baseball? What do you find most rewarding about the work you do?
KW: I started my professional career after college as a software developer in Silicon Valley. I pursued a Master degree while working, and studied baseball statistics as a hobby. In the late 1990's, I got involved with a group of people on the Internet who were publishing a baseball research book along the lines of the old Bill James Baseball Abstracts. That group, Baseball Prospectus, grew to become a well-known presence in the baseball industry, and I ended up co-authoring about 10 books, developing statistical reports for the web site, and pursuing new baseball research. That body of work eventually came to the attention of the Indians, and I was able to change my hobby into my career.
Kirk, SAEG: What type of models do you/your group run regularly? How often and what would trigger the creation of a new model?
KW: Linear and logistic regression are still the stalwarts of most of the analyses we do, but we've also tackled hierarchical models, local regression and other nonparametric statistics, as well as more machine learning approaches such as neural nets, SVM's, etc. My group works exclusively for the baseball operations group, so all of our work is directed towards decisions our front office and coaching staff have to make about the team on the field. A lot of effort is spent trying to improve our forecasts of player performance in future years, but we also research in-game strategies (e.g. the best time to try to steal a base, or bring in a reliever), evaluate college and high school players for the amateur draft, assessing the impact of changes in the Collective Bargaining Agreement, and so on.
Kirk, SAEG: What and how are you collecting the information used for your models?
KW: We get data from a variety of sources. From Major League Baseball and some other vendors, we receive a data feed containing detailed play by play information about every game played in the majors and minors every night. That includes the names of the pitcher, batter, and every fielder, baserunner, and umpire on the field, the inning, number of outs, score, and the result of each play. In many cases we also get information about the full pitch sequence leading up to the play, the exact location where a batted ball was hit to on the field, the speed of every fastball, the location of every pitch in the zone, and so on. We also get information from MLB on the contract status, service time, and transaction history of every professional player. Within our own organization, we have a database of scouting reports on players (major leaguers, minor leaguers, college and high school players) that goes back many years, plus medical information from our training staff, and reports from our coaches and instructors.
Kirk, SAEG: This question comes from Greg, one of our SAEG members: How much information are you using about player's attitude, lifestyle, and etc. for player evaluation?
KW: A player’s personality, habits, work ethic, attitudes, and competitiveness are all aspects of what commonly is called a player’s “makeup”. Our scouts and player development staff spend a lot of time evaluating a player’s makeup – his ability to adapt to the schedules and rhythm of a baseball life, to receive instruction, to cope with failure, and to work hard at maximizing his skill set. It’s very hard to quantify, but too important to ignore. We consider a player’s makeup to be a significant part of who he is, and what he can become.
Kirk, SAEG: Keith...I am amazed by all you're able to learn on baseball! It's amazing how many data points that can be gathered from a single game. I especially enjoyed knowing that you take into account a player's makeup. It makes sense to me that this dimension of a player would affect their performance. This combine with offensive and defensive performance would make for an interesting prediction modeling exercise for sure.
Exactly what decisions are being made using data analysis? I know you talked about this a bit in your last set of questions....could you elaborate? Perhaps it would help to categorize the answer into micro and macro level decisions...just a thought.
KW: Data analysis is just one of several inputs being used in decision making. Chris Antonetti, our GM, takes multiple perspectives into account before making a decision, including input from our field manager Terry Francona and his coaches, our scouting department and player development staff. I don’t think there’s any decision that is made solely on the basis of data analysis. But the kinds of things we are asked to analyze would include questions like: How many runs would we save over the course of a season by playing a better defensive player at a certain position? What kind of offensive production can we expect from player X five years from now? Which prospects in team Y’s farm system might be worth targeting in a trade?
Kirk, SAEG: Here are some additional questions from some of our SAEG members...
From Mike...
"I would like to know some examples of the kind of revenue generating or cost cutting analytics you do. Sports as a business is something I'm unfamiliar with."
KW: I don’t have much direct involvement with our business analytics, but we do have people working on it. A couple of articles related to a presentation one of my colleagues gave last spring might give you some insight: http://www.baseballprospectus.com/article.php?articleid=19854 http://www.fangraphs.com/blogs/sabr-analytics-teams-going-deep-to-attract-new-fans/
From Sam...
"How do you get to be an analyst for a professional baseball team?"
KW: That’s probably the most common question I get asked. The bottom line is that it’s very hard get paid to do baseball analysis, and there are a lot of people interested in doing it (cheaply or even for free). There are only 30 teams, and not all of them hire even one analyst.
There wasn’t really a defined career path to get into baseball analysis, although as sports analysis becomes more mainstream, that’s changing somewhat. A strong quantitative background, with coursework in probability, statistics, and computer science is certainly helpful. Becoming proficient with at least one statistical software package such as R, SPSS, SAS, etc. is a must. I also recommend that people become comfortable with databases and writing SQL queries so they can be self-sufficient with data extraction and preparation.
"Analysis often involves observing the trends of groups like a team or a franchise. How does the baseball analyst make decisions based on individual players when most statistics involve averages for a team (or teams)?"
KW: Actually, we collect a great deal of information at the individual player level. Although we win or lose as a team, baseball is perhaps more separable into individual efforts than most other teams sports. When a batter is at the plate, the outcome is largely determined by his own ability and his opponents’, rather than that of his teammates. Most fielding opportunities can only be handled by a single player, and so on. We know which players were involved on every play that occurs during a game. We have data on how where and how hard each batter tends to hit balls. We know what pitches a pitcher throws. We can count how often a runner takes an extra base on a hit, or attempts a steal. We have an idea (through modeling) how these individual performances come together to produce team-level results, so we can estimate the effect of a single player on the overall team-level outcomes.
What do you think of the movie Moneyball? Has the concept actually changed baseball or was it just hype?
The Moneyball movie was enjoyable, albeit exaggerated in places for dramatic effect. I think they captured the feel of the book pretty well, but wouldn't pretend it's a documentary on how front offices work (then or now).
I think Moneyball (the book) shed light on something that was already starting to happen in baseball. It made it more prominent, and turned analytics into a catchphrase, but didn't create the change itself. The industry is clearly very different than it was 15-20 years ago. Few, if any, teams had full time analysts on staff. There wouldn't have been an opportunity for someone like me. Now the majority of teams have at least one, and some have several analysts on board.
I think Moneyball (the book) shed light on something that was already starting to happen in baseball. It made it more prominent, and turned analytics into a catchphrase, but didn't create the change itself. The industry is clearly very different than it was 15-20 years ago. Few, if any, teams had full time analysts on staff. There wouldn't have been an opportunity for someone like me. Now the majority of teams have at least one, and some have several analysts on board.
Kirk, SAEG: What has been your greatest success as a baseball analyst for the Indians? What motivates you to do what you do?
KW: I think the greatest success has simply been analytics becoming an integrated part of the Indians’ decision making -- knowing that Chris Antonetti or Mark Shapiro relied in part on my work to make potential multimillion dollar baseball decisions. The fact that they continue to ask for more information and invest in the infrastructure and analysts speaks to the value the organization places in what we contribute. And seeing the kind of work that we do spread from baseball operations to other parts of the company (marketing, ticketing, customer service) emphasizes that they buy into data-driven decision making in a lot of different ways.
Kirk, SAEG: Tell us of a model that you really enjoyed working on (that perhaps you found to be innovative for your field). Think about what made it special for you.
Kirk, SAEG: Tell us of a model that you really enjoyed working on (that perhaps you found to be innovative for your field). Think about what made it special for you.
KW: One of the most exciting developments in recent years has been the availability of PITCHf/x data. There are multiple cameras installed at every major league park that track a pitched ball in flight between the pitcher’s mound and home plate. We can measure how much each curveball breaks, how much a sinker sinks, how consistent a pitcher’s release point it, the speed of every pitch thrown, and the location of where every pitch crosses the plate. From that data, we can create models that automatically classify the type of pitch thrown, track whether a pitcher starts to lose velocity the deeper he goes into games, even how well the catcher frames pitches to get a few more called strikes from the umpire. Rather than just measuring outcomes of plays, we’re gathering data that relates to the actual physics of the game of baseball.
Kirk, SAEG: Do you get the chance to meet and talk to any of the players as part of your analysis work? If so, what kind of data gathering do you do that involved direct contact?
KW: For the most part, we interact with other members of the front office, or with Terry Francona and the coaching staff. The information we create isn’t typically of a form that can be turned into operational knowledge by a player –p-values, inference, and cross-validation don’t help Carlos Santana recognize pitches, develop an approach at the plate, or call a game. We need to communicate what our work implies for game strategy, roster management, and impact on the field, and then let the staff decide how best to convey what players need to know to try to make it happen.
Kirk, SAEG: Are you disappointed that Choo left the team? How do you handle the loss of major players in the lineup?
KW: Anyone who works in a front office is, first and foremost, a baseball fan. And, as fans, we all have favorite players, and guys we like to root for. But when it comes to making decisions, it’s the team, both present and future, that comes first. Sometimes that means parting ways with a guy you really like to help in other areas of the team. Roster turnover is a fact of life in baseball, and often times for reasons out of your control. You handle it by being adaptable, creative, and focused on the larger goal.
Kirk, SAEG: Thank you Keith for your time. I know this interview will be a homerun for our group (pun intended).
A picture I took at Progressive Field, Cleveland, Ohio at a game I attended, Summer 2011

No comments:
Post a Comment