You are viewing a preview of this job. Log in or register to view more details about this job.

Data Scientist, Baseball Research and Development

Primary Purpose 
The Cleveland Guardians Baseball Research & Development group is seeking to hire a data scientist to use statistical and machine learning techniques to enhance our ability to quantify the game of baseball. In this role, you will work with video, player tracking, and biomechanics data as well as traditional baseball data sources (i.e. box scores). You will work alongside the rest of the R&D department to use these data sources to build creative and impactful statistical and machine learning models that will help the team acquire the best possible players (i.e. trades, free agency, draft) and develop them into members of a championship-caliber team. We are open to a remote role for the right candidate, but relocation to Cleveland, OH is preferred. We can also be flexible on start dates.  

If you meet some of the qualifications below, we encourage you to apply or to reach out for more information. We know that historically marginalized groups – including people of color, women, people from working class backgrounds, and people who identify as LGBTQ – and groups that may not have direct experience in the sports industry - are less likely to apply unless and until they meet every requirement for a job. Therefore, we encourage you to reach out if you have questions about the role or your qualifications. We are happy to help you feel ready to apply! 

 Responsibilities 
  • Design, build, test, and deploy statistical and/or machine learning models to support all facets of baseball operations, including scouting, player development, and the major league team. 
  • Effectively communicate actionable insights to key stakeholders across the organization. 
  • Using data to visualize model outputs and important baseball concepts. 

 Qualifications 
  • Demonstrated experience or advanced degree in a quantitative field such as Statistics, Computer Science, Economics, Machine Learning, or Operations Research. 
  • Programming skills in a language such as R or Python to work efficiently at scale with large data sets. 
  • Desire to continue learning about data science applications in baseball. 

 Preferred Experience 
We are looking for a variety of skill sets. If you have demonstrated experience with one or more of the following, you may be who we are looking for. 
  • Demonstrated research experience in a sports context (baseball is a plus). 
  • Experience with a database language such as SQL. 
  • Experience with computer vision. 
  • Experience working with spatiotemporal data. 
  • Experience working with high-dimensional time series data. 
  • Experience with deep learning frameworks such as TensorFlow or Torch. 
  • Experience with Bayesian statistics and languages such as Stan. 

 Standard Requirements 
  • Represent the Cleveland Guardians in a positive fashion to all business partners and the general public. 
  • Ability to develop and maintain successful working relationship with members of the Front Office. 
  • Ability to act according to the organizational values and service excellence at all times. 
  • Ability to work with multicultural populations and have a commitment to fairness and equality. 
  • Ability to work in a diverse and changing environment. 

About Us 
Baseball Operations and Baseball Research & Development are committed to our mission of winning the World Series while creating a compelling fan experience. Our shared goal is to identify and develop diverse players and front office teammates who contribute to our mission. By working together effectively and collaboratively, we create a family atmosphere that supports learning and striving for excellence in everything we do. We believe that we will achieve our goals by making evidence and model-based decisions and creating environments that support our people and empower them to continuously learn. This role might be for you if you are looking to join a team that works together to learn new ways to make model-based decisions based on proprietary data that lead to excellent outcomes.