Multimodal Machine Learning for Major League Baseball Playoff Prediction
DOI:
https://doi.org/10.31449/inf.v46i6.3864Abstract
The introduction on sabermetrics has changed the way Major League Baseball (MLB) teams valued their players. Since then, new baseball stats have been made to make various predictions for MLB teams. With the immense amount of data on baseball players, teams, and scores. Using various Supervised machine learning algorithms, we plan to see how well we can accurately predict which teams will make it to the playoff for year 2019. For this research, we have gathered data from the last 20 years. The features that we will utilize for our machine learning algorithm includes Runs, Batting Average, Homeruns, Strikeouts, Innings Pitched, Earned Runs, and Earned Runs average. We decided to use a Logistic Regression model and a Support Vector Classifier (SVC) as the two machine learning algorithms for our features. After running our tests, our models showed that our trained algorithms were only able to predict accurately 77% of the teams correctly. Of those 77% accurately predicted, 59% was recalled correctly. This led to our overall projected model being only 60% accurate. As the projected model was only able to correctly predict 6 out of 10 teams that made the 2019 playoffs. We believed that we could improve upon our findings by using other machine learning algorithms or including more features that thus increase the overall accuracy of our training model.References
"2019 MLB Team Statistics," 16 March 2020. [Online]. Available: https://www.baseball-reference.com/leagues/MLB/2019.shtml. [Accessed 17 March 2020].
Adams, Mark. “The Man Behind Moneyball: The Billy Beane Story: Domo.” Connecting Your Data, Systems & People, Domo, 24 Feb. 2015, www.domo.com/blog/the-man-behind-moneyball-the-billy-beane-story/.
"A Guide to Sabermetric Research," [Online]. Available: https://sabr.org/sabermetrics.
Blackburn, Ghoji. “What Is Fantasy Baseball? How Do I Play It?” Fake Teams, Fake Teams, 16 Mar. 2017, www.faketeams.com/2017/3/16/14942064/what-is-fantasy-baseball.
D. Prasetio and D. Harlili, "Predicting football match results with logistic regression," 2016 International Conference On Advanced Informatics: Concepts, Theory And Application (ICAICTA), George Town, 2016, pp. 1-5.
J. Bean, "Modeling MLB's 2018 Playoff Teams," 9 October 2018. [Online]. Available: https://towardsdatascience.com/modeling-mlbs-2018-playoff-teams-b3c67481edb2. [Accessed 17 March 2020].
J. Bean, "Modeling MLB's 2018 Playoff Teams," 9 October 2018. [Online]. Available: https://towardsdatascience.com/modeling-mlbs-2018-playoff-teams-b3c67481edb2. [Accessed 17 March 2020].
J. Dutcher, "Book Review: Moneyball: The Art of Winning an Unfair Game," 28 March 2014. [Online]. Available: https://datascience.berkeley.edu/moneyball-book-review/.
J. Silverman, "How Sabermetrics Works," 21 January 2009. [Online]. Available: https://entertainment.howstuffworks.com/sabermetrics.htm.
K. Fuchs, "Machine Learning: Classification Models," 28 March 2017. [Online]. Available: https://medium.com/fuzz/machine-learning-classification-models-3040f71e2529. [Accessed 17 March 2020].
Lashbrook, Lynn. “Why Baseball Analytics Matters and How You Can Make It into a Career.” Why Baseball Analytics Matters and How You Can Make It into a Career, SportsManagementWorldwide, 20 Jan. 2017, www.sportsmanagementworldwide.com/content/why-baseball-analytics-matters-and-how-you-can-make-it-career.
“List of Major League Baseball Postseason Teams.” Wikipedia, Wikimedia Foundation, 1 Nov. 2019, en.wikipedia.org/wiki/List_of_Major_League_Baseball_postseason_teams.
Lutins, Evan. “Grid Searching in Machine Learning: Quick Explanation and Python Implementation.” Medium, Medium, 5 Sept. 2017, medium.com/@elutins/grid-searching-in-machine-learning-quick-explanation-and-python-implementation-550552200596.
“Major League Baseball Team Win Totals.” Baseball, Baseball-Reference, www.baseball-reference.com/leagues/MLB/.
Micahmelling@gmail.com. “Using Machine Learning to Predict Baseball Hall of Famers.” Baseball Data Science, 27 Sept. 2017, www.baseballdatascience.com/using-machine-learning-to-predict-baseball-hall-of-famers/.
“Moneyball.” Moneyball (2011), IMDb.com, 23 Sept. 2011, www.imdb.com/title/tt1210166/.
N. Paine, "The Imperfect Pursuit of a Perfect Baseball Forecast," 27 March 2014. [Online]. Available: https://fivethirtyeight.com/features/the-imperfect-pursuit-of-a-perfect-baseball-forecast/.
Pharr, Roger D. “Predicting MLB Game Outcomes with Machine Learning.” Medium, Towards Data Science, 3 Aug. 2019, towardsdatascience.com/predicting-mlb-game-outcomes-with-machine-learning-594eac9484e9.
Raschka, Sebastian. “Predictive Modeling, Supervised Machine Learning, and Pattern Classification.” Dr. Sebastian Raschka, 25 Aug. 2014, sebastianraschka.com/Articles/2014_intro_supervised_learning.html
R. Ribeiro, "Houston Astros Strive for Balance Between Quantitative and Qualitative Data Analytics," 3 July 2014. [Online]. Available: https://biztechmagazine.com/article/2014/07/houston-astros-strive-balance-between-quantitative-and-qualitative-data-analytics.
S. Banerjee, "Linear Regression: Moneyball - Part 1," 15 April 2018. [Online]. Available: https://towardsdatascience.com/linear-regression-moneyball-part-1-b93b3b9f5b53.
S. Banerjee, "towardsdatascience," 1 June 2018. [Online]. Available: https://towardsdatascience.com/linear-regression-moneyball-part-2-175a9dc72e89.
Downloads
Published
How to Cite
Issue
Section
License
I assign to Informatica, An International Journal of Computing and Informatics ("Journal") the copyright in the manuscript identified above and any additional material (figures, tables, illustrations, software or other information intended for publication) submitted as part of or as a supplement to the manuscript ("Paper") in all forms and media throughout the world, in all languages, for the full term of copyright, effective when and if the article is accepted for publication. This transfer includes the right to reproduce and/or to distribute the Paper to other journals or digital libraries in electronic and online forms and systems.
I understand that I retain the rights to use the pre-prints, off-prints, accepted manuscript and published journal Paper for personal use, scholarly purposes and internal institutional use.
In certain cases, I can ask for retaining the publishing rights of the Paper. The Journal can permit or deny the request for publishing rights, to which I fully agree.
I declare that the submitted Paper is original, has been written by the stated authors and has not been published elsewhere nor is currently being considered for publication by any other journal and will not be submitted for such review while under review by this Journal. The Paper contains no material that violates proprietary rights of any other person or entity. I have obtained written permission from copyright owners for any excerpts from copyrighted works that are included and have credited the sources in my article. I have informed the co-author(s) of the terms of this publishing agreement.
Copyright © Slovenian Society Informatika