Prediction of Marathon Time
Introduction
The main goal of this project is prediction of marathon race time in male and female runners.
Data Source
The data is sourced from the website of six world major marathons namely: Tokyo, Boston, London, Berlin, Chicago, NYC. The data is scraped from these websites using web scraping technologies like Beautiful Soup and Selenium
Feature Selection
The more relevant features picked for predicting marathon times are athlete’s age, percentage of body fat, average running speed, weather, terrain, and running footwear.
Modeling
I used two Ordinary Least Square models for male and female athletes. I also tried Lasso and Ridge regression with cross validation and tuning. Lasso regression with alpha=0.0001 was my best model. The adjusted R-squared of the male model was 0.44 and for the female model was 0.71. The images below show the final models for both genders:
Male race time (min) = 285.4
- 2.316 × Body Fat (%) − 10.446 × Running Speed (kph) + 1.38 × Age − 1.02 × Running Shoe
Female race time (min) = 337.4
- 2.154 × Body Fat (%) − 15.112 × Running Speed (kph) + 1.27 × Age − 1.04 × Running Shoe
Contributing
Please feel free to submit pull requests for development. The following instructions will get you a copy of the project up and running on your local machine for development and testing purposes.
Prerequisites
You need the following software for testing and development:
- Python (version 3)
- Command line interface
- Web browser (Microsoft Edge, Firefox, Chrome and Safari supported)
- Source code editor like Atom or Sublime Text
- Git source control manager
Cloning Repository
Start by opening the command line and downloading the repository as follows
git clone https://github.com/BabakShah/...
change the directory to the project folder and the desired source file (Python, HTML, CSS, JS)
cd ./DS-Saf
in the command line, download all the python library dependencies
pip install -r /path/to/requirements.txt
for further development, open the source files in a source code editor. For Python scripts
open -a "Sublime Text" file-name
for IPython notebooks
jupyter notebook file-name
Built With
- Python
- Git
- Command Line Interface
Libraries used
- Scikit-learn
- Matplotlib
- Numpy
- Pandas
- Jupyter Notebook
References
[1] A. Bogomolov, B. Lepri, J. Staiano, N. Oliver, F. Pianesi and A. Pentland, ‘Once Upon a Crime: Towards Crime Prediction from Demographics and Mobile Data’, CoRR, vol. 14092983, 2014.
Contact me
Babak - email