During the course DSCI 525 of UBC MDS 2021-22 cohort, we were tasked to deploying an Application Programming Interface (API) of an ensemble machine learning model on Amazon Web Services (AWS). The dataset was publicly available and of a fairly large size which would require the big data handling capabilities of AWS services, including EC2, S3 and EMR. The model deployed was to predict the daily rainfall in Sydney, NSW, Australia. The deployment was successful and we learned the challenges of doing so and how to take advantage of the scalability of a cloud platform like AWS for handling a large dataset.
4 weeks with 4 milestones
feather
) for better performance.parquet
file
format).