Welcome to my GitHub pages!
Technical Projects
Detection and Mitigation of Data Drift and Model Decay (May-June 2022): a capstone project with partner Goldspot Discoveries Corp.
- This project is about the design and implementation of a framework in a Python package, which is generalized to all datasets and models, for monitoring and detecting data drift and model decay, i.e. when the model needs retraining.
- The findings were presented to the geologists and data scientists of the capstone partner and received positive feedback.
- Tools/Techniques - PyTorch, Scikit-Learn, Autoencoder, PCA, confidence distribution, image properties, statistical tests, etc.
Project duration
- 2 months
Developers
Christopher Alexander, Lianna Hovhannisyan, Joshua Sia and Steven Leung
UBC Supervisor
Dr. Simon Goring
Advisor from Partner
Mr. Shervin Manzuri Shalmani of Goldspot Discoveries Corp.
Project - Cloud Deployment of Machine Learning Model
- This project is about using a publicly available large dataset, training an ensemble machine learning and deploying an API to make the model available on Amazon Web Services (AWS) for doing prediction of daily rainfall of Sydney, Australia.
Project duration
- 4 weeks with 4 milestones
AWS Services used
- EC2
- S3
- EMR (with Apache Spark)
DoggoDash
- DoggoDash is an interactive web dashboard which provides visualizations for users to explore the breeds of dogs that best match their preferences.
- Please be patient when the page loads since I am using the free tier with Render.com.
SQL
I have learned SQL from UBC and have been honing my skills with exercises on platforms like LeetCode, Hackerrank, etc. Here are some examples what the problems I solved. I am challenging myself by trying to solve the same problem with more than one techniques (e.g. CTEs, subqueries and window functions).
Here are some examples:
- LeetCode 1174 - Immediate Food Delivery II
- LeetCode 1811 - Find Interview Candidates
- LeetCode 1097 - Game Play Analysis V
EDAhelper
- Python and R packages to make Exploratory Data Analysis (EDA) easier by simplifying 4 common EDA tasks into one-line codes.
- Presentation deck in PDF format is here
- Documentation of the Python package can be seen here and here.
- The GitHub repo for the Python package is here.
- Documentation of the R package can be see here.
- The GitHub repo for the R package is here.
Is age associated with success at the Olympics?
- This is a hypothesis test of whether athletes under age 25 have a significantly higher chance of winning an Olympic medal than those age 25 or above. It is done in the form of a reproducible data science project.
- Non-technical report
- Technical report
- GitHub repo