Touching
Data Science
Eugene Shevchuk
Working on "Machine Learning with Python" MIT course projects, I successfully applied key ML methodologies and happy to share the code. Due to MIT policy, I can not store it accessible publicly, but I am eligible to share with you as a potential employer directly upon your request
Want to know a bit more? Scroll!
What kind of vacation do you prefer?
How long are you going to be on vacation?
500
5000
Zero block
Click „Block Editor” to enter the edit mode. Use layers, shapes and customize adaptability. Everything is in your hands.
Tilda Publishing
create your own block from scratch
I know
  • Principle component analysis
  • Convolutional neural networks
  • Gaussian Mixture model (EM kernalization)
  • Collaborative filtering
  • Reinforcement learning (MDP + NN)
My projects
MIT projects I enjoyed most
Digits recognition
MNIST data set. Single and double overlapping digits
Movies scores prediction
Netflix data set. 1-5 ✩ score per user prediction
AI playing computer game
Model learn text game rules & strategy by playing it
Enter your working email please:
Projects 1-2
Digits recognition
Goal: Recognize handwritted digits

Result:
minimal error rate 2% for two overlapping digits recognition

Input data:
MNIST Dataset - handwritten single & double overlapping digits
- 60K - training, 10K - testing

→ View the code

Single digit recognition
(manually coded)
Linear regression
(closed form solution)
Test error: 0.7702
Yep. It will be better.
View the code →
Multi class SVM
Test error: 0.0819
View the code →
Multinomial (softmax) regression
with Gradient Descend
Test error: 0.1005
View the code →
Multinomial (softmax) regression with PCA & cubic features
PCA dimensions reduction 784 ⤑ 18
Kernel: $\phi (x)^ T \phi (x') = (x^ T x' + 1)^3$

Test error:
0.08520
View the code →
Convolutional Neural Network
Optimized over:
- baseline (no modifications)
- batch size
- learning rate
- momentum
- activation (ReLU / LeakyReLU)

Test error:
0.9902
Double digit recognition
(PyTorch)
Convolutional Neural Network
(PyTorch)

Test error:
< 0.0200
→ View the code
Project 3
Netflix ratings prediction
Goal: Predict unrated movies

Result: $\Delta \geqslant 2$ error rate 0.17

Input data:

1200 users for 1200 movies
Ratings values $\in {1, ..., 5}$
Value $= 0$ for unrated ones



Description:


I was given a data matrix containing movie ratings made by users extracted from Netflix database. Any particular user has rated only a small fraction of the movies so the data matrix was only partially filled. The goal was to predict all the remaining entries of the matrix. I approached it by building a Gaussian Mixture Model (GMM) for collaborative filtering educating it with Expectation Maximization algorithm.
    Resources
    Copywright Ⓒ 2020 Eugene Shevchuk