Gaussian Naive Bayes Classifier
-
Gaussian Naive Bayes Classifier: Iris data set
Nice write-up of applying the sklearn Naive_Bayes_Classifier to the Iris dataset.
Linear Methods
-
PyData talk: Winning with Simple, even Linear, Models
Great talk about how to get a lot of mileage with tricks using logistic and linear regression.
-
Solving least squares by Matrix inversion
Most ML libraries find the minima of e.g. a linear or logistic regression by means of gradient descent. However, you can find the minimum exactly by solving what is called in Linear algebra a Normal Equation
Regularization
-
A nice plot giving intuition why Ridge (L2) regularization tries to make parameters small whereas LASSO (L1) regularization tends to force some parameters towards zero, while others can stay large.
Encoding Data for ML Models
-
Smarter Ways to Encode Categorical Data for Machine Learning
Better encoding of categorical data can mean better model performance. In this article I’ll introduce you to a wide range of encoding options from the Category Encoders package for use with scikit-learn machine learning in Python.
Dimensionality reduction
-
Good clear video explaining SVD from a Stanford AI course. (13 minutes)
-
StatQuest: Principal Component Analysis (PCA), Step-by-Step
YouTube video showing how to use SVD to perform PCA. (21 minutes)