

Online Courses

  • Introduction to Probability and Statistics, MIT OpenCourseWare

    This course provides an elementary introduction to probability and statistics with applications. Topics include: basic combinatorics, random variables, probability distributions, Bayesian inference, hypothesis testing, confidence intervals, and linear regression.

Books covering general topics


  • StatQuest

    A popular youtube that covers lots of statistical concepts.

Odds & Ends

  • Common Probability Distributions: The Data Scientist’s Crib Sheet

    Data scientists have hundreds of probability distributions from which to choose. Where to start? Includes descriptions of the following distributions: Bernoulli and Uniform, Binomial and Hypergeometric, Poisson, Geometric and Negative Binomial, Exponential and Weibull, Normal, Log-Normal, Student’s t, and Chi-squared, Gamma and Beta.

  • Kolmogorov–Smirnov test

    A convenient way to see if samples are from a probability distribution. Can compare one group of samples to a known distribution or a set of 2 samples to each other. Positives: Non-parametric can compare two samples without knowing the underlying distributions. Downsides, 1D distributions only, hard to implement multi dimensional methods.

  • Heteroscedasticity vs Homoscedasticity

    alt text alt text

    If the underlying distribution is heteroscedastic it may mess up methods that assume variance is uniform and uncorrelated like goodness of fits in regression problems. For more info, check out this epsiode of the Data Skeptic Podcast.

Table of contents