Meetup summary
2025-07-11 - Intro to probability - part 2
Recommended reading:
None.
Agenda:
The meetup date and agenda are both tentative, but the idea is to pick up the intro to probability stuff from last time.
- Expectation
- Formalization of a “mean”
- Definition for discrete RV
- Definition for continuous RV
- Single RV linearity (multiply/add constants)
- LOTUS
- Multiple RV linearity (requires joint distribution; invoke LOTUS)
- Conditional expectation (should be though of as a function of the conditioned RV)
- Law of total expectation
- Of independent RVs
- Variance
- Definition
- Expansion due to linearity of expectation
- Law of total variance
- Of independent RVs (derived via the general case of covariance)
- Covariance
- Definition
- Linear expansion
- Rules for sums/scales of covariance. (This gives rise to sums of variance, whether or not independent.)
- Covariance inequality (relies on Cauchy-Schwarz)
- Correlation (just standardize and take covariance)
- Derived RVs
- Single-variable method of transformations
- Multivariate MOT (analogous but uses Jacobian)
- Special case: sum of random variables (convolution). Works for both discrete and continuous RVs.
- Foundational processes/distributions
- Bernoulli process/RV, binomial RV, geometric RV, negative binomial RV, etc.
- Multinomial (categorical) process, multinoulli.
- Poisson process (limiting case of binomial), Poisson distribution, exponential distribution, Erlang/Gamma distribution
- Gaussian distribution (different limiting case of binomial, but the derivation is long and we won’t get into it today; also arises from the CLT)
- Moment generating functions
- Equivalent to a two-sided Laplace transform, so it does not exist when RV doesn’t have finite moments.
- Characteristic functions
- Equivalent to Fourier transform, so it always exists but cannot be used to easily recovery moments from the series expansion.
- Basic estimators
- Definition
- Estimator bias
- Estimator variance
- Estimator MSE
- Sample mean
- “Naive” sample variance
- Unbiased sample variance
- Foundational inequalities
- Union bound
- Markov inequalitiy
- Chebychev inequality
- Cauchy-Schwarz inquality (for expectations, by analogy with coordinate-free vector version)
- Jensen’s inequality (I don’t know a general proof—this will just be an intuitive argument)
- Gibbs’ inequalitiy (preview for information theory—won’t drill into entropy yet)
- Inference preview (not planning to go deep here; will need to dedicate future sessions to particular areas)
- Classical vs Bayesian perspective in a nutshell (raw MLE vs explicit priors, underlying parameter is an (unknown) constant vs RV)
- Conjugate priors (e.g., beta-binomial)
tags: