Model:
$Y = \theta_0^* + \theta_1^* X_1 + \cdots + \theta_m^* X_m + \text{noise}$
We obtain estimate, $\hat{\theta}$
A new X comes in, we want to estimate Y.
$\hat{Y} = \hat{\theta}_0 + \hat{\theta}_1 X_1 + \cdots \hat{\theta}_m X_m$
Sources of Uncertainty:
- unavoidable, due to unknowable noise
- due to estimation errors, $\hat{\theta} - \theta^*$
Since we know the variance of estimated errors, we can come up with confidence band, a range of values within which the regression line for call.
Linear regression is a gateway to a linear model: a versatile and effective way to analyse model.
CI has varying width.
Key Formulas:
- estimator: $\hat{\theta} = (X^T X)^{-1} X^T Y$
(mean, variance, $\hat{\theta}$)
- predictor: $\hat{Y} = \hat{\theta}^T X$
(value $\hat{Y}$)
Standard error: std of $\hat{\theta}_j$
SE is the expected avg distance between sample means and population means.
95% CI: $\text{CI} = [\hat{\theta}_j - 2\sigma_j, \hat{\theta}_j + 2\sigma_j]$
Wald test: reject "$\theta_j^* = 0$" if $0 \notin \text{CI}$
Comments NOTHING