Prediction Interval, Confidence Interval
Prediction Interval and Confidence Interval
As it has multiple approaches to explain concepts, I’ll utilize a few comparisons.
High level explanation
Confidence intervals tell you about how well you have determined the mean. Assume that the data really are randomly sampled from a Gaussian distribution. If you do this many times, and calculate a confidence interval of the mean from each sample, you’d expect about 95 % of those intervals to include the true value of the population mean. The key point is that the confidence interval tells you about the likely location of the true population parameter.
Prediction intervals tell you where you can expect to see the next data point sampled. Assume that the data really are randomly sampled from a Gaussian distribution. Collect a sample of data and calculate a prediction interval. Then sample one more value from the population. If you do this many times, you’d expect that next value to lie within that prediction interval in 95% of the samples.The key point is that the prediction interval tells you about the distribution of values, not the uncertainty in determining the population mean.
Prediction intervals must account for both the uncertainty in knowing the value of the population mean, plus data scatter (variance). So a prediction interval is always wider than a confidence interval.
Standard Error-related explanation
The difference between a prediction interval and a confidence interval is the standard error.
The standard error for a confidence interval on the mean takes into account the uncertainty due to sampling. The line you computed from your sample will be different from the line that would have been computed if you had the entire population, the standard error takes this uncertainty into account.
The standard error for a prediction interval on an individual observation takes into account the uncertainty due to sampling like above, but also takes into account the variability of the individuals around the predicted mean. The standard error for the prediction interval will be wider than for the confidence interval and hence the prediction interval will be wider than the confidence interval.
Mathematical Explanation
Let’s consider a simple linear regression model:
\[y = \alpha + \beta x + \epsilon\]Where y
is the response variable, x
is the explanatory variable, α
and β
are the model parameters, and ε
is the random error term. This error term is assumed to be normally distributed with a mean of 0 and a variance of σ²
.
Since we typically don’t know the true error variance σ²
, we must estimate it from our sample data. We use the Mean Squared Error (MSE) for this, which represents the average of the squared differences between the observed and predicted values.
Confidence Interval for the Mean Response
A confidence interval provides a range for the average value of y
for a given x
. The formula is:
The term inside the square root represents the variance of our prediction for the mean response. It’s composed of two parts:
- Uncertainty in the intercept (α): The
1/n
term reflects the variability in estimating the model’s intercept. - Uncertainty in the slope (β): The second term,
(x - x̄)² / Σ(xᵢ - x̄)²
, accounts for the variability in estimating the slope. This term increases asx
moves further from the data’s center (x̄
), making predictions less certain at the extremes.
Prediction Interval for a New Observation
A prediction interval provides a range for a single future observation. It must account for the same uncertainty as the confidence interval, plus the inherent variability of an individual data point. The formula is:
\[\hat{y} \pm t_{n-2, \alpha/2} \cdot \sqrt{MSE \left( 1 + \frac{1}{n} + \frac{(x - \bar{x})^2}{\sum(x_i - \bar{x})^2} \right)}\]Notice the extra 1
inside the square root. This accounts for the irreducible error (σ²
, estimated by MSE
) of a single observation. This additional source of variance is why the prediction interval is always wider than the confidence interval.