# Confidence Interval

Suppose we want to estimate the mean age of students in a college. So our population comprise of all the students in the college. Since noting down the age of each student is very time consuming, we will take a sample of some students (say 10), and estimate the mean age of students using the sample mean of those 10 students.

There are two types of estimates:

1) age of a student is 20 years --------Point Estimate

2) age lies between 18- 22 years ---------interval Estimate

Now let’s see the formal definitions:

__Point Estimate__: Point estimation involves the use of Sample Data to calculate a single value (known as a point estimate since it identifies a point in some parameter space ) which is to serve as a "best guess" or "best estimate" of an unknown population parameter (for example, the population mean)

__Interval Estimate__: Interval estimation is the use of sample data to calculate an interval of possible values of an unknown population parameter; this is in contrast to point estimation, which gives a single value.

Suppose we take 100 samples each of size 10.For each of these samples we will calculate sample mean and form a sampling distribution. The sampling distribution turns out to be approximately Normal distribution.

*Now let’s see what confidence intervals are.*

Confidence interval is an estimated range of values that seem to be reasonable based on what we have observed. Its center is still the sample mean , but we have got some room on either side for uncertainty. So if we say the age of a student is 18-22 years, there is uncertainty attached to it.

A 95% confidence interval means that if we calculate confidence interval from 100 different samples, about 95 of them would contain the true population mean.

*So if we estimate the mean age of a students ,we may be correct 95% of the times.(But we may be wrong 5% of times too !!!) *

*So why don’t we take 100% confidence intervals?*

To get a 100% confidence interval we need to examine the entire population. We can always take large samples so that our estimated value is very close to true population parameter. But it requires a lot of time and money.

A 100% confidence interval means the values from -∞ to +∞ i.e. the entire real line and therefore it will always contain the true population parameter. So, it is not of much use. We should find a confidence interval which is narrow enough to be useful and wide enough to contain the population parameter. We need to balance accuracy and precision. We need to sacrifice a little bit of accuracy to gain more precision, a 95% confidence interval will give us more useful range which is not infinitely long.

__Margin of error__: Reflects the uncertainty that surrounds sample estimates of population parameters.

In our example of estimating the age of students in a college, our sample mean is 20 years.

We have taken a confidence interval of 18-22 years i.e. 20±2 years .So here our margin of error is 2 years.

Author: Tanya Gupta

https://www.linkedin.com/in/tanya-gupta-805407160