Subject 5. Estimators

Very often, there are a number of different estimators that can be used to estimate unknown population parameters. When faced with such a choice, it is desirable to know that the estimator chosen is the "best" under the circumstances, that is, it has more desirable properties than any of the other options available to us. There are three desirable properties of estimators:

  • unbiasedness

    An estimator's expected value (the mean of its sampling distribution) equals the parameter it is intended to estimate. For example, the sample mean is an unbiased estimator of the population mean because the expected value of the sample mean is equal to the population mean.

  • efficiency

    An estimator is efficient if no other unbiased estimator of the sample parameter has a sampling distribution with smaller variance. That is, in repeated samples, analysts expect the estimates from an efficient estimator to be more tightly grouped around the mean than estimates from other unbiased estimators. For example, the sample mean is an efficient estimator of the population mean, and the sample variance is an efficient estimator of the population variance.

  • consistency

    A consistent estimator is one for which the probability of accurate estimates (estimates close to the value of the population parameter) increases as sample size increases. In other words, a consistent estimator's sampling distribution becomes concentrated on the value of the parameter it is intended to estimate as the sample size approaches infinity. As the sample size increases to infinity, the standard error of the sample mean declines to 0 and the sampling distribution concentrates around the population mean. Therefore, the sample mean is a consistent estimator of the population mean.

The single estimate of an unknown population parameter calculated as a sample mean is called a point estimate of the mean. The formula used to compute the point estimate is called an estimator. The specific value calculated from sample observations using an estimator is called an estimate. For example, the sample mean is a point estimate of the population mean. Suppose two samples are taken from a population and the sample means are 16 and 21 respectively. Therefore, 16 and 21 are two estimates of the population mean. Note that an estimator will yield different estimates as repeated samples are taken from the sample population.

A confidence interval is an interval for which one can assert with a given probability 1 - α, called the degree of confidence, that it will contain the parameter it is intended to estimate. This interval is often referred to as the (1 - α)% confidence interval for the parameter, where α is referred to as the level of significance. The end points of a confidence interval are called the lower and upper confidence limits.

For example, suppose that a 95% confidence interval for the population mean is 20 to 40. This means that:

  • There is a 95% probability that the population mean lies in the range of 20 to 40.
  • "95%" is the degree of confidence.
  • "5%" is the level of significance.
  • 20 and 40 are the lower and higher confidence limits, respectively.

User Contributed Comments 6

You need to log in first to add your comment.
danlan: level of significance = 1-degree of confidence
achu: Note: strictly speaking we really can't say there's a "95% probablility" of the mean being between 20-40. See wikipedia.org/Confidence_intervals for a detail description. But for the exam, I guess it's probably not a big deal.
vsimco: Tha above is correct, a confidence interval does not imply a probability statement of the estimated parameter being inside it (this is a given -- it is) nor does it give a probability of statement of the true mean. You cannot technically say the mean has a 95% probability of being inside a confidence interval. This is WRONG. The mean is either inside or outside the interval, there is no middle ground. THE TRUE MEAN IS NOT A RANDOM VARIABLE. What is being said is that 95% of all CONFIDENCE INTERVALS (note: the interval(S****)) contain the true mean. Its very subtle.
sahilb7: UnEfCo: Unbiased, Efficiency, Consistency
sahilb7: Unbiased: Mean = Intended Parameter
Efficiency: Least variance among all parameters
Consistency: Converges towards the actual value as the sample size increases
yannick85: you are the best Sahilb7