A standard error of measurement, often denoted SEm, estimates the variation around a “true” score for an individual when repeated measures are taken.
It is calculated as:
SEm = s√1-R
where:
- s: The standard deviation of measurements
- R: The reliability coefficient of a test
Note that a reliability coefficient ranges from 0 to 1 and is calculated by administering a test to many individuals twice and calculating the correlation between their test scores.
The higher the reliability coefficient, the more often a test produces consistent scores.
Example: Calculating a Standard Error of Measurement
Suppose an individual takes a certain test 10 times over the course of a week that aims to measure overall intelligence on a scale of 0 to 100. They receive the following scores:
Scores: 88, 90, 91, 94, 86, 88, 84, 90, 90, 94
The sample mean is 89.5 and the sample standard deviation is 3.17.
If the test is known to have a reliability coefficient of 0.88, then we would calculate the standard error of measurement as:
SEm = s√1-R = 3.17√1-.88 = 1.098
How to Use SEm to Create Confidence Intervals
Using the standard error of measurement, we can create a confidence interval that is likely to contain the “true” score of an individual on a certain test with a certain degree of confidence.
If an individual receives a score of x on a test, we can use the following formulas to calculate various confidence intervals for this score:
- 68% Confidence Interval = [x – SEm, x + SEm]
- 95% Confidence Interval = [x – 2*SEm, x + 2*SEm]
- 99% Confidence Interval = [x – 3*SEm, x + 3*SEm]
For example, suppose an individual scores a 92 on a certain test that is known to have a SEm of 2.5. We could calculate a 95% confidence interval as:
- 95% Confidence Interval = [92 – 2*2.5, 92 + 2*2.5] = [87, 97]
This means we are 95% confident that an individual’s “true” score on this test is between 87 and 97.
Reliability & Standard Error of Measurement
There exists a simple relationship between the reliability coefficient of a test and the standard error of measurement:
- The higher the reliability coefficient, the lower the standard error of measurement.
- The lower the reliability coefficient, the higher the standard error of measurement.
To illustrate this, consider an individual who takes a test 10 times and has a standard deviation of scores of 2.
If the test has a reliability coefficient of 0.9, then the standard error of measurement would be calculated as:
- SEm = s√1-R = 2√1-.9 = 0.632
However, if the test has a reliability coefficient of 0.5, then the standard error of measurement would be calculated as:
- SEm = s√1-R = 2√1-.5 = 1.414
This should make sense intuitively: If the scores of a test are less reliable, then the error in the measurement of the “true” score will be higher.
From Wikipedia, the free encyclopedia
For a value that is sampled with an unbiased normally distributed error, the above depicts the proportion of samples that would fall between 0, 1, 2, and 3 standard deviations above and below the actual value.
The standard error (SE)[1] of a statistic (usually an estimate of a parameter) is the standard deviation of its sampling distribution[2] or an estimate of that standard deviation. If the statistic is the sample mean, it is called the standard error of the mean (SEM).[1]
The sampling distribution of a mean is generated by repeated sampling from the same population and recording of the sample means obtained. This forms a distribution of different means, and this distribution has its own mean and variance. Mathematically, the variance of the sampling mean distribution obtained is equal to the variance of the population divided by the sample size. This is because as the sample size increases, sample means cluster more closely around the population mean.
Therefore, the relationship between the standard error of the mean and the standard deviation is such that, for a given sample size, the standard error of the mean equals the standard deviation divided by the square root of the sample size.[1] In other words, the standard error of the mean is a measure of the dispersion of sample means around the population mean.
In regression analysis, the term «standard error» refers either to the square root of the reduced chi-squared statistic or the standard error for a particular regression coefficient (as used in, say, confidence intervals).
Standard error of the sample mean[edit]
Exact value[edit]
Suppose a statistically independent sample of observations is taken from a statistical population with a standard deviation of . The mean value calculated from the sample, , will have an associated standard error on the mean, , given by:[1]
- .
Practically this tells us that when trying to estimate the value of a population mean, due to the factor , reducing the error on the estimate by a factor of two requires acquiring four times as many observations in the sample; reducing it by a factor of ten requires a hundred times as many observations.
Estimate[edit]
The standard deviation of the population being sampled is seldom known. Therefore, the standard error of the mean is usually estimated by replacing with the sample standard deviation instead:
- .
As this is only an estimator for the true «standard error», it is common to see other notations here such as:
- or alternately .
A common source of confusion occurs when failing to distinguish clearly between:
Accuracy of the estimator[edit]
When the sample size is small, using the standard deviation of the sample instead of the true standard deviation of the population will tend to systematically underestimate the population standard deviation, and therefore also the standard error. With n = 2, the underestimate is about 25%, but for n = 6, the underestimate is only 5%. Gurland and Tripathi (1971) provide a correction and equation for this effect.[3] Sokal and Rohlf (1981) give an equation of the correction factor for small samples of n < 20.[4] See unbiased estimation of standard deviation for further discussion.
Derivation[edit]
The standard error on the mean may be derived from the variance of a sum of independent random variables,[5] given the definition of variance and some simple properties thereof. If are independent samples from a population with mean and standard deviation , then we can define the total
which due to the Bienaymé formula, will have variance
where we’ve approximated the standard deviations, i.e., the uncertainties, of the measurements themselves with the best value for the standard deviation of the population. The mean of these measurements is simply given by
- .
The variance of the mean is then
The standard error is, by definition, the standard deviation of which is simply the square root of the variance:
- .
For correlated random variables the sample variance needs to be computed according to the Markov chain central limit theorem.
Independent and identically distributed random variables with random sample size[edit]
There are cases when a sample is taken without knowing, in advance, how many observations will be acceptable according to some criterion. In such cases, the sample size is a random variable whose variation adds to the variation of such that,
- [6]
If has a Poisson distribution, then with estimator . Hence the estimator of becomes , leading the following formula for standard error:
(since the standard deviation is the square root of the variance)
Student approximation when σ value is unknown[edit]
In many practical applications, the true value of σ is unknown. As a result, we need to use a distribution that takes into account that spread of possible σ’s.
When the true underlying distribution is known to be Gaussian, although with unknown σ, then the resulting estimated distribution follows the Student t-distribution. The standard error is the standard deviation of the Student t-distribution. T-distributions are slightly different from Gaussian, and vary depending on the size of the sample. Small samples are somewhat more likely to underestimate the population standard deviation and have a mean that differs from the true population mean, and the Student t-distribution accounts for the probability of these events with somewhat heavier tails compared to a Gaussian. To estimate the standard error of a Student t-distribution it is sufficient to use the sample standard deviation «s» instead of σ, and we could use this value to calculate confidence intervals.
Note: The Student’s probability distribution is approximated well by the Gaussian distribution when the sample size is over 100. For such samples one can use the latter distribution, which is much simpler.
Assumptions and usage[edit]
An example of how is used is to make confidence intervals of the unknown population mean. If the sampling distribution is normally distributed, the sample mean, the standard error, and the quantiles of the normal distribution can be used to calculate confidence intervals for the true population mean. The following expressions can be used to calculate the upper and lower 95% confidence limits, where is equal to the sample mean, is equal to the standard error for the sample mean, and 1.96 is the approximate value of the 97.5 percentile point of the normal distribution:
- Upper 95% limit and
- Lower 95% limit
In particular, the standard error of a sample statistic (such as sample mean) is the actual or estimated standard deviation of the sample mean in the process by which it was generated. In other words, it is the actual or estimated standard deviation of the sampling distribution of the sample statistic. The notation for standard error can be any one of SE, SEM (for standard error of measurement or mean), or SE.
Standard errors provide simple measures of uncertainty in a value and are often used because:
- in many cases, if the standard error of several individual quantities is known then the standard error of some function of the quantities can be easily calculated;
- when the probability distribution of the value is known, it can be used to calculate an exact confidence interval;
- when the probability distribution is unknown, Chebyshev’s or the Vysochanskiï–Petunin inequalities can be used to calculate a conservative confidence interval; and
- as the sample size tends to infinity the central limit theorem guarantees that the sampling distribution of the mean is asymptotically normal.
Standard error of mean versus standard deviation[edit]
In scientific and technical literature, experimental data are often summarized either using the mean and standard deviation of the sample data or the mean with the standard error. This often leads to confusion about their interchangeability. However, the mean and standard deviation are descriptive statistics, whereas the standard error of the mean is descriptive of the random sampling process. The standard deviation of the sample data is a description of the variation in measurements, while the standard error of the mean is a probabilistic statement about how the sample size will provide a better bound on estimates of the population mean, in light of the central limit theorem.[7]
Put simply, the standard error of the sample mean is an estimate of how far the sample mean is likely to be from the population mean, whereas the standard deviation of the sample is the degree to which individuals within the sample differ from the sample mean.[8] If the population standard deviation is finite, the standard error of the mean of the sample will tend to zero with increasing sample size, because the estimate of the population mean will improve, while the standard deviation of the sample will tend to approximate the population standard deviation as the sample size increases.
Extensions[edit]
Finite population correction (FPC)[edit]
The formula given above for the standard error assumes that the population is infinite. Nonetheless, it is often used for finite populations when people are interested in measuring the process that created the existing finite population (this is called an analytic study). Though the above formula is not exactly correct when the population is finite, the difference between the finite- and infinite-population versions will be small when sampling fraction is small (e.g. a small proportion of a finite population is studied). In this case people often do not correct for the finite population, essentially treating it as an «approximately infinite» population.
If one is interested in measuring an existing finite population that will not change over time, then it is necessary to adjust for the population size (called an enumerative study). When the sampling fraction (often termed f) is large (approximately at 5% or more) in an enumerative study, the estimate of the standard error must be corrected by multiplying by a »finite population correction» (a.k.a.: FPC):[9]
[10]
which, for large N:
to account for the added precision gained by sampling close to a larger percentage of the population. The effect of the FPC is that the error becomes zero when the sample size n is equal to the population size N.
This happens in survey methodology when sampling without replacement. If sampling with replacement, then FPC does not come into play.
Correction for correlation in the sample[edit]
Expected error in the mean of A for a sample of n data points with sample bias coefficient ρ. The unbiased standard error plots as the ρ = 0 diagonal line with log-log slope −½.
If values of the measured quantity A are not statistically independent but have been obtained from known locations in parameter space x, an unbiased estimate of the true standard error of the mean (actually a correction on the standard deviation part) may be obtained by multiplying the calculated standard error of the sample by the factor f:
where the sample bias coefficient ρ is the widely used Prais–Winsten estimate of the autocorrelation-coefficient (a quantity between −1 and +1) for all sample point pairs. This approximate formula is for moderate to large sample sizes; the reference gives the exact formulas for any sample size, and can be applied to heavily autocorrelated time series like Wall Street stock quotes. Moreover, this formula works for positive and negative ρ alike.[11] See also unbiased estimation of standard deviation for more discussion.
See also[edit]
- Illustration of the central limit theorem
- Margin of error
- Probable error
- Standard error of the weighted mean
- Sample mean and sample covariance
- Standard error of the median
- Variance
References[edit]
- ^ a b c d Altman, Douglas G; Bland, J Martin (2005-10-15). «Standard deviations and standard errors». BMJ: British Medical Journal. 331 (7521): 903. doi:10.1136/bmj.331.7521.903. ISSN 0959-8138. PMC 1255808. PMID 16223828.
- ^ Everitt, B. S. (2003). The Cambridge Dictionary of Statistics. CUP. ISBN 978-0-521-81099-9.
- ^ Gurland, J; Tripathi RC (1971). «A simple approximation for unbiased estimation of the standard deviation». American Statistician. 25 (4): 30–32. doi:10.2307/2682923. JSTOR 2682923.
- ^ Sokal; Rohlf (1981). Biometry: Principles and Practice of Statistics in Biological Research (2nd ed.). p. 53. ISBN 978-0-7167-1254-1.
- ^ Hutchinson, T. P. (1993). Essentials of Statistical Methods, in 41 pages. Adelaide: Rumsby. ISBN 978-0-646-12621-0.
- ^ Cornell, J R, and Benjamin, C A, Probability, Statistics, and Decisions for Civil Engineers, McGraw-Hill, NY, 1970, ISBN 0486796094, pp. 178–9.
- ^ Barde, M. (2012). «What to use to express the variability of data: Standard deviation or standard error of mean?». Perspect. Clin. Res. 3 (3): 113–116. doi:10.4103/2229-3485.100662. PMC 3487226. PMID 23125963.
- ^ Wassertheil-Smoller, Sylvia (1995). Biostatistics and Epidemiology : A Primer for Health Professionals (Second ed.). New York: Springer. pp. 40–43. ISBN 0-387-94388-9.
- ^ Isserlis, L. (1918). «On the value of a mean as calculated from a sample». Journal of the Royal Statistical Society. 81 (1): 75–81. doi:10.2307/2340569. JSTOR 2340569. (Equation 1)
- ^ Bondy, Warren; Zlot, William (1976). «The Standard Error of the Mean and the Difference Between Means for Finite Populations». The American Statistician. 30 (2): 96–97. doi:10.1080/00031305.1976.10479149. JSTOR 2683803. (Equation 2)
- ^ Bence, James R. (1995). «Analysis of Short Time Series: Correcting for Autocorrelation». Ecology. 76 (2): 628–639. doi:10.2307/1941218. JSTOR 1941218.
Стандартная ошибка измерения: определение и пример
17 авг. 2022 г.
читать 2 мин
Стандартная ошибка измерения , часто обозначаемая как SE m , оценивает отклонение от «истинного» показателя для индивидуума при повторных измерениях.
Он рассчитывается как:
SE m = s√ 1-R
куда:
- s: стандартное отклонение измерений
- R: коэффициент надежности теста.
Обратите внимание, что коэффициент надежности находится в диапазоне от 0 до 1 и рассчитывается путем двукратного проведения теста для многих людей и расчета корреляции между их результатами теста.
Чем выше коэффициент надежности, тем чаще тест дает стабильные результаты.
Пример: расчет стандартной ошибки измерения
Предположим, человек проходит определенный тест 10 раз в течение недели, целью которого является измерение общего интеллекта по шкале от 0 до 100. Он получает следующие баллы:
Очки: 88, 90, 91, 94, 86, 88, 84, 90, 90, 94.
Среднее значение выборки равно 89,5, а стандартное отклонение выборки равно 3,17.
Если известно, что тест имеет коэффициент надежности 0,88, то мы рассчитываем стандартную ошибку измерения как:
SE м = с√1 -R = 3,17√1-0,88 = 1,098
Как использовать SE m для создания доверительных интервалов
Используя стандартную ошибку измерения, мы можем создать доверительный интервал, который, вероятно, будет содержать «истинную» оценку человека по определенному тесту с определенной степенью достоверности.
Если человек получает по тесту оценку x , мы можем использовать следующие формулы для расчета различных доверительных интервалов для этой оценки:
- 68% доверительный интервал = [ x – SE m , x + SE m ]
- 95% доверительный интервал = [ x – 2*SE m , x + 2*SE m ]
- 99% доверительный интервал = [ x – 3*SE m , x + 3*SE m ]
Например, предположим, что человек набрал 92 балла по определенному тесту, который, как известно, имеет SE m 2,5. Мы могли бы рассчитать 95% доверительный интервал как:
- 95% доверительный интервал = [92 – 2*2,5, 92 + 2*2,5] = [87, 97]
Это означает, что мы на 95% уверены в том, что «истинный» результат этого теста человека находится между 87 и 97.
Надежность и стандартная ошибка измерения
Существует простая зависимость между коэффициентом надежности теста и стандартной ошибкой измерения:
- Чем выше коэффициент надежности, тем меньше стандартная ошибка измерения.
- Чем ниже коэффициент надежности, тем выше стандартная ошибка измерения.
Чтобы проиллюстрировать это, рассмотрим человека, который проходит тест 10 раз и имеет стандартное отклонение баллов, равное 2 .
Если тест имеет коэффициент надежности 0,9 , то стандартная ошибка измерения будет рассчитываться как:
- SE m = s√1 -R = 2√1-0,9 = 0,632
Однако, если тест имеет коэффициент надежности 0,5 , то стандартная ошибка измерения будет рассчитываться как:
- SE м = с√ 1-R = 2√ 1-,5 = 1,414
Это должно иметь смысл интуитивно: если результаты теста менее надежны, то ошибка измерения «истинного» результата будет выше.
Measurement errors also called observational errors are defined as the difference between the actual response acquired and the measured response value. The actual response value is the average of the infinite number of measurements in this case while the measured response value is the accurate value.
Measurement is the quаntifiсаtiоn оf attributes оf аn оbjeсt оr event, whiсh саn be used tо соmраre with оther оbjeсts оr events. The sсорe аnd аррliсаtiоn оf meаsurement аre deрendent оn the соntext аnd disсiрline. In nаturаl sсienсes аnd engineering, measurements dо nоt аррle tо nominal рrорerties of objects or events, which is consistent with the guidelines of the Internаtiоnаl vосаbulаry оf metrоlоgy рublished by the Internаtiоnаl Bureau of Weights аnd Measures. Hоwever, in other fields such аs statistics as well аs the sосiаl аnd behаviоurаl sсienсes, measurements can have multiple levels, which would include nominal, оrdinаl, intervаl аnd rаtiо sсаles.
Meаsurement is а соrnerstоne оf trаde, sсienсe, teсhnоlоgy аnd quаntitаtive reseаrсh in mаny disсiрlines. Histоriсаlly, mаny measurement systems existed for the varied fields оf humаn existenсe tо fасilitаte соmраrisоns in these fields. Оften these were асhieved by lосаl аgreements between trading partners or соllаbоrаtоrs. Sinсe the 18th сentury, developments progress towards unifying, widely fcc ehted standards крае resulted in the modern Intеrnаtiоnаl System of Units (SI). This system reduсes аll рhysiсаl meаsurements tо а mаthemаtiсаl combination of seven base units. The science оf measurement is pursued in the field of metrology.
Meаsurement is defined аs the рrосess оf соmраrisоn оf аn unknоwn quаntity with а knоwn оr stаndаrd quаntity.
Classification of Measurement Errors
The Measurement errors can be classified into three different kinds —
-
Random errors
-
Systematic errors
-
Environmental
-
Instrumental
-
Observational
-
Gross errors
Random Errors: When repeated measurements of value are taken, the inconsistencies in the values account for the so-called Random Errors. They are always present within the instrument. They occur with the fluctuations in the values after each measurement.
Systematic Errors: These are not determined by chances but occur due to inaccuracies that are inherent in the system. They are sometimes referred to as Statistical bias. In general, they are constant and are predictable w.r.t. to the true value.
Due to the inappropriate calibration of the instruments or imperfect methods of observation, or due to the interference of the environment with the measurement process the systematic error occurred. Imperfect zeroing of the instrument under study is an example of these errors.
Reasons for Errors
(a) Inherent Shortcomings of Instruments – Such sorts of errors are inbuilt in instruments due to their mechanical design. They might be because of assembling, adjustment or activity of the gadget. These errors might make the error read excessively low or excessively high.
For instance – If the instrument utilizes the frail spring then it gives the high benefit of estimating the amount. The error happens in the instrument due to the grating or hysteresis misfortune.
(b) Misuse of Instrument – The error happens in the instrument due to the issue of the administrator. A decent instrument utilized in an unintelligent manner might give a tremendous outcome.
For instance – The abuse of the instrument might make the disappointment change the zero of instruments, helpless introductory change, utilizing lead to too high opposition. These ill-advised practices may not make extremely durable harm to the instrument, yet no difference, either way, they cause errors.
(c) Loading Effect – It is the most widely recognized kind of error which is brought about by the instrument in estimation work. For instance, when the voltmeter is associated with the high obstruction circuit it gives a deceptive perusing, and when it is associated with the low opposition circuit, it gives the trustworthy perusing. This implies the voltmeter has a stacking impact on the circuit.
Systematic Errors
Systematic errors are errors that have a clear cause and can be eliminated for future experiments. There are four different types of systematic errors:
-
Instrumental: When the instrument being used does not function properly causing error in the experiment.
-
Environmental: When the surrounding environment such as a lab causes errors in the experiment
-
Observational: When the scientist inaccurately reads a measurement wrong such as when not standing straight-on when reading the volume of a flask causing the volume to be incorrectly measured)
-
Theoretical: When the model system being used causes the results to be inaccurate
Systematic measurement errors are also classified as sampling errors and non-sampling errors,
Sampling Errors: Non-representative samples fall under this category.
Non-Sampling Errors: It includes:
-
Paradigm Error: A scientific method to study the measurable phenomenon.
-
Researcher Bias: A researcher is keen to confirm the particular theory which has been devised by him that can influence the decisions.
-
Participant Bias: By Social desirability, supporting or opposing a particular opinion, etc participants are influenced.
-
Reliability and validity of measurement tools.
Gross Error: The gross error arises mainly due to human mistakes or it can also be said to be physical errors. This results in gross error and incorrect data is recorded. By being careful and making sure that the reading that is taken is correct it can be avoided.
The gross error happens on account of human mix-ups. For models consider the individual utilizing the instruments takes some unacceptable perusing, or they can record the mistaken information. Such sort of error goes under the gross error. The gross error must be kept away from by taking the perusing cautiously
For instance – The experimenter peruses the 31.5ºC perusing while the genuine perusing is 21.5Cº. This happens on account of the oversights. The experimenter takes some unacceptable perusing and in light of which the error happens in the estimation.
Such an error is exceptionally normal in the estimation. The total disposal of such an error is beyond the realm of possibilities. A portion of the gross error is effortlessly recognized by the experimenter however some of them are hard to track down.
Two strategies can eliminate the gross error.
-
The perusing ought to be taken cautiously.
-
At least two readings ought to be taken off the estimation amount. The readings are taken by the diverse experimenter and at an alternate point for eliminating the error.
Type A and Type B Evaluation of Uncertainty
The knowledge of an input quantity is taken into the Type A measurement only after considering repeated measured values. For measurement in input or other words of repeated values, we consider the Gaussian distribution.
On the other hand, the scientific judgment or other information concerning the possible values of the quantity has been taken into account by the type B measurement. It can be termed as a Type B evaluation of Uncertainty. Here, we use the concept of a rectangular probability distribution with limits
Statistical Methods of Assessing Measurement Error
To assess the measurement error, which includes there are certain methods that are adopted:
-
Standard Error of Measurement (SEM): About the deviations or true values of how an instrument when used for multiple times produces the desired output is being known with this.
-
Coefficient of Variation (CV): How the values vary on repeated measurements is being defined by it. The results are closer to the true value if the CV is low in value.
-
Limits of Agreement (LOA): Where a proportion of the differences lie between the measurements, it gives the estimate of the interval.
Ways To Minimize Errors
-
Use instruments of higher precision.
-
Improve the experimental techniques.
-
Adjust the zero of the instruments properly.
-
The value of the reading by standing straight to the instrument has been taken and not from the sides to avoid Parallax errors.
-
Take its algebraic mean for a closer result by repeating the experiment several times.
-
Take care of the environment if possible.
-
In order to avoid gross errors carefully take the measurements.
Other Types of Errors
There are various types of errors that can happen in our common day to day life. Some of these are:
-
Absolute Error: The amount of error in the measurement has been definite by absolute error.
-
Greatest Possible Error: This error has been definite as the error which is to be one half of a measuring unit.
-
Instrument Error: The error associated with the instrument is known as instrument error. The inaccuracy of the instrument is being told with this.
-
Operator Error: An operating error is being caused by the operator. E.g. in an experiment to be conducted in the lab, a man notes the voltmeter to read 5 volts, where it was 4 V. Thus, such types of errors are commonly referred to as operator error. They are also called personal errors.
-
Measurement Location Error: Measurement location errors have been caused by the instrument that is kept at a location in which it was not bound to be kept. For example, take the case of a thermometer, which is told to be kept away from the sun. Such cumulative errors are broadly classified under this category.
-
Parallax Error: Due to taking the wrong sides of measurement, parallax error occurred. By standing straight in front of the instrument and not from its sides, always take readings.
-
External Error: External Errors are caused due to external factors like wind, environment, etc. contribute to External errors.
-
Percentage Error: The error that is defined as the ratio of the difference of the actual value and the measured value to the actual value is called a Percentage error.
Fun Facts
-
Two components of measurement which are number and unit can be reduced.
-
The unit depends on what is being measured is the mass, length or some other property.
-
The process of measuring something involves giving a number to some property of the object.