+ All Categories
Home > Documents > Koch, M.: Basic Statistics© Springer-Verlag Berlin Heidelberg 2003 In: Wenclawiak, Koch,...

Koch, M.: Basic Statistics© Springer-Verlag Berlin Heidelberg 2003 In: Wenclawiak, Koch,...

Date post: 11-Jan-2016
Category:
Upload: earl-washington
View: 217 times
Download: 2 times
Share this document with a friend
62
In: Wenclawiak, Koch, Hadjicostas (eds.) Quality Assurance in Analytical Chemistry – Training and Teaching Koch, M.: Basic Statistics © Springer-Verlag Berlin Heidelberg 2003 Basic Statistics Michael Koch
Transcript
Page 1: Koch, M.: Basic Statistics© Springer-Verlag Berlin Heidelberg 2003 In: Wenclawiak, Koch, Hadjicostas (eds.) Quality Assurance in Analytical Chemistry –

In: Wenclawiak, Koch, Hadjicostas (eds.) Quality Assurance in Analytical Chemistry – Training and Teaching

Koch, M.: Basic Statistics © Springer-Verlag Berlin Heidelberg 2003

Basic Statistics

Michael Koch

Page 2: Koch, M.: Basic Statistics© Springer-Verlag Berlin Heidelberg 2003 In: Wenclawiak, Koch, Hadjicostas (eds.) Quality Assurance in Analytical Chemistry –

In: Wenclawiak, Koch, Hadjicostas (eds.) Quality Assurance in Analytical Chemistry – Training and Teaching

Koch, M.: Basic Statistics © Springer-Verlag Berlin Heidelberg 2003

Statistics

Are a tool to answer questions: How accurate is the analyses? How many analyses do I have to make to

overcome problems of inhomogeneity or imprecision?

Does the product meet the specification? With what confidence is the limit

exceeded?

Page 3: Koch, M.: Basic Statistics© Springer-Verlag Berlin Heidelberg 2003 In: Wenclawiak, Koch, Hadjicostas (eds.) Quality Assurance in Analytical Chemistry –

In: Wenclawiak, Koch, Hadjicostas (eds.) Quality Assurance in Analytical Chemistry – Training and Teaching

Koch, M.: Basic Statistics © Springer-Verlag Berlin Heidelberg 2003

Analytical results vary

Because of unavoidable deviation during measurement

Because of inhomogeneity between the subsamples

It is very important to differentiate between these two cases

Because the measures to be taken can be different

Page 4: Koch, M.: Basic Statistics© Springer-Verlag Berlin Heidelberg 2003 In: Wenclawiak, Koch, Hadjicostas (eds.) Quality Assurance in Analytical Chemistry –

In: Wenclawiak, Koch, Hadjicostas (eds.) Quality Assurance in Analytical Chemistry – Training and Teaching

Koch, M.: Basic Statistics © Springer-Verlag Berlin Heidelberg 2003

Variations

Are always present

If there seems to be none, the resolution is not high enough

322322322322322322322322322322322322322322322322322322322322

322,3322,5322,3322,2321,7321,8321,7322,2321,6322,4322,4321,5322,4322,1322,2322,5321,7321,8321,9322,4

322,29322,49322,28322,17321,67321,76321,75322,17321,58322,36322,40321,53322,42322,05322,16322,47321,73321,76321,85322,41

Page 5: Koch, M.: Basic Statistics© Springer-Verlag Berlin Heidelberg 2003 In: Wenclawiak, Koch, Hadjicostas (eds.) Quality Assurance in Analytical Chemistry –

In: Wenclawiak, Koch, Hadjicostas (eds.) Quality Assurance in Analytical Chemistry – Training and Teaching

Koch, M.: Basic Statistics © Springer-Verlag Berlin Heidelberg 2003

Histogram

For the description of the variability Divide range of possible measurements

into a number of groups Count observations in each group Draw a boxplot of the frequencies

Page 6: Koch, M.: Basic Statistics© Springer-Verlag Berlin Heidelberg 2003 In: Wenclawiak, Koch, Hadjicostas (eds.) Quality Assurance in Analytical Chemistry –

In: Wenclawiak, Koch, Hadjicostas (eds.) Quality Assurance in Analytical Chemistry – Training and Teaching

Koch, M.: Basic Statistics © Springer-Verlag Berlin Heidelberg 2003

Histogram

0

2

4

6

8

10

12

14

16

18

20

45,75 46,25 46,75 47,25 47,75 48,25 48,75 49,25 49,75 50,25 50,75

mg/g

fre

qu

en

cy

Page 7: Koch, M.: Basic Statistics© Springer-Verlag Berlin Heidelberg 2003 In: Wenclawiak, Koch, Hadjicostas (eds.) Quality Assurance in Analytical Chemistry –

In: Wenclawiak, Koch, Hadjicostas (eds.) Quality Assurance in Analytical Chemistry – Training and Teaching

Koch, M.: Basic Statistics © Springer-Verlag Berlin Heidelberg 2003

Distribution

If the number of values rises and the groups have a smaller range, the histogramm changes into a distribution curve

Page 8: Koch, M.: Basic Statistics© Springer-Verlag Berlin Heidelberg 2003 In: Wenclawiak, Koch, Hadjicostas (eds.) Quality Assurance in Analytical Chemistry –

In: Wenclawiak, Koch, Hadjicostas (eds.) Quality Assurance in Analytical Chemistry – Training and Teaching

Koch, M.: Basic Statistics © Springer-Verlag Berlin Heidelberg 2003

Cumulative Distribution

The cumulative distribution is obtained by summing up all frequencies

It can be used to find the proportion of values that fall above or below a certain value

For distributions (or histograms) with a maximum, the cumulative distribution curve shows the typical S-shape

Page 9: Koch, M.: Basic Statistics© Springer-Verlag Berlin Heidelberg 2003 In: Wenclawiak, Koch, Hadjicostas (eds.) Quality Assurance in Analytical Chemistry –

In: Wenclawiak, Koch, Hadjicostas (eds.) Quality Assurance in Analytical Chemistry – Training and Teaching

Koch, M.: Basic Statistics © Springer-Verlag Berlin Heidelberg 2003

Cumulative Distribution

0

20

40

60

80

100

45,75 46,75 47,75 48,75 49,75 50,75 mg/g

cu

mu

lati

ve f

req

ue

nc

y [%

]

Page 10: Koch, M.: Basic Statistics© Springer-Verlag Berlin Heidelberg 2003 In: Wenclawiak, Koch, Hadjicostas (eds.) Quality Assurance in Analytical Chemistry –

In: Wenclawiak, Koch, Hadjicostas (eds.) Quality Assurance in Analytical Chemistry – Training and Teaching

Koch, M.: Basic Statistics © Springer-Verlag Berlin Heidelberg 2003

Population – Sample

It is important to know, whether all data (the population) or only a subset (a sample) are known.

PopulationSampleAll inhabitants of a town

A selection of 1000 inhabitants of a townAny number of measurements of copper in a soil

Not possible

Page 11: Koch, M.: Basic Statistics© Springer-Verlag Berlin Heidelberg 2003 In: Wenclawiak, Koch, Hadjicostas (eds.) Quality Assurance in Analytical Chemistry –

In: Wenclawiak, Koch, Hadjicostas (eds.) Quality Assurance in Analytical Chemistry – Training and Teaching

Koch, M.: Basic Statistics © Springer-Verlag Berlin Heidelberg 2003

Distribution Descriptives

The data distribution can be described with four characteristics: Measures of location Measures of dispersion Skewness Kurtosis

Page 12: Koch, M.: Basic Statistics© Springer-Verlag Berlin Heidelberg 2003 In: Wenclawiak, Koch, Hadjicostas (eds.) Quality Assurance in Analytical Chemistry –

In: Wenclawiak, Koch, Hadjicostas (eds.) Quality Assurance in Analytical Chemistry – Training and Teaching

Koch, M.: Basic Statistics © Springer-Verlag Berlin Heidelberg 2003

Location – Arithmetic Mean

Best estimation of the population mean µ for random samples from a population

xx

n

ii

n

1

Page 13: Koch, M.: Basic Statistics© Springer-Verlag Berlin Heidelberg 2003 In: Wenclawiak, Koch, Hadjicostas (eds.) Quality Assurance in Analytical Chemistry –

In: Wenclawiak, Koch, Hadjicostas (eds.) Quality Assurance in Analytical Chemistry – Training and Teaching

Koch, M.: Basic Statistics © Springer-Verlag Berlin Heidelberg 2003

Location – Median

Sort data by value Median is the central value of this series

The median is robust, i.e. it is not affected by extreme values or outliers

Page 14: Koch, M.: Basic Statistics© Springer-Verlag Berlin Heidelberg 2003 In: Wenclawiak, Koch, Hadjicostas (eds.) Quality Assurance in Analytical Chemistry –

In: Wenclawiak, Koch, Hadjicostas (eds.) Quality Assurance in Analytical Chemistry – Training and Teaching

Koch, M.: Basic Statistics © Springer-Verlag Berlin Heidelberg 2003

Location – Mode

The most frequent value

Large number of values necessary It is possible to have two or more

modes

Page 15: Koch, M.: Basic Statistics© Springer-Verlag Berlin Heidelberg 2003 In: Wenclawiak, Koch, Hadjicostas (eds.) Quality Assurance in Analytical Chemistry –

In: Wenclawiak, Koch, Hadjicostas (eds.) Quality Assurance in Analytical Chemistry – Training and Teaching

Koch, M.: Basic Statistics © Springer-Verlag Berlin Heidelberg 2003

Location – Other Means

Geometric mean Harmonic Mean Robust Mean (e.g. Huber mean) ...

Page 16: Koch, M.: Basic Statistics© Springer-Verlag Berlin Heidelberg 2003 In: Wenclawiak, Koch, Hadjicostas (eds.) Quality Assurance in Analytical Chemistry –

In: Wenclawiak, Koch, Hadjicostas (eds.) Quality Assurance in Analytical Chemistry – Training and Teaching

Koch, M.: Basic Statistics © Springer-Verlag Berlin Heidelberg 2003

Dispersion – Variance of a population

The mean of the squares of the deviations of the individual values from the population mean.

2

2

1

( )x

n

ii

n

Page 17: Koch, M.: Basic Statistics© Springer-Verlag Berlin Heidelberg 2003 In: Wenclawiak, Koch, Hadjicostas (eds.) Quality Assurance in Analytical Chemistry –

In: Wenclawiak, Koch, Hadjicostas (eds.) Quality Assurance in Analytical Chemistry – Training and Teaching

Koch, M.: Basic Statistics © Springer-Verlag Berlin Heidelberg 2003

Dispersion – Variance of a sample

Variance of population with (n-1) replacing n and the arithmetic mean of the data (estimated mean) replacing the population mean µ

s

x x

n

ii

n

2

2

1

1

( )

Page 18: Koch, M.: Basic Statistics© Springer-Verlag Berlin Heidelberg 2003 In: Wenclawiak, Koch, Hadjicostas (eds.) Quality Assurance in Analytical Chemistry –

In: Wenclawiak, Koch, Hadjicostas (eds.) Quality Assurance in Analytical Chemistry – Training and Teaching

Koch, M.: Basic Statistics © Springer-Verlag Berlin Heidelberg 2003

Dispersion – standard deviation

The positive square root of the variance

Population

( )x

n

ii

n2

1

Sample

s

x x

n

ii

n

( ) 21

1

Page 19: Koch, M.: Basic Statistics© Springer-Verlag Berlin Heidelberg 2003 In: Wenclawiak, Koch, Hadjicostas (eds.) Quality Assurance in Analytical Chemistry –

In: Wenclawiak, Koch, Hadjicostas (eds.) Quality Assurance in Analytical Chemistry – Training and Teaching

Koch, M.: Basic Statistics © Springer-Verlag Berlin Heidelberg 2003

Dispersion – standard deviation of the mean

possible means of a data set are more closely clustered than the data

sdmn

where n is the number of

measurements for the mean

Page 20: Koch, M.: Basic Statistics© Springer-Verlag Berlin Heidelberg 2003 In: Wenclawiak, Koch, Hadjicostas (eds.) Quality Assurance in Analytical Chemistry –

In: Wenclawiak, Koch, Hadjicostas (eds.) Quality Assurance in Analytical Chemistry – Training and Teaching

Koch, M.: Basic Statistics © Springer-Verlag Berlin Heidelberg 2003

Dispersion – relative standard deviation

measure of the spread of data compared with the mean

RSDs

x

Page 21: Koch, M.: Basic Statistics© Springer-Verlag Berlin Heidelberg 2003 In: Wenclawiak, Koch, Hadjicostas (eds.) Quality Assurance in Analytical Chemistry –

In: Wenclawiak, Koch, Hadjicostas (eds.) Quality Assurance in Analytical Chemistry – Training and Teaching

Koch, M.: Basic Statistics © Springer-Verlag Berlin Heidelberg 2003

Skewness

The skewness of a distribution represents the degree of symmetry Analytical data close to the detection

limit often are skewed because negative concentrations are not possible

Page 22: Koch, M.: Basic Statistics© Springer-Verlag Berlin Heidelberg 2003 In: Wenclawiak, Koch, Hadjicostas (eds.) Quality Assurance in Analytical Chemistry –

In: Wenclawiak, Koch, Hadjicostas (eds.) Quality Assurance in Analytical Chemistry – Training and Teaching

Koch, M.: Basic Statistics © Springer-Verlag Berlin Heidelberg 2003

Kurtosis

The peakedness of a distribution. Flat platykurtic Very peaked leptokurtic In between mesokurtic

Page 23: Koch, M.: Basic Statistics© Springer-Verlag Berlin Heidelberg 2003 In: Wenclawiak, Koch, Hadjicostas (eds.) Quality Assurance in Analytical Chemistry –

In: Wenclawiak, Koch, Hadjicostas (eds.) Quality Assurance in Analytical Chemistry – Training and Teaching

Koch, M.: Basic Statistics © Springer-Verlag Berlin Heidelberg 2003

Normal distribution - I

First studied in the 18th century by Carl Friedrich Gauss

He found that the distribution of errors could be closely approximated by a curve called the „normal curve of errors“

Page 24: Koch, M.: Basic Statistics© Springer-Verlag Berlin Heidelberg 2003 In: Wenclawiak, Koch, Hadjicostas (eds.) Quality Assurance in Analytical Chemistry –

In: Wenclawiak, Koch, Hadjicostas (eds.) Quality Assurance in Analytical Chemistry – Training and Teaching

Koch, M.: Basic Statistics © Springer-Verlag Berlin Heidelberg 2003

Normal distribution - II

bell shaped completely determined by µ and σ

µ

y ex

1

2

2

22

( )

Page 25: Koch, M.: Basic Statistics© Springer-Verlag Berlin Heidelberg 2003 In: Wenclawiak, Koch, Hadjicostas (eds.) Quality Assurance in Analytical Chemistry –

In: Wenclawiak, Koch, Hadjicostas (eds.) Quality Assurance in Analytical Chemistry – Training and Teaching

Koch, M.: Basic Statistics © Springer-Verlag Berlin Heidelberg 2003

Normal distribution – important properties

the curve is symmetrical about µ the greater the value of σ the greater the spread of the curve approximately 68% (68,27%) of the data lies within µ±1σ approximately 95 % (95,45%) of the data lies within µ±2σ approximately 99,7 % (99,73%) of the data lies within µ±3σ

µµµµ

Page 26: Koch, M.: Basic Statistics© Springer-Verlag Berlin Heidelberg 2003 In: Wenclawiak, Koch, Hadjicostas (eds.) Quality Assurance in Analytical Chemistry –

In: Wenclawiak, Koch, Hadjicostas (eds.) Quality Assurance in Analytical Chemistry – Training and Teaching

Koch, M.: Basic Statistics © Springer-Verlag Berlin Heidelberg 2003

Normal distribution – a useful model

With the mean and the standard deviation area under the curve can be defined

These areas can be interpreted as proportions of observations falling within these ranges defined by µ and σ

Page 27: Koch, M.: Basic Statistics© Springer-Verlag Berlin Heidelberg 2003 In: Wenclawiak, Koch, Hadjicostas (eds.) Quality Assurance in Analytical Chemistry –

In: Wenclawiak, Koch, Hadjicostas (eds.) Quality Assurance in Analytical Chemistry – Training and Teaching

Koch, M.: Basic Statistics © Springer-Verlag Berlin Heidelberg 2003

One-tailed / two-tailed questions

One-tailed e.g. a limit for the

specification of a product

It is only of interest, whether a certain limit is exceeded or not

Two-tailed e.g. the analysis of a

reference material we want to know,

whether a measured value lies within or outside the certified value ± a certain range

Page 28: Koch, M.: Basic Statistics© Springer-Verlag Berlin Heidelberg 2003 In: Wenclawiak, Koch, Hadjicostas (eds.) Quality Assurance in Analytical Chemistry –

In: Wenclawiak, Koch, Hadjicostas (eds.) Quality Assurance in Analytical Chemistry – Training and Teaching

Koch, M.: Basic Statistics © Springer-Verlag Berlin Heidelberg 2003

One-sided probabilities

µ-k µ µ+k

Less than µ + k , P(<)Greater than µ + k , P(>)

k P(<) P(>)1 0.841 0.1591.64 0.950 0.0501.96 0.975 0.0252 0.977 0.0233 0.998 0.002

One-sided Probabilities

Page 29: Koch, M.: Basic Statistics© Springer-Verlag Berlin Heidelberg 2003 In: Wenclawiak, Koch, Hadjicostas (eds.) Quality Assurance in Analytical Chemistry –

In: Wenclawiak, Koch, Hadjicostas (eds.) Quality Assurance in Analytical Chemistry – Training and Teaching

Koch, M.: Basic Statistics © Springer-Verlag Berlin Heidelberg 2003

Two-sided probabilities

µ-k µ µ+k

Area within the range µ ± k , P(in)Area outside the range µ ± k , P(out)

k P(in) P(out)1 0.683 0.3171.64 0.900 0.1001.96 0.950 0.0502 0.954 0.0463 0.997 0.003

Two-sided Probabilities

Page 30: Koch, M.: Basic Statistics© Springer-Verlag Berlin Heidelberg 2003 In: Wenclawiak, Koch, Hadjicostas (eds.) Quality Assurance in Analytical Chemistry –

In: Wenclawiak, Koch, Hadjicostas (eds.) Quality Assurance in Analytical Chemistry – Training and Teaching

Koch, M.: Basic Statistics © Springer-Verlag Berlin Heidelberg 2003

Population / sample small random samples of results from a

normal distribution may not look normal at first sight e.g. samples from a population of 996 data

sample of 10 values

0

0,5

1

1,5

2

2,5

3

3,5

42 50 58 66 74 82 90 98 106

114

122

130

138

146

154

value

fre

qu

en

cy

sample of 20 values

0

0,5

1

1,5

2

2,5

3

3,5

42 50 58 66 74 82 90 98 106

114

122

130

138

146

154

value

fre

qu

en

cy

sample of 50 values

0

1

2

3

4

5

6

42 50 58 66 74 82 90 98 106

114

122

130

138

146

154

value

fre

qu

en

cy

sample of 100 values

0

1

2

3

4

5

6

7

8

value

fre

qu

en

cy

sample of 249 values

0

2

4

6

8

10

12

value

fre

qu

en

cy

sample of 498 values

0

5

10

15

20

25

value

fre

qu

en

cy

all 996 values

05

1015202530354045

value

freq

uenc

y

Population: µ=100, σ=19,75Sample size mean s

10 99,6 20,48

20 101,8 18,14

50 97,76 19,13

100 96,52 20,14

249 98,63 20,10

498 97,81 20,49

Page 31: Koch, M.: Basic Statistics© Springer-Verlag Berlin Heidelberg 2003 In: Wenclawiak, Koch, Hadjicostas (eds.) Quality Assurance in Analytical Chemistry –

In: Wenclawiak, Koch, Hadjicostas (eds.) Quality Assurance in Analytical Chemistry – Training and Teaching

Koch, M.: Basic Statistics © Springer-Verlag Berlin Heidelberg 2003

Confidence limits / intervals

The confidence limits describe the range within which we expect with given confidence the true value to lie.

Page 32: Koch, M.: Basic Statistics© Springer-Verlag Berlin Heidelberg 2003 In: Wenclawiak, Koch, Hadjicostas (eds.) Quality Assurance in Analytical Chemistry –

In: Wenclawiak, Koch, Hadjicostas (eds.) Quality Assurance in Analytical Chemistry – Training and Teaching

Koch, M.: Basic Statistics © Springer-Verlag Berlin Heidelberg 2003

Confidence limits for the population mean

If the population standard deviation σ and the estimated mean are known:

CL for the population mean, estimated from n measurements on a confidence level of 95%

CL xn

1 96.

Page 33: Koch, M.: Basic Statistics© Springer-Verlag Berlin Heidelberg 2003 In: Wenclawiak, Koch, Hadjicostas (eds.) Quality Assurance in Analytical Chemistry –

In: Wenclawiak, Koch, Hadjicostas (eds.) Quality Assurance in Analytical Chemistry – Training and Teaching

Koch, M.: Basic Statistics © Springer-Verlag Berlin Heidelberg 2003

Confidence limits for the population mean

If the standard deviation has to be estimated from a sample standard deviation s (this is normally the case):

Additional uncertainty use of Student t distribution

CL x ts

n

Where t is the two-tailed tabulated value for (n-1) degrees of freedom and the requested confidence limit

Page 34: Koch, M.: Basic Statistics© Springer-Verlag Berlin Heidelberg 2003 In: Wenclawiak, Koch, Hadjicostas (eds.) Quality Assurance in Analytical Chemistry –

In: Wenclawiak, Koch, Hadjicostas (eds.) Quality Assurance in Analytical Chemistry – Training and Teaching

Koch, M.: Basic Statistics © Springer-Verlag Berlin Heidelberg 2003

t-valuesCritical values for the Student t-Testd.o.f. \ 1T 60% 75% 80% 85% 90% 95% 97,50% 99% 99,50% 99,90% 99,95%d.o.f. \ 2T 20% 50% 60% 70% 80% 90% 95% 98% 99% 99,80% 99,90%

1 0,3249 1,000 1,376 1,963 3,078 6,314 12,71 31,82 63,66 318,3 636,62 0,2887 0,8165 1,061 1,386 1,886 2,920 4,303 6,965 9,925 22,33 31,603 0,2767 0,7649 0,9785 1,250 1,638 2,353 3,182 4,541 5,841 10,21 12,944 0,2707 0,7407 0,9410 1,190 1,533 2,132 2,776 3,747 4,604 7,173 8,6105 0,2672 0,7267 0,9195 1,156 1,476 2,015 2,571 3,365 4,032 5,893 6,8696 0,2648 0,7176 0,9057 1,134 1,440 1,943 2,447 3,143 3,707 5,208 5,9597 0,2632 0,7111 0,8960 1,119 1,415 1,895 2,365 2,998 3,499 4,785 5,4088 0,2619 0,7064 0,8889 1,108 1,397 1,860 2,306 2,896 3,355 4,501 5,0419 0,2610 0,7027 0,8834 1,100 1,383 1,833 2,262 2,821 3,250 4,297 4,781

10 0,2602 0,6998 0,8791 1,093 1,372 1,812 2,228 2,764 3,169 4,144 4,58711 0,2596 0,6974 0,8755 1,088 1,363 1,796 2,201 2,718 3,106 4,025 4,43712 0,2590 0,6955 0,8726 1,083 1,356 1,782 2,179 2,681 3,055 3,930 4,31813 0,2586 0,6938 0,8702 1,079 1,350 1,771 2,160 2,650 3,012 3,852 4,22114 0,2582 0,6924 0,8681 1,079 1,345 1,761 2,145 2,624 2,977 3,787 4,14015 0,2579 0,6912 0,8662 1,074 1,341 1,753 2,131 2,602 2,947 3,733 4,07316 0,2576 0,6901 0,8647 1,071 1,337 1,746 2,120 2,583 2,921 3,686 4,01517 0,2573 0,6892 0,8633 1,069 1,333 1,740 2,110 2,567 2,898 3,646 3,96518 0,2571 0,6884 0,8620 1,067 1,330 1,734 2,201 2,552 2,878 3,610 3,92219 0,2569 0,6876 0,8610 1,066 1,328 1,729 2,093 2,539 2,861 3,579 3,88320 0,2567 0,6870 0,8600 1,064 1,325 1,725 2,086 2,528 2,845 3,552 3,850

Page 35: Koch, M.: Basic Statistics© Springer-Verlag Berlin Heidelberg 2003 In: Wenclawiak, Koch, Hadjicostas (eds.) Quality Assurance in Analytical Chemistry –

In: Wenclawiak, Koch, Hadjicostas (eds.) Quality Assurance in Analytical Chemistry – Training and Teaching

Koch, M.: Basic Statistics © Springer-Verlag Berlin Heidelberg 2003

Confidence limits – depending on n

Confidence Limits are strongly dependent on the number of measurements

µ

CLn=1

µ

CLn=1

CLn=3

µ

CLn=1

CLn=3

CLn=10

Page 36: Koch, M.: Basic Statistics© Springer-Verlag Berlin Heidelberg 2003 In: Wenclawiak, Koch, Hadjicostas (eds.) Quality Assurance in Analytical Chemistry –

In: Wenclawiak, Koch, Hadjicostas (eds.) Quality Assurance in Analytical Chemistry – Training and Teaching

Koch, M.: Basic Statistics © Springer-Verlag Berlin Heidelberg 2003

Trueness – Precision – Accuracy - Error Trueness

the closeness of agreement between the true value and the sample mean

Precision the degree of agreement between the results of

repeated measurements Accuracy

the sum of both, trueness and precision Error

the difference between one measurement and the true value

Page 37: Koch, M.: Basic Statistics© Springer-Verlag Berlin Heidelberg 2003 In: Wenclawiak, Koch, Hadjicostas (eds.) Quality Assurance in Analytical Chemistry –

In: Wenclawiak, Koch, Hadjicostas (eds.) Quality Assurance in Analytical Chemistry – Training and Teaching

Koch, M.: Basic Statistics © Springer-Verlag Berlin Heidelberg 2003

Linkage

Improvingaccuracy

Improving precision

Error

Impr

ovin

g tr

uene

ss

from: LGC, vamstat II

Page 38: Koch, M.: Basic Statistics© Springer-Verlag Berlin Heidelberg 2003 In: Wenclawiak, Koch, Hadjicostas (eds.) Quality Assurance in Analytical Chemistry –

In: Wenclawiak, Koch, Hadjicostas (eds.) Quality Assurance in Analytical Chemistry – Training and Teaching

Koch, M.: Basic Statistics © Springer-Verlag Berlin Heidelberg 2003

Significance Testing - ?

Typical Situation: We have analyzed a reference material

and we want to know, if the bias between the mean of measurements and the certified value of a reference material is „significant“ or just due to random errors.

This is a typical task for significance testing

Page 39: Koch, M.: Basic Statistics© Springer-Verlag Berlin Heidelberg 2003 In: Wenclawiak, Koch, Hadjicostas (eds.) Quality Assurance in Analytical Chemistry –

In: Wenclawiak, Koch, Hadjicostas (eds.) Quality Assurance in Analytical Chemistry – Training and Teaching

Koch, M.: Basic Statistics © Springer-Verlag Berlin Heidelberg 2003

Significance testing

a decision is made about a population based on observations made from a sample of the

population on a certain level of confidence

there can never be any guarantee that the decision is absolutely correct

on a 95% confidence level there is a probability of 5% that the decision is wrong

Page 40: Koch, M.: Basic Statistics© Springer-Verlag Berlin Heidelberg 2003 In: Wenclawiak, Koch, Hadjicostas (eds.) Quality Assurance in Analytical Chemistry –

In: Wenclawiak, Koch, Hadjicostas (eds.) Quality Assurance in Analytical Chemistry – Training and Teaching

Koch, M.: Basic Statistics © Springer-Verlag Berlin Heidelberg 2003

Before significance testing

inspect a graphical display of the data before using any statistics

are there any data that are not suitable? is a significance test really necessary? compare the statistical result with a

commonsense visual appraisal a difference may be statistically significant,

but not of any importance, because it is small

Page 41: Koch, M.: Basic Statistics© Springer-Verlag Berlin Heidelberg 2003 In: Wenclawiak, Koch, Hadjicostas (eds.) Quality Assurance in Analytical Chemistry –

In: Wenclawiak, Koch, Hadjicostas (eds.) Quality Assurance in Analytical Chemistry – Training and Teaching

Koch, M.: Basic Statistics © Springer-Verlag Berlin Heidelberg 2003

The most important thing in statistics

No blind use

Page 42: Koch, M.: Basic Statistics© Springer-Verlag Berlin Heidelberg 2003 In: Wenclawiak, Koch, Hadjicostas (eds.) Quality Assurance in Analytical Chemistry –

In: Wenclawiak, Koch, Hadjicostas (eds.) Quality Assurance in Analytical Chemistry – Training and Teaching

Koch, M.: Basic Statistics © Springer-Verlag Berlin Heidelberg 2003

The stages of significance testing - I

Null hypothesis a statement about the population, e.g.:

There is no bias (µ = xtrue)

Alternative hypothesis H1

the opposite of the null hypothesis, e.g.: there is a bias (µ ≠ xtrue)

Page 43: Koch, M.: Basic Statistics© Springer-Verlag Berlin Heidelberg 2003 In: Wenclawiak, Koch, Hadjicostas (eds.) Quality Assurance in Analytical Chemistry –

In: Wenclawiak, Koch, Hadjicostas (eds.) Quality Assurance in Analytical Chemistry – Training and Teaching

Koch, M.: Basic Statistics © Springer-Verlag Berlin Heidelberg 2003

The stages of significance testing - II

Critical value the tabulated value of the test statistics,

generally at the 95% confidence level. If the calculated value is greater than the

tabulated value than the test result indicates a significant difference

Evaluation of the test statistics

Decisions and conclusions

Page 44: Koch, M.: Basic Statistics© Springer-Verlag Berlin Heidelberg 2003 In: Wenclawiak, Koch, Hadjicostas (eds.) Quality Assurance in Analytical Chemistry –

In: Wenclawiak, Koch, Hadjicostas (eds.) Quality Assurance in Analytical Chemistry – Training and Teaching

Koch, M.: Basic Statistics © Springer-Verlag Berlin Heidelberg 2003

Significance test example – the question

Is the bias between the mean of measurements and the certified value of a reference material significant or just due to random errors?

Page 45: Koch, M.: Basic Statistics© Springer-Verlag Berlin Heidelberg 2003 In: Wenclawiak, Koch, Hadjicostas (eds.) Quality Assurance in Analytical Chemistry –

In: Wenclawiak, Koch, Hadjicostas (eds.) Quality Assurance in Analytical Chemistry – Training and Teaching

Koch, M.: Basic Statistics © Springer-Verlag Berlin Heidelberg 2003

Significance test example – graphical explanation

x ref

95% confidence interval for the population mean µ

x

significant

x ref

95% confidence interval for the population mean µ

x

not significant

the reference value is outside the 95% confidence limits of the data set

the bias is unlikely to result from random variations in the results

the bias is significant

the reference value is inside the 95% confidence limits of the data set

the bias is likely to result from random variations in the results

the bias is not significant

Page 46: Koch, M.: Basic Statistics© Springer-Verlag Berlin Heidelberg 2003 In: Wenclawiak, Koch, Hadjicostas (eds.) Quality Assurance in Analytical Chemistry –

In: Wenclawiak, Koch, Hadjicostas (eds.) Quality Assurance in Analytical Chemistry – Training and Teaching

Koch, M.: Basic Statistics © Springer-Verlag Berlin Heidelberg 2003

Significance test example - calculation

The Null hypothesis we assume there is no bias: x=xref

Calculation of confidence limits x

t s

nx

t s

n

With the Null hypothesis x

t s

nx x

t s

nre f

t t

x x

s

n

critica l observed

ref

or

Page 47: Koch, M.: Basic Statistics© Springer-Verlag Berlin Heidelberg 2003 In: Wenclawiak, Koch, Hadjicostas (eds.) Quality Assurance in Analytical Chemistry –

In: Wenclawiak, Koch, Hadjicostas (eds.) Quality Assurance in Analytical Chemistry – Training and Teaching

Koch, M.: Basic Statistics © Springer-Verlag Berlin Heidelberg 2003

t-valuesCritical values for the Student t-Testd.o.f. \ 1T 60% 75% 80% 85% 90% 95% 97,50% 99% 99,50% 99,90% 99,95%d.o.f. \ 2T 20% 50% 60% 70% 80% 90% 95% 98% 99% 99,80% 99,90%

1 0,3249 1,000 1,376 1,963 3,078 6,314 12,71 31,82 63,66 318,3 636,62 0,2887 0,8165 1,061 1,386 1,886 2,920 4,303 6,965 9,925 22,33 31,603 0,2767 0,7649 0,9785 1,250 1,638 2,353 3,182 4,541 5,841 10,21 12,944 0,2707 0,7407 0,9410 1,190 1,533 2,132 2,776 3,747 4,604 7,173 8,6105 0,2672 0,7267 0,9195 1,156 1,476 2,015 2,571 3,365 4,032 5,893 6,8696 0,2648 0,7176 0,9057 1,134 1,440 1,943 2,447 3,143 3,707 5,208 5,9597 0,2632 0,7111 0,8960 1,119 1,415 1,895 2,365 2,998 3,499 4,785 5,4088 0,2619 0,7064 0,8889 1,108 1,397 1,860 2,306 2,896 3,355 4,501 5,0419 0,2610 0,7027 0,8834 1,100 1,383 1,833 2,262 2,821 3,250 4,297 4,781

10 0,2602 0,6998 0,8791 1,093 1,372 1,812 2,228 2,764 3,169 4,144 4,58711 0,2596 0,6974 0,8755 1,088 1,363 1,796 2,201 2,718 3,106 4,025 4,43712 0,2590 0,6955 0,8726 1,083 1,356 1,782 2,179 2,681 3,055 3,930 4,31813 0,2586 0,6938 0,8702 1,079 1,350 1,771 2,160 2,650 3,012 3,852 4,22114 0,2582 0,6924 0,8681 1,079 1,345 1,761 2,145 2,624 2,977 3,787 4,14015 0,2579 0,6912 0,8662 1,074 1,341 1,753 2,131 2,602 2,947 3,733 4,07316 0,2576 0,6901 0,8647 1,071 1,337 1,746 2,120 2,583 2,921 3,686 4,01517 0,2573 0,6892 0,8633 1,069 1,333 1,740 2,110 2,567 2,898 3,646 3,96518 0,2571 0,6884 0,8620 1,067 1,330 1,734 2,201 2,552 2,878 3,610 3,92219 0,2569 0,6876 0,8610 1,066 1,328 1,729 2,093 2,539 2,861 3,579 3,88320 0,2567 0,6870 0,8600 1,064 1,325 1,725 2,086 2,528 2,845 3,552 3,850

Page 48: Koch, M.: Basic Statistics© Springer-Verlag Berlin Heidelberg 2003 In: Wenclawiak, Koch, Hadjicostas (eds.) Quality Assurance in Analytical Chemistry –

In: Wenclawiak, Koch, Hadjicostas (eds.) Quality Assurance in Analytical Chemistry – Training and Teaching

Koch, M.: Basic Statistics © Springer-Verlag Berlin Heidelberg 2003

Significance test example – conclusions I

If we find tcritical > tobserved

conforming to our Null hypothesis

we can say: we cannot reject the Null hypothesis we accept the Null hypothesis we find no significant bias at the 95% level we find no measurable bias (under the given

experimental conditions) but not: there is no bias

Page 49: Koch, M.: Basic Statistics© Springer-Verlag Berlin Heidelberg 2003 In: Wenclawiak, Koch, Hadjicostas (eds.) Quality Assurance in Analytical Chemistry –

In: Wenclawiak, Koch, Hadjicostas (eds.) Quality Assurance in Analytical Chemistry – Training and Teaching

Koch, M.: Basic Statistics © Springer-Verlag Berlin Heidelberg 2003

Significance test example – conclusions II

If we find tcritical < tobserved

not conforming to our Null hypothesis

we can say: we reject the null hypothesis we find significant bias at the 95% level but not: there is a bias, because there is

roughly a 5% chance of rejecting the Null hypothesis when it is true.

Page 50: Koch, M.: Basic Statistics© Springer-Verlag Berlin Heidelberg 2003 In: Wenclawiak, Koch, Hadjicostas (eds.) Quality Assurance in Analytical Chemistry –

In: Wenclawiak, Koch, Hadjicostas (eds.) Quality Assurance in Analytical Chemistry – Training and Teaching

Koch, M.: Basic Statistics © Springer-Verlag Berlin Heidelberg 2003

One-sample t-test

This test compares the mean of analytical results with a stated value

The question may be two-tailed as in the previous example

or one-tailed e.g.: is the copper content of an alloy below the

specification? for one-tailed questions other t-values are used

Page 51: Koch, M.: Basic Statistics© Springer-Verlag Berlin Heidelberg 2003 In: Wenclawiak, Koch, Hadjicostas (eds.) Quality Assurance in Analytical Chemistry –

In: Wenclawiak, Koch, Hadjicostas (eds.) Quality Assurance in Analytical Chemistry – Training and Teaching

Koch, M.: Basic Statistics © Springer-Verlag Berlin Heidelberg 2003

t-valuesCritical values for the Student t-Testd.o.f. \ 1T 60% 75% 80% 85% 90% 95% 97,50% 99% 99,50% 99,90% 99,95%d.o.f. \ 2T 20% 50% 60% 70% 80% 90% 95% 98% 99% 99,80% 99,90%

1 0,3249 1,000 1,376 1,963 3,078 6,314 12,71 31,82 63,66 318,3 636,62 0,2887 0,8165 1,061 1,386 1,886 2,920 4,303 6,965 9,925 22,33 31,603 0,2767 0,7649 0,9785 1,250 1,638 2,353 3,182 4,541 5,841 10,21 12,944 0,2707 0,7407 0,9410 1,190 1,533 2,132 2,776 3,747 4,604 7,173 8,6105 0,2672 0,7267 0,9195 1,156 1,476 2,015 2,571 3,365 4,032 5,893 6,8696 0,2648 0,7176 0,9057 1,134 1,440 1,943 2,447 3,143 3,707 5,208 5,9597 0,2632 0,7111 0,8960 1,119 1,415 1,895 2,365 2,998 3,499 4,785 5,4088 0,2619 0,7064 0,8889 1,108 1,397 1,860 2,306 2,896 3,355 4,501 5,0419 0,2610 0,7027 0,8834 1,100 1,383 1,833 2,262 2,821 3,250 4,297 4,781

10 0,2602 0,6998 0,8791 1,093 1,372 1,812 2,228 2,764 3,169 4,144 4,58711 0,2596 0,6974 0,8755 1,088 1,363 1,796 2,201 2,718 3,106 4,025 4,43712 0,2590 0,6955 0,8726 1,083 1,356 1,782 2,179 2,681 3,055 3,930 4,31813 0,2586 0,6938 0,8702 1,079 1,350 1,771 2,160 2,650 3,012 3,852 4,22114 0,2582 0,6924 0,8681 1,079 1,345 1,761 2,145 2,624 2,977 3,787 4,14015 0,2579 0,6912 0,8662 1,074 1,341 1,753 2,131 2,602 2,947 3,733 4,07316 0,2576 0,6901 0,8647 1,071 1,337 1,746 2,120 2,583 2,921 3,686 4,01517 0,2573 0,6892 0,8633 1,069 1,333 1,740 2,110 2,567 2,898 3,646 3,96518 0,2571 0,6884 0,8620 1,067 1,330 1,734 2,201 2,552 2,878 3,610 3,92219 0,2569 0,6876 0,8610 1,066 1,328 1,729 2,093 2,539 2,861 3,579 3,88320 0,2567 0,6870 0,8600 1,064 1,325 1,725 2,086 2,528 2,845 3,552 3,850

Page 52: Koch, M.: Basic Statistics© Springer-Verlag Berlin Heidelberg 2003 In: Wenclawiak, Koch, Hadjicostas (eds.) Quality Assurance in Analytical Chemistry –

In: Wenclawiak, Koch, Hadjicostas (eds.) Quality Assurance in Analytical Chemistry – Training and Teaching

Koch, M.: Basic Statistics © Springer-Verlag Berlin Heidelberg 2003

Two-sample t-test

This test compares two independent sets of data (analytical results)

The question may be two-tailed or one-tailed as well two-tailed: are the results of two methods

significantly different one-tailed: are the results of method A

significantly lower than the results of method B

Page 53: Koch, M.: Basic Statistics© Springer-Verlag Berlin Heidelberg 2003 In: Wenclawiak, Koch, Hadjicostas (eds.) Quality Assurance in Analytical Chemistry –

In: Wenclawiak, Koch, Hadjicostas (eds.) Quality Assurance in Analytical Chemistry – Training and Teaching

Koch, M.: Basic Statistics © Springer-Verlag Berlin Heidelberg 2003

two-sample t-test – t-value

The formula for the calculation of the t-value is somewhat more complex:

21111

)(

21

2221

21

21

21

nnnsns

nn

xxtobserved

The degree of freedom for the tabulated value tcritical is

d.o.f = n1+n2-2

Page 54: Koch, M.: Basic Statistics© Springer-Verlag Berlin Heidelberg 2003 In: Wenclawiak, Koch, Hadjicostas (eds.) Quality Assurance in Analytical Chemistry –

In: Wenclawiak, Koch, Hadjicostas (eds.) Quality Assurance in Analytical Chemistry – Training and Teaching

Koch, M.: Basic Statistics © Springer-Verlag Berlin Heidelberg 2003

two-sample t-test - validity

The two-sample t-test is valid only if there is no significant difference between the both standard deviations s1 and s2.

This can (and should) be tested with the F-test

Page 55: Koch, M.: Basic Statistics© Springer-Verlag Berlin Heidelberg 2003 In: Wenclawiak, Koch, Hadjicostas (eds.) Quality Assurance in Analytical Chemistry –

In: Wenclawiak, Koch, Hadjicostas (eds.) Quality Assurance in Analytical Chemistry – Training and Teaching

Koch, M.: Basic Statistics © Springer-Verlag Berlin Heidelberg 2003

F-test

The F-test is a method of testing whether two independent estimates of variance are significantly different.

The F-value is calculated according to Fobserved=sA

2/sB2, with sA

2 > sB2

The critical value of F depends on the degrees of freedom dofA=nA-1 and dofB=nB-1 and can be taken from a table

Page 56: Koch, M.: Basic Statistics© Springer-Verlag Berlin Heidelberg 2003 In: Wenclawiak, Koch, Hadjicostas (eds.) Quality Assurance in Analytical Chemistry –

In: Wenclawiak, Koch, Hadjicostas (eds.) Quality Assurance in Analytical Chemistry – Training and Teaching

Koch, M.: Basic Statistics © Springer-Verlag Berlin Heidelberg 2003

nA=10

nB=10

F-values

1 2 3 4 5 6 7 8 9 10

1 647,79 799,50 864,16 899,58 921,85 937,11 948,22 956,66 963,29 968,63

2 38,51 39,00 39,17 39,25 39,30 39,33 39,36 39,37 39,39 39,40

3 17,44 16,04 15,44 15,10 14,89 14,74 14,62 14,54 14,47 14,42

4 12,22 10,65 9,98 9,61 9,36 9,20 9,07 8,98 8,91 8,84

5 10,01 8,43 7,76 7,39 7,15 6,98 6,85 6,76 6,68 6,62

6 8,81 7,26 6,60 6,23 5,99 5,82 5,70 5,60 5,52 5,46

7 8,07 6,54 5,89 5,52 5,29 5,12 5,00 4,90 4,82 4,76

8 7,57 6,06 5,42 5,05 4,82 4,65 4,53 4,43 4,36 4,30

9 7,21 5,72 5,08 4,72 4,48 4,32 4,20 4,10 4,03 3,96

10 6,94 5,46 4,83 4,47 4,24 4,07 3,95 3,86 3,78 3,72

degrees of freedom for numerator (A)

deg

rees

of

free

dom

fo

r de

num

erat

or (

B)

F statistics for P=0.05

Page 57: Koch, M.: Basic Statistics© Springer-Verlag Berlin Heidelberg 2003 In: Wenclawiak, Koch, Hadjicostas (eds.) Quality Assurance in Analytical Chemistry –

In: Wenclawiak, Koch, Hadjicostas (eds.) Quality Assurance in Analytical Chemistry – Training and Teaching

Koch, M.: Basic Statistics © Springer-Verlag Berlin Heidelberg 2003

F-test - decision

If Fobserved < Fcritical than the variances sA2

and sB2 are not significantly different

on the chosen confidence level

Page 58: Koch, M.: Basic Statistics© Springer-Verlag Berlin Heidelberg 2003 In: Wenclawiak, Koch, Hadjicostas (eds.) Quality Assurance in Analytical Chemistry –

In: Wenclawiak, Koch, Hadjicostas (eds.) Quality Assurance in Analytical Chemistry – Training and Teaching

Koch, M.: Basic Statistics © Springer-Verlag Berlin Heidelberg 2003

Paired t-test

Two different methods are applied for measuring different samples

sample 1sample 2

sample 3sample 4

sample 5sample 6

0

0,5

1

1,5

2

2,5

3

Method A

Method B

Two-sample t-test is inappropriate, because differences between samples ar much greater than a bias between the two methods

paired t-test

Page 59: Koch, M.: Basic Statistics© Springer-Verlag Berlin Heidelberg 2003 In: Wenclawiak, Koch, Hadjicostas (eds.) Quality Assurance in Analytical Chemistry –

In: Wenclawiak, Koch, Hadjicostas (eds.) Quality Assurance in Analytical Chemistry – Training and Teaching

Koch, M.: Basic Statistics © Springer-Verlag Berlin Heidelberg 2003

Paired t-test - principle

The differences between both methods are calculated

A one-sample t-test is applied on the differences to decide whether the mean difference is significantly different from zero.

Page 60: Koch, M.: Basic Statistics© Springer-Verlag Berlin Heidelberg 2003 In: Wenclawiak, Koch, Hadjicostas (eds.) Quality Assurance in Analytical Chemistry –

In: Wenclawiak, Koch, Hadjicostas (eds.) Quality Assurance in Analytical Chemistry – Training and Teaching

Koch, M.: Basic Statistics © Springer-Verlag Berlin Heidelberg 2003

Paired t-test – test statistics

The t-value is calculated according to:

n

s

x

observed diff

diff

t

The critical t-value is taken from the t-table for the desired confidence level

Page 61: Koch, M.: Basic Statistics© Springer-Verlag Berlin Heidelberg 2003 In: Wenclawiak, Koch, Hadjicostas (eds.) Quality Assurance in Analytical Chemistry –

In: Wenclawiak, Koch, Hadjicostas (eds.) Quality Assurance in Analytical Chemistry – Training and Teaching

Koch, M.: Basic Statistics © Springer-Verlag Berlin Heidelberg 2003

Statistics

There are many statistical tools for the analyst to interpret analytical data

Statistics are not a „cure-all“ that can answer all questions

If the analyst uses statistics with a commonsense, this is a powerful tool to help him to answer his customers or his own questions.

Page 62: Koch, M.: Basic Statistics© Springer-Verlag Berlin Heidelberg 2003 In: Wenclawiak, Koch, Hadjicostas (eds.) Quality Assurance in Analytical Chemistry –

In: Wenclawiak, Koch, Hadjicostas (eds.) Quality Assurance in Analytical Chemistry – Training and Teaching

Koch, M.: Basic Statistics © Springer-Verlag Berlin Heidelberg 2003

Recommended literature and training software

LGC, VAMSTAT II, Statistics Training for Valid Analytical Measurements. CD-ROM, Laboratory of the Government Chemist, Teddington 2000.

H. Lohninger: Teach/Me - Data Analysis. Multimedia teachware. Springer-Verlag 1999.

J.C. Miller and J.N. Miller: Statistics for Analytical Chemistry. Ellis Horwood Ltd., Chicester 1988

T.J. Farrant: Practical Statistics for the Analytical Scientist – a Bench Guide. Royal Society of Chemistry for the Laboratory of the Government Chemist (LGC), Teddington 1997


Recommended