那么,如果总体量很大,不能做到全部采样,那么就需要用样本来估计总体,假设从总体为 N N N的总数中抽取 n n n个样本,其中 ( N > > n ) (N>>n) (N>>n),采样值为 x 1 , x 2 , . . . , x n x_1,x_2,…,x_n x1,x2,...,xn
样本均值为:
x ˉ = ∑ i = 1 n x i n \bar{x}=\frac{\sum_{i=1}^{n}{x_i}}{n} xˉ=n∑i=1nxi
样本的方差为:
S 2 = ∑ i = 1 n ( x i − x ˉ ) 2 n S^2=\frac{\sum_{i=1}^{n}(x_i-\bar{x})^2}{n} S2=n∑i=1n(xi−xˉ)2
但是样本的方差和总体的方差是有差别的,计算样本方差的期望值,来估计样本方差和实际方差 σ 2 \sigma^2 σ2之间差了多少:
E [ S 2 ] = E [ ∑ i = 1 n ( x i − x ˉ ) 2 n ] E[S^2]=E[\frac{\sum_{i=1}^{n}(x_i-\bar{x})^2}{n}] E[S2]=E[n∑i=1n(xi−xˉ)2]
= E [ 1 n ∑ i = 1 n ( ( x i − μ ) − ( x ˉ − μ ) ) 2 ] =E[\frac{1}{n}\sum_{i=1}^{n}{((x_i-\mu)-(\bar{x}-\mu))^2}] =E[n1i=1∑n((xi−μ)−(xˉ−μ))2]
= E [ 1 n ∑ i = 1 n ( ( x i − μ ) 2 − 2 ( x i − μ ) ( x ˉ − μ ) + ( x ˉ − μ ) 2 ) ] =E[\frac{1}{n}\sum_{i=1}^{n}{((x_i-\mu)^2-2(x_i-\mu)(\bar{x}-\mu)+(\bar{x}-\mu)^2)}] =E[n1i=1∑n((xi−μ)2−2(xi−μ)(xˉ−μ)+(xˉ−μ)2)]
= E [ 1 n ∑ i = 1 n ( x i − μ ) 2 − 2 n ( x ˉ − μ ) ∑ i = 1 n ( x i − μ ) + ( x ˉ − μ ) 2 ] =E[\frac{1}{n}\sum_{i=1}^{n}{(x_i-\mu)^2}-\frac{2}{n}(\bar{x}-\mu)\sum_{i=1}^{n}{(x_i-\mu)}+(\bar{x}-\mu)^2] =E[n1i=1∑n(xi−μ)2−n2(xˉ−μ)i=1∑n(xi−μ)+(xˉ−μ)2]
其中
∑ i = 1 n ( x i − μ ) \sum_{i=1}^{n}{(x_i-\mu)} i=1∑n(xi−μ)
= ∑ i = 1 n x i − ∑ i = 1 n μ =\sum_{i=1}^{n}{x_i}-\sum_{i=1}^{n}{\mu} =∑i=1nxi−∑i=1nμ
= n ( x ˉ − μ ) =n(\bar{x}-\mu) =n(xˉ−μ)
所以
= E [ 1 n ∑ i = 1 n ( x i − μ ) 2 − 2 n ( x ˉ − μ ) ∑ i = 1 n ( x i − μ ) + ( x ˉ − μ ) 2 ] =E[\frac{1}{n}\sum_{i=1}^{n}{(x_i-\mu)^2}-\frac{2}{n}(\bar{x}-\mu)\sum_{i=1}^{n}{(x_i-\mu)}+(\bar{x}-\mu)^2] =E[n1i=1∑n(xi−μ)2−n2(xˉ−μ)i=1∑n(xi−μ)+(xˉ−μ)2]
= E [ 1 n ∑ i = 1 n ( x i − μ ) 2 − 2 ( x ˉ − μ ) 2 + ( x ˉ − μ ) 2 ] =E[\frac{1}{n}\sum_{i=1}^{n}{(x_i-\mu)^2}-2(\bar{x}-\mu)^2+(\bar{x}-\mu)^2] =E[n1i=1∑n(xi−μ)2−2(xˉ−μ)2+(xˉ−μ)2]
= σ 2 − E [ ( x ˉ − μ ) 2 ] =\sigma^2-E[(\bar{x}-\mu)^2] =σ2−E[(xˉ−μ)2]
(这里 σ 2 \sigma^2 σ2是因为样本方差的期望值是总体方差)
E [ ( x ˉ − μ ) 2 ] E[(\bar{x}-\mu)^2] E[(xˉ−μ)2]
= E ( x ˉ − E [ x ˉ ] ) 2 =E(\bar{x}-E[\bar{x}])^2 =E(xˉ−E[xˉ])2
= v a r ( x ˉ ) =var(\bar{x}) =var(xˉ)
= 1 n 2 v a r ( ∑ i = 1 n x i ) =\frac{1}{n^2}var(\sum_{i=1}^{n}{x_i}) =n21var(i=1∑nxi)
= 1 n 2 ∑ i = 1 n v a r ( x i ) =\frac{1}{n^2}\sum_{i=1}^{n}{var(x_i)} =n21i=1∑nvar(xi)
= n σ 2 n 2 =\frac{n\sigma^2}{n^2} =n2nσ2
= σ 2 n =\frac{\sigma^2}{n} =nσ2
根据上面推导的式子,有以下计算:
σ 2 − E [ ( x ˉ − μ ) 2 ] \sigma^2-E[(\bar{x}-\mu)^2] σ2−E[(xˉ−μ)2]
= σ 2 − σ 2 n =\sigma^2-\frac{\sigma^2}{n} =σ2−nσ2
= n − 1 n σ 2 =\frac{n-1}{n}\sigma^2 =nn−1σ2
也就是说,样本估计的方差是总体方差的 n − 1 n \frac{n-1}{n} nn−1倍,即所谓的有偏估计。要转换成无偏估计,只需要乘以倍数就可以了
n n − 1 S 2 = n n − 1 ∑ i = 1 n ( x i − x ˉ ) n = ∑ i = 1 n ( x i − x ˉ ) n − 1 \frac{n}{n-1}S^2=\frac{n}{n-1}\frac{\sum_{i=1}^{n}(x_i-\bar{x})}{n}=\frac{\sum_{i=1}^{n}(x_i-\bar{x})}{n-1} n−1nS2=n−1nn∑i=1n(xi−xˉ)=n−1∑i=1n(xi−xˉ)
这即是所谓的无偏估计
发布者:全栈程序员-站长,转载请注明出处:https://javaforall.net/217699.html原文链接:https://javaforall.net
