二乗誤差の分解

二乗誤差の分解#

式の変形の仕方によって

MSEをBiasの二乗とVarianceに分解できること
二乗誤差に対する最適解（ベイズ規則 Bayes rule）が条件付き期待値であること

という異なる2つの説明ができる

二乗誤差のBias-Variance分解#

予測値 $E [\hat{y}]$ について引いて足す場合

\begin{array}{r} \begin{aligned} E_{D} [ℓ (\hat{y}, y)] & = E [(y - \hat{y})^{2}] \\ = E [(y - E [\hat{y}] + E [\hat{y}] - \hat{y})^{2}] (E [\hat{y}] を 引 い て 足 す) \\ = E [(y - E [\hat{y}])^{2} + 2 (y - E [\hat{y}]) (E [\hat{y}] - \hat{y}) + (E [\hat{y}] - \hat{y})^{2}] \\ = E [(y - E [\hat{y}])^{2}] + \underset{\begin{aligned} = 2 E [y \cdot E [\hat{y}] - y \cdot \hat{y} - E [\hat{y}]^{2} + E [\hat{y}] \cdot \hat{y}] \\ = 2 (y \cdot E [\hat{y}] - y \cdot E [\hat{y}] - E [\hat{y}]^{2} + E [\hat{y}]^{2}) \\ = 0 \end{aligned}}{\underset{⏟}{2 E [(y - E [\hat{y}]) (E [\hat{y}] - \hat{y})]}} + E [(E [\hat{y}] - \hat{y})^{2}] \\ = (y - E [\hat{y}])^{2} + Var [\hat{y}] \\ = {Bias}^{2} + Variance \end{aligned} \end{array}

実際に等式が成り立つのか実証的に確認#

- bias: 0.045
- variance: 0.975

二乗誤差の期待値とBias^2 + Varianceが一致するか
- 二乗誤差の期待値 EPE: 0.976
- Bias^2 + variance: 0.977
- 差分（EPE - bias^2 + variance）: -0.001

二乗誤差のベイズ規則#

実測値 $E [y]$ について引いて足す場合。

損失関数として二乗損失 $ℓ (\hat{y}, y) = (\hat{y} - y)^{2}$ を利用するとき、

\begin{array}{r} \begin{aligned} E [ℓ (\hat{y}, Y)] & = E [(\hat{y} - Y)^{2}] \\ = E [(\hat{y} - E [Y] + E [Y] - Y)^{2}] (E [Y] を 引 い て 足 す) \\ = E [(\hat{y} - E [Y])^{2} + 2 (\hat{y} - E [Y]) (E [Y] - Y) + (E [Y] - y)^{2}] \\ = \underset{= (\hat{y} - E [Y])^{2}}{\underset{⏟}{E [(\hat{y} - E [Y])^{2}]}} + \underset{\begin{aligned} = 2 E [\hat{y} E [Y] - \hat{y} Y - E [Y]^{2} + E [Y] Y] \\ = 2 \hat{y} E [Y] - 2 \hat{y} E [Y] - 2 E [Y]^{2} + 2 E [Y]^{2} \\ = 0 \end{aligned}}{\underset{⏟}{2 E [(\hat{y} - E [Y]) (E [Y] - Y)]}} + E [(E [Y] - Y)^{2}] \\ = (\hat{y} - E [Y])^{2} + Var [Y] \end{aligned} \end{array}

となる。よって $\hat{y} = E [Y]$ とすれば予測誤差が最小になる。

この期待値を条件付き期待値に置き換えて考えると、ベイズ規則 $h_{0} (X)$ は

h_{0} (X) = E [Y | X]

によって与えられる。

二乗誤差の分解

Contents

二乗誤差の分解#

二乗誤差のBias-Variance分解#

実際に等式が成り立つのか実証的に確認#

二乗誤差のベイズ規則#