無限不可能性ドライブ

『ニューラルネットワーク自作入門』に刺激されてExcelVBAでニューラルネットワークを作ってみたものの、やっぱり数学やらなきゃと思い少しずつやってきたのもあって、自分の知識の整理とかそういった感じです。

【数式編】(逆伝播)1つめの隠れ層の重みとバイアスを更新する 3-(2)

f:id:celaeno42:20181026233118p:plain

ユニットh11 の残りの重みとバイアスの更新式

では、残りの  w_{12}^2, w_{13}^2, w_{14}^2, b_1^2 の更新式についてみていきましょう。

・まずは  w_{12}^2 の更新式


 \displaystyle w_{12}^2 ← w_{12}^2 - \eta \nabla E


 \displaystyle \begin{align}
\nabla E = \frac{\partial E}{\partial w_{12}^2} &= \frac{\partial E}{\partial u_1^2} \frac{\partial u_1^2}{\partial w_{12}^2} \\
\\
&= \left( \frac{\partial E}{\partial u_1^3} \frac{\partial u_1^3}{\partial z_1^2} \frac{\partial z_1^2}{\partial u_1^2} + \frac{\partial E}{\partial u_2^3} \frac{\partial u_2^3}{\partial z_1^2} \frac{\partial z_1^2}{\partial u_1^2} + \frac{\partial E}{\partial u_3^3} \frac{\partial u_3^3}{\partial z_1^2} \frac{\partial z_1^2}{\partial u_1^2} + \frac{\partial E}{\partial u_4^3} \frac{\partial u_4^3}{\partial z_1^2} \frac{\partial z_1^2}{\partial u_1^2} \right) 
\times \frac{\partial u_1^2}{\partial w_{12}^2}
\end{align}


前回求めた  w_{11}^2 の更新式と見比べると、 \displaystyle \frac{\partial E}{\partial u_1^2} の部分が共通していることがわかると思います。


(再掲)
 \displaystyle \begin{align}
\nabla E = \frac{\partial E}{\partial w_{11}^2} &= \frac{\partial E}{\partial u_1^2} \frac{\partial u_1^2}{\partial w_{11}^2} \\
\\
&= \left( \frac{\partial E}{\partial u_1^3} \frac{\partial u_1^3}{\partial z_1^2} \frac{\partial z_1^2}{\partial u_1^2} + \frac{\partial E}{\partial u_2^3} \frac{\partial u_2^3}{\partial z_1^2} \frac{\partial z_1^2}{\partial u_1^2} + \frac{\partial E}{\partial u_3^3} \frac{\partial u_3^3}{\partial z_1^2} \frac{\partial z_1^2}{\partial u_1^2} + \frac{\partial E}{\partial u_4^3} \frac{\partial u_4^3}{\partial z_1^2} \frac{\partial z_1^2}{\partial u_1^2} \right) 
\times \frac{\partial u_1^2}{\partial w_{11}^2}
\end{align}


なので、 \displaystyle \frac{\partial u_1^2}{\partial w_{12}^2} の部分だけ計算すればいいですね。


\displaystyle u_1^2 = w_{11}^2 x_1 + w_{12}^2 x_2 + w_{13}^2 x_3 + w_{14}^2 x_4 + b_1^2


なので、


 \displaystyle \frac{\partial u_1^2}{\partial w_{12}^2} = \frac{\partial}{\partial w_{12}^2} (w_{11}^2 x_1 + w_{12}^2 x_2 + w_{13}^2 x_3 + w_{14}^2 x_4 + b_1^2) = x_2


よって


 \displaystyle w_{12}^2 ← w_{12}^2 - \eta \nabla E


 \displaystyle \begin{align}
\nabla E = \frac{\partial E}{\partial w_{12}^2} &= \frac{\partial E}{\partial u_1^2} \frac{\partial u_1^2}{\partial w_{12}^2} \\
\\
&= \left( \frac{\partial E}{\partial u_1^3} \frac{\partial u_1^3}{\partial z_1^2} \frac{\partial z_1^2}{\partial u_1^2} + \frac{\partial E}{\partial u_2^3} \frac{\partial u_2^3}{\partial z_1^2} \frac{\partial z_1^2}{\partial u_1^2} + \frac{\partial E}{\partial u_3^3} \frac{\partial u_3^3}{\partial z_1^2} \frac{\partial z_1^2}{\partial u_1^2} + \frac{\partial E}{\partial u_4^3} \frac{\partial u_4^3}{\partial z_1^2} \frac{\partial z_1^2}{\partial u_1^2} \right) 
\times \frac{\partial u_1^2}{\partial w_{12}^2} \\
\\
\\
&= 
\begin{pmatrix}
 \delta_1^3 \times w_{11}^3 \times ReLU'(u_1^2) \\
\\ + \delta_2^3 \times w_{21}^3 \times ReLU'(u_1^2) \\
\\ + \delta_3^3 \times w_{31}^3 \times ReLU'(u_1^2) \\
\\ + \delta_4^3 \times w_{41}^3 \times ReLU'(u_1^2)
\end{pmatrix}
\times x_2
\end{align}


 w_{13}^2 の更新式

これも  w_{12}^2 とほとんど同じです。


\displaystyle u_1^2 = w_{11}^2 x_1 + w_{12}^2 x_2 + w_{13}^2 x_3 + w_{14}^2 x_4 + b_1^2


なので、


 \displaystyle \frac{\partial u_1^2}{\partial w_{13}^2} = \frac{\partial}{\partial w_{13}^2} (w_{11}^2 x_1 + w_{12}^2 x_2 + w_{13}^2 x_3 + w_{14}^2 x_4 + b_1^2) = x_3


よって


 \displaystyle w_{13}^2 ← w_{13}^2 - \eta \nabla E


 \displaystyle \begin{align}
\nabla E = \frac{\partial E}{\partial w_{13}^2} &= \frac{\partial E}{\partial u_1^2} \frac{\partial u_1^2}{\partial w_{13}^2} \\
\\
&= \left( \frac{\partial E}{\partial u_1^3} \frac{\partial u_1^3}{\partial z_1^2} \frac{\partial z_1^2}{\partial u_1^2} + \frac{\partial E}{\partial u_2^3} \frac{\partial u_2^3}{\partial z_1^2} \frac{\partial z_1^2}{\partial u_1^2} + \frac{\partial E}{\partial u_3^3} \frac{\partial u_3^3}{\partial z_1^2} \frac{\partial z_1^2}{\partial u_1^2} + \frac{\partial E}{\partial u_4^3} \frac{\partial u_4^3}{\partial z_1^2} \frac{\partial z_1^2}{\partial u_1^2} \right) 
\times \frac{\partial u_1^2}{\partial w_{13}^2} \\
\\
\\
&= 
\begin{pmatrix}
 \delta_1^3 \times w_{11}^3 \times ReLU'(u_1^2) \\
\\ + \delta_2^3 \times w_{21}^3 \times ReLU'(u_1^2) \\
\\ + \delta_3^3 \times w_{31}^3 \times ReLU'(u_1^2) \\
\\ + \delta_4^3 \times w_{41}^3 \times ReLU'(u_1^2)
\end{pmatrix}
\times x_3
\end{align}


 w_{14}^2 の更新式

同様に、


 \displaystyle \frac{\partial u_1^2}{\partial w_{14}^2} = \frac{\partial}{\partial w_{14}^2} (w_{11}^2 x_1 + w_{12}^2 x_2 + w_{13}^2 x_3 + w_{14}^2 x_4 + b_1^2) = x_4


よって


 \displaystyle w_{14}^2 ← w_{14}^2 - \eta \nabla E


 \displaystyle \begin{align}
\nabla E = \frac{\partial E}{\partial w_{14}^2} &= \frac{\partial E}{\partial u_1^2} \frac{\partial u_1^2}{\partial w_{14}^2} \\
\\
&= \left( \frac{\partial E}{\partial u_1^3} \frac{\partial u_1^3}{\partial z_1^2} \frac{\partial z_1^2}{\partial u_1^2} + \frac{\partial E}{\partial u_2^3} \frac{\partial u_2^3}{\partial z_1^2} \frac{\partial z_1^2}{\partial u_1^2} + \frac{\partial E}{\partial u_3^3} \frac{\partial u_3^3}{\partial z_1^2} \frac{\partial z_1^2}{\partial u_1^2} + \frac{\partial E}{\partial u_4^3} \frac{\partial u_4^3}{\partial z_1^2} \frac{\partial z_1^2}{\partial u_1^2} \right) 
\times \frac{\partial u_1^2}{\partial w_{14}^2} \\
\\
\\
&= 
\begin{pmatrix}
 \delta_1^3 \times w_{11}^3 \times ReLU'(u_1^2) \\
\\ + \delta_2^3 \times w_{21}^3 \times ReLU'(u_1^2) \\
\\ + \delta_3^3 \times w_{31}^3 \times ReLU'(u_1^2) \\
\\ + \delta_4^3 \times w_{41}^3 \times ReLU'(u_1^2)
\end{pmatrix}
\times x_4
\end{align}


 b_1^2 の更新式


 \displaystyle \frac{\partial u_1^2}{\partial b_1^2} = \frac{\partial}{\partial b_1^2} (w_{11}^2 x_1 + w_{12}^2 x_2 + w_{13}^2 x_3 + w_{14}^2 x_4 + b_1^2) = 1


よって


 \displaystyle b_1^2 ← b_1^2 - \eta \nabla E


 \displaystyle \begin{align}
\nabla E = \frac{\partial E}{\partial b_1^2} &= \frac{\partial E}{\partial u_1^2} \frac{\partial u_1^2}{\partial b_1^2} \\
\\
&= \left( \frac{\partial E}{\partial u_1^3} \frac{\partial u_1^3}{\partial z_1^2} \frac{\partial z_1^2}{\partial u_1^2} + \frac{\partial E}{\partial u_2^3} \frac{\partial u_2^3}{\partial z_1^2} \frac{\partial z_1^2}{\partial u_1^2} + \frac{\partial E}{\partial u_3^3} \frac{\partial u_3^3}{\partial z_1^2} \frac{\partial z_1^2}{\partial u_1^2} + \frac{\partial E}{\partial u_4^3} \frac{\partial u_4^3}{\partial z_1^2} \frac{\partial z_1^2}{\partial u_1^2} \right) 
\times \frac{\partial u_1^2}{\partial b_1^2} \\
\\
\\
&= 
\begin{pmatrix}
 \delta_1^3 \times w_{11}^3 \times ReLU'(u_1^2) \\
\\ + \delta_2^3 \times w_{21}^3 \times ReLU'(u_1^2) \\
\\ + \delta_3^3 \times w_{31}^3 \times ReLU'(u_1^2) \\
\\ + \delta_4^3 \times w_{41}^3 \times ReLU'(u_1^2)
\end{pmatrix}
\times 1
\end{align}


総まとめ

それぞれの重みとバイアスの更新部分の式(勾配)をまとめておきます。
前回同様に共通部分を  \displaystyle \delta_1^2 としましょう。


 \displaystyle \begin{align}
\frac{\partial E}{\partial w_{11}^2} &= \frac{\partial E}{\partial u_1^2} \frac{\partial u_1^2}{\partial w_{11}^2}
= 
\begin{pmatrix}
 \delta_1^3 \times w_{11}^3 \times ReLU'(u_1^2) \\
\\ + \delta_2^3 \times w_{21}^3 \times ReLU'(u_1^2) \\
\\ + \delta_3^3 \times w_{31}^3 \times ReLU'(u_1^2) \\
\\ + \delta_4^3 \times w_{41}^3 \times ReLU'(u_1^2)
\end{pmatrix}
\times x_1
= \delta_1^2 \times x_1
\end{align}


 \displaystyle \begin{align}
\frac{\partial E}{\partial w_{12}^2} &= \frac{\partial E}{\partial u_1^2} \frac{\partial u_1^2}{\partial w_{12}^2}
= 
\begin{pmatrix}
 \delta_1^3 \times w_{11}^3 \times ReLU'(u_1^2) \\
\\ + \delta_2^3 \times w_{21}^3 \times ReLU'(u_1^2) \\
\\ + \delta_3^3 \times w_{31}^3 \times ReLU'(u_1^2) \\
\\ + \delta_4^3 \times w_{41}^3 \times ReLU'(u_1^2)
\end{pmatrix}
\times x_2
= \delta_1^2 \times x_2
\end{align}


 \displaystyle \begin{align}
\frac{\partial E}{\partial w_{13}^2} &= \frac{\partial E}{\partial u_1^2} \frac{\partial u_1^2}{\partial w_{13}^2}
= 
\begin{pmatrix}
 \delta_1^3 \times w_{11}^3 \times ReLU'(u_1^2) \\
\\ + \delta_2^3 \times w_{21}^3 \times ReLU'(u_1^2) \\
\\ + \delta_3^3 \times w_{31}^3 \times ReLU'(u_1^2) \\
\\ + \delta_4^3 \times w_{41}^3 \times ReLU'(u_1^2)
\end{pmatrix}
\times x_3
= \delta_1^2 \times x_3
\end{align}


 \displaystyle \begin{align}
\frac{\partial E}{\partial w_{14}^2} &= \frac{\partial E}{\partial u_1^2} \frac{\partial u_1^2}{\partial w_{14}^2}
= 
\begin{pmatrix}
 \delta_1^3 \times w_{11}^3 \times ReLU'(u_1^2) \\
\\ + \delta_2^3 \times w_{21}^3 \times ReLU'(u_1^2) \\
\\ + \delta_3^3 \times w_{31}^3 \times ReLU'(u_1^2) \\
\\ + \delta_4^3 \times w_{41}^3 \times ReLU'(u_1^2)
\end{pmatrix}
\times x_4
= \delta_1^2 \times x_4
\end{align}


 \displaystyle \begin{align}
\frac{\partial E}{\partial b_1^2} &= \frac{\partial E}{\partial u_1^2} \frac{\partial u_1^2}{\partial b_1^2}
= 
\begin{pmatrix}
 \delta_1^3 \times w_{11}^3 \times ReLU'(u_1^2) \\
\\ + \delta_2^3 \times w_{21}^3 \times ReLU'(u_1^2) \\
\\ + \delta_3^3 \times w_{31}^3 \times ReLU'(u_1^2) \\
\\ + \delta_4^3 \times w_{41}^3 \times ReLU'(u_1^2)
\end{pmatrix}
\times 1
= \delta_1^2 \times 1
\end{align}


これで ユニットh11 の重みとバイアスに関する更新式を求めることができました。