본문으로 건너뛰기

Convolution 2D

zi,jl=mnai+m,j+nl1(ω)m,nl+bi,jlz^l_{i,j} = \sum_m \sum_n{a^{l-1}_{i+m,j+n} (\omega')^l_{m,n} + b^l_{i,j}} Al=σ(Zl)=σ(Al1Wl+Bl)A^l = \sigma (Z^l) = \sigma ( A^{l-1} * W^l + B^l )

Convolution

K,KRM×NK(m,n)=K(M1m,N1n)K, K' \isin \R^{M \times N} \quad K(m, n) = K'(M-1-m, N-1-n) (IK)ij=mnI(i+m,j+n)K(M1m,N1n)=mnI(i+m,j+n)K(m,n)=(IK)ij\begin{aligned} ( I * K )_{ij} &= \sum_m \sum_n {I(i+m, j+n)K(M-1-m, N-1-n)} \\ &= \sum_m \sum_n {I(i+m, j+n)K'(m, n)} &= (I \otimes K')_{ij} \end{aligned}
정보

CNN 설명에서, 대부분의 그림은 cross-correlation KK'을 나타내는 것 같습니다.

Kernel example

K=[987654321]K=[123456789]K' = \begin{bmatrix}9 & 8 & 7 \\ 6 & 5 & 4 \\ 3 & 2 & 1\end{bmatrix} \quad K = \begin{bmatrix}1 & 2 & 3 \\ 4 & 5 & 6 \\ 7 & 8 & 9\end{bmatrix}

Forward-propagation

in_channels=1,
out_channels(filters)=1,
kernel_size=3,
stride=1,
padding=1,
dilation=1,
bias=False
in_channels=1,
out_channels(filters)=1,
kernel_size=3,
stride=2,
padding=0,
dilation=1,
bias=False
in_channels=1,
out_channels(filters)=1,
kernel_size=3,
stride=1,
padding=0,
dilation=2,
bias=False
in_channels=2,
out_channels(filters)=3,
kernel_size=3,
stride=1,
padding=0,
dilation=1,
bias=True

Back-propagation

δi,jlLosszi,jl\delta^l_{i,j} \equiv \frac{\partial Loss}{\partial z^l_{i,j}} δi,jl=Losszi,jl=xyLosszx,yl+1zx,yl+1zi,jl=xyδx,yl+1(ω)ix,jyl+1σ(zi,jl)=mnδim,jnl+1(ω)m,nl+1σ(zi,jl)\begin{aligned} \delta^l_{i,j} = \frac{\partial Loss}{\partial z^l_{i,j}} &= \sum_x \sum_y { \frac { \partial Loss } { \partial z^{l+1}_{x,y} } \frac { \partial z^{l+1}_{x,y} } { \partial z^l_{i,j} } } \\ &= \sum_x \sum_y { \delta^{l+1}_{x,y} (\omega')^{l+1}_{i-x,j-y} \sigma' (z^l_{i,j}) } \\ &= \sum_m \sum_n { \delta^{l+1}_{i-m,j-n} (\omega')^{l+1}_{m,n} \sigma' (z^l_{i,j}) } \end{aligned} Loss(ω)m,nl=ijLosszi,jlzi,jl(ω)m,nl=ijδi,jlai+m,j+nl1\begin{aligned} \frac {\partial Loss} {\partial (\omega')^l_{m,n}} &= \sum_i \sum_j { \frac { \partial Loss } { \partial z^{l}_{i,j} } \frac { \partial z^{l}_{i,j} } { \partial ( \omega' )^l_{m,n} } } \\ &= \sum_i \sum_j { \delta^l_{i,j} a^{l-1}_{i+m, j+n} } \end{aligned} Lossbi,jl=xyLosszx,ylzx,ylbi,jl=δi,jl\begin{aligned} \frac {\partial Loss} {\partial b^l_{i,j}} &= \sum_x \sum_y { \frac { \partial Loss } { \partial z^{l}_{x,y} } \frac { \partial z^{l}_{x,y} } { \partial b^l_{i,j} } } \\ & = \delta^l_{i,j} \end{aligned}

Reference