Skip to main content

Convolution 2D

zi,jl=∑m∑nai+m,j+nl−1(ω′)m,nl+bi,jlz^l_{i,j} = \sum_m \sum_n{a^{l-1}_{i+m,j+n} (\omega')^l_{m,n} + b^l_{i,j}}
Al=σ(Zl)=σ(Al−1∗Wl+Bl)A^l = \sigma (Z^l) = \sigma ( A^{l-1} * W^l + B^l )

Convolution#

K,K′∈RM×NK(m,n)=K′(M−1−m,N−1−n)K, K' \isin \R^{M \times N} \quad K(m, n) = K'(M-1-m, N-1-n)
(I∗K)ij=∑m∑nI(i+m,j+n)K(M−1−m,N−1−n)=∑m∑nI(i+m,j+n)K′(m,n)=(I⊗K′)ij\begin{aligned} ( I * K )_{ij} &= \sum_m \sum_n {I(i+m, j+n)K(M-1-m, N-1-n)} \\ &= \sum_m \sum_n {I(i+m, j+n)K'(m, n)} &= (I \otimes K')_{ij} \end{aligned}
info

In the CNN description, most of the picture representing convolution seem to the cross-correlation using K′K'.

Kernel example#

K′=[987654321]K=[123456789]K' = \begin{bmatrix}9 & 8 & 7 \\ 6 & 5 & 4 \\ 3 & 2 & 1\end{bmatrix} \quad K = \begin{bmatrix}1 & 2 & 3 \\ 4 & 5 & 6 \\ 7 & 8 & 9\end{bmatrix}

Forward-propagation#

in_channels=1,
out_channels(filters)=1,
kernel_size=3,
stride=1,
padding=1,
dilation=1,
bias=False
in_channels=1,
out_channels(filters)=1,
kernel_size=3,
stride=2,
padding=0,
dilation=1,
bias=False
in_channels=1,
out_channels(filters)=1,
kernel_size=3,
stride=1,
padding=0,
dilation=2,
bias=False
in_channels=2,
out_channels(filters)=3,
kernel_size=3,
stride=1,
padding=0,
dilation=1,
bias=True

Back-propagation#

δi,jl≡∂Loss∂zi,jl\delta^l_{i,j} \equiv \frac{\partial Loss}{\partial z^l_{i,j}}
δi,jl=∂Loss∂zi,jl=∑x∑y∂Loss∂zx,yl+1∂zx,yl+1∂zi,jl=∑x∑yδx,yl+1(ω′)i−x,j−yl+1σ′(zi,jl)=∑m∑nδi−m,j−nl+1(ω′)m,nl+1σ′(zi,jl)\begin{aligned} \delta^l_{i,j} = \frac{\partial Loss}{\partial z^l_{i,j}} &= \sum_x \sum_y { \frac { \partial Loss } { \partial z^{l+1}_{x,y} } \frac { \partial z^{l+1}_{x,y} } { \partial z^l_{i,j} } } \\ &= \sum_x \sum_y { \delta^{l+1}_{x,y} (\omega')^{l+1}_{i-x,j-y} \sigma' (z^l_{i,j}) } \\ &= \sum_m \sum_n { \delta^{l+1}_{i-m,j-n} (\omega')^{l+1}_{m,n} \sigma' (z^l_{i,j}) } \end{aligned}
∂Loss∂(ω′)m,nl=∑i∑j∂Loss∂zi,jl∂zi,jl∂(ω′)m,nl=∑i∑jδi,jlai+m,j+nl−1\begin{aligned} \frac {\partial Loss} {\partial (\omega')^l_{m,n}} &= \sum_i \sum_j { \frac { \partial Loss } { \partial z^{l}_{i,j} } \frac { \partial z^{l}_{i,j} } { \partial ( \omega' )^l_{m,n} } } \\ &= \sum_i \sum_j { \delta^l_{i,j} a^{l-1}_{i+m, j+n} } \end{aligned}
∂Loss∂bi,jl=∑x∑y∂Loss∂zx,yl∂zx,yl∂bi,jl=δi,jl\begin{aligned} \frac {\partial Loss} {\partial b^l_{i,j}} &= \sum_x \sum_y { \frac { \partial Loss } { \partial z^{l}_{x,y} } \frac { \partial z^{l}_{x,y} } { \partial b^l_{i,j} } } \\ & = \delta^l_{i,j} \end{aligned}

Reference#

Last updated on