深度可分离卷积
通过引入深度可分离卷积,能很好的减少参数量,但效果下降更少,用于将模型降小。
第一步 depthwise feature:一般卷积网络会在卷积后三个通道求和,但深度可分离卷积不会求和,三个通道分别继续卷积:
第二步 pointwise feature:使用一个1x1的卷积核,将多通道求和:
对比计算量
模型结构
◆ 普通卷积计算量
Dk. Dk.M .N . DF. DF (各项相乘)
加法操作, 复杂度小, 忽略掉
其中:DKxDK 是输入和卷积核相乘, DF*DF 是滑动次数, 为什么乘以 M? M 是通道数目, N 是卷积核的数目(可以理解 M 是输入通道数目, N 是输出通道数目)
◆ 深度可分离卷积计算量
◆深度可分离
DK.DK.M.DF.DF depthwise 计算量
◆1*1 卷积
M.N.DF.DF pointwise 计算量
1/卷积核数目+1/卷积核 size 的平方
https://www.cnblogs.com/hellcat/p/9726528.html
参数量减少比例 :Dk x Dk x M+M x N / Dk x Dk x M x N
代码:
1 2 3 4 5 6 7 8 9 10
| class DepthWiseConv2d(nn.Module): def __init__(self, in_channels, out_channels, kernel_size, stride=1, padding=0, bias=True): super(DepthWiseConv2d, self).__init__() self.depthwise_conv = nn.Conv2d(in_channels, in_channels, kernel_size, stride, padding, groups=in_channels, bias=False) self.pointwise_conv = nn.Conv2d(in_channels, out_channels, 1, 1, 0, bias=bias) def forward(self, x): x = self.depthwise_conv(x) x = self.pointwise_conv(x) return x
|
需要使用groups参数表示一个卷积核的每个通道分别进行运算,否则会直接卷积求和。
pointwise_conv:使用是1x1的卷积核
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41
| class CNN(nn.Module): def __init__(self, activation="relu"): super(CNN, self).__init__() self.activation = F.relu if activation == "relu" else F.selu self.conv1 = nn.Conv2d(in_channels=1, out_channels=32, kernel_size=3, padding="same") self.conv2 = DepthWiseConv2d(in_channels=32, out_channels=32, kernel_size=3, padding="same") self.pool = nn.MaxPool2d(2, 2) self.conv3 = DepthWiseConv2d(in_channels=32, out_channels=64, kernel_size=3, padding="same") self.conv4 = DepthWiseConv2d(in_channels=64, out_channels=64, kernel_size=3, padding="same") self.conv5 = DepthWiseConv2d(in_channels=64, out_channels=128, kernel_size=3, padding="same") self.conv6 = DepthWiseConv2d(in_channels=128, out_channels=128, kernel_size=3, padding="same") self.flatten = nn.Flatten() self.fc1 = nn.Linear(128 * 3 * 3, 128) self.fc2 = nn.Linear(128, 10) self.init_weights() def init_weights(self): """使用 xavier 均匀分布来初始化全连接层、卷积层的权重 W""" for m in self.modules(): if isinstance(m, (nn.Linear, nn.Conv2d)): nn.init.xavier_uniform_(m.weight) if m.bias is not None: nn.init.zeros_(m.bias) def forward(self, x): act = self.activation x = self.pool(act(self.conv2(act(self.conv1(x))))) x = self.pool(act(self.conv4(act(self.conv3(x))))) x = self.pool(act(self.conv6(act(self.conv5(x))))) x = self.flatten(x) x = act(self.fc1(x)) x = self.fc2(x) return x
for idx, (key, value) in enumerate(CNN().named_parameters()): print(f"{key}\tparamerters num: {np.prod(value.shape)}")
|
收敛速度会比卷积网络更慢