作者:欧新宇(Xinyu OU)
当前版本:Release v1.0
开发平台:Paddle 2.3.2
运行环境:Intel Core i7-7700K CPU 4.2GHz, nVidia GeForce GTX 1080 Ti
本教案所涉及的数据集仅用于教学和交流使用,请勿用作商用。
最后更新:2021年10月24日
网络参数配置表
(Q1, Q4)(每个模型20分)网络拓扑结构结构图
和参数配置表
,完成MNIST神经网络类Class MNIST()
Q2和CIFAR神经网络类Class CIFAR10
Q5的定义(每个模型20分)网络结构测试代码
和前向传输测试代码
(Q3, Q6)(每个模型10分)LeNet-5模型的原始输入尺寸是32×32,但是由于MNIST数据集的样本为28×28,因此需要对LeNet-5进行一定的调整。此次我们暂定调整后的模型命名为MNIST。
Layer | Input | Kernels_num | Kernels_size | Stride | Padding | PoolingType | Output | Parameters |
---|---|---|---|---|---|---|---|---|
Input | 1×28×28 | 1×28×28 | ||||||
Conv1 | 1×28×28 | 6 | 1×5×5 | 1 | 0 | 6×24×24 | (1×5×5+1)×6=156 | |
Pool1 | 6×24×24 | 6 | 6×2×2 | 2 | 0 | Max | 6×12×12 | 0 |
Conv2 | 6×12×12 | 16 | 6×5×5 | 1 | 0 | 16×8×8 | (6×5×5+1)×16=2416 | |
Pool2 | 16×8×8 | 16 | 16×2×2 | 2 | 0 | Max | 16×4×4 | 0 |
Conv3 | 16×4×4 | 120 | 16×4×4 | 1 | 0 | 120×1×1 | (16×4×4+1)×120=30840 | |
FC1 | (120×1×1)×1 | 120×1 | (120+1)×84=10164 | |||||
FC2 | 120×1 | 84×1 | (84+1)×10=850 | |||||
Output | 84×1 | 10×1 | ||||||
Total = 44426 |
# 载入基础库 import numpy as np import paddle import paddle.fluid as fluid # 载入基于fluid框架的paddle from paddle.fluid.dygraph import Linear, Conv2D, Pool2D use_cuda = True # True, False 如果设备有GPU,怎么我们可以启用GPU进行快速训练 PLACE = fluid.CUDAPlace(0) if use_cuda else fluid.CPUPlace() # 定义卷积神经网络MNIST class MNIST(fluid.dygraph.Layer): name_scope = 'MNIST' def __init__(self, num_classes=10): # 初始化CIFAR类,并为CIFAR增加对象self.x super(MNIST, self).__init__() # Q2-1: 根据LeNet-5拓扑结构图和Q1中完成的网络参数配置表完成下列MNIST模型的类定义 # 各层超参数定义: [Your codes 1] self.conv1 = Conv2D(num_channels=1, num_filters=6, filter_size=5, stride=1, act='relu') self.pool1 = Pool2D(pool_size=2, pool_stride=2, pool_type='max') self.conv2 = Conv2D(num_channels=6, num_filters=16, filter_size=5, stride=1, act='relu') self.pool2 = Pool2D(pool_size=2, pool_stride=2, pool_type='max') self.conv3 = Conv2D(num_channels=16, num_filters=120, filter_size=4, stride=1, act='relu') #conv3层输出通道数为120 self.fc1 = Linear(input_dim=120, output_dim=84, act='relu') # 在最后一个全连接层如果不定义激活函数act='softmax',可以直接在最后的输出层定义激活函数 self.fc2 = Linear(input_dim=84, output_dim=num_classes) def forward(self,input): # 为CNN类增加forward方法 # Q2-2: 根据LeNet-5拓扑结构图和Q1中完成的网络参数配置表完成下列MNIST模型的类定义 # 定义前向传输过程: [Your codes 2] x = self.conv1(input) x = self.pool1(x) x = self.conv2(x) x = self.pool2(x) x = self.conv3(x) x = fluid.layers.reshape(x, [x.shape[0], -1]) x = self.fc1(x) y = self.fc2(x) return y
# Q3: 完善下列网络结构测试代码 [Your codes 3] model = MNIST() paddle.summary(model, (10, 1, 28, 28))
--------------------------------------------------------------------------- Layer (type) Input Shape Output Shape Param # =========================================================================== Conv2D-13 [[10, 1, 28, 28]] [10, 6, 24, 24] 156 Pool2D-5 [[10, 6, 24, 24]] [10, 6, 12, 12] 0 Conv2D-14 [[10, 6, 12, 12]] [10, 16, 8, 8] 2,416 Pool2D-6 [[10, 16, 8, 8]] [10, 16, 4, 4] 0 Conv2D-15 [[10, 16, 4, 4]] [10, 120, 1, 1] 30,840 Linear-9 [[10, 120]] [10, 84] 10,164 Linear-10 [[10, 84]] [10, 10] 850 =========================================================================== Total params: 44,426 Trainable params: 44,426 Non-trainable params: 0 --------------------------------------------------------------------------- Input size (MB): 0.03 Forward/backward pass size (MB): 0.44 Params size (MB): 0.17 Estimated Total Size (MB): 0.64 --------------------------------------------------------------------------- {'total_params': 44426, 'trainable_params': 44426}
CIFAR-10 是一个更接近普适物体的彩色图像数据集。CIFAR-10是由Hinton 的学生Alex Krizhevsky 和Ilya Sutskever 整理的一个用于识别普适物体的小型数据集,该数据集的所有样本均从 80 million tiny images 数据中获取。。一共包含10 个类别的RGB彩色图片:飞机( airplane )、汽车( automobile )、鸟类( bird )、猫( cat )、鹿( deer )、狗( dog )、蛙类( frog )、马( horse )、船( ship )和卡车( truck )。每个图片的尺寸为32 × 32 ,每个类别有6000个图像,数据集中一共有50000 张训练图片和10000 张测试图片。此外还有一个类似的CIFAR-100数据集,也由Alex和Ilya收集,该数据包含100个类,每个类100个样本,其中500个用于训练,100个用于测试。
下图是一个经典的用于CIFAR-10识别的卷积神经网络结构图。
Layer | Input | Kernels_num | Kernels_size | Stride | Padding | PoolingType | Output | Parameters |
---|---|---|---|---|---|---|---|---|
Input | 3×32×32 | 3×32×32 | ||||||
Conv1 | 3×32×32 | 32 | 3×5×5 | 1 | 0 | 32×28×28 | (3×5×5+1)×32=2432 | |
Pool1 | 32×28×28 | 32 | 32×2×2 | 2 | 0 | Max | 32×14×14 | 0 |
Conv2 | 32×14×14 | 32 | 32×5×5 | 1 | 0 | 32×10×10 | (32×5×5+1)×32=25632 | |
Pool2 | 32×10×10 | 32 | 32×2×2 | 2 | 0 | Avg | 32×5×5 | 0 |
Conv3 | 32×5×5 | 64 | 32×4×4 | 1 | 0 | 64×2×2 | (32×4×4+1)×64=32832 | |
Pool3 | 64×2×2 | 64 | 64×2×2 | 2 | 0 | Avg | 64×1×1 | 0 |
FC1 | (64×1×1)×1 | 64×1 | (64+1)×64=4160 | |||||
FC2 | 64×1 | 64×10 | (64+1)×10=650 | |||||
Output | 10×1 | |||||||
Total = 65706 |
# 载入基础库 import paddle from paddle.nn import Sequential, Conv2D, MaxPool2D, AvgPool2D, Linear, Dropout, ReLU class Cifar10(paddle.nn.Layer): def __init__(self, num_classes=10): super(Cifar10, self).__init__() self.num_classes = num_classes # Q5-1: 根据Cifar10拓扑结构图和Q4中完成的网络参数配置表完成下列Cifar10模型的类定义 # 各层超参数定义: [Your codes 4] self.features = Sequential( Conv2D(in_channels=3, out_channels=32, kernel_size=5, stride=1), ReLU(), MaxPool2D(kernel_size=2, stride=2), Conv2D(in_channels=32, out_channels=32, kernel_size=5, stride=1), ReLU(), AvgPool2D(kernel_size=2, stride=2), Conv2D(in_channels=32, out_channels=64, kernel_size=4, stride=1), ReLU(), AvgPool2D(kernel_size=2, stride=2), ) self.fc = Sequential( Linear(in_features=64*1*1, out_features=64), Linear(in_features=64, out_features=num_classes), ) def forward(self, inputs): # Q5-2: 根据Cifar10拓扑结构图和Q4中完成的网络参数配置表完成下列Cifar10模型的类定义 # 定义前向传输过程: [Your codes 5] x = self.features(inputs) x = paddle.flatten(x, 1) x = self.fc(x) return x
#### 网络测试 if __name__ == '__main__': # 1. 输出网络结构 # Q6-1: 完善下列网络结构测试代码: [Your codes 6] model = Cifar10() paddle.summary(model, (1, 3, 32, 32)) # 2. 测试前向传输 # Q6-2: 完善前向传输测试代码: [Your codes 7] print('测试前向传输:') img = paddle.rand([2, 3, 32, 32]) model = Cifar10() outs = model(img).numpy() print(outs) print('输出张量的形态为:{}'.format(outs.shape))
--------------------------------------------------------------------------- Layer (type) Input Shape Output Shape Param # =========================================================================== Conv2D-4 [[1, 3, 32, 32]] [1, 32, 28, 28] 2,432 ReLU-1 [[1, 32, 28, 28]] [1, 32, 28, 28] 0 MaxPool2D-1 [[1, 32, 28, 28]] [1, 32, 14, 14] 0 Conv2D-5 [[1, 32, 14, 14]] [1, 32, 10, 10] 25,632 ReLU-2 [[1, 32, 10, 10]] [1, 32, 10, 10] 0 AvgPool2D-1 [[1, 32, 10, 10]] [1, 32, 5, 5] 0 Conv2D-6 [[1, 32, 5, 5]] [1, 64, 2, 2] 32,832 ReLU-3 [[1, 64, 2, 2]] [1, 64, 2, 2] 0 AvgPool2D-2 [[1, 64, 2, 2]] [1, 64, 1, 1] 0 Linear-3 [[1, 64]] [1, 64] 4,160 Linear-4 [[1, 64]] [1, 10] 650 =========================================================================== Total params: 65,706 Trainable params: 65,706 Non-trainable params: 0 --------------------------------------------------------------------------- Input size (MB): 0.01 Forward/backward pass size (MB): 0.49 Params size (MB): 0.25 Estimated Total Size (MB): 0.75 --------------------------------------------------------------------------- 测试前向传输: [[ 0.2849031 -0.05835417 -0.8993984 0.24468683 0.73669934 -0.23037165 1.0173364 0.14233571 0.15384644 -0.4779564 ] [ 0.20973384 -0.06269945 -0.86370426 0.2379338 0.79307395 -0.05458799 0.9996517 0.09347898 0.11147128 -0.26353168]] 输出张量的形态为:(2, 10)