Python手写数字识别

Neural Network

手写数字识别——MNIST数据集

神经网络初始化

节点数

使用一个简单的三层神经网络:输入层、中间隐藏层和输出层。

初始权重矩阵

训练:相邻层之间使用随机正态分布设定初始权重矩阵

使用:读取训练好的权重矩阵

权重矩阵:\(W_{mn}\)\(\{w_{mn}\}\)表示前一层第n个节点后一层第m个节点之间的权重系数;其规模为\(F^{m*n}\),表示后一层m个节点,前一层n个节点。

学习率

学习率(learning rate)为误差的调整幅度。

激活函数

激活函数(activation function)一般为sigmoid函数\(\frac 1{1+e^{-x}}\),用来模拟神经元细胞接受刺激、传出刺激的过程。

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
class neuralNetwork:
# initialize the neural network
def __init__(self, inputnodes, hiddennodes, outputnodes, learningrate):
# set number of nodes in each input, hidden, output layer
self.inNodes = inputnodes
self.hidNodes = hiddennodes
self.outNodes = outputnodes

# wih is the link weight matrix between hidden nodes and input nodes;
# who is the link weight matrix between output nodes and hidden nodes;
# w11 w21
# w12 w22 etc
self.wih = numpy.random.normal(0.0, pow(self.hidNodes, -0.5), (self.hidNodes, self.inNodes))
self.who = numpy.random.normal(0.0, pow(self.outNodes, -0.5), (self.outNodes, self.hidNodes))

# learning rate
self.lr = learningrate

# activation function is the sigmoid function
self.activation_function = lambda x:scipy.special.expit(x)

训练函数

1、获取输入数据、标准输出数据;

2、计算隐藏层输入数据(权重+激活函数);

3、计算输出层输出数据(权重+激活函数);

4、计算输出层权重误差,进而反向计算中间隐藏层权重误差、输入层权重误差;

5、修正权重矩阵误差(梯度下降法)。

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
def train(self, inputs_list, targets_list):
# convert inputs list to 2d array
inputs = numpy.array(inputs_list, ndmin = 2).T
targets = numpy.array(targets_list, ndmin = 2).T

# calculate signals into hidden layer
hidden_inputs = numpy.dot(self.wih, inputs)
# calculate the signals emerging from hidden layer
hidden_outputs = self.activation_function(hidden_inputs)

# calculate signals into final output layer
final_inputs = numpy.dot(self.who, hidden_outputs)
# calculate the signals emerging from final output layer
final_outputs = self.activation_function(final_inputs)

# output layer error is the (target - acture)
output_errors = targets - final_outputs
# hidden layer error is the output_errors, split by weights, recombined at hidden nodes
hidden_errors = numpy.dot(self.who.T, output_errors)

# update the weights for the links between the hidden and output layers
self.who += self.lr * numpy.dot((output_errors * final_outputs * (1.0 - final_outputs)), numpy.transpose(hidden_outputs))

# update the weights for the links between the input and hidden layers
self.wih += self.lr * numpy.dot((hidden_errors * hidden_outputs * (1.0 - hidden_outputs)), numpy.transpose(inputs))

查询函数

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
def query(self, inputs_list):
# convert inputs list to 2d array
inputs = numpy.array(inputs_list, ndmin = 2).T

# calculate signals into hidden layer
hidden_inputs = numpy.dot(self.wih, inputs)
# calculate the signals emerging from hidden layer
hidden_outputs = self.activation_function(hidden_inputs)

# calculate signals into final output layer
final_inputs = numpy.dot(self.who, hidden_outputs)
# calculate the sognals emerging from final output layer
final_outputs = self.activation_function(final_inputs)

return final_outputs

训练过程

训练数据集

中间隐藏层

学习率

世代(rotations)

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
if __name__ == "__main__":
# train
input_nodes = 784
# 改变中间层层数
hidden_nodes = 100
output_nodes = 10

# 改变学习率
learning_rate = 0.3
# 改变层数
epochs = 5

n = neuralNetwork(input_nodes, hidden_nodes, output_nodes, learning_rate)

# training_data_file = open("./mnist_train_100.csv", 'r')
training_data_file = open("./mnist_train.csv", 'r')
training_data_list = training_data_file.readlines()
training_data_file.close()

for e in range(epochs):
for record in training_data_list:
all_values = record.split(',')
inputs = (numpy.asfarray(all_values[1:]) / 255.0 * 0.99) + 0.01
targets = numpy.zeros(output_nodes) + 0.01
targets[int(all_values[0])] = 0.99
n.train(inputs, targets)

# test
# test_data_file = open("mnist_test_10.csv", 'r')
test_data_file = open("mnist_test.csv", 'r')
test_data_list = test_data_file.readlines()
test_data_file.close()

scorecard = []

for record in test_data_list:
all_values = record.split(',')
correct_label = int(all_values[0])
print(correct_label, "is correct label")
inputs = (numpy.asfarray(all_values[1:]) / 255.0 * 0.99) + 0.01
outputs = n.query(inputs)
label = numpy.argmax(outputs)
print(label, "is network's answer\n")
if label == correct_label:
scorecard.append(1)
else:
scorecard.append(0)
scorecard_array = numpy.asarray(scorecard)
print("performance = {}".format(scorecard_array.sum() / scorecard_array.size))

easy1

1
2
3
4
5
6
7
8
9
10
11
# 100个训练数据,10个测试数据
training_data_file = open("./mnist_train_100.csv", 'r')
# 中间隐藏层为100节点
hidden_nodes = 100
# 学习率为0.3
learning_rate = 0.3
# 训练世代数为1
epochs = 1

# 正确率为
performance = 0.6

hard1

1
2
3
4
5
6
7
8
9
10
11
# 60000个训练数据,10000个测试数据
training_data_file = open("./mnist_train_100.csv", 'r')
# 中间隐藏层为100节点
hidden_nodes = 100
# 学习率为0.3
learning_rate = 0.3
# 训练世代数为1
epochs = 1

# 正确率为
performance = 0.9463

可以看出,扩大训练规模后,学习率有了大幅上升,测量的正确率也更加精确。

hard2

1
2
3
4
5
6
7
8
9
10
11
# 60000个训练数据,10000个测试数据
training_data_file = open("./mnist_train_100.csv", 'r')
# 中间隐藏层为100节点
hidden_nodes = 100
# 学习率为0.2
learning_rate = 0.2
# 训练世代数为1
epochs = 1

# 正确率为
performance = 0.9507

可以看出,当调整学习率为0.2时,正确率有一定上升;然而,调整为0.4时,正确率为performance = 0.9302,有了一定下降。

学习率—正确率

hard3

1
2
3
4
5
6
7
8
9
10
11
# 60000个训练数据,10000个测试数据
training_data_file = open("./mnist_train_100.csv", 'r')
# 中间隐藏层为100节点
hidden_nodes = 100
# 学习率为0.3
learning_rate = 0.3
# 训练世代数为5
epochs = 5

# 正确率为
performance = 0.9517

提高训练世代为5时,正确率有了提高;然而,世代数过多可能会因为过拟合导致正确率下降。

hard4

1
2
3
4
5
6
7
8
9
10
11
# 60000个训练数据,10000个测试数据
training_data_file = open("./mnist_train_100.csv", 'r')
# 中间隐藏层为300节点
hidden_nodes = 300
# 学习率为0.3
learning_rate = 0.3
# 训练世代数为1
epochs = 1

# 正确率为
performance = 0.9532

增加中间隐藏层层数为300时,正确率有了一些提升。值得注意的是,当中间隐藏层规模越大,增幅越少。

MAX

根据以上规律,采取如下设置,达到最好正确率。

1
2
3
4
5
6
7
8
9
10
11
# 60000个训练数据,10000个测试数据
training_data_file = open("./mnist_train_100.csv", 'r')
# 中间隐藏层为600节点
hidden_nodes = 600
# 学习率为0.1
learning_rate = 0.1
# 训练世代数为5
epochs = 5

# 正确率为
performance = 0.9753

最高可以达到97.53%的正确率。

训练后识别

首先,在训练后将储存的设定训练好的权重矩阵写入文件。

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
if __name__ == "__main__":
# train
...

# test
...

# 写入设定及权重
with open('setup','w') as fw:
fw.write(str(n.inNodes)+'\n')
fw.write(str(n.hidNodes)+'\n')
fw.write(str(n.outNodes)+'\n')
fw.write(str(n.lr)+'\n')
fw.write(str(epochs)+'\n')
fw.write(str(performance)+'\n')
with open('weight.csv','w', newline='') as fwc:
fwc_csv = csv.writer(fwc)
fwc_csv.writerows(n.wih)
fwc_csv.writerows(n.who)
pass

其次,在使用时,读入设定对神经网络进行初始化,然后读入训练好的权重矩阵,重写初始权重矩阵

1
2
3
4
5
6
# 引用库
import csv
import numpy
import scipy.special

import NeuralNetwork as nn
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
# 初始化函数
def initialize():
# 从初始化文件读取设定
with open("setup", 'r') as fo:
list_temp = fo.readlines()
fo.close()
input_nodes = int(list_temp[0].strip('\n'))
hidden_nodes = int(list_temp[1].strip('\n'))
output_nodes = int(list_temp[2].strip('\n'))
learning_rate = float(list_temp[3].strip('\n'))
epochs = int(list_temp[4].strip('\n'))
theoretical_performance = float(list_temp[5].strip('\n'))

# 从初始化文件读取权重矩阵wih、who
wih = [[] for i in range(hidden_nodes)]
who = [[] for i in range(output_nodes)]
with open('weight.csv', 'r') as fr:
fr_csv = csv.reader(fr)
k = 0
for row in fr_csv:
if k <= hidden_nodes - 1:
wih[k] = list(map(float, row))
else:
who[k-hidden_nodes] = list(map(float, row))
k += 1

# 实例化神经网络,设置权重矩阵
neural_network = nn.neuralNetwork(input_nodes, hidden_nodes, output_nodes, learning_rate)
neural_network.wih = wih
neural_network.who = who

# 输出理论正确率
print("theoretical_performance = {}".format(theoretical_performance))
return neural_network
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
# 测试函数
def test(neural_network):
# test
# test_data_file = open("mnist_test_10.csv", 'r')
test_data_file = open("mnist_test.csv", 'r')
test_data_list = test_data_file.readlines()
test_data_file.close()

scorecard = []

for record in test_data_list:
all_values = record.split(',')
correct_label = int(all_values[0])
print(correct_label, "is correct label")
inputs = (numpy.asfarray(all_values[1:]) / 255.0 * 0.99) + 0.01
outputs = neural_network.query(inputs)
label = numpy.argmax(outputs)
print(label, "is network's answer\n")
if label == correct_label:
scorecard.append(1)
else:
scorecard.append(0)
scorecard_array = numpy.asarray(scorecard)
performance = scorecard_array.sum() / scorecard_array.size
print("performance = {}".format(performance))
return performance

识别程序为:

1
2
3
if __name__ == "__main__":
neural_network = initialize()
test(neural_network)

图像转换

有时候,我们需要识别的图像并不是标准大小分辨率。大小需要使用图像处理软件调整为正方形,分辨率可以使用程序进行转换。

1
2
3
4
5
6
7
8
9
# 引用库
import glob
import csv
import imageio
import numpy
from PIL import Image

import NeuralNetwork as nn
import Use
1
2
3
4
5
# 分辨率调整函数
def produceImage(file_in, width, height, file_out):
image = Image.open(file_in)
resized_image = image.resize((width, height), Image.ANTIALIAS)
resized_image.save(file_out)

首先,重整分辨率:

使用glob进行匹配。

1
2
3
# 重整分辨率
for image_file_name in glob.glob('?.png'):
produceImage(image_file_name,28,28,image_file_name)

其次,预处理重整后图像的数据:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
# 数值预处理
# our_own_dataset为各个图像预处理后数据的集合
# our_own_dataset中各个元素为一个record,包括正确的标签和图像数据
our_own_dataset = []
for image_file_name in glob.glob('?.png'):
print ("loading ... ", image_file_name)
# use the filename to set the correct label
label = int(image_file_name[-5:-4])
# load image data from png files into an array
img_array = imageio.imread(image_file_name, as_gray=True)
# reshape from 28x28 to list of 784 values, invert values
img_data = 255.0 - img_array.reshape(784)
# then scale data to range from 0.01 to 1.0
img_data = (img_data / 255.0 * 0.99) + 0.01
print(numpy.min(img_data))
print(numpy.max(img_data))
# append label and image data to test data set
record = numpy.append(label,img_data)
print(record)
our_own_dataset.append(record)
pass

最后,进行识别:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
# 初始化
neural_network = Use.initialize()

# 测试
scorecard = []
for record in our_own_dataset:
all_values = record
correct_label = int(all_values[0])
print(correct_label, "is correct label")
inputs = numpy.asfarray(all_values[1:])
outputs = neural_network.query(inputs)
label = numpy.argmax(outputs)
print(label, "is network's answer\n")
if label == correct_label:
scorecard.append(1)
else:
scorecard.append(0)
scorecard_array = numpy.asarray(scorecard)
print("performance = {}".format(scorecard_array.sum() / scorecard_array.size))