AI人工智能培訓(xùn)：AlexNet手寫數(shù)字圖像識(shí)別

更新時(shí)間:2022-12-07 來(lái)源:傳智教育瀏覽量:

圖像分類

圖像分類實(shí)質(zhì)上就是從給定的類別集合中為圖像分配對(duì)應(yīng)標(biāo)簽的任務(wù)。也就是說(shuō)我們的任務(wù)是分析一個(gè)輸入圖像并返回一個(gè)該圖像類別的標(biāo)簽。

假定類別集為categories = {dog, cat, panda}，之后我們提供一張圖片給分類模型，如下圖所示：

圖像分類

分類模型給圖像分配多個(gè)標(biāo)簽，每個(gè)標(biāo)簽的概率值不同，如dog:95%，cat:4%，panda:1%，根據(jù)概率值的大小將該圖片分類為dog，那就完成了圖像分類的任務(wù)。下面利用AlexNet完成圖像分類過(guò)程的講解。

AlexNet完手寫數(shù)字勢(shì)識(shí)別

2012年，AlexNet橫空出世，該模型的名字源于論文第一作者的姓名Alex Krizhevsky 。AlexNet使用了8層卷積神經(jīng)網(wǎng)絡(luò)，以很大的優(yōu)勢(shì)贏得了ImageNet 2012圖像識(shí)別挑戰(zhàn)賽。它首次證明了學(xué)習(xí)到的特征可以超越手工設(shè)計(jì)的特征，從而一舉打破計(jì)算機(jī)視覺(jué)研究的方向。

AlexNet使用ImageNet數(shù)據(jù)集進(jìn)行訓(xùn)練，但因?yàn)镮mageNet數(shù)據(jù)集較大訓(xùn)練時(shí)間較長(zhǎng)，我們?nèi)杂们懊娴腗NIST數(shù)據(jù)集來(lái)演示AlexNet。讀取數(shù)據(jù)的時(shí)將圖像高和寬擴(kuò)大到AlexNet使用的圖像高和寬227。這個(gè)通過(guò)tf.image.resize_with_pad來(lái)實(shí)現(xiàn)。

數(shù)據(jù)讀取

首先獲取數(shù)據(jù),并進(jìn)行維度調(diào)整：

import numpy as np
# 獲取手寫數(shù)字?jǐn)?shù)據(jù)集
(train_images, train_labels), (test_images, test_labels) = mnist.load_data()
# 訓(xùn)練集數(shù)據(jù)維度的調(diào)整：N H W C
train_images = np.reshape(train_images,(train_images.shape[0],train_images.shape[1],train_images.shape[2],1))
# 測(cè)試集數(shù)據(jù)維度的調(diào)整：N H W C
test_images = np.reshape(test_images,(test_images.shape[0],test_images.shape[1],test_images.shape[2],1))

由于使用全部數(shù)據(jù)訓(xùn)練時(shí)間較長(zhǎng)，我們定義兩個(gè)方法獲取部分?jǐn)?shù)據(jù)，并將圖像調(diào)整為227*227大小，進(jìn)行模型訓(xùn)練：

# 定義兩個(gè)方法隨機(jī)抽取部分樣本演示# 獲取訓(xùn)練集數(shù)據(jù)def get_train(size):
    # 隨機(jī)生成要抽樣的樣本的索引
    index = np.random.randint(0, np.shape(train_images)[0], size)
    # 將這些數(shù)據(jù)resize成227*227大小
    resized_images = tf.image.resize_with_pad(train_images[index],227,227,)
    # 返回抽取的
    return resized_images.numpy(), train_labels[index]# 獲取測(cè)試集數(shù)據(jù) def get_test(size):
    # 隨機(jī)生成要抽樣的樣本的索引
    index = np.random.randint(0, np.shape(test_images)[0], size)
    # 將這些數(shù)據(jù)resize成227*227大小
    resized_images = tf.image.resize_with_pad(test_images[index],227,227,)
    # 返回抽樣的測(cè)試樣本
    return resized_images.numpy(), test_labels[index]

調(diào)用上述兩個(gè)方法，獲取參與模型訓(xùn)練和測(cè)試的數(shù)據(jù)集：

# 獲取訓(xùn)練樣本和測(cè)試樣本
train_images,train_labels = get_train(256)
test_images,test_labels = get_test(128)

為了讓大家更好的理解，我們將數(shù)據(jù)展示出來(lái)：

# 數(shù)據(jù)展示：將數(shù)據(jù)集的前九個(gè)數(shù)據(jù)集進(jìn)行展示for i in range(9):
    plt.subplot(3,3,i+1)
    # 以灰度圖顯示，不進(jìn)行插值
    plt.imshow(train_images[i].astype(np.int8).squeeze(), cmap='gray', interpolation='none')
    # 設(shè)置圖片的標(biāo)題：對(duì)應(yīng)的類別
    plt.title("數(shù)字{}".format(train_labels[i]))

結(jié)果為：