๋ณธ๋ฌธ ๋ฐ”๋กœ๊ฐ€๊ธฐ

Deep Learning/Model

ํ•ฉ์„ฑ๊ณฑ ์‹ ๊ฒฝ๋ง(Convolutional Neural Network) ๊ตฌํ˜„

MNIST ๋ฐ์ดํ„ฐ์˜ ์†๊ธ€์”จ๋กœ ์ ํžŒ ์ˆซ์ž ์ด๋ฏธ์ง€๋ฅผ ๋ถ„๋ฅ˜ํ•˜๋Š” ๋‹ค์ค‘ ๋ถ„๋ฅ˜(Multiclass classification) ๋ฌธ์ œ๋ฅผ ๋‹ค๋ฃฐ ๊ฒƒ์ด๋‹ค. ์•ž์˜ ํฌ์ŠคํŒ… "[Model] 01. ์ธ๊ณต์‹ ๊ฒฝ๋ง(Artificial Neural Network) ๊ตฌํ˜„"๊ณผ ์ค‘๋ณต๋˜๋Š” ๋‚ด์šฉ์— ๋Œ€ํ•ด ๋‹ค๋ฃจ์ง€ ์•Š์„ ๊ฒƒ์ด๋‹ค. ํ•„์š”ํ•˜๋ฉด ๋‹ค์Œ ๋งํฌ๋ฅผ ํ†ตํ•ด ์ฐธ๊ณ ํ•˜๋„๋ก ํ•˜์ž.

 

 

[Model] 01. ์ธ๊ณต์‹ ๊ฒฝ๋ง(Artificial Neural Network) ๊ตฌํ˜„

MNIST ๋ฐ์ดํ„ฐ์˜ ์†๊ธ€์”จ๋กœ ์ ํžŒ ์ˆซ์ž ์ด๋ฏธ์ง€๋ฅผ ๋ถ„๋ฅ˜ํ•˜๋Š” ๋‹ค์ค‘ ๋ถ„๋ฅ˜(Multiclass classification) ๋ฌธ์ œ๋ฅผ ๋‹ค๋ฃฐ ๊ฒƒ์ด๋‹ค. ๋ฐ์ดํ„ฐ๋Š” ์—ฌ๊ธฐ(https://www.kaggle.com/c/digit-recognizer)์—์„œ ์–ป์„ ์ˆ˜ ์žˆ๋‹ค. ํŒŒ์ดํ† ์น˜๋ฅผ ์ด..

rooney-song.tistory.com

 

ํŒŒ์ดํ† ์น˜๋ฅผ ์ด์šฉํ•˜์—ฌ ํ•ฉ์„ฑ๊ณฑ ์‹ ๊ฒฝ๋ง(Convolutional Neural Network)๋ฅผ ๊ตฌํ˜„ํ•  ๊ฒƒ์ด๋‹ค. ๊ตฌํ˜„ ๊ณผ์ •์€ ๋‹ค์Œ๊ณผ ๊ฐ™๋‹ค.

 

1. ๋ฐ์ดํ„ฐ ์ž…๋ ฅ ๋ฐ ํ™•์ธ

2. ๋ฐ์ดํ„ฐ ์ „์ฒ˜๋ฆฌ

3. ๋ชจ๋ธ ์„ค์ •

4. ๋ฐ์ดํ„ฐ ํ•™์Šต ๋ฐ ๊ฒ€์ฆ

 

1. ๋ฐ์ดํ„ฐ ์ž…๋ ฅ ๋ฐ ํ™•์ธ

 

In:

import numpy as np
import torch
import torch.nn as nn
import torch.optim as optim
import torch.nn.init as init
import torchvision.datasets as dset
import torchvision.transforms as transforms
from torch.utils.data import DataLoader
from torchsummary import summary
import matplotlib.pyplot as plt

train = dset.MNIST('./', train = True, transform = transforms.ToTensor(), target_transform = None, download = True)
test = dset.MNIST('./', train = False, transform = transforms.ToTensor(), target_transform = None, download = True)

 

โ–ท dset.MNIST()๋ฅผ ์ด์šฉํ•˜์—ฌ MNIST ๋ฐ์ดํ„ฐ๋ฅผ ๋ถˆ๋Ÿฌ์™”๋‹ค. "./"๋Š” ํ˜„์žฌ ์ฝ”๋“œ๊ฐ€ ์žˆ๋Š” ์œ„์น˜๋ฅผ ๊ฒฝ๋กœ๋กœ ์‚ฌ์šฉํ•˜๊ฒ ๋‹ค๋Š” ์˜๋ฏธ์ด๊ณ , transform ์ธ์ž์— transforms.ToTensor()๋ฅผ ์ฃผ์–ด ์ด๋ฏธ์ง€ ๋ฐ์ดํ„ฐ๋ฅผ ํŒŒ์ดํ† ์น˜์˜ ํ…์„œ๋กœ ๋ณ€ํ™˜ํ•˜์˜€๋‹ค. ๋ ˆ์ด๋ธ”์— ๋Œ€ํ•œ ๋ณ€ํ™˜์€ ํ•˜์ง€ ์•Š์•˜๋‹ค. download ์ธ์ž๋Š” ํ˜„์žฌ ๊ฒฝ๋กœ์— MNIST ๋ฐ์ดํ„ฐ๊ฐ€ ์—…์„ ๊ฒฝ์šฐ ๋‹ค์šด๋กœ๋“œํ•˜์˜€๋‹ค.

 

In:

print(train)
print(test)

 

Out:

Dataset MNIST
    Number of datapoints: 60000
    Root location: ./
    Split: Train
    StandardTransform
Transform: ToTensor()
Dataset MNIST
    Number of datapoints: 10000
    Root location: ./
    Split: Test
    StandardTransform
Transform: ToTensor()

 

โ–ท ํ›ˆ๋ จ ๋ฐ์ดํ„ฐ์˜ ์ˆ˜๋Š” 60,000๊ฐœ ํ…Œ์ŠคํŠธ ๋ฐ์ดํ„ฐ์˜ ์ˆ˜๋Š” 10,000๊ฐœ์ด๊ณ , ๋ฐ์ดํ„ฐ์˜ ํ˜•ํƒœ๊ฐ€ ํ…์„œ๋กœ ๋ฐ”๋€ ๊ฒƒ์„ ํ™•์ธํ•  ์ˆ˜ ์žˆ๋‹ค.

 

In:

train[1][0].shape

 

Out:

torch.Size([1, 28, 28])

 

โ–ท ๊ฐ ๋ฐ์ดํ„ฐ์˜ ํ˜•ํƒœ๋ฅผ ํ™•์ธํ•  ๊ฒฐ๊ณผ, ์ฑ„๋„์˜ ์ˆ˜๋Š” 1๊ฐœ, ๊ฐ€๋กœ์™€ ์„ธ๋กœ์˜ ํ”ฝ์…€์˜ ์ˆ˜๋Š” 28๊ฐœ์ธ ๊ฒƒ์„ ํ™•์ธํ•  ์ˆ˜ ์žˆ๋‹ค.

 

2. ๋ฐ์ดํ„ฐ ์ „์ฒ˜๋ฆฌ

 

In:

batch_size = 256

ldr_train = torch.utils.data.DataLoader(train, batch_size = batch_size, shuffle = True, num_workers = 2, drop_last = True)
ldr_test = torch.utils.data.DataLoader(test, batch_size = batch_size, shuffle = True, num_workers = 2, drop_last = True)

 

โ–ท ๋ฐฐ์น˜ ํฌ๊ธฐ๋ฅผ 256์œผ๋กœ ์„ค์ •ํ•˜๊ณ , ํ›ˆ๋ จ ๋ฐ ํ…Œ์ŠคํŠธ ๋ฐ์ดํ„ฐ ์„ธํŠธ์˜ ํƒ€์ž…์„ ๋ฐ์ดํ„ฐ ๋กœ๋”(Data loader)๋กœ ๋ณ€ํ™˜ํ•˜์˜€๋‹ค.

 

โ–ท num_workers ์ธ์ž๋ฅผ ํ†ตํ•ด ์‚ฌ์šฉํ•  ํ”„๋กœ์„ธ์Šค์˜ ์ˆ˜๋ฅผ ์„ค์ •ํ•˜๊ณ , drop_last ์ธ์ž๋ฅผ ํ†ตํ•ด ๋งˆ์ง€๋ง‰์œผ๋กœ ํ˜•์„ฑ๋œ ๋ฐฐ์น˜๋ฅผ ํฌํ•จ์‹œํ‚ฌ์ง€ ๊ฒฐ์ •ํ•œ๋‹ค.

 

3. ๋ชจ๋ธ ์„ค์ •

 

In:

class CNN(nn.Module):
    def __init__(self):
        super(CNN, self).__init__()
        
        self.layer = nn.Sequential(
            nn.Conv2d(1, 16, 5), 
            nn.ReLU(), 
            nn.Conv2d(16, 32, 5), 
            nn.ReLU(), 
            nn.MaxPool2d(2, 2), 
            nn.Conv2d(32, 64, 5), 
            nn.ReLU(), 
            nn.MaxPool2d(2, 2)
        )
        
        self.fc_layer = nn.Sequential(
            nn.Linear(64*3*3, 100), 
            nn.ReLU(), 
            nn.Linear(100, 10)
        )
        
    def forward(self, x):
        out = self.layer(x)
        out = out.view(batch_size, -1)
        out = self.fc_layer(out)
        
        return out

 

โ–ท nn.Convd2()๋ฅผ ์ด์šฉํ•˜์—ฌ ์ž…๋ ฅ๋ฐ›๊ณ  ์ถœ๋ ฅํ•˜๋Š” ์ฑ„๋„์˜ ์ˆ˜๋ฅผ ์ง€์ •ํ•˜๊ณ , ํ•„ํ„ฐ(Filter)์˜ ํฌ๊ธฐ๋ฅผ ์„ค์ •ํ•  ์ˆ˜ ์žˆ๋‹ค. ๋”ฐ๋ผ์„œ ์ฒซ ๋ฒˆ์งธ ๋ ˆ์ด์–ด(Layer)์—์„œ๋Š” ์ž…๋ ฅ ๋ฐ›๋Š” ์ฑ„๋„์˜ ์ˆ˜๊ฐ€ 1์ด๊ณ , ์ถœ๋ ฅํ•˜๋Š” ์ฑ„๋„์˜ ์ˆ˜๋Š” 5๊ฐœ, ํ•„ํ„ฐ์˜ ํฌ๊ธฐ๋Š” 5×5์ด๋‹ค.

 

โ–ท nn.MaxPool2d()์„ ์ด์šฉํ•˜์—ฌ ๋งฅ์Šค ํ’€๋ง(Max pooling)ํ•  ๋•Œ์˜ ํฌ๊ธฐ๋ฅผ ์ง€์ •ํ•˜๊ณ , ๋ณดํญ(Stride)์„ ์ •ํ•  ์ˆ˜ ์žˆ๋‹ค. ๋ณดํญ์— ๋Œ€ํ•œ ์ธ์ž๋ฅผ ์ฃผ์ง€ ์•Š์„ ๊ฒฝ์šฐ, ๊ธฐ๋ณธ๊ฐ’์œผ๋กœ 1๋กœ ์„ค์ •๋œ๋‹ค.

 

โ–ท ์œ„์˜ CNN ๋ชจ๋ธ์€ ํ™œ์„ฑํ™” ํ•จ์ˆ˜(Activation function)๋กœ ReLU๋ฅผ ์‚ฌ์šฉํ•˜๊ณ  ์žˆ๋‹ค. ์„ธ ํ•ฉ์„ฑ๊ณฑ ๋ ˆ์ด์–ด(Convolutional layer)๋ฅผ ๊ฑฐ์นœ ๋‹ค์Œ, ๋‘ ์„ ํ˜• ๋ ˆ์ด์–ด๋ฅผ ํ†ตํ•ด ๊ฒฐ๊ณผ๊ฐ’์„ ์ถœ๋ ฅํ•œ๋‹ค. ์ด๋•Œ, ์ค‘๊ฐ„์— ์˜ค๋ฒ„ํ”ผํŒ…(Overfitting)์„ ๋ฐฉ์ง€ํ•˜๊ธฐ ์œ„ํ•ด ๋งฅ์Šค ํ’€๋ง์„ ์‚ฌ์šฉํ•˜์˜€๋‹ค.

 

In:

summary(nn.Sequential(
            nn.Conv2d(1, 16, 5), 
            nn.ReLU(), 
            nn.Conv2d(16, 32, 5), 
            nn.ReLU(), 
            nn.MaxPool2d(2, 2), 
            nn.Conv2d(32, 64, 5), 
            nn.ReLU(), 
            nn.MaxPool2d(2, 2)
        ), (1, 28, 28))

summary(nn.Sequential(
            nn.Linear(64*3*3, 100), 
            nn.ReLU(), 
            nn.Linear(100, 10)
        ), (1, 64*3*3))

 

Out:

----------------------------------------------------------------
        Layer (type)               Output Shape         Param #
================================================================
            Conv2d-1           [-1, 16, 24, 24]             416
              ReLU-2           [-1, 16, 24, 24]               0
            Conv2d-3           [-1, 32, 20, 20]          12,832
              ReLU-4           [-1, 32, 20, 20]               0
         MaxPool2d-5           [-1, 32, 10, 10]               0
            Conv2d-6             [-1, 64, 6, 6]          51,264
              ReLU-7             [-1, 64, 6, 6]               0
         MaxPool2d-8             [-1, 64, 3, 3]               0
================================================================
Total params: 64,512
Trainable params: 64,512
Non-trainable params: 0
----------------------------------------------------------------
Input size (MB): 0.00
Forward/backward pass size (MB): 0.40
Params size (MB): 0.25
Estimated Total Size (MB): 0.65
----------------------------------------------------------------
----------------------------------------------------------------
        Layer (type)               Output Shape         Param #
================================================================
            Linear-1               [-1, 1, 100]          57,700
              ReLU-2               [-1, 1, 100]               0
            Linear-3                [-1, 1, 10]           1,010
================================================================
Total params: 58,710
Trainable params: 58,710
Non-trainable params: 0
----------------------------------------------------------------
Input size (MB): 0.00
Forward/backward pass size (MB): 0.00
Params size (MB): 0.22
Estimated Total Size (MB): 0.23
----------------------------------------------------------------

 

โ–ท summary()๋ฅผ ์ด์šฉํ•˜์—ฌ ๊ฐ ์ธต์˜ ์ถœ๋ ฅ ํ˜•ํƒœ์™€ ํŒŒ๋ผ๋ฏธํ„ฐ์˜ ์ˆ˜๋ฅผ ํ™•์ธํ•  ์ˆ˜ ์žˆ๋‹ค.

 

โ–ท ํ•ฉ์„ฑ๊ณฑ ๋ ˆ์ด์–ด์˜ ํŒŒ๋ผ๋ฏธํ„ฐ์˜ ์ˆ˜๋Š” 64,512๊ฐœ, ์™„์ „์—ฐ๊ฒฐ ์ธต(Fully connnected layer)์˜ ํŒŒ๋ผ๋ฏธํ„ฐ์˜ ์ˆ˜๋Š” 58,710๊ฐœ์ธ ๊ฒƒ์„ ํ™•์ธํ•  ์ˆ˜ ์žˆ๋‹ค.

 

4. ๋ฐ์ดํ„ฐ ํ•™์Šต ๋ฐ ๊ฒ€์ฆ

 

Out:

Epoch: 1/10..  Training Loss: 0.771..  Test Loss: 0.212..  Test Accuracy: 0.937
Epoch: 2/10..  Training Loss: 0.168..  Test Loss: 0.125..  Test Accuracy: 0.962
Epoch: 3/10..  Training Loss: 0.112..  Test Loss: 0.079..  Test Accuracy: 0.976
Epoch: 4/10..  Training Loss: 0.087..  Test Loss: 0.073..  Test Accuracy: 0.977
Epoch: 5/10..  Training Loss: 0.072..  Test Loss: 0.054..  Test Accuracy: 0.983
Epoch: 6/10..  Training Loss: 0.061..  Test Loss: 0.053..  Test Accuracy: 0.984
Epoch: 7/10..  Training Loss: 0.054..  Test Loss: 0.046..  Test Accuracy: 0.985
Epoch: 8/10..  Training Loss: 0.048..  Test Loss: 0.044..  Test Accuracy: 0.987
Epoch: 9/10..  Training Loss: 0.044..  Test Loss: 0.039..  Test Accuracy: 0.987
Epoch: 10/10..  Training Loss: 0.041..  Test Loss: 0.036..  Test Accuracy: 0.988

 

In:

plt.plot(train_loss, label = 'Training loss')
plt.plot(test_loss, label = 'Test loss')
plt.xlabel('Epoch')
plt.ylabel('Cross Entropy Loss')
plt.legend(frameon = True)

 

Out:

 

โ–ท ๋งˆ์ง€๋ง‰ ํ•™์Šต์„ ์ง„ํ–‰ํ•œ ๋’ค, ํ…Œ์ŠคํŠธ ๋ฐ์ดํ„ฐ์—์„œ์˜ ์ •ํ™•๋„๊ฐ€ 98.8%๋กœ ์ƒ๋‹นํžˆ ํก์กฑํ•  ๋งŒํ•œ ๊ฒฐ๊ณผ๋ฅผ ์–ป์„ ์ˆ˜ ์žˆ์—ˆ๋‹ค.

 

โ–ท ๊ทธ๋ž˜ํ”„์˜ ํ•™์Šต ์ง„ํ–‰๊ณผ์ •์ด ์ƒ๋‹นํžˆ ๋ฐ”๋žŒ์งํ•˜๊ฒŒ ๋‚˜ํƒ€๋‚œ ๊ฒƒ์„ ํ™•์ธํ•  ์ˆ˜ ์žˆ๋‹ค.

 


Reference:

์ตœ๊ฑดํ˜ธ, ใ€ŒํŒŒ์ดํ† ์น˜ ์ฒซ๊ฑธ์Œใ€, ํ•œ๋น›๋ฏธ๋””์–ด(2019)