합성곱 신경망(Convolutional Neural Network) 구현

MNIST 데이터의 손글씨로 적힌 숫자 이미지를 분류하는 다중 분류(Multiclass classification) 문제를 다룰 것이다. 앞의 포스팅 "[Model] 01. 인공신경망(Artificial Neural Network) 구현"과 중복되는 내용에 대해 다루지 않을 것이다. 필요하면 다음 링크를 통해 참고하도록 하자.

[Model] 01. 인공신경망(Artificial Neural Network) 구현

MNIST 데이터의 손글씨로 적힌 숫자 이미지를 분류하는 다중 분류(Multiclass classification) 문제를 다룰 것이다. 데이터는 여기(https://www.kaggle.com/c/digit-recognizer)에서 얻을 수 있다. 파이토치를 이..

rooney-song.tistory.com

파이토치를 이용하여 합성곱 신경망(Convolutional Neural Network)를 구현할 것이다. 구현 과정은 다음과 같다.

1. 데이터 입력 및 확인

2. 데이터 전처리

3. 모델 설정

4. 데이터 학습 및 검증

1. 데이터 입력 및 확인

In:

import numpy as np
import torch
import torch.nn as nn
import torch.optim as optim
import torch.nn.init as init
import torchvision.datasets as dset
import torchvision.transforms as transforms
from torch.utils.data import DataLoader
from torchsummary import summary
import matplotlib.pyplot as plt

train = dset.MNIST('./', train = True, transform = transforms.ToTensor(), target_transform = None, download = True)
test = dset.MNIST('./', train = False, transform = transforms.ToTensor(), target_transform = None, download = True)

▷ dset.MNIST()를 이용하여 MNIST 데이터를 불러왔다. "./"는 현재 코드가 있는 위치를 경로로 사용하겠다는 의미이고, transform 인자에 transforms.ToTensor()를 주어 이미지 데이터를 파이토치의 텐서로 변환하였다. 레이블에 대한 변환은 하지 않았다. download 인자는 현재 경로에 MNIST 데이터가 업을 경우 다운로드하였다.

In:

print(train)
print(test)

Out:

Dataset MNIST
    Number of datapoints: 60000
    Root location: ./
    Split: Train
    StandardTransform
Transform: ToTensor()
Dataset MNIST
    Number of datapoints: 10000
    Root location: ./
    Split: Test
    StandardTransform
Transform: ToTensor()

▷ 훈련 데이터의 수는 60,000개 테스트 데이터의 수는 10,000개이고, 데이터의 형태가 텐서로 바뀐 것을 확인할 수 있다.

In:

train[1][0].shape

Out:

torch.Size([1, 28, 28])

▷ 각 데이터의 형태를 확인할 결과, 채널의 수는 1개, 가로와 세로의 픽셀의 수는 28개인 것을 확인할 수 있다.

2. 데이터 전처리

In:

batch_size = 256

ldr_train = torch.utils.data.DataLoader(train, batch_size = batch_size, shuffle = True, num_workers = 2, drop_last = True)
ldr_test = torch.utils.data.DataLoader(test, batch_size = batch_size, shuffle = True, num_workers = 2, drop_last = True)

▷ 배치 크기를 256으로 설정하고, 훈련 및 테스트 데이터 세트의 타입을 데이터 로더(Data loader)로 변환하였다.

▷ num_workers 인자를 통해 사용할 프로세스의 수를 설정하고, drop_last 인자를 통해 마지막으로 형성된 배치를 포함시킬지 결정한다.

3. 모델 설정

In:

class CNN(nn.Module):
    def __init__(self):
        super(CNN, self).__init__()
        
        self.layer = nn.Sequential(
            nn.Conv2d(1, 16, 5), 
            nn.ReLU(), 
            nn.Conv2d(16, 32, 5), 
            nn.ReLU(), 
            nn.MaxPool2d(2, 2), 
            nn.Conv2d(32, 64, 5), 
            nn.ReLU(), 
            nn.MaxPool2d(2, 2)
        )
        
        self.fc_layer = nn.Sequential(
            nn.Linear(64*3*3, 100), 
            nn.ReLU(), 
            nn.Linear(100, 10)
        )
        
    def forward(self, x):
        out = self.layer(x)
        out = out.view(batch_size, -1)
        out = self.fc_layer(out)
        
        return out

▷ nn.Convd2()를 이용하여 입력받고 출력하는 채널의 수를 지정하고, 필터(Filter)의 크기를 설정할 수 있다. 따라서 첫 번째 레이어(Layer)에서는 입력 받는 채널의 수가 1이고, 출력하는 채널의 수는 5개, 필터의 크기는 5×5이다.

▷ nn.MaxPool2d()을 이용하여 맥스 풀링(Max pooling)할 때의 크기를 지정하고, 보폭(Stride)을 정할 수 있다. 보폭에 대한 인자를 주지 않을 경우, 기본값으로 1로 설정된다.

▷ 위의 CNN 모델은 활성화 함수(Activation function)로 ReLU를 사용하고 있다. 세 합성곱 레이어(Convolutional layer)를 거친 다음, 두 선형 레이어를 통해 결과값을 출력한다. 이때, 중간에 오버피팅(Overfitting)을 방지하기 위해 맥스 풀링을 사용하였다.

In:

summary(nn.Sequential(
            nn.Conv2d(1, 16, 5), 
            nn.ReLU(), 
            nn.Conv2d(16, 32, 5), 
            nn.ReLU(), 
            nn.MaxPool2d(2, 2), 
            nn.Conv2d(32, 64, 5), 
            nn.ReLU(), 
            nn.MaxPool2d(2, 2)
        ), (1, 28, 28))

summary(nn.Sequential(
            nn.Linear(64*3*3, 100), 
            nn.ReLU(), 
            nn.Linear(100, 10)
        ), (1, 64*3*3))

Out:

----------------------------------------------------------------
        Layer (type)               Output Shape         Param #
================================================================
            Conv2d-1           [-1, 16, 24, 24]             416
              ReLU-2           [-1, 16, 24, 24]               0
            Conv2d-3           [-1, 32, 20, 20]          12,832
              ReLU-4           [-1, 32, 20, 20]               0
         MaxPool2d-5           [-1, 32, 10, 10]               0
            Conv2d-6             [-1, 64, 6, 6]          51,264
              ReLU-7             [-1, 64, 6, 6]               0
         MaxPool2d-8             [-1, 64, 3, 3]               0
================================================================
Total params: 64,512
Trainable params: 64,512
Non-trainable params: 0
----------------------------------------------------------------
Input size (MB): 0.00
Forward/backward pass size (MB): 0.40
Params size (MB): 0.25
Estimated Total Size (MB): 0.65
----------------------------------------------------------------
----------------------------------------------------------------
        Layer (type)               Output Shape         Param #
================================================================
            Linear-1               [-1, 1, 100]          57,700
              ReLU-2               [-1, 1, 100]               0
            Linear-3                [-1, 1, 10]           1,010
================================================================
Total params: 58,710
Trainable params: 58,710
Non-trainable params: 0
----------------------------------------------------------------
Input size (MB): 0.00
Forward/backward pass size (MB): 0.00
Params size (MB): 0.22
Estimated Total Size (MB): 0.23
----------------------------------------------------------------

▷ summary()를 이용하여 각 층의 출력 형태와 파라미터의 수를 확인할 수 있다.

▷ 합성곱 레이어의 파라미터의 수는 64,512개, 완전연결 층(Fully connnected layer)의 파라미터의 수는 58,710개인 것을 확인할 수 있다.

4. 데이터 학습 및 검증

Out:

Epoch: 1/10..  Training Loss: 0.771..  Test Loss: 0.212..  Test Accuracy: 0.937
Epoch: 2/10..  Training Loss: 0.168..  Test Loss: 0.125..  Test Accuracy: 0.962
Epoch: 3/10..  Training Loss: 0.112..  Test Loss: 0.079..  Test Accuracy: 0.976
Epoch: 4/10..  Training Loss: 0.087..  Test Loss: 0.073..  Test Accuracy: 0.977
Epoch: 5/10..  Training Loss: 0.072..  Test Loss: 0.054..  Test Accuracy: 0.983
Epoch: 6/10..  Training Loss: 0.061..  Test Loss: 0.053..  Test Accuracy: 0.984
Epoch: 7/10..  Training Loss: 0.054..  Test Loss: 0.046..  Test Accuracy: 0.985
Epoch: 8/10..  Training Loss: 0.048..  Test Loss: 0.044..  Test Accuracy: 0.987
Epoch: 9/10..  Training Loss: 0.044..  Test Loss: 0.039..  Test Accuracy: 0.987
Epoch: 10/10..  Training Loss: 0.041..  Test Loss: 0.036..  Test Accuracy: 0.988

In:

plt.plot(train_loss, label = 'Training loss')
plt.plot(test_loss, label = 'Test loss')
plt.xlabel('Epoch')
plt.ylabel('Cross Entropy Loss')
plt.legend(frameon = True)

Out:

▷ 마지막 학습을 진행한 뒤, 테스트 데이터에서의 정확도가 98.8%로 상당히 흡족할 만한 결과를 얻을 수 있었다.

▷ 그래프의 학습 진행과정이 상당히 바람직하게 나타난 것을 확인할 수 있다.

Reference:

최건호, 「파이토치 첫걸음」, 한빛미디어(2019)

'Deep Learning > Model' 카테고리의 다른 글

LSTM(Long Short-Term Memories model) 구현 (0)	2020.10.10
순환 신경망(Recurrent Neural Network) 구현 (0)	2020.10.01
인공신경망(Artificial Neural Network) 구현 (0)	2020.07.29

내 블로그 - 관리자 홈 전환	`Q` `Q`
새 글 쓰기	`W` `W`

글 수정 (권한 있는 경우)	`E` `E`
댓글 영역으로 이동	`C` `C`

이 페이지의 URL 복사	`S` `S`
맨 위로 이동	`T` `T`
티스토리 홈 이동	`H` `H`
단축키 안내	`Shift` + `/` `⇧` + `/`

Data world with 🌻Pep🌻

합성곱 신경망(Convolutional Neural Network) 구현

1. 데이터 입력 및 확인

2. 데이터 전처리

3. 모델 설정

4. 데이터 학습 및 검증

'Deep Learning > Model' 카테고리의 다른 글

티스토리툴바

단축키

내 블로그

블로그 게시글

모든 영역

합성곱 신경망(Convolutional Neural Network) 구현

1. 데이터 입력 및 확인

2. 데이터 전처리

3. 모델 설정

4. 데이터 학습 및 검증

'Deep Learning > Model' 카테고리의 다른 글

'Deep Learning/Model' Related Articles

티스토리툴바

단축키

내 블로그

블로그 게시글

모든 영역