๋ณธ๋ฌธ ๋ฐ”๋กœ๊ฐ€๊ธฐ

Deep Learning/Model

LSTM(Long Short-Term Memories model) ๊ตฌํ˜„

ํŒŒ์ดํ† ์น˜๋ฅผ ์ด์šฉํ•˜์—ฌ LSTM(Long Short-Term Memories model)์„ ๊ตฌํ˜„ํ•  ๊ฒƒ์ด๋‹ค. ์‚ฌ์šฉ๋  ๋ฐ์ดํ„ฐ๋Š” ์•„๋งˆ์กด์˜ ์ฃผ๊ฐ€๋กœ ์ข…๊ฐ€์™€ ๊ฑฐ๋ž˜๋Ÿ‰์„ ์ด์šฉํ•˜์—ฌ, ์ผ์ฃผ์ผ ๋’ค์˜ ์ข…๊ฐ€๋ฅผ ์˜ˆ์ธกํ•˜๋Š” ๊ฒƒ์ด ๋ชฉ์ ์ด๋‹ค. ๋ฐ์ดํ„ฐ๋Š” ์—ฌ๊ธฐ(www.kaggle.com/camnugent/sandp500)์—์„œ ์–ป์„ ์ˆ˜ ์žˆ๋‹ค.

 

๋ฐ์ดํ„ฐ๊ตฌํ˜„ ๊ณผ์ •์€ ๋‹ค์Œ๊ณผ ๊ฐ™๋‹ค.

 

1. ๋ฐ์ดํ„ฐ ์ „์ฒ˜๋ฆฌ

2. ๋ชจ๋ธ ์„ค์ •

3. ๋ชจ๋ธ ํ•™์Šต

4. ํ•™์Šต ๊ฒฐ๊ณผ

 

1. ๋ฐ์ดํ„ฐ ์ „์ฒ˜๋ฆฌ

 

In:

import pandas as pd
from numpy import array
from numpy import hstack
import matplotlib.pyplot as plt
import torch

def min_max_scaler(arr):
    min_arr = min(arr)
    max_arr = max(arr)
    
    return (arr-min_arr)/(max_arr-min_arr), min_arr, max_arr

def split_seq(seq, n_step):
    X, y = list(), list()
    
    for i in range(len(seq)):
        end_idx = i+n_step
        
        if end_idx > len(seq):
            break
            
        seq_x, seq_y = seq[i:end_idx, :-1], seq[end_idx-1, -1]
        
        X.append(seq_x)
        y.append(seq_y)
        
    return array(X), array(y)

 

โ–ท min_max_scaler ํ•จ์ˆ˜๋Š” ๋ณ€์ˆ˜์˜ ์Šค์ผ€์ผ์„ ๊ฐ€์žฅ ํฐ ๊ฐ’์€ 1๋กœ ๊ฐ€์žฅ ์ž‘์€ ๊ฐ’์€ 0์œผ๋กœ ๋ณ€ํ™˜ํ•˜์—ฌ ์ค€๋‹ค.

 

โ–ท split_seq ํ•จ์ˆ˜๋Š” ๋ฐ์ดํ„ฐ์˜ ํฌ๊ธฐ๋ฅผ n_step ๋ณ€์ˆ˜๋งŒํผ ์ž˜๋ผ์ค€๋‹ค.

 

In:

data_stock = pd.read_csv('../input/all_stocks_5yr.csv')

in_seq_p = array(data_stock.loc[data_stock['Name'] == 'AMZN', 'close'])
in_seq_v = array(data_stock.loc[data_stock['Name'] == 'AMZN', 'volume'])

pred_day = 7

out_seq = array([in_seq_p[i-1] for i in range(pred_day, len(in_seq_p))])
in_seq_p = in_seq_p[:-pred_day]
in_seq_v = in_seq_v[:-pred_day]

out_seq, min_out, max_out = min_max_scaler(out_seq)
in_seq_p, min_p, max_p = min_max_scaler(in_seq_p)
in_seq_v, min_v, max_v = min_max_scaler(in_seq_v)

in_seq_p = in_seq_p.reshape(len(in_seq_p), 1)
in_seq_v = in_seq_v.reshape(len(in_seq_p), 1)
out_seq = out_seq.reshape(len(out_seq), 1)

dataset = hstack((in_seq_p, in_seq_v, out_seq))

seq_len = 7

X, y = split_seq(dataset, seq_len)

 

โ–ท S&P 500์˜ ์ฃผ์‹ ๋ฐ์ดํ„ฐ๋ฅผ ๋ถˆ๋Ÿฌ์˜จ ํ›„, ์•„๋งˆ์กด ์ฃผ์‹์˜ ์ข…๊ฐ€์™€ ๊ฑฐ๊ฐœ๋Ÿ‰๋งŒ ๋ฝ‘์•„์„œ in_seq_p, in_seq_v ๋ณ€์ˆ˜์— ํ• ๋‹นํ•œ๋‹ค.

 

โ–ท pred_day ๋ณ€์ˆ˜์— ์˜ˆ์ธก ๋‚ ์งœ๋ฅผ ํ• ๋‹นํ•˜๊ณ , ์ผ์ฃผ์ผ ๋’ค์˜ ์ข…๊ฐ€์˜ ๋ฐ์ดํ„ฐ๋งŒ ๋ฝ‘์•„์„œ out_seq ๋ณ€์ˆ˜์— ํ• ๋‹นํ•œ๋‹ค.

 

โ–ท out_seq, in_seq_p, in_seq_v ๋ณ€์ˆ˜์˜ ์Šค์ผ€์ผ์„ min_max_scaler๋ฅผ ์ด์šฉํ•˜์—ฌ 0๊ณผ 1์‚ฌ์ด๋กœ ๋งž์ถ˜๋‹ค.

 

โ–ท hstack ํ•จ์ˆ˜๋ฅผ ์ด์šฉํ•˜์—ฌ ์„ธ ๋ณ€์ˆ˜๋ฅผ ํ•˜๋‚˜๋กœ ํ•ฉ์นœ ๋’ค, dataset ๋ณ€์ˆ˜์— ํ• ๋‹นํ•œ๋‹ค.

 

โ–ท ๋งˆ์ง€๋ง‰์œผ๋กœ split_seq ํ•จ์ˆ˜๋ฅผ ์ด์šฉํ•˜์—ฌ dataset์„ ์ž…๋ ฅ๊ฐ’๊ณผ ์ถœ๋ ฅ๊ฐ’์œผ๋กœ ๋‚˜๋ˆˆ ๋’ค, ๊ฐ ๊ฐ’์„ ์‹œํ€€์Šค์˜ ๊ธธ์ด๋กœ ๋Š์–ด์ค€๋‹ค.

 

2. ๋ชจ๋ธ ์„ค์ •

 

In:

class MV_LSTM(torch.nn.Module):
    def __init__(self, n_feat, seq_len):
        super(MV_LSTM, self).__init__()
        
        self.n_feat = n_feat
        self.seq_len = seq_len
        self.n_hidden = 50
        self.n_layer = 5
        
        self.l_lstm = torch.nn.LSTM(input_size = n_feat, 
                                   hidden_size = self.n_hidden, 
                                   num_layers = self.n_layer, 
                                   batch_first = True)
        
        self.l_linear = torch.nn.Linear(self.n_hidden*self.seq_len, 1)
        
    def init_hidden(self, batch_size):
        hidden_state = torch.zeros(self.n_layer, batch_size, self.n_hidden)
        cell_state = torch.zeros(self.n_layer, batch_size, self.n_hidden)
        self.hidden = (hidden_state, cell_state)
        
    def forward(self, X):
        batch_size, seq_len, _ = X.size()
        
        lstm_out, self.hidden = self.l_lstm(X, self.hidden)
        
        X = lstm_out.contiguous().view(batch_size, -1)
        
        return self.l_linear(X)

 

โ–ท torch.nn.LSTM ํ•จ์ˆ˜๋ฅผ ์ด์šฉํ•˜์—ฌ, LSTM์˜ ๊ตฌ์กฐ๋ฅผ ์„ธํŒ…ํ•œ๋‹ค. ์ด๋•Œ, batch_first ์ธ์ž๋ฅผ True๋กœ ์„ค์ •ํ•˜์—ฌ, ๋ฐฐ์น˜ ๋‹จ์œ„์˜ ํ•™์Šต์„ ๋จผ์ €ํ•  ์ˆ˜ ์žˆ๋„๋ก ์ง€์ •ํ•ด์ฃผ์—ˆ๋‹ค.

 

โ–ท torch.nn.Linear ํ•จ์ˆ˜๋ฅผ ์ด์šฉํ•˜์—ฌ, ๊ฒฐ๊ณผ๊ฐ’์˜ ํฌ๊ธฐ๋ฅผ 1๋กœ ์„ค์ •ํ•˜์˜€๋‹ค.

 

โ–ท init_hidden ํ•จ์ˆ˜๋ฅผ ์ด์šฉํ•˜์—ฌ, ์ฒซ ํ•™์Šต์‹œ์˜ ํžˆ๋“  ์Šคํ…Œ์ด๋ฅผ ๋งŒ๋“ค์–ด ์ค€๋‹ค.

 

โ–ท forward ํ•จ์ˆ˜์—์„œ ํ•™์Šต ์ง„ํ–‰ ๋ฐฉํ–ฅ์„ ์„ค์ •ํ•œ๋‹ค.

 

3. ๋ชจ๋ธ ํ•™์Šต

 

In:

n_feat = 2
seq_len = 7

mv_lstm = MV_LSTM(n_feat, seq_len)
loss_fun = torch.nn.MSELoss()
optim = torch.optim.Adam(mv_lstm.parameters(), lr = 1e-3)

epoch = 500
batch_size = int(len(dataset)/5)

mv_lstm.train()

for t in range(epoch):
    batch_loss = 0
    
    for b in range(0, len(X), batch_size):
        input_ = X[b:b+batch_size, :, :]
        target = y[b:b+batch_size]
        
        X_batch = torch.tensor(input_, dtype = torch.float32)
        y_batch = torch.tensor(target, dtype = torch.float32)
        
        mv_lstm.init_hidden(X_batch.size(0))
        output = mv_lstm(X_batch)
        loss = loss_fun(output.view(-1), y_batch)
        
        loss.backward()
        optim.step()
        optim.zero_grad()
        
        batch_loss += loss.item()
    
    if t%10 == 0:
        print('Step:', t, 'Loss:', batch_loss)

 

Out:

Step: 0 Loss: 0.49312681483570486
Step: 10 Loss: 0.276709976606071
Step: 20 Loss: 0.03958324994891882
Step: 30 Loss: 0.011060289776651189
Step: 40 Loss: 0.008034420054173097
Step: 50 Loss: 0.010796168557135388
Step: 60 Loss: 0.00563392968615517
Step: 70 Loss: 0.006591864308575168
Step: 80 Loss: 0.008238963724579662
Step: 90 Loss: 0.004256958898622543
Step: 100 Loss: 0.0048738375917309895
Step: 110 Loss: 0.0047946782724466175
Step: 120 Loss: 0.003956006024964154
Step: 130 Loss: 0.006989476649323478
Step: 140 Loss: 0.00544810012797825
Step: 150 Loss: 0.0037749825860373676
Step: 160 Loss: 0.004278417211025953
Step: 170 Loss: 0.010671858908608556
Step: 180 Loss: 0.0039105534087866545
Step: 190 Loss: 0.0036512921506073326

 

โ–ท ํ•™์Šต ์‹œ, ๋ฐฐ์น˜ ํฌ๊ธฐ๋ฅผ ์ง€์ •ํ•œ ๋’ค, ๊ฐ ์—ํญ์—์„œ ๋ชจ๋“  ๋ฐฐ์น˜์— ๋Œ€ํ•œ ํ•™์Šต์„ ์ง„ํ–‰ํ•˜๋„๋ก ๊ตฌํ˜„ํ•˜์˜€๋‹ค.

 

โ–ท ํ•™์Šต ๊ฒฐ๊ณผ, ๊ฐ ์—ํญ์—์„œ์˜ ์˜ค์ฐจ๊ฐ€ ์ค„์–ด๋“ค๊ณ  ์žˆ๋Š” ๊ฒƒ์„ ํ™•์ธํ•  ์ˆ˜ ์žˆ๋‹ค.

 

4. ํ•™์Šต ๊ฒฐ๊ณผ

 

In:

pred_y = list()

with torch.no_grad():
    for b in range(0, len(X), batch_size):
        input_ = X[b:b+batch_size, :, :]
        target = y[b:b+batch_size]
        
        X_batch = torch.tensor(input_, dtype = torch.float32)
        y_batch = torch.tensor(target, dtype = torch.float32)
        
        mv_lstm.init_hidden(X_batch.size(0))
        output = mv_lstm(X_batch)
        
        for i in output.tolist():
            pred_y.append(i[0])

pred_y = array(pred_y)

plt.figure()

plt.plot(pred_y)
plt.plot(y)

plt.show()

 

Out:

 

โ–ท ํŒŒ๋ž€์ƒ‰ ์„ ์ด ์‹ค์ œ ์•„๋งˆ์กด์˜ ์ข…๊ฐ€์ด๊ณ , ์ฃผํ™ฉ์ƒ‰ ์„ ์ด ์ผ์ฃผ์ผ ์ „ ์˜ˆ์ธก๋œ ์ข…๊ฐ€์ด๋‹ค. ๊ทธ๋ž˜ํ”„์—์„œ ๋ณด๋‹ค์‹œํ”ผ ํ•™์Šต์ด ์ž˜ ๋œ ๊ฒƒ์„ ์•Œ ์ˆ˜ ์žˆ๋‹ค.