๋ณธ๋ฌธ ๋ฐ”๋กœ๊ฐ€๊ธฐ

์ „์ฒด ๊ธ€

(80)
์ธ๊ณต์‹ ๊ฒฝ๋ง(Artificial Neural Network) ๊ตฌํ˜„ MNIST ๋ฐ์ดํ„ฐ์˜ ์†๊ธ€์”จ๋กœ ์ ํžŒ ์ˆซ์ž ์ด๋ฏธ์ง€๋ฅผ ๋ถ„๋ฅ˜ํ•˜๋Š” ๋‹ค์ค‘ ๋ถ„๋ฅ˜(Multiclass classification) ๋ฌธ์ œ๋ฅผ ๋‹ค๋ฃฐ ๊ฒƒ์ด๋‹ค. ๋ฐ์ดํ„ฐ๋Š” ์—ฌ๊ธฐ(https://www.kaggle.com/c/digit-recognizer)์—์„œ ์–ป์„ ์ˆ˜ ์žˆ๋‹ค. ํŒŒ์ดํ† ์น˜๋ฅผ ์ด์šฉํ•˜์—ฌ ์ธ๊ณต์‹ ๊ฒฝ๋ง(Artificial Neural Network)์„ ๊ตฌํ˜„ํ•  ๊ฒƒ์ด๋‹ค. ๊ตฌํ˜„ ๊ณผ์ •์€ ๋‹ค์Œ๊ณผ ๊ฐ™๋‹ค. 1. ๋ฐ์ดํ„ฐ ์ž…๋ ฅ ๋ฐ ํ™•์ธ 2. ๋ฐ์ดํ„ฐ ์ „์ฒ˜๋ฆฌ 3. ๋ชจ๋ธ ์„ค์ • 4. ๋ฐ์ดํ„ฐ ํ•™์Šต ๋ฐ ๊ฒ€์ฆ 1. ๋ฐ์ดํ„ฐ ์ž…๋ ฅ ๋ฐ ํ™•์ธ In: import numpy as np import pandas as pd from sklearn.model_selection import train_test_split import torch import torch.nn ..
์‚ฌ์ „์˜ˆ์ธก๋ถ„ํฌ์™€ ์‚ฌํ›„์˜ˆ์ธก๋ถ„ํฌ(Prior and posterior predictive distribution) ์‚ฌ์ „์˜ˆ์ธก๋ถ„ํฌ(Prior predictive distribution)์™€ ์‚ฌํ›„์˜ˆ์ธก๋ถ„ํฌ(Posterior predictive distribution)์— ๋Œ€ํ•ด ์•Œ์•„๋ณผ ๊ฒƒ์ด๋‹ค. ๋‹ค๋ฃฐ ๋‚ด์šฉ์€ ๋‹ค์Œ๊ณผ ๊ฐ™๋‹ค. 1. ์‚ฌ์ „์˜ˆ์ธก๋ถ„ํฌ์™€ ์‚ฌํ›„์˜ˆ์ธก๋ถ„ํฌ์˜ ์ •์˜ 2. ์‚ฌ์ „์˜ˆ์ธก๋ถ„ํฌ์™€ ์‚ฌํ›„์˜ˆ์ธก๋ถ„ํฌ์˜ ์˜ˆ์ œ 1. ์‚ฌ์ „์˜ˆ์ธก๋ถ„ํฌ์™€ ์‚ฌํ›„์˜ˆ์ธก๋ถ„ํฌ์˜ ์ •์˜ โ–ท ์‚ฌ์ „์˜ˆ์ธก๋ถ„ํฌ๋Š” ๋ฒ ์ด์ฆˆ ์ •๋ฆฌ๋ฅผ ์ด์šฉํ•˜์—ฌ ๊ตฌํ•˜๋ฉด, ์‚ฌ์ „๋ถ„ํฌ์™€ ๊ฐ€๋Šฅ๋„ ํ•จ์ˆ˜์˜ ๊ณฑ์„ ์ ๋ถ„ํ•œ ํ˜•ํƒœ๋กœ ์ •์˜๋œ๋‹ค. ์ฆ‰, theta์— ๋Œ€ํ•œ ๊ฐ€๋Šฅ๋„ ํ•จ์ˆ˜์˜ ํ‰๊ท ์ด๋ผ ํ•  ์ˆ˜ ์žˆ๋‹ค. โ–ท ์‚ฌํ›„์˜ˆ์ธก๋ถ„ํฌ๋Š” ๋ฒ ์ด์ฆˆ ์ •๋ฆฌ๋ฅผ ์ด์šฉํ•˜์—ฌ ๊ตฌํ•  ์ˆ˜ ์žˆ๋‹ค. ์ด๋•Œ, ์ผ๋ฐ˜์ ์œผ๋กœ ๊ด€์ธก ๊ฒฐ๊ณผ์ธ x์™€ ํ™•๋ฅ  ๋ณ€์ˆ˜ x tilde์˜ ๊ด€๊ณ„๋Š” ๋…๋ฆฝ์ด๋ผ ๊ฐ€์ •ํ•˜๊ธฐ ๋•Œ๋ฌธ์—, theta์˜ ์‚ฌํ›„๋ถ„ํฌ์™€ ๊ฐ€๋Šฅ๋„ ํ•จ์ˆ˜์˜ ๊ณฑ์„ ์ ๋ถ„ํ•œ ํ˜•ํƒœ๋กœ ์ •์˜๋œ๋‹ค. theta์˜..
์‹ ์šฉ๊ตฌ๊ฐ„(Credible interval) ์‹ ์šฉ๊ตฌ๊ฐ„(Credible interval)์— ๋Œ€ํ•ด ์•Œ์•„๋ณผ ๊ฒƒ์ด๋‹ค. ๋‹ค๋ฃฐ ๋‚ด์šฉ์€ ๋‹ค์Œ๊ณผ ๊ฐ™๋‹ค. 1. ์‹ ์šฉ๊ตฌ๊ฐ„์˜ ์ •์˜ 2. ์‹ ์šฉ๊ตฌ๊ฐ„์˜ ์˜ˆ์ œ 1. ์‹ ์šฉ๊ตฌ๊ฐ„์˜ ์ •์˜ ์‹ ์šฉ๊ตฌ๊ฐ„์˜ ์ •์˜๋Š” ๋‹ค์Œ๊ณผ ๊ฐ™๋‹ค. โ–ท ๋นˆ๋„์ฃผ์˜(Frequentist) ๊ด€์ ์—์„œ๋Š” ๋ชจ์ˆ˜๊ฐ€ ๊ณ ์ •๋˜์–ด ์žˆ๊ธฐ ๋•Œ๋ฌธ์— ์‹ ๋ขฐ๊ตฌ๊ฐ„(Confidence interval)์— ๋Œ€ํ•œ ํ•ด์„์ด ์šฐ๋ฆฌ์˜ ์ง๊ด€๊ณผ ๋งž์ง€ ์•Š๋Š” ๋ฌธ์ œ๊ฐ€ ๋ฐœ์ƒํ•œ๋‹ค. ์‹ ์šฉ๊ตฌ๊ฐ„์€ ๋ชจ์ˆ˜์— ๋Œ€ํ•œ ์‚ฌํ›„๋ถ„ํฌ๋ฅผ ๊ฐ€์ •ํ•˜๊ณ  ์žˆ๊ธฐ ๋•Œ๋ฌธ์— ์‹ ์šฉ๊ตฌ๊ฐ„์˜ ํ•ด์„์ด ์šฐ๋ฆฌ์˜ ์ง๊ด€๊ณผ ์ผ์น˜ํ•œ๋‹ค. ์ฆ‰, ๋ชจ์ˆ˜๊ฐ€ ํ•ด๋‹น ์‹ ์šฉ๊ตฌ๊ฐ„์— ๋Œ€ํ•ด ์กด์žฌํ•  ํ™•๋ฅ ์— ๋Œ€ํ•œ ํ•ด์„์ด ๊ฐ€๋Šฅํ•˜๋‹ค. 2. ์‹ ์šฉ๊ตฌ๊ฐ„์˜ ์˜ˆ์ œ ๋ฌธ์ œ) ๋™์ „์˜ ์•ž๋ฉด์ด ๋‚˜์˜ฌ ํ™•๋ฅ ์ด ๊ท ์ผ๋ถ„ํฌ๋ฅผ ๋”ฐ๋ฅด๊ณ , ๊ฐ€๋Šฅ๋„ ํ•จ์ˆ˜๋Š” ๋ฒ ๋ฅด๋ˆ„์ด ๋ถ„ํฌ์„ ๋”ฐ๋ฅธ๋‹ค. ์ด ๋•Œ, ๋™์ „์„ ๋˜์กŒ๋”๋‹ˆ ์•ž๋ฉด์ด ๋‚˜์™”๋‹ค. ์ด ๊ฒฐ๊ณผ๋ฅผ ์ด์šฉํ•˜์—ฌ..
ํด๋ž˜์Šค(Class)์˜ ์ธ์ž ๋ฐ ๋ฉ”์†Œ๋“œ(Method) ํŒŒ์ด์ฌ์˜ ์ž๋ฃŒ ๊ตฌ์กฐ์ธ ํด๋ž˜์Šค(Class)์˜ ์ธ์ž ๋ฐ ๋ฉ”์†Œ๋“œ(Method) ๋Œ€ํ•ด ์•Œ์•„๋ณผ ๊ฒƒ์ด๋‹ค. ๋‹ค๋ฃฐ ๋‚ด์šฉ์œผ๋กœ๋Š” ๋‹ค์Œ๊ณผ ๊ฐ™๋‹ค. 1. self ์ธ์ž 2. __init__() ๋ฉ”์†Œ๋“œ 3. super() ๋ฉ”์†Œ๋“œ 1. self ์ธ์ž In: class test_class: def test_fun_1(): print('Function 1') def test_fun_2(self): print('Function 2') t_c = test_class() t_c.test_fun_1() Out: --------------------------------------------------------------------------- TypeError Traceback (most recent call last) in 1 t_c =..
๋นˆ๋„์ฃผ์˜ ์ถ”๋ก (Frequentist inference) ๋นˆ๋„์ฃผ์˜(Frequentist) ๊ด€์ ์˜ ์ถ”๋ก (Inference)์— ๋Œ€ํ•ด ์•Œ์•„๋ณผ ๊ฒƒ์ด๋‹ค. ๋‹ค๋ฃฐ ๋‚ด์šฉ์€ ๋‹ค์Œ๊ณผ ๊ฐ™๋‹ค. 1. ๊ฐ€๋Šฅ๋„(Likelihood)์™€ MLE(Maximum Likelihood Estimation) 2. ์‹ ๋ขฐ๊ตฌ๊ฐ„(Confidence interval) 1. ๊ฐ€๋Šฅ๋„์™€ MLE ๋ฒ ๋ฅด๋ˆ„์ด ๋ถ„ํฌ์˜ ๊ฐ€๋Šฅ๋„ ํ•จ์ˆ˜๋ฅผ ๊ตฌํ•ด๋ณด์ž. โ–ท P(X tilde)์™€ ๊ฐ€๋Šฅ๋„ ํ•จ์ˆ˜์ธ L(theta | X tilde)์˜ ๊ฒฐ๊ณผ๋Š” ๊ฐ™์ง€๋งŒ, ๊ฐ€๋Šฅ๋„ ํ•จ์ˆ˜๋Š” y์— ๋Œ€ํ•œ ํ•จ์ˆ˜๊ฐ€ ์•„๋‹Œ theta์— ๋Œ€ํ•œ ํ•จ์ˆ˜๋ผ๋Š” ์ ์—์„œ ๋‹ค๋ฅด๋‹ค. ์ฆ‰, ๊ฐ€๋Šฅ๋„๋ž€ ๋ชจ์ˆ˜์— ๋Œ€ํ•œ ํ•จ์ˆ˜๋กœ์จ ๋ชจ์ˆ˜๊ฐ€ ์ฃผ์–ด์กŒ์„ ๋•Œ, ๊ด€์ธก๊ฐ’์— ๋Œ€ํ•ด ๋ถ€์—ฌํ•˜๋Š” ํ™•๋ฅ ์„ ์˜๋ฏธํ•œ๋‹ค. ๋นˆ๋„์ฃผ์˜ ๊ด€์ ์—์„œ ๋ชจ์ˆ˜๋ฅผ ์ถ”์ •ํ•˜๊ธฐ ์œ„ํ•œ ๋Œ€ํ‘œ์ ์ธ ๋ฐฉ๋ฒ•์œผ๋กœ๋Š” MLE๊ฐ€ ์žˆ๋‹ค. MLE๋ฅผ ํ†ตํ•ด ๋ฒ ๋ฅด๋ˆ„์ด ๋ถ„ํฌ์˜ ๋ชจ์ˆ˜๋ฅผ ์ถ”..
์ž๋™ ๋ฏธ๋ถ„(Automatic differentiation) ์‚ฌ์šฉ๋ฒ• ํŒŒ์ดํ† ์น˜์˜ ์ž๋™ ๋ฏธ๋ถ„(Auto differentiation)์„ ์ด์šฉํ•œ ๋ณ€ํ™”๋„(Gradient) ๊ณ„์‚ฐ ๋ฐฉ๋ฒ•์„ ์•Œ์•„๋ณผ ๊ฒƒ์ด๋‹ค. ๋‹ค๋ฃฐ ๋‚ด์šฉ์€ ๋‹ค์Œ๊ณผ ๊ฐ™๋‹ค. 1. ์ž๋™ ๋ฏธ๋ถ„ ์ค€๋น„ 2. ๋ณ€ํ™”๋„ ๊ณ„์‚ฐ 1. ์ž๋™ ๋ฏธ๋ถ„ ์ค€๋น„ In: import torch x = torch.ones(2, 2, requires_grad = True) print(x) Out: tensor([[1., 1.], [1., 1.]], requires_grad=True) โ–ท torch.ones()์— ํ…์„œ ํฌ๊ธฐ์— ๋Œ€ํ•œ ์ธ์ž์™€ requires_grad ์ธ์ž๋ฅผ ์ฃผ์–ด ํ…์„œ๋ฅผ ์ƒ์„ฑํ•˜์˜€๋‹ค. ๊ฒฐ๊ณผ ์ฐฝ์— requires_grad=True๊ฐ€ ๋‚˜ํƒ€๋‚œ ๊ฒƒ์„ ๋ณผ ์ˆ˜ ์žˆ๋Š”๋ฐ, ์ด๋Š” ์ดํ›„ ์—ญ์ „ํŒŒ ๊ณผ์ •์„ ์ˆ˜ํ–‰ ํ›„, ํ•ด๋‹น ํ…์„œ์˜ ๋ณ€ํ™”๋„๋ฅผ ๊ตฌํ•  ์ˆ˜ ์žˆ๋„๋ก ํ•œ๋‹ค. In: y = ..
๋ฒ ์ด์ฆˆ ์ •๋ฆฌ(Bayes' theorem) ๋ฒ ์ด์ง€์•ˆ ํ†ต๊ณ„์˜ ๊ฐ€์žฅ ํ•ต์‹ฌ์ธ ๋ฒ ์ด์ฆˆ ์ •๋ฆฌ(Bayes' theorem)์— ๋Œ€ํ•ด ์•Œ์•„๋ณผ ๊ฒƒ์ด๋‹ค. ๋‹ค๋ฃฐ ๋‚ด์šฉ์€ ๋‹ค์Œ๊ณผ ๊ฐ™๋‹ค. 1. ๋ฒ ์ด์ฆˆ ์ •๋ฆฌ์˜ ์˜๋ฏธ 2. ๋ฒ ์ด์ฆˆ ์ •๋ฆฌ์˜ ์˜ˆ์ œ 1. ๋ฒ ์ด์ฆˆ ์ •๋ฆฌ์˜ ์˜๋ฏธ ๋ฒ ์ด์ฆˆ ์ •๋ฆฌ์˜ ๊ณต์‹์€ ๋‹ค์Œ๊ณผ ๊ฐ™๋‹ค. โ–ท ๋ฒ ์ด์ฆˆ ์ •๋ฆฌ์—์„œ P(H)๋Š” ์‚ฌ์ „ ํ™•๋ฅ (Prior probability)์ด๋ผ๊ณ  ํ•œ๋‹ค. ์‚ฌ์ „ ํ™•๋ฅ ์ด๋ž€ ์‚ฌ๊ฑด E๊ฐ€ ๋ฐœ์ƒํ•˜๊ธฐ ์ „ ์‚ฌ๊ฑด H์— ๋Œ€ํ•œ ํ™•๋ฅ ์„ ์˜๋ฏธํ•œ๋‹ค. โ–ท ์‚ฌ๊ฑด E๊ฐ€ ๋ฐœ์ƒํ•˜๊ฒŒ ๋˜์–ด ์ด ์ •๋ณด๋ฅผ ๋ฐ˜์˜ํ•˜๋ฉด ์‚ฌ๊ฑด H์˜ ํ™•๋ฅ ์€ P(H|E)๋กœ ๋ฐ”๋€Œ๊ฒŒ ๋˜๋ฉฐ, ์ด๋ฅผ ์‚ฌํ›„ ํ™•๋ฅ (Posterior probability)์ด๋ผ ํ•œ๋‹ค. โ–ท P(E|H) ๋Š” ๊ฐ€๋Šฅ๋„(Likelihood)๋ผ ํ•˜๊ณ , ์‚ฌ๊ฑด H๊ฐ€ ์กฐ๊ฑด์œผ๋กœ ์ฃผ์–ด์ง„ ์ƒํƒœ์—์„œ ์–ผ๋งˆ๋‚˜ ์‚ฌ๊ฑด E๊ฐ€ ๊ฐ€๋Šฅํ•œ ์ง€์— ๋Œ€ํ•œ ํ™•๋ฅ ์„ ์˜๋ฏธํ•œ๋‹ค. โ–ท P(E) ..
ํ…์„œ(Tensor) ์‚ฌ์šฉ๋ฒ• ํŒŒ์ดํ† ์น˜์˜ ํ…์„œ(Tensor)์˜ ์‚ฌ์šฉ๋ฒ•์— ๋Œ€ํ•ด ์•Œ์•„๋ณด์ž. ๋‹ค๋ฃฐ ๋‚ด์šฉ์€ ๋‹ค์Œ๊ณผ ๊ฐ™๋‹ค. 1. ํ…์„œ์˜ ์ƒ์„ฑ 2. ํ…์„œ์˜ ์—ฐ์‚ฐ 3. ํ…์„œ์˜ ๋ณ€ํ™˜ 1. ํ…์„œ์˜ ์ƒ์„ฑ In: import torch x = torch.rand(5, 3) print(x) Out: tensor([[0.1501, 0.8814, 0.4848], [0.0723, 0.9468, 0.1327], [0.8581, 0.8050, 0.4441], [0.4888, 0.0157, 0.6959], [0.9666, 0.4729, 0.1983]]) โ–ท torch.rand()๋ฅผ ์ด์šฉํ•˜์—ฌ 0๊ณผ 1 ์‚ฌ์ด์˜ ์ž„์˜์˜ ์ˆ˜๊ฐ€ ์›์†Œ์ธ 5×3 ํ–‰๋ ฌ์ด ๋งŒ๋“ค์—ˆ๋‹ค. ํ•จ์ˆ˜ ์•ˆ์˜ ๋‘ ์ธ์ž๋Š” ํ–‰๊ณผ ์—ด์„ ๋‚˜ํƒ€๋‚ธ๋‹ค. In: x = torch.rand(5, 3, 3) print(x) Out:..