๋ณธ๋ฌธ ๋ฐ”๋กœ๊ฐ€๊ธฐ

Statistics/Probabilistic Graphical Model

๋ฒ ์ด์ง€์•ˆ ๋„คํŠธ์›Œํฌ(Bayesian network) (1)

๋ณธ ํฌ์ŠคํŒ…์€ ์นด์ด์ŠคํŠธ ๋ฌธ์ผ์ฒ  ๊ต์ˆ˜๋‹˜์˜ ์ธ๊ณต์ง€๋Šฅ ๋ฐ ๊ธฐ๊ณ„ํ•™์Šต ๊ฐœ๋ก  2์˜ ๋ฒ ์ด์ง€์•ˆ ๋„คํŠธ์›Œํฌ(Bayesian network) ๊ฐ•์˜ ๋‚ด์šฉ์„ ์ •๋ฆฌํ•œ ๊ฒƒ์ด๋‹ค.

 

๋‹ค๋ฃฐ ๋‚ด์šฉ์€ ๋‹ค์Œ๊ณผ ๊ฐ™๋‹ค.

 

1. Conditional vs. Marginal independence

2. Bayesian network

3. Interpretation of Bayesian network

4. Typical local structures

5. Bayes ball algorithm

 

1. Conditional vs. Marginal independence

 

 

โ–ท Conditional independence์™€ Marginal independence๋Š” ๋‘˜ ๋‹ค ๋…๋ฆฝ์„ ์˜๋ฏธํ•˜์ง€๋งŒ, ์•ฝ๊ฐ„์˜ ์ฐจ์ด๊ฐ€ ์žˆ๋‹ค. Marginal independence๋Š” P(A|B) = P(A)์ธ ๊ฒฝ์šฐ์ด๊ณ , Conditional independence๋Š” P(A|B,C) = P(A|B)์ธ ๊ฒฝ์šฐ์ด๋‹ค.

 

โ–ถ Conditional independence๋Š” Full joint probability๋ฅผ ๊ตฌํ•  ๋•Œ, Factorization์— ์‚ฌ์šฉ๋˜๊ธฐ ๋•Œ๋ฌธ์— ์ƒ๋‹นํžˆ ์ค‘์š”ํ•˜๋‹ค.

 

โ–ท Marginal independence๊ฐ€ ์ ์šฉ๋˜๋Š” ์˜ˆ๋ฅผ ์•Œ์•„๋ณด์ž. ์ค‘๋ น์ด ๋‘ ์žฅ๊ต A, B์—๊ฒŒ ๋ช…๋ น์„ ๋‚ด๋ฆฌ๋Š” ๊ฒฝ์šฐ์— ๋Œ€ํ•ด ์ƒ๊ฐํ•ด๋ณด์ž. ์ค‘๋ น์˜ ๋ช…๋ น์„ ์•Œ์ง€ ๋ชปํ•  ๋•Œ, ์žฅ๊ต A์™€ ์žฅ๊ต B๋Š” ์„œ๋กœ ๋…๋ฆฝ์ ์ด์ง€ ์•Š๋‹ค. ์™œ๋ƒํ•˜๋ฉด ํ•œ ์žฅ๊ต์˜ ํ–‰๋™์ด ๋‹ค๋ฅธ ์žฅ๊ต์—๊ฒŒ ์ •๋ณด๋ฅผ ์ค„ ์ˆ˜ ์žˆ๊ธฐ ๋•Œ๋ฌธ์ด๋‹ค. ํ•˜์ง€๋งŒ ์ค‘๋ น์˜ ๋ช…๋ น์„ ์•„๋Š” ๊ฒฝ์šฐ์—๋Š” ์–˜๊ธฐ๊ฐ€ ๋‹ฌ๋ผ์ง„๋‹ค. ํ•œ ์žฅ๊ต๊ฐ€ ์–ด๋–ค ํ–‰๋™์„ ์ทจํ•˜์—ฌ๋„ ์ค‘๋ น์˜ ๋ช…๋ น์„ ์ด๋ฏธ ์•Œ๊ณ  ์žˆ๊ธฐ ๋•Œ๋ฌธ์— ๋‹ค๋ฅธ ์žฅ๊ต์—๊ฒŒ ์–ด๋– ํ•œ ์ •๋ณด๋„ ์ฃผ์ง€ ๋ชปํ•œ๋‹ค. ์ด ๊ด€๊ณ„๋ฅผ ํ™•๋ฅ ๋กœ ๋‚˜ํƒ€๋‚ด๋ฉด ๋‹ค์Œ๊ณผ ๊ฐ™๋‹ค.

 

P(์žฅ๊ต A์˜ ํ–‰๋™|์žฅ๊ต B์˜ ํ–‰๋™, ์ค‘๋ น์˜ ๋ช…๋ น) = P(์žฅ๊ต A์˜ ํ–‰๋™|์ค‘๋ น์˜ ๋ช…๋ น)

 

2. Bayesian network

 

 

โ–ท ๋ฒ ์ด์ง€์•ˆ ๋„คํŠธ์›Œํฌ๋Š” ๊ทธ๋ž˜ํ”„ ํ‘œ๊ธฐ(Graphical notation)๋ฅผ ํ†ตํ•ด ๋‚˜ํƒ€๋‚ผ ์ˆ˜ ์žˆ๊ณ , ๊ทธ๋ž˜ํ”„ ํ‘œ๊ธฐ๋ฅผ ํ†ตํ•ด ํ™•๋ฅ  ๋ณ€์ˆ˜์™€ ๋ณ€์ˆ˜๊ฐ„์˜ ๊ด€๊ณ„์— ๋”ฐ๋ฅธ Conditional independence๋ฅผ ๋‚˜ํƒ€๋‚ธ๋‹ค. ์ด ํ‘œ๊ธฐ๋Š” Full joint distribution์„ ์–ป์„ ์ˆ˜ ์žˆ๋„๋ก ๊ฐ„๊ฒฐํ•˜๊ฒŒ ๋‚˜ํƒ€๋‚ธ ๊ฒƒ์ด๋‹ค.

 

โ–ท ๋ฒ ์ด์ง€์•ˆ ๋„คํŠธ์›Œํฌ๋Š” ํ•œ ๋ฐฉํ–ฅ์œผ๋กœ ํ™•๋ฅ  ๋ณ€์ˆ˜๊ฐ„์˜ ๊ด€๊ณ„๋ฅผ ํ•œ ๋ฐฉํ–ฅ์œผ๋กœ ๋‚˜ํƒ€๋‚ด๊ณ , ๋น„์ˆœํ™˜(Acyclic) ๊ตฌ์กฐ๋ฅผ ๊ฐ€์ง€๊ณ  ์žˆ๋‹ค.

 

โ–ท ๊ฐ ๋…ธ๋“œ์˜ ์ง์ ‘ ์˜ํ–ฅ์„ ์ฃผ๋Š” ๊ฒƒ์€ ๋ถ€๋ชจ ๋…ธ๋“œ๋ฐ–์— ์—†๋‹ค. ๊ฐ„์ ‘ ์˜ํ–ฅ์„ ์ฃผ๋Š” ๊ฒฝ์šฐ๋„ ์žˆ๋‹ค. ์˜ˆ๋ฅผ ๋“ค์–ด, Y์˜ ๋ถ€๋ชจ ๋…ธ๋“œ๋ฅผ Z๋ผ ํ•˜๊ณ , Z์˜ ๋งํฌ(Link)๊ฐ€ Y๋ฅผ ํ†ตํ•ด X1 ๋˜๋Š” X2์— ๊ฐ„์ ‘ ์˜ํ–ฅ์„ ์ฃผ๊ฒŒ ๋œ๋‹ค.

 

3. ๋ฒ ์ด์ง€์•ˆ ๋„คํŠธ์›Œํฌ์˜ ํ•ด์„

 

 

โ–ท ๋„คํŠธ์›Œํฌ์˜ ํ† ํด๋กœ์ง€(Topology)๋Š” ์ผ๋ฐ˜ ์ƒ์‹์„ ์ด์šฉํ•˜๊ฑฐ๋‚˜ ํ•ด๋‹น ๋ถ„์•ผ์˜ ์ „๋ฌธ๊ฐ€์˜ ์˜๊ฒฌ์„ ํ†ตํ•˜์—ฌ ๊ตฌ์„ฑํ•œ๋‹ค.

 

โ–ท ์œ„์˜ ๋„คํŠธ์›Œํฌ ํ† ํด๋กœ์ง€๋ฅผ ํ•ด์„ํ•˜๋ฉด ๋‹ค์Œ๊ณผ ๊ฐ™๋‹ค.

 

1. Weather์€ ๋‹ค๋ฅธ ๋ณ€์ˆ˜์™€ ์—ฐ๊ฒฐ๋˜์–ด ์žˆ์ง€ ์•Š๊ธฐ ๋•Œ๋ฌธ์— ๋‹ค๋ฅธ ๋ณ€์ˆ˜์—๊ฒŒ ์•„๋ฌด๋Ÿฐ ์˜ํ–ฅ์„ ์ฃผ์ง€ ๋ชปํ•œ๋‹ค. ์ฆ‰, ๋‹ค๋ฅธ ๋ณ€์ˆ˜์™€ ๋…๋ฆฝ์ด๋‹ค.

2. Toothache์™€ Stench๋Š” Cavity๊ฐ€ ์ฃผ์–ด์งˆ ๋•Œ, Conditionally independentํ•˜๋‹ค. ์ฆ‰, Cavity์— ๋Œ€ํ•œ ์ •๋ณด๋ฅผ ์–ป์—ˆ์„ ๋•Œ, Tootache์™€ Stench๋Š” ์„œ๋กœ ๋…๋ฆฝ์ด๋‹ค.

3. Cavity๋Š” Toothache์™€ Strench์— ์ง์ ‘ ์˜ํ–ฅ์„ ์ค€๋‹ค.

 

 

โ–ท ์œ„์˜ ์Šฌ๋ผ์ด๋“œ์— ๋‚˜ํƒ€๋‚œ ๋ฒ ์ด์ง€์•ˆ ํ† ํด๋กœ์ง€๋กœ๋ถ€ํ„ฐ ์•„๋ž˜์˜ ํ‘œ์™€ ๊ฐ™์ด CPD(Conditional Probability Distribution)๋ฅผ ๋งŒ๋“ค ์ˆ˜ ์žˆ๋‹ค. CPD๋ฅผ ์ด์šฉํ•˜์—ฌ ๋‹ค์–‘ํ•œ ์ƒํ™ฉ์— ๋Œ€ํ•œ ํ™•๋ฅ ์„ ๊ตฌํ•  ์ˆ˜ ์žˆ๋‹ค. ์œ„์˜ ๋ฒ ์ด์ง€์•ˆ ํ† ํด๋กœ์ง€๋ฅผ ์ด์šฉํ•œ ์˜ˆ๋ฅผ ๋“ค๋ฉด, ์•Œ๋žŒ(Alarm)์ด ์šธ๋ ธ์„ ๋•Œ, ์ง‘์— ๋„๋‘‘(Burglary)์ด ๋“ค์—ˆ์„ ํ™•๋ฅ ์„ ๊ตฌํ•  ์ˆ˜ ์žˆ๋‹ค.

 

โ–ท ๋ฒ ์—์ง€์•ˆ ๋„คํŠธ์›Œํฌ์˜ ๊ตฌ์„ฑ์š”์†Œ๋Š” ๊ตฌ์กฐ์ ์ธ ๊ตฌ์„ฑ์š”์†Œ์™€ ์ˆ˜์น˜์ ์ธ ๊ตฌ์„ฑ์š”์†Œ๋กœ ๋‚˜๋ˆŒ ์ˆ˜ ์žˆ๋‹ค. ๊ตฌ์กฐ์ ์ธ ๊ตฌ์„ฑ์š”์†Œ๋Š” ๋ฒ ์ด์ง€์•ˆ ํ† ํด๋กœ์ง€์˜ ํ˜•ํƒœ๋กœ ๊ฐ ๋ณ€์ˆ˜๊ฐ„์˜ ๊ด€๊ณ„์— ๋Œ€ํ•œ ์‚ฌ์ „์ •๋ณด ๋˜๋Š” ๋ฐ์ดํ„ฐ๋ฅผ ํ†ตํ•œ ํ•™์Šตํ•˜๋Š” ๋ฐฉ๋ฒ•์„ ์ด์šฉํ•˜์—ฌ ๊ตฌ์„ฑํ•œ๋‹ค. ์ˆ˜์น˜์ ์ธ ๊ตฌ์„ฑ์š”์†Œ๋Š” ๊ฐ ๋…ธ๋“œ์—์„œ์˜ CPD๋ฅผ ์˜๋ฏธํ•œ๋‹ค.

 

โ–ท ๋‹ค์Œ๊ณผ ๊ฐ™์€ ์˜ˆ๋ฅผ ์ƒ๊ฐํ•ด ๋ณด์ž. ๋ฉ”๋ฆฌ์˜ ์ „ํ™”(MarryCalls)๊ฐ€ ์™”์„ ๋•Œ, ์ง‘์— ๋„๋‘‘์ด ๋“ค์—ˆ์„ ํ™•๋ฅ ์„ ์–ด๋–ป๊ฒŒ ๊ตฌํ• ๊นŒ? ๋ฉ”๋ฆฌ์˜ ์ „ํ™”์— ๊ด€ํ•œ CPD๋งŒ์„ ์ด์šฉํ•ด์„œ๋Š” ๊ตฌํ•  ์ˆ˜ ์—†๋‹ค. ์ด๋Ÿฐ ๊ฒฝ์šฐ์—๋Š” ๋ฒ ์ด์ง€์•ˆ ๋„คํŠธ์›Œํฌ์˜ ๊ตฌ์กฐ์ ์ธ ์š”์†Œ์™€ ์ˆ˜์น˜์ ์ธ ์š”์†Œ ๋‘˜ ๋‹ค ์ด์šฉํ•˜์—ฌ ๊ตฌํ•  ์ˆ˜ ์žˆ๋‹ค.

 

4. Typical local structures

 

๋ฒ ์ด์ง€์•ˆ ๋„คํŠธ์›Œํฌ๋ฅผ ๊ตฌ์„ฑํ•˜๋Š” 3๊ฐ€์ง€์˜ ๋Œ€ํ‘œ์ ์ธ ๊ตฌ์กฐ์— ๋Œ€ํ•ด ์•Œ์•„๋ณด์ž.

 

 

 

โ–ท Common parent๋Š” ๋‘ ๊ฐœ์˜ ์ž์‹ ๋…ธ๋“œ๊ฐ€ ํ•˜๋‚˜์˜ ๋ถ€๋ชจ ๋…ธ๋“œ๋ฅผ ๊ฐ€์ง€๋Š” ๊ฒฝ์šฐ์ด๋‹ค. ์ด๋•Œ, ๋ถ€๋ชจ ๋…ธ๋“œ์˜ ์ •๋ณด๋ฅผ ์•Œ๊ฒŒ ๋˜๋ฉด ๋‘ ์ž์‹ ๋…ธ๋“œ๋Š” ์„œ๋กœ ๋…๋ฆฝ์ด๋‹ค.

 

โ–ท Cascading์€ ์„ธ ๊ฐœ์˜ ๋…ธ๋“œ๊ฐ€ ์ˆœ์ฐจ์ ์œผ๋กœ ์—ฐ๊ฒฐ๋œ ๊ฒฝ์šฐ์ด๋‹ค. ์ด๋•Œ, ๊ฐ€์šด๋ฐ์— ์œ„์น˜ํ•œ ๋…ธ๋“œ์˜ ์ •๋ณด๋ฅผ ์•Œ๊ฒŒ ๋˜๋ฉด ์–‘ ๋์˜ ๋…ธ๋“œ๋Š” ์„œ๋กœ ๋…๋ฆฝ์ด๋‹ค. ์œ„์˜ ์Šฌ๋ผ์ด๋“œ์˜ ์˜ˆ์— ์ ์šฉํ•˜๋ฉด, Alarm์— ๋Œ€ํ•œ ์ •๋ณด๊ฐ€ ์ฃผ์–ด์งˆ ๋•Œ, Buglary์˜ ์ •๋ณด๋Š” MarryCalls์— ์•„๋ฌด๋Ÿฐ ์ •๋ณด๋ฅผ ์ฃผ์ง€ ์•Š๋Š”๋‹ค. ์ฆ‰, Buglary์™€ MarryCalss๋Š” ๋…๋ฆฝ์ด๊ฒŒ ๋œ๋‹ค.

 

โ–ท V-Structure์€ ํ•˜๋‚˜์˜ ๋…ธ๋“œ๊ฐ€ ๋‘ ๊ฐœ์˜ ๋ถ€๋ชจ ๋…ธ๋“œ๋ฅผ ๊ฐ€์ง€๋Š” ๊ฒฝ์šฐ์ด๋‹ค. ์ด ๊ฒฝ์šฐ๋Š” ์•ž์˜ ๋‘ ๊ตฌ์กฐ์™€๋Š” ๋‹ค๋ฅด๊ฒŒ ์ž์‹ ๋…ธ๋“œ์˜ ์ •๋ณด๋ฅผ ์•Œ๋ฉด ๋‘ ๊ฐœ์˜ ๋ถ€๋ชจ ๋…ธ๋“œ๊ฐ€ ์ข…์†์ด๊ฒŒ ๋œ๋‹ค. ์œ„์˜ ์Šฌ๋ผ์ด๋“œ์˜ ์˜ˆ๋ฅผ ๋“ค์–ด ์„ค๋ช…ํ•˜๋ฉด, Alarm์ด ์šธ๋ ธ์„ ๋•Œ, Earthquake๊ฐ€ ๋ฐœ์ƒํ•˜์ง€ ์•Š์•˜๋‹ค๋ฉด, ์ด๋Š” Alarm์ด ๋ฐœ์ƒํ•œ ์›์ธ์ด Earthquake์— ์žˆ์ง€ ์•Š๋‹ค๋Š” ์ถ”๊ฐ€ ์ •๋ณด๋ฅผ Buglary์— ์ œ๊ณตํ•˜๊ฒŒ ๋œ๋‹ค. ๋”ฐ๋ผ์„œ ๋”์ด์ƒ Buglary์™€ Earthquake๊ฐ€ ๋…๋ฆฝ์ด์ง€ ์•Š๊ฒŒ ๋œ๋‹ค.

 

5. Bayes ball algorithm

 

 

โ–ท Bayes ball algorithm์€ ๊ฐ€์ƒ์˜ ๊ณต์„ ๊ตด๋ ค ๋ฒ ์ด์ฆˆ ๋„คํŠธ์›Œํฌ์—์„œ ๊ตด๋Ÿฌ๊ฐ€๋Š”์ง€ ์•ˆ ๊ตด๋Ÿฌ๊ฐ€๋Š”์ง€์— ๋”ฐ๋ผ ๋…ธ๋“œ๊ฐ„์˜ ๋…๋ฆฝ์„ ํŒ๋‹จํ•˜๊ธฐ ์œ„ํ•œ ์•Œ๊ณ ๋ฆฌ์ฆ˜์ด๋‹ค.

 

โ–ท ๋„คํŠธ์›Œํฌ์˜ ๊ตฌ์กฐ๊ฐ€ Common parent ๋˜๋Š” Casacading์—์„œ๋Š” ๊ฐ€์šด๋ฐ ๋…ธ๋“œ์—์„œ ์ •๋ณด๊ฐ€ ์ฃผ์–ด์ง€์ง€ ์•Š์œผ๋ฉด ๊ฐ€์ƒ์˜ ๊ณต์ด ๊ตด๋Ÿฌ๊ฐ€๊ณ , V-Structure์—์„œ๋Š” ๋ฐ˜๋Œ€๋กœ ๊ฐ€์šด๋ฐ ๋…ธ๋“œ์˜ ์ •๋ณด๊ฐ€ ์ฃผ์–ด์ ธ์•ผ ๊ณต์ด ๊ตด๋Ÿฌ๊ฐ€๊ฒŒ ๋œ๋‹ค. ๊ณต์ด ๊ตด๋Ÿฌ๊ฐ€๋Š” ๊ฒฝ๋กœ์— ์žˆ๋Š” ๋…ธ๋“œ์˜ ๊ด€๊ณ„๋Š” ์ข…์†์œผ๋กœ ํŒ๋‹จํ•  ์ˆ˜ ์žˆ๋‹ค.

 

 

โ–ท ์œ„ ์Šฌ๋ผ์ด๋“œ์˜ 4๊ฐœ์˜ ๋ฌธ์ œ๋ฅผ Bayes ball algorithm์„ ์ด์šฉํ•˜์—ฌ ํ’€๋ฉด ๋‹ต์€ ์ฐธ, ์ฐธ, ์ฐธ, ๊ฑฐ์ง“์ด๋‹ค.

 

โ–ท ํŠน์ • ๋…ธ๋“œ์—์„œ Blanket์— ํ•ด๋‹นํ•˜๋Š” ๋…ธ๋“œ์˜ ์ •๋ณด๊ฐ€ ์ œ๊ณต๋˜๋Š” ๊ฒฝ์šฐ, ํŠน์ • ๋…ธ๋“œ์™€ Blanket์— ํฌํ•จ๋˜์ง€ ์•Š๋Š” ๋…ธ๋“œ์˜ ๊ด€๊ณ„๋Š” ๋…๋ฆฝ์ด๋‹ค. ์ด Blanket์„ Markov blanket์ด๋ผ ํ•œ๋‹ค.

 

โ–ท Blanket์€ ํŠน์ • ๋…ธ๋“œ์˜ ๋ถ€๋ชจ ๋…ธ๋“œ์™€ ์ž์‹ ๋…ธ๋“œ ๊ทธ๋ฆฌ๊ณ  ์ž์‹ ๋…ธ๋“œ์˜ ๋ถ€๋ชจ ๋…ธ๋“œ๋กœ ๊ตฌ์„ฑ๋œ๋‹ค. ์ด๋•Œ, ์ž์‹ ๋…ธ๋“œ์˜ ๋ถ€๋ชจ ๋…ธ๋“œ๊ฐ€ ํ•„์š”ํ•œ ์ด์œ ๋Š” V-Structure์— ๋”ฐ๋ฅธ ์ •๋ณด์˜ ์ „๋‹ฌ์„ ๋ง‰๊ธฐ ์œ„ํ•จ์ด๋‹ค.

 

โ–ท D-Seperation์€ ํŠน์ • ๋…ธ๋“œ์— ๋Œ€ํ•œ ์ •๋ณด๊ฐ€ ์ฃผ์–ด์กŒ์„ ๋•Œ, ์‹œ์ž‘ ๋…ธ๋“œ์—์„œ Bayes ball algorithm์„ ์ด์šฉํ•˜์—ฌ ๊ณต์„ ๊ตด๋ ค์„œ ๋ชฉ์  ๋…ธ๋“œ์— ๋„๋‹ฌํ•˜์ง€ ๋ชปํ•˜๋Š” ๊ฒฝ์šฐ๋ฅผ ์˜๋ฏธํ•œ๋‹ค.

 


Reference:

"(๊ธฐ๊ณ„ ํ•™์Šต, Machine Learning) Week 7 Bayesian Network | Lecture 2 Probability Theorems," AAILab Kaist, www.youtube.com/watch?v=mnUcZbT5E28.

"(๊ธฐ๊ณ„ ํ•™์Šต, Machine Learning) Week 7 Bayesian Network | Lecture 3 Interpretation of Bayesian Network," AAILab Kaist, www.youtube.com/watch?v=N2rx_OmDeCk.

"(๊ธฐ๊ณ„ ํ•™์Šต, Machine Learning) Week 7 Bayesian Network | Lecture 4 Bayes Ball Algorithm," AAILab Kaist, www.youtube.com/watch?v=ZMG_LxhTzzk.