λ³Έλ¬Έ λ°”λ‘œκ°€κΈ°

Programming/R

μ‹œκ³„μ—΄ 데이터 λ‹€λ£¨λŠ” 방법

R의 λ‚΄μž₯λ˜μ–΄ μžˆλŠ” ν•¨μˆ˜μ™€ lubridate νŒ¨ν‚€μ§€λ₯Ό ν™œμš©ν•˜μ—¬ μ‹œκ³„μ—΄ 데이터λ₯Ό λ‹€λ£¨λŠ” 방법에 λŒ€ν•΄ μ•Œμ•„λ³Ό 것이닀.

 

β–‘ κΈ°λ³Έ ν•¨μˆ˜λ₯Ό ν™œμš©ν•œ μ‹œκ³„μ—΄ 데이터 닀루기

 

In:

date_form = c('%Y%m%d', '%Y.%m.%d', '%Y~%m~%d', '%Y-%m-%d')

date_1 = as.Date('20201019', tryFormats = date_form)
date_2 = as.Date('2020.10.19', tryFormats = date_form)
date_3 = as.Date('2020~10~19', tryFormats = date_form)

print(date_1)
print(date_2)
print(date_3)

print(class(date_1))
print(class(date_2))
print(class(date_3))

 

Out:

[1] "2020-10-19"

[1] "2020-10-19"

[1] "2020-10-19"

[1] "Date"

[1] "Date"

[1] "Date"

 

β–· λ¬Έμžμ—΄λ‘œλΆ€ν„° λ°μ΄ν„°νƒ€μž…μ„ λ³€ν™˜ν•˜μ—¬ μ‹œκ³„μ—΄ 데이터λ₯Ό 생성할 수 μžˆλ‹€. as.Date ν•¨μˆ˜μ˜ tryFormats μΈμžμ— data_form λ³€μˆ˜λ₯Ό μ£Όμ–΄ 첫 번째 μΈμžμ— 주어진 λ¬Έμžμ—΄μ΄ data_form λ³€μˆ˜μ˜ κ²½μš°μ— ν•΄λ‹Ήλ˜λŠ” 경우 Date νƒ€μž…μœΌλ‘œ λ³€ν™˜ν•œλ‹€.

 

β–· λ³€ν™˜λœ μ„Έ 경우 λͺ¨λ‘ ν΄λž˜μŠ€κ°€ Date인 것을 확인할 수 μžˆλ‹€.

 

In:

time_form = c('%Y%m%d %H:%M:%OS', '%Y.%m.%d %H:%M:%OS', '%Y~%m~%d %H:%M:%OS', '%Y-%m-%d %H:%M:%OS')

time_1 = as.POSIXct('2020-10-19 10:24:45', tryFormats = time_form)
time_2 = as.POSIXct('2020.10.19 10:24:45', tryFormats = time_form)
time_3 = as.POSIXct('2020~10~19 10:24:45', tryFormats = time_form)

print(time_1)
print(time_2)
print(time_3)

print(class(time_1))
print(class(time_2))
print(class(time_2))

 

Out:

[1] "2020-10-19 10:24:45 KST"

[1] "2020-10-19 10:24:45 KST"

[1] "2020-10-19 10:24:45 KST"

[1] "POSIXct" "POSIXt" 

[1] "POSIXct" "POSIXt" 

[1] "POSIXct" "POSIXt" 

 

β–· as.Date ν•¨μˆ˜λ₯Ό ν†΅ν•œ λ³€ν™˜κ³Ό λΉ„μŠ·ν•œ λ°©μ‹μœΌλ‘œ as.POSIXct ν•¨μˆ˜λ₯Ό μ΄μš©ν•˜μ—¬ λ¬Έμžμ—΄λ‘œλΆ€ν„° μ‹œκ³„μ—΄ 데이터λ₯Ό 생성할 수 μžˆλ‹€. 차이점은 as.Date ν•¨μˆ˜μ˜ 경우, 년월일 밖에 λ‚˜νƒ€λ‚΄μ§€ λͺ»ν•˜μ§€λ§Œ, as.POSIXct ν•¨μˆ˜μ˜ 경우 년월일 μ‹œλΆ„μ΄ˆκΉŒμ§€ λ‚˜νƒ€λ‚Ό 수 μžˆλ‹€λŠ” 것이닀. tryFormats μΈμžμ— μ‹œλΆ„μ΄ˆκΉŒμ§€ κ³ λ €ν•œ ν˜•νƒœλ₯Ό μ£Όλ©΄ 이λ₯Ό μΈμ‹ν•˜μ—¬ μ‹œκ³„μ—΄ λ°μ΄ν„°λ‘œ λ³€ν™˜ν•œλ‹€.

 

β–‘ lubridate νŒ¨ν‚€μ§€λ₯Ό ν™œμš©ν•œ μ‹œκ³„μ—΄ 데이터 닀루기

 

In:

date_3 = ymd('20201019')
date_4 = ymd('2020.10.19')
date_5 = ymd('2020~10~19')

print(date_3)
print(date_4)
print(date_5)

print(class(date_3))
print(class(date_4))
print(class(date_5))

 

Out:

[1] "2020-10-19"

[1] "2020-10-19"

[1] "2020-10-19"

[1] "Date"

[1] "Date"

[1] "Date"

 

β–· lubridate νŒ¨ν‚€μ§€μ—μ„œλŠ” λ¬Έμžμ—΄λ‘œλΆ€ν„° μ‹œκ³„μ—΄ λ°μ΄ν„°λ‘œ λ³€ν™˜ν•  λ•Œ, μ‚¬μš©μžμ˜ 편의λ₯Ό μœ„ν•˜μ—¬ ymd ν•¨μˆ˜λ₯Ό μ œκ³΅ν•œλ‹€. 이 ν•¨μˆ˜μ˜ νŠΉμ§•μ€ as.Date ν•¨μˆ˜μ™€ 달리, 인자둜 μ‹œκ³„μ—΄ λ°μ΄ν„°λ‘œ 인식할 수 μžˆλŠ” νŒ¨ν„΄μ„ 쀄 ν•„μš”κ°€ μ—†λ‹€λŠ” 것이닀. μ½”λ“œμ˜ κ²°κ³Όμ—μ„œ λ³΄λ‹€μ‹œν”Ό μžλ™μœΌλ‘œ λ¬Έμžμ—΄λ‘œλΆ€ν„° μ‹œκ³„μ—΄ λ°μ΄ν„°μ˜ νŒ¨ν„΄μ„ μΈμ‹ν•˜κ³ , λ³€ν™˜λœ 것을 확인할 수 μžˆλ‹€.

 

In:

time_4 = ymd_hms('2020-10-19 10:24:45')
time_5 = ymd_hms('2020.10.19 10:24:45')
time_6 = ymd_hms('2020~10~19 10:24:45')

print(time_4)
print(time_5)
print(time_6)

print(class(time_4))
print(class(time_5))
print(class(time_6))

 

Out:

[1] "2020-10-19 10:24:45 UTC"

[1] "2020-10-19 10:24:45 UTC"

[1] "2020-10-19 10:24:45 UTC"

[1] "POSIXct" "POSIXt" 

[1] "POSIXct" "POSIXt" 

[1] "POSIXct" "POSIXt" 

 

β–· ymd_hms ν•¨μˆ˜λ„ μ‹œκ³„μ—΄ 데이터에 λŒ€ν•œ νŒ¨ν„΄μ„ λ”°λ‘œ 인자둜 주지 μ•Šμ•„λ„, 이λ₯Ό μΈμ‹ν•˜μ—¬ μžλ™μœΌλ‘œ λ³€ν™˜ν•˜μ—¬ μ€€λ‹€.

 

In:

today()
now()

 

Out:

[1] "2020-10-20"

[1] "2020-10-20 02:24:07 KST"

 

β–· today ν•¨μˆ˜λŠ” ν•¨μˆ˜μ˜ μ‹€ν–‰ μ‹œμ μ˜ 년월일을, nowλŠ” μ‹œλΆ„μ΄ˆκΉŒμ§€ μ•Œλ €μ€€λ‹€.

 

In:

year(date_1)
month(date_1)
day(date_1)
weekdays(date_1)
yday(date_1)

hour(time_1)
minute(time_1)
second(time_1)

 

Out:

[1] 2020

[1] 10

[1] 19

[1] "μ›”μš”μΌ"

[1] 293

[1] 10

[1] 24

[1] 45

 

β–· lubridate νŒ¨ν‚€μ§€λŠ” μ‹œκ³„μ—΄ λ°μ΄ν„°λ‘œλΆ€ν„° νŠΉμ • μ‹œκ°„μ— λŒ€ν•œ 정보λ₯Ό μΆ”μΆœν•  수 μžˆλŠ” ν•¨μˆ˜λ„ μ œκ³΅ν•œλ‹€. μœ„ μ½”λ“œμ˜ κ²°κ³ΌλŠ” μ‹œκ³„μ—΄ λ°μ΄ν„°λ‘œλΆ€ν„° μΆ”μΆœλœ μ •λ³΄μ˜ 좜λ ₯ 결과이닀.

 

In:

hour(time_1) = 11
minute(time_1) = 11
second(time_1) = 11
print(time_1)

ymd('2000-01-01') + days(2)
ymd('2000-01-01') + months(2)
ymd('2000-01-01') + years(2)

 

Out:

[1] "2020-10-19 11:11:11 KST"

[1] "2000-01-03"

[1] "2000-03-01"

[1] "2002-01-01"

 

β–· lubridate νŒ¨ν‚€μ§€λŠ” μ‹œκ³„μ—΄ λ°μ΄ν„°μ˜ 연산을 μˆ˜ν–‰ν•˜λŠ” ν•¨μˆ˜λ₯Ό μ œκ³΅ν•œλ‹€. μ‹œκ°„μ„ μΆ”μΆœν•˜λŠ” hour, minute, second λ“±μ˜ ν•¨μˆ˜μ— λ°”κΎΈκ³ μž ν•˜λŠ” 값을 ν• λ‹Ήν•˜λŠ” 경우, ν•΄λ‹Ή ν•¨μˆ˜μ˜ 인자둜 주어진 μ‹œκ³„μ—΄ 데이터가 λ°”λ€Œκ²Œ λœλ‹€.

 

β–· μ‹œκ³„μ—΄ 데이터에 μ‹œκ°„μ„ μΆ”μΆœν•˜λŠ” ν•¨μˆ˜λͺ…에 sλ₯Ό 뢙인 ν•¨μˆ˜λ‘œ λ”ν•˜κ±°λ‚˜, λΉΌλŠ” 경우, μ‹œκ°„μ˜ 연산을 μˆ˜ν–‰ν•  수 μžˆλ‹€.