๋ณธ๋ฌธ ๋ฐ”๋กœ๊ฐ€๊ธฐ

Programming/Python

BeautifulSoup๋ฅผ ํ™œ์šฉํ•œ ํฌ๋กค๋ง

โ–ก ํฌ๋กค๋ง์˜ ์ˆœ์„œ

 

(1) ์›ํ•˜๋Š” ํŽ˜์ด์ง€์˜ HTML ๋ฐ์ดํ„ฐ๋ฅผ ๋ฐ›๊ธฐ

(2) HTML์„ ๋ถ„์„(ํŒŒ์‹ฑ) ๊ฐ€๋Šฅํ•œ ํ˜•ํƒœ๋กœ ๊ฐ€๊ณต

(3) ์›ํ•˜๋Š” ๋ฐ์ดํ„ฐ๋ฅผ ์ถ”์ถœ

 

In:

import urllib.request # ์›นํŽ˜์ด์ง€ ์ ‘์†

req = urllib.request.Request('https://naver.com') # ์›น์„œ๋ฒ„๋ฅผ Requestํ•˜๊ธฐ ์œ„ํ•œ ๊ฐ์ฒด ์ƒ์„ฑ
res = urllib.request.urlopen(req) # Reqeustํ•œ ๊ฐ์ฒด์— ๋Œ€ํ•œ urlopen ํ•จ์ˆ˜์˜ ์ˆ˜ํ–‰๊ฒฐ๊ณผ๋ฅผ ๋ณ€์ˆ˜์— ์ €์žฅ

html_byte = res.read() # read ํ•จ์ˆ˜๋ฅผ ํ†ตํ•ด ๋ฐ”์ดํŠธ ํ˜•ํƒœ๋กœ ์ €์žฅ
html_str = html_byte.decode('UTF8') # ๋ฌธ์ž์—ด๋กœ ์‚ฌ์šฉํ•˜๊ธฐ ์œ„ํ•ด UTF-8๋กœ ๋””์ฝ”๋”ฉ ์ˆ˜ํ–‰
html_splt = html_str.split('\n') # html_str ๋ณ€์ˆ˜ ๋‚ด์šฉ์„ ์ค„๋ฐ”๊ฟˆ์„ ๊ธฐ์ค€์œผ๋กœ ๋ถ„๋ฆฌ

cnt = 1

for l in html_splt: # ๋ถ„๋ฆฌ๋œ ๋‚ด์šฉ์„ ์ค„ ๋‹จ์œ„๋กœ ์ถœ๋ ฅ
    if cnt <= 10:
        print(l)
        print('\n')
        
        cnt += 1

 

Out:



<!doctype html>                <html lang="ko" data-dark="false"> <head> <meta charset="utf-8"> <title>NAVER</title> <meta http-equiv="X-UA-Compatible" content="IE=edge"> <meta name="viewport" content="width=1190"> <meta name="apple-mobile-web-app-title" content="NAVER"/> <meta name="robots" content="index,nofollow"/> <meta name="description" content="๋„ค์ด๋ฒ„ ๋ฉ”์ธ์—์„œ ๋‹ค์–‘ํ•œ ์ •๋ณด์™€ ์œ ์šฉํ•œ ์ปจํ…์ธ ๋ฅผ ๋งŒ๋‚˜ ๋ณด์„ธ์š”"/> <meta property="og:title" content="๋„ค์ด๋ฒ„"> <meta property="og:url" content="https://www.naver.com/"> <meta property="og:image" content="https://s.pstatic.net/static/www/mobile/edit/2016/0705/mobile_212852414260.png"> <meta property="og:description" content="๋„ค์ด๋ฒ„ ๋ฉ”์ธ์—์„œ ๋‹ค์–‘ํ•œ ์ •๋ณด์™€ ์œ ์šฉํ•œ ์ปจํ…์ธ ๋ฅผ ๋งŒ๋‚˜ ๋ณด์„ธ์š”"/> <meta name="twitter:card" content="summary"> <meta name="twitter:title" content=""> <meta name="twitter:url" content="https://www.naver.com/"> <meta name="twitter:image" content="https://s.pstatic.net/static/www/mobile/edit/2016/0705/mobile_212852414260.png"> <meta name="twitter:description" content="๋„ค์ด๋ฒ„ ๋ฉ”์ธ์—์„œ ๋‹ค์–‘ํ•œ ์ •๋ณด์™€ ์œ ์šฉํ•œ ์ปจํ…์ธ ๋ฅผ ๋งŒ๋‚˜ ๋ณด์„ธ์š”"/>  <link rel="stylesheet" href="https://pm.pstatic.net/dist/css/nmain.20200917.css"> <link rel="stylesheet" href="https://ssl.pstatic.net/sstatic/search/pc/css/api_atcmp_200709.css"> <link rel="shortcut icon" type="image/x-icon" href="/favicon.ico?1"/>   <script>document.domain="naver.com",window.nmain=window.nmain||{},window.nmain.supportFlicking=!1;var nsc="navertop.v4",ua=navigator.userAgent;window.nmain.isIE=navigator.appName&&navigator.appName.indexOf("Explorer")>0&&ua.toLocaleLowerCase().indexOf("msie 10.0")<0,document.getElementsByTagName("html")[0].setAttribute("data-useragent",ua),window.nmain.isIE&&(Object.create=function(n){function a(){}return a.prototype=n,new a})</script> <script>var darkmode= false;window.naver_corp_da=window.naver_corp_da||{main:{}},window.naver_corp_da.main=window.naver_corp_da.main||{},window.naver_corp_da.main.darkmode=darkmode</script> <script> window.nmain.gv = {  isLogin: false,


useId: null,   daInfo: {"ANIMAL":{"menu":"ANIMAL","childMenu":"","adType":"singleDom","multiDomAdUrl":"","multiDomUnit":"","infoList":[{"adposId":"1000161","singleDomAdUrl":"https://nv.veta.naver.com/fxshow","param":{"da_dom_id":"p_main_animal_1","tb":"ANIMAL_1","unit":"SU10567","calp":"-"},"type":{"position":"abs","positionIndex":4,"subject":"contents"},"dom":null},{"adposId":"1000162","singleDomAdUrl":"https://nv.veta.naver.com/fxshow","param":{"da_dom_id":"p_main_animal_2","tb":"ANIMAL_1","unit":"SU10568","calp":"-"},"type":{"position":"abs","positionIndex":8,"subject":"contents"},"dom":null}]},"BEAUTY":{"menu":"BEAUTY","childMenu":"","adType":"singleDom","multiDomAdUrl":"","multiDomUnit":"","infoList":[{"adposId":"1000163","singleDomAdUrl":"https://nv.veta.naver.com/fxshow","param":{"da_dom_id":"p_main_beauty_1","tb":"BEAUTY_1","unit":"SU10595","calp":"-"},"type":{"position":"abs","positionIndex":4,"subject":"contents"},"dom":null},{"adposId":"1000164","singleDomAdUrl":"https://nv.veta.naver.com/fxshow","param":{"da_dom_id":"p_main_beauty_2","tb":"BEAUTY_1","unit":"SU10596","calp":"-"},"type":{"position":"abs","positionIndex":8,"subject":"contents"},"dom":null}]},"BUSINESS":{"menu":"BUSINESS","childMenu":"","adType":"singleDom","multiDomAdUrl":"","multiDomUnit":"","infoList":[{"adposId":"1000165","singleDomAdUrl":"https://nv.veta.naver.com/fxshow","param":{"da_dom_id":"p_main_business_1","tb":"BUSINESS_1","unit":"SU10577","calp":"-"},"type":{"position":"abs","positionIndex":4,"subject":"contents"},"dom":null},{"adposId":"1000166","singleDomAdUrl":"https://nv.veta.naver.com/fxshow","param":{"da_dom_id":"p_main_business_2","tb":"BUSINESS_1","unit":"SU10578","calp":"-"},"type":{"position":"abs","positionIndex":8,"subject":"contents"},"dom":null}]},"CARGAME":{"menu":"CARGAME","childMenu":"","adType":"singleDom","multiDomAdUrl":"","multiDomUnit":"","infoList":[{"adposId":"1000167","singleDomAdUrl":"https://nv.veta.naver.com/fxshow","param":{"da_dom_id":"p_main_cargame_1","tb":"CARGAME_1","unit":"SU10587","calp":"-"},"type":{"position":"abs","positionIndex":4,"subject":"contents"},"dom":null},{"adposId":"1000168","singleDomAdUrl":"https://nv.veta.naver.com/fxshow","param":{"da_dom_id":"p_main_cargame_2","tb":"CARGAME_1","unit":"SU10588","calp":"-"},"type":{"position":"abs","positionIndex":8,"subject":"contents"},"dom":null}]},"CHINA":{"menu":"CHINA","childMenu":"","adType":"singleDom","multiDomAdUrl":"","multiDomUnit":"","infoList":[{"adposId":"1000169","singleDomAdUrl":"https://nv.veta.naver.com/fxshow","param":{"da_dom_id":"p_main_china_1","tb":"CHINA_1","unit":"SU10591","calp":"-"},"type":{"position":"abs","positionIndex":4,"subject":"contents"},"dom":null},{"adposId":"1000170","singleDomAdUrl":"https://nv.veta.naver.com/fxshow","param":{"da_dom_id":"p_main_china_2","tb":"CHINA_1","unit":"SU10592","calp":"-"},"type":{"position":"abs","positionIndex":8,"subject":"contents"},"dom":null}]},"DESIGN":{"menu":"DESIGN","childMenu":"","adType":"singleDom","multiDomAdUrl":"","multiDomUnit":"","infoList":[{"adposId":"1000171","singleDomAdUrl":"https://nv.veta.naver.com/fxshow","param":{"da_dom_id":"p_main_design_1","tb":"DESIGN_1","unit":"SU10569","calp":"-"},"type":{"position":"abs","positionIndex":4,"subject":"contents"},"dom":null},{"adposId":"1000172","singleDomAdUrl":"https://nv.veta.naver.com/fxshow","param":{"da_dom_id":"p_main_design_2","tb":"DESIGN_1","unit":"SU10570","calp":"-"},"type":{"position":"abs","positionIndex":8,"subject":"contents"},"dom":null}]},"FARM":{"menu":"FARM","childMenu":"","adType":"singleDom","multiDomAdUrl":"","multiDomUnit":"","infoList":[{"adposId":"1000173","singleDomAdUrl":"https://nv.veta.naver.com/fxshow","param":{"da_dom_id":"p_main_farm_1","tb":"FARM_1","unit":"SU10561","calp":"-"},"type":{"position":"abs","positionIndex":4,"subject":"contents"},"dom":null},{"adposId":"1000174","singleDomAdUrl":"https://nv.veta.naver.com/fxshow","param":{"da_dom_id":"p_main_farm_2","tb":"FARM_1","unit":"SU10562","calp":"-"},"type":{"position":"abs","positionIndex":8,"subject":"contents"},"dom":null}]},"FINANCE":{"menu":"FINANCE","childMenu":"","adType":"singleDom","multiDomAdUrl":"","multiDomUnit":"","infoList":[{"adposId":"1000175","singleDomAdUrl":"https://nv.veta.naver.com/fxshow","param":{"da_dom_id":"p_main_finance_1","tb":"FINANCE_1","unit":"SU10563","calp":"-"},"type":{"position":"abs","positionIndex":4,"subject":"contents"},"dom":null},{"adposId":"1000176","singleDomAdUrl":"https://nv.veta.naver.com/fxshow","param":{"da_dom_id":"p_main_finance_2","tb":"FINANCE_1","unit":"SU10564","calp":"-"},"type":{"position":"abs","positionIndex":8,"subject":"contents"},"dom":null}]},"ITTECH":{"menu":"ITTECH","childMenu":"","adType":"singleDom","multiDomAdUrl":"","multiDomUnit":"","infoList":[{"adposId":"1000177","singleDomAdUrl":"https://nv.veta.naver.com/fxshow","param":{"da_dom_id":"p_main_ittech_1","tb":"ITTECH_1","unit":"SU10593","calp":"-"},"type":{"position":"abs","positionIndex":4,"subject":"contents"},"dom":null},{"adposId":"1000178","singleDomAdUrl":"https://nv.veta.naver.com/fxshow","param":{"da_dom_id":"p_main_ittech_2","tb":"ITTECH_1","unit":"SU10594","calp":"-"},"type":{"position":"abs","positionIndex":8,"subject":"contents"},"dom":null}]},"JOB":{"menu":"JOB","childMenu":"","adType":"singleDom","multiDomAdUrl":"","multiDomUnit":"","infoList":[{"adposId":"1000179","singleDomAdUrl":"https://nv.veta.naver.com/fxshow","param":{"da_dom_id":"p_main_job_1","tb":"JOB_1","unit":"SU10589","calp":"-"},"type":{"position":"abs","positionIndex":4,"subject":"contents"},"dom":null},{"adposId":"1000180","singleDomAdUrl":"https://nv.veta.naver.com/fxshow","param":{"da_dom_id":"p_main_job_2","tb":"JOB_1","unit":"SU10590","calp":"-"},"type":{"position":"abs","positionIndex":8,"subject":"contents"},"dom":null}]},"LAW":{"menu":"LAW","childMenu":"","adType":"singleDom","multiDomAdUrl":"","multiDomUnit":"","infoList":[{"adposId":"1000181","singleDomAdUrl":"https://nv.veta.naver.com/fxshow","param":{"da_dom_id":"p_main_law_1","tb":"LAW_1","unit":"SU10573","calp":"-"},"type":{"position":"abs","positionIndex":4,"subject":"contents"},"dom":null},{"adposId":"1000182","singleDomAdUrl":"https://nv.veta.naver.com/fxshow","param":{"da_dom_id":"p_main_law_2","tb":"LAW_1","unit":"SU10574","calp":"-"},"type":{"position":"abs","positionIndex":8,"subject":"contents"},"dom":null}]},"LIVING":{"menu":"LIVING","childMenu":"","adType":"singleDom","multiDomAdUrl":"","multiDomUnit":"","infoList":[{"adposId":"1000183","singleDomAdUrl":"https://nv.veta.naver.com/fxshow","param":{"da_dom_id":"p_main_living_1","tb":"LIVING_1","unit":"SU10597","calp":"-"},"type":{"position":"abs","positionIndex":4,"subject":"contents"},"dom":null},{"adposId":"1000184","singleDomAdUrl":"https://nv.veta.naver.com/fxshow","param":{"da_dom_id":"p_main_living_2","tb":"LIVING_1","unit":"SU10606","calp":"-"},"type":{"position":"abs","positionIndex":8,"subject":"contents"},"dom":null}]},"LIVINGHOME":{"menu":"LIVINGHOME","childMenu":"","adType":"singleDom","multiDomAdUrl":"","multiDomUnit":"","infoList":[{"adposId":"1000185","singleDomAdUrl":"https://nv.veta.naver.com/fxshow","param":{"da_dom_id":"p_main_livinghome_1","tb":"LIVINGHOME_1","unit":"SU10571","calp":"-"},"type":{"position":"abs","positionIndex":4,"subject":"contents"},"dom":null},{"adposId":"1000186","singleDomAdUrl":"https://nv.veta.naver.com/fxshow","param":{"da_dom_id":"p_main_livinghome_2","tb":"LIVINGHOME_1","unit":"SU10572","calp":"-"},"type":{"position":"abs","positionIndex":8,"subject":"contents"},"dom":null}]},"MOMKIDS":{"menu":"MOMKIDS","childMenu":"","adType":"singleDom","multiDomAdUrl":"","multiDomUnit":"","infoList":[{"adposId":"1000187","singleDomAdUrl":"https://nv.veta.naver.com/fxshow","param":{"da_dom_id":"p_main_momkids_1","tb":"MOMKIDS_1","unit":"SU10575","calp":"-"},"type":{"position":"abs","positionIndex":4,"subject":"contents"},"dom":null},{"adposId":"1000188","singleDomAdUrl":"https://nv.veta.naver.com/fxshow","param":{"da_dom_id":"p_main_momkids_2","tb":"MOMKIDS_1","unit":"SU10576","calp":"-"},"type":{"position":"abs","positionIndex":8,"subject":"contents"},"dom":null}]},"MOVIE":{"menu":"MOVIE","childMenu":"","adType":"singleDom","multiDomAdUrl":"","multiDomUnit":"","infoList":[{"adposId":"1000189","singleDomAdUrl":"https://nv.veta.naver.com/fxshow","param":{"da_dom_id":"p_main_movie_1","tb":"MOVIE_1","unit":"SU10585","calp":"-"},"type":{"position":"abs","positionIndex":4,"subject":"contents"},"dom":null},{"adposId":"1000190","singleDomAdUrl":"https://nv.veta.naver.com/fxshow","param":{"da_dom_id":"p_main_movie_2","tb":"MOVIE_1","unit":"SU10586","calp":"-"},"type":{"position":"abs","positionIndex":8,"subject":"contents"},"dom":null}]},"SCHOOL":{"menu":"SCHOOL","childMenu":"","adType":"singleDom","multiDomAdUrl":"","multiDomUnit":"","infoList":[{"adposId":"1000191","singleDomAdUrl":"https://nv.veta.naver.com/fxshow","param":{"da_dom_id":"p_main_school_1","tb":"SCHOOL_1","unit":"SU10579","calp":"-"},"type":{"position":"abs","positionIndex":4,"subject":"contents"},"dom":null},{"adposId":"1000192","singleDomAdUrl":"https://nv.veta.naver.com/fxshow","param":{"da_dom_id":"p_main_school_2","tb":"SCHOOL_1","unit":"SU10580","calp":"-"},"type":{"position":"abs","positionIndex":8,"subject":"contents"},"dom":null}]},"SHOW":{"menu":"SHOW","childMenu":"","adType":"singleDom","multiDomAdUrl":"","multiDomUnit":"","infoList":[{"adposId":"1000193","singleDomAdUrl":"https://nv.veta.naver.com/fxshow","param":{"da_dom_id":"p_main_show_1","tb":"SHOW_1","unit":"SU10565","calp":"-"},"type":{"position":"abs","positionIndex":4,"subject":"contents"},"dom":null},{"adposId":"1000194","singleDomAdUrl":"https://nv.veta.naver.com/fxshow","param":{"da_dom_id":"p_main_show_2","tb":"SHOW_1","unit":"SU10566","calp":"-"},"type":{"position":"abs","positionIndex":8,"subject":"contents"},"dom":null}]},"TRAVEL":{"menu":"TRAVEL","childMenu":"","adType":"singleDom","multiDomAdUrl":"","multiDomUnit":"","infoList":[{"adposId":"1000195","singleDomAdUrl":"https://nv.veta.naver.com/fxshow","param":{"da_dom_id":"p_main_travel_1","tb":"TRAVEL_1","unit":"SU10581","calp":"-"},"type":{"position":"abs","positionIndex":4,"subject":"contents"},"dom":null},{"adposId":"1000196","singleDomAdUrl":"https://nv.veta.naver.com/fxshow","param":{"da_dom_id":"p_main_travel_2","tb":"TRAVEL_1","unit":"SU10582","calp":"-"},"type":{"position":"abs","positionIndex":8,"subject":"contents"},"dom":null}]},"WEDDING":{"menu":"WEDDING","childMenu":"","adType":"singleDom","multiDomAdUrl":"","multiDomUnit":"","infoList":[{"adposId":"1000197","singleDomAdUrl":"https://nv.veta.naver.com/fxshow","param":{"da_dom_id":"p_main_wedding_1","tb":"WEDDING_1","unit":"SU10583","calp":"-"},"type":{"position":"abs","positionIndex":4,"subject":"contents"},"dom":null},{"adposId":"1000198","singleDomAdUrl":"https://nv.veta.naver.com/fxshow","param":{"da_dom_id":"p_main_wedding_2","tb":"WEDDING_1","unit":"SU10584","calp":"-"},"type":{"position":"abs","positionIndex":8,"subject":"contents"},"dom":null}]}},


svt: 20201010181134,


}; </script> <script> window.nmain.newsstand = {


rcode: '08320102',


newsCastSubsInfo: '',


newsStandSubsInfo: ''


};


window.etc = {  };


 

 

โ–ก Requests

 

- ์›น์„œ๋ฒ„์— ์ ‘์†ํ•˜์—ฌ HTML์„ ๋ฐ›์„ ์ˆ˜ ์žˆ๋Š” ์™ธ๋ถ€ ๋ผ์ด๋ธŒ๋Ÿฌ๋ฆฌ

- ๊ธฐ์กด ๋ผ์ด๋ธŒ๋Ÿฌ๋ฆฌ๋ณด๋‹ค ์‚ฌ์šฉ์ด ํŽธ๋ฆฌํ•˜๊ณ , ์†๋„๊ฐ€ ๋น ๋ฆ„

- ์•„๋‚˜์ฝ˜๋‹ค์˜ ๊ฒฝ์šฐ, ๋‚ด์žฅ๋˜์–ด ์žˆ์Œ

- pip install requests๋ฅผ ์ž…๋ ฅํ•˜์—ฌ ์„ค์น˜

 

In:

import requests

res = requests.get('https://www.naver.com') # ์œ„์˜ urllib๋ฅผ ํ†ตํ•ด ์ˆ˜ํ–‰ํ•œ ์ž‘์—…์„ ํ•œ ์ค„๋กœ ์ˆ˜ํ–‰

print(res.status_code)

 

Out:

200

 

โ–ท status_code ํ•จ์ˆ˜์˜ ์ถœ๋ ฅ๋œ ์ˆซ์ž์˜ ์˜๋ฏธ๋Š” ๋‹ค์Œ๊ณผ ๊ฐ™๋‹ค.

     200: ์ •์ƒ

     404: ์š”์ฒญํ•œ URL์ด ์—†์Œ

     502: ์–ดํ”Œ๋ฆฌ์ผ€์ด์…˜ ์—๋Ÿฌ

 

In:

print(res.headers['content-type'])

 

Out:

text/html; charset=UTF-8

 

โ–ท UTF-8๋กœ ์ด๋ฏธ ๋””์ฝ”๋”ฉ๋˜์–ด ์žˆ๋Š” ๊ฒƒ์„ ์•Œ ์ˆ˜ ์žˆ๋‹ค.

 

In:

print(res.text)

 

โ–ท ์•ž์˜ urllib๋ฅผ ํ†ตํ•œ ์ˆ˜ํ–‰๊ฒฐ๊ณผ์™€ ๊ฐ™์€ ํ˜•ํƒœ๋ฅผ ์–ป์„ ์ˆ˜ ์žˆ๋‹ค.

 

 

โ–ก BeautifulSoup

 

- HTML์„ ๋ถ„์„ ๊ฐ€๋Šฅํ•œ ํ˜•ํƒœ๋กœ ๊ฐ€๊ณต ๋ฐ ์ถ”์ถœํ•  ์ˆ˜ ์žˆ๋Š” ์™ธ๋ถ€ ๋ผ์ด๋ธŒ๋Ÿฌ๋ฆฌ

- ์‚ฌ์šฉ์‹œ ํŒŒ์„œ๋ฅผ ๋ฐ˜๋“œ์‹œ ์ง€์ •

- ํŒŒ์„œ์˜ ์˜ˆ๋กœ๋Š” html, parser, lxml ๋“ฑ์ด ์žˆ์Œ

- pip install BeautifulSoup4๋ฅผ ์ž…๋ ฅํ•˜์—ฌ ์„ค์น˜

 

In:

from bs4 import BeautifulSoup

bs = BeautifulSoup(res.text, 'html.parser')

print(bs)

 

โ–ท BeautifulSoup๋ฅผ ํ™œ์šฉํ•˜์—ฌ ์•ž์˜ ๊ฒฝ์šฐ์™€ ๊ฐ™์ด ํ•ด๋‹น URL์˜ ๋‚ด์šฉ์„ ๋ถˆ๋Ÿฌ์˜ฌ ์ˆ˜ ์žˆ๋‹ค.

 

In:

print(bs.select('a')[0]) # a ํƒœ๊ทธ์˜ 0๋ฒˆ์งธ ์ธ๋ฑ์Šค ๋ถˆ๋Ÿฌ์˜ค๊ธฐ

Out:

<a href="#newsstand"><span>๋‰ด์Šค์Šคํƒ ๋“œ ๋ฐ”๋กœ๊ฐ€๊ธฐ</span></a>

 

โ–ท select ํ•จ์ˆ˜๋ฅผ ์ด์šฉํ•˜์—ฌ ํƒœ๊ทธ๊ฐ€ a์ธ ๊ฒƒ์„ ๋ถˆ๋Ÿฌ์˜ฌ ์ˆ˜ ์žˆ๋‹ค. ๋‹ค์Œ์€ ์‹ค์ œ HTML์˜ a ํƒœ๊ทธ์ด๋‹ค.

 

 

In:

url = 'https://www.naver.com'
req = requests.get(url)
bs = BeautifulSoup(req.text, 'html.parser')

lst = bs.find_all('li' , {'class':'nav_item'}) # li ํƒœ๊ทธ์˜ class๊ฐ€ "nav_item"์ธ ์š”์†Œ๋ฅผ ์ฐพ๊ธฐ

for l in lst:
    print(l.find('a').text) # a ํƒœ๊ทธ์— ์žˆ๋Š” ๊ธ€์ž๋งŒ ๊ฐ€์ ธ์˜ค๊ธฐ

 

Out:

๋ฉ”์ผ
์นดํŽ˜
๋ธ”๋กœ๊ทธ
์ง€์‹iN
์‡ผํ•‘
Pay
TV
์‚ฌ์ „
๋‰ด์Šค
์ฆ๊ถŒ
๋ถ€๋™์‚ฐ
์ง€๋„
์˜ํ™”
VIBE
์ฑ…
์›นํˆฐ

 

โ–ท ์•„๋ž˜์˜ ๊ทธ๋ฆผ์—์„œ ๋ณด๋‹ค์‹œํ”ผ, li ํƒœ๊ทธ์˜ ํด๋ž˜์Šค๊ฐ€ "nav_item"์ด๋ฉฐ, a ํƒœ๊ทธ์— ํฌํ•จ๋œ ๋‚ด์šฉ์„ ๋ถˆ๋Ÿฌ์˜จ ๊ฒฐ๊ณผ์ด๋‹ค.

 

 

In:

url = 'https://www.daum.net'
req = requests.get(url)
bs = BeautifulSoup(req.text, 'html.parser')

lst_ul = bs.find_all('ul', {'class':'list_mainsvc'})

for u in lst_ul:
    lst_li = u.find_all('li')
    
    for l  in lst_li:
        print(l.find_all('span', {'class':'ir_wa'})[0].text)

 

Out:

์นดํŽ˜
๋ฉ”์ผ
๋‰ด์Šค
์ง€๋„
์ฆ๊ถŒ
์‡ผํ•‘
์นด์นด์˜คTV
์›นํˆฐ
๋ธ”๋กœ๊ทธ
๋ธŒ๋Ÿฐ์น˜
์‚ฌ์ „
๊ฒŒ์ž„
๊ฐ™์ด๊ฐ€์น˜

 

โ–ท ์ด ๊ฒฝ์šฐ๋Š” span ํƒœ๊ทธ์˜ ํด๋ž˜์Šค๊ฐ€ "ir_wa"์ธ ๋‚ด์šฉ์„ ๋ฝ‘์•„๋‚ด๊ธฐ ์œ„ํ•ด, ul ํƒœ๊ทธ, li ํƒœ๊ทธ, span ํƒœ๊ทธ ์ˆœ์œผ๋กœ ๊ณ„์ธต์ ์œผ๋กœ ์ ‘๊ทผํ•˜์˜€๋‹ค.

 

โ–ท ์•„๋ž˜์˜ ๊ทธ๋ฆผ์—์„œ ํƒœ๊ทธ์˜ ๊ณ„์ธต์  ๊ตฌ์„ฑ์„ ํ™•์ธํ•  ์ˆ˜ ์žˆ๋‹ค.