python赶集网bs4爬取内容

2022-07-28,,,

下面展示一些 内联代码片

// A code block
var foo = 'bar';
// An highlighted block
import requests,csv
from bs4 import BeautifulSoup
list=[]
herders={"User-Agent":"Mozilla/5.0 (Linux; Android 6.0; Nexus 5 Build/MRA58N) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/83.0.4103.116 Mobile Safari/537.36" }
url="http://bj.ganji.com/zufang/f4/pnl/"
r=requests.get(url,headers=herders).text
soup=BeautifulSoup(r,'lxml')
div=soup.find('div',class_='f-list js-tips-list').find_all("dl",class_="f-list-item-wrap min-line-height f-clear")
for i in div:
    name=i.find("dd",class_="dd-item title").a.string
    struct=i.find("dd",class_="dd-item size").span.text
    price=i.find("dd",class_="dd-item info").find("span",class_="num").string
    list.append([name,struct,price])
with open("赶集网.csv","w+",encoding="utf-8",newline="") as f:
    w=csv.writer(f)
    w.writerows(list)

本文地址:https://blog.csdn.net/sober_1/article/details/109640200

《python赶集网bs4爬取内容.doc》

下载本文的Word格式文档,以方便收藏与打印。