提交 3df38ab2 编写于 作者: CSDN-Ada助手's avatar CSDN-Ada助手

Merge branch 'master' into dev

{
"export": [],
"export": ["dynamic_page.json"],
"keywords": [],
"children": [
{
......
{
"author": "zxm2015",
"source": "dynamic_page.md",
"depends": [],
"type": "code_options"
}
\ No newline at end of file
# 爬取动态页面
现在想爬取一个url为下拉滚动的页面,下列选项可以爬取到下列页面内容的是:
## 答案
```python
import time
from selenium import webdriver
from bs4 import BeautifulSoup
driver = webdriver.Chrome()
driver.get(url);
Thread.sleep(1000);
page_size = 10
for i in range(page_size):
time.sleep(2)
js = "var q=document.documentElement.scrollTop=10000"
driver.execute_script(js)
page = BeautifulSoup(driver.page_source, 'lxml')
print(page.text)
```
## 选项
### A
```
以上均不正确
```
### B
```python
import requests
response = requests.get(url=url)
page = BeautifulSoup(response.text, 'lxml')
print(page.text)
```
### C
```python
import urllib.request
response = urllib.request.urlopen(url)
buff = response.read()
html = buff.decode("utf8")
page = BeautifulSoup(html, 'lxml')
print(page.text)
```
......@@ -3,4 +3,4 @@
"source": "simulate_login.md",
"depends": [],
"type": "code_options"
}
\ No newline at end of file
}
......@@ -3,4 +3,4 @@
"source": "selenium.md",
"depends": [],
"type": "code_options"
}
\ No newline at end of file
}
......@@ -3,4 +3,4 @@
"source": "pyspider.md",
"depends": [],
"type": "code_options"
}
\ No newline at end of file
}
......@@ -3,4 +3,4 @@
"source": "verification_code.md",
"depends": [],
"type": "code_options"
}
\ No newline at end of file
}
Markdown is supported
0% .
You are about to add 0 people to the discussion. Proceed with caution.
先完成此消息的编辑!
想要评论请 注册