Skip to content
体验新版
项目
组织
正在加载...
登录
切换导航
打开侧边栏
looyolo
scrapy
提交
9f8c3938
S
scrapy
项目概览
looyolo
/
scrapy
与 Fork 源项目一致
从无法访问的项目Fork
通知
2
Star
0
Fork
0
代码
文件
提交
分支
Tags
贡献者
分支图
Diff
Issue
0
列表
看板
标记
里程碑
合并请求
0
Wiki
0
Wiki
分析
仓库
DevOps
项目成员
Pages
S
scrapy
项目概览
项目概览
详情
发布
仓库
仓库
文件
提交
分支
标签
贡献者
分支图
比较
Issue
0
Issue
0
列表
看板
标记
里程碑
合并请求
0
合并请求
0
Pages
分析
分析
仓库分析
DevOps
Wiki
0
Wiki
成员
成员
收起侧边栏
关闭侧边栏
动态
分支图
创建新Issue
提交
Issue看板
前往新版Gitcode,体验更适合开发者的 AI 搜索 >>
未验证
提交
9f8c3938
编写于
10月 08, 2020
作者:
A
Andrey Rahmatullin
提交者:
GitHub
10月 08, 2020
浏览文件
操作
浏览文件
下载
差异文件
Merge pull request #4823 from elacuesta/cookies-revert-header
Do not process cookies from headers
上级
45c06cfd
137c8ba6
变更
5
隐藏空白更改
内联
并排
Showing
5 changed file
with
37 addition
and
31 deletion
+37
-31
docs/topics/downloader-middleware.rst
docs/topics/downloader-middleware.rst
+5
-0
docs/topics/request-response.rst
docs/topics/request-response.rst
+12
-0
docs/topics/settings.rst
docs/topics/settings.rst
+5
-0
scrapy/downloadermiddlewares/cookies.py
scrapy/downloadermiddlewares/cookies.py
+10
-31
tests/test_downloadermiddleware_cookies.py
tests/test_downloadermiddleware_cookies.py
+5
-0
未找到文件。
docs/topics/downloader-middleware.rst
浏览文件 @
9f8c3938
...
...
@@ -207,6 +207,11 @@ CookiesMiddleware
a warning. Refer to :ref:`topics-logging-advanced-customization`
to customize the logging behaviour.
.. caution:: Cookies set via the ``Cookie`` header are not considered by the
:ref:`cookies-mw`. If you need to set cookies for a request, use the
:class:`Request.cookies <scrapy.http.Request>` parameter. This is a known
current limitation that is being worked on.
The following settings can be used to configure the cookie middleware:
* :setting:`COOKIES_ENABLED`
...
...
docs/topics/request-response.rst
浏览文件 @
9f8c3938
...
...
@@ -61,6 +61,12 @@ Request objects
:param headers: the headers of this request. The dict values can be strings
(for single valued headers) or lists (for multi-valued headers). If
``None`` is passed as value, the HTTP header will not be sent at all.
.. caution:: Cookies set via the ``Cookie`` header are not considered by the
:ref:`cookies-mw`. If you need to set cookies for a request, use the
:class:`Request.cookies <scrapy.http.Request>` parameter. This is a known
current limitation that is being worked on.
:type headers: dict
:param cookies: the request cookies. These can be sent in two forms.
...
...
@@ -102,6 +108,12 @@ Request objects
)
For more info see :ref:`cookies-mw`.
.. caution:: Cookies set via the ``Cookie`` header are not considered by the
:ref:`cookies-mw`. If you need to set cookies for a request, use the
:class:`Request.cookies <scrapy.http.Request>` parameter. This is a known
current limitation that is being worked on.
:type cookies: dict or list
:param encoding: the encoding of this request (defaults to ``'utf-8'``).
...
...
docs/topics/settings.rst
浏览文件 @
9f8c3938
...
...
@@ -352,6 +352,11 @@ Default::
The default headers used for Scrapy HTTP Requests. They're populated in the
:class:`~scrapy.downloadermiddlewares.defaultheaders.DefaultHeadersMiddleware`.
.. caution:: Cookies set via the ``Cookie`` header are not considered by the
:ref:`cookies-mw`. If you need to set cookies for a request, use the
:class:`Request.cookies <scrapy.http.Request>` parameter. This is a known
current limitation that is being worked on.
.. setting:: DEPTH_LIMIT
DEPTH_LIMIT
...
...
scrapy/downloadermiddlewares/cookies.py
浏览文件 @
9f8c3938
...
...
@@ -97,35 +97,14 @@ class CookiesMiddleware:
def
_get_request_cookies
(
self
,
jar
,
request
):
"""
Extract cookies from a Request. Values from the `Request.cookies` attribute
take precedence over values from the `Cookie` request header.
Extract cookies from the Request.cookies attribute
"""
def
get_cookies_from_header
(
jar
,
request
):
cookie_header
=
request
.
headers
.
get
(
"Cookie"
)
if
not
cookie_header
:
return
[]
cookie_gen_bytes
=
(
s
.
strip
()
for
s
in
cookie_header
.
split
(
b
";"
))
cookie_list_unicode
=
[]
for
cookie_bytes
in
cookie_gen_bytes
:
try
:
cookie_unicode
=
cookie_bytes
.
decode
(
"utf8"
)
except
UnicodeDecodeError
:
logger
.
warning
(
"Non UTF-8 encoded cookie found in request %s: %s"
,
request
,
cookie_bytes
)
cookie_unicode
=
cookie_bytes
.
decode
(
"latin1"
,
errors
=
"replace"
)
cookie_list_unicode
.
append
(
cookie_unicode
)
response
=
Response
(
request
.
url
,
headers
=
{
"Set-Cookie"
:
cookie_list_unicode
})
return
jar
.
make_cookies
(
response
,
request
)
def
get_cookies_from_attribute
(
jar
,
request
):
if
not
request
.
cookies
:
return
[]
elif
isinstance
(
request
.
cookies
,
dict
):
cookies
=
({
"name"
:
k
,
"value"
:
v
}
for
k
,
v
in
request
.
cookies
.
items
())
else
:
cookies
=
request
.
cookies
formatted
=
filter
(
None
,
(
self
.
_format_cookie
(
c
,
request
)
for
c
in
cookies
))
response
=
Response
(
request
.
url
,
headers
=
{
"Set-Cookie"
:
formatted
})
return
jar
.
make_cookies
(
response
,
request
)
return
get_cookies_from_header
(
jar
,
request
)
+
get_cookies_from_attribute
(
jar
,
request
)
if
not
request
.
cookies
:
return
[]
elif
isinstance
(
request
.
cookies
,
dict
):
cookies
=
({
"name"
:
k
,
"value"
:
v
}
for
k
,
v
in
request
.
cookies
.
items
())
else
:
cookies
=
request
.
cookies
formatted
=
filter
(
None
,
(
self
.
_format_cookie
(
c
,
request
)
for
c
in
cookies
))
response
=
Response
(
request
.
url
,
headers
=
{
"Set-Cookie"
:
formatted
})
return
jar
.
make_cookies
(
response
,
request
)
tests/test_downloadermiddleware_cookies.py
浏览文件 @
9f8c3938
...
...
@@ -2,6 +2,8 @@ import logging
from
testfixtures
import
LogCapture
from
unittest
import
TestCase
import
pytest
from
scrapy.downloadermiddlewares.cookies
import
CookiesMiddleware
from
scrapy.downloadermiddlewares.defaultheaders
import
DefaultHeadersMiddleware
from
scrapy.exceptions
import
NotConfigured
...
...
@@ -243,6 +245,7 @@ class CookiesMiddlewareTest(TestCase):
self
.
assertIn
(
'Cookie'
,
request
.
headers
)
self
.
assertEqual
(
b
'currencyCookie=USD'
,
request
.
headers
[
'Cookie'
])
@
pytest
.
mark
.
xfail
(
reason
=
"Cookie header is not currently being processed"
)
def
test_keep_cookie_from_default_request_headers_middleware
(
self
):
DEFAULT_REQUEST_HEADERS
=
dict
(
Cookie
=
'default=value; asdf=qwerty'
)
mw_default_headers
=
DefaultHeadersMiddleware
(
DEFAULT_REQUEST_HEADERS
.
items
())
...
...
@@ -257,6 +260,7 @@ class CookiesMiddlewareTest(TestCase):
assert
self
.
mw
.
process_request
(
req2
,
self
.
spider
)
is
None
self
.
assertCookieValEqual
(
req2
.
headers
[
'Cookie'
],
b
'default=value; a=b; asdf=qwerty'
)
@
pytest
.
mark
.
xfail
(
reason
=
"Cookie header is not currently being processed"
)
def
test_keep_cookie_header
(
self
):
# keep only cookies from 'Cookie' request header
req1
=
Request
(
'http://scrapytest.org'
,
headers
=
{
'Cookie'
:
'a=b; c=d'
})
...
...
@@ -291,6 +295,7 @@ class CookiesMiddlewareTest(TestCase):
assert
self
.
mw
.
process_request
(
req3
,
self
.
spider
)
is
None
self
.
assertCookieValEqual
(
req3
.
headers
[
'Cookie'
],
b
'a=
\xc3\xa1
'
)
@
pytest
.
mark
.
xfail
(
reason
=
"Cookie header is not currently being processed"
)
def
test_request_headers_cookie_encoding
(
self
):
# 1) UTF8-encoded bytes
req1
=
Request
(
'http://example.org'
,
headers
=
{
'Cookie'
:
'a=á'
.
encode
(
'utf8'
)})
...
...
编辑
预览
Markdown
is supported
0%
请重试
或
添加新附件
.
添加附件
取消
You are about to add
0
people
to the discussion. Proceed with caution.
先完成此消息的编辑!
取消
想要评论请
注册
或
登录