未验证 提交 6f4c964a 编写于 作者: A Adrián Chaves 提交者: GitHub

Cover Scrapy 2.2.0 in the release notes (#4630)

上级 536643ef
......@@ -155,6 +155,9 @@ Finally, try to keep aesthetic changes (:pep:`8` compliance, unused imports
removal, etc) in separate commits from functional changes. This will make pull
requests easier to review and more likely to get merged.
.. _coding-style:
Coding style
============
......@@ -163,7 +166,7 @@ Scrapy:
* Unless otherwise specified, follow :pep:`8`.
* It's OK to use lines longer than 80 chars if it improves the code
* It's OK to use lines longer than 79 chars if it improves the code
readability.
* Don't put your name in the code you contribute; git provides enough
......
......@@ -3,6 +3,201 @@
Release notes
=============
.. _release-2.2.0:
Scrapy 2.2.0 (2020-06-24)
-------------------------
Highlights:
* Python 3.5.2+ is required now
* :ref:`dataclass objects <dataclass-items>` and
:ref:`attrs objects <attrs-items>` are now valid :ref:`item types
<item-types>`
* New :meth:`TextResponse.json <scrapy.http.TextResponse.json>` method
* New :signal:`bytes_received` signal that allows canceling response download
* :class:`~scrapy.downloadermiddlewares.cookies.CookiesMiddleware` fixes
Backward-incompatible changes
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
* Support for Python 3.5.0 and 3.5.1 has been dropped; Scrapy now refuses to
run with a Python version lower than 3.5.2, which introduced
:class:`typing.Type` (:issue:`4615`)
Deprecations
~~~~~~~~~~~~
* :meth:`TextResponse.body_as_unicode
<scrapy.http.TextResponse.body_as_unicode>` is now deprecated, use
:attr:`TextResponse.text <scrapy.http.TextResponse.text>` instead
(:issue:`4546`, :issue:`4555`, :issue:`4579`)
* :class:`scrapy.item.BaseItem` is now deprecated, use
:class:`scrapy.item.Item` instead (:issue:`4534`)
New features
~~~~~~~~~~~~
* :ref:`dataclass objects <dataclass-items>` and
:ref:`attrs objects <attrs-items>` are now valid :ref:`item types
<item-types>`, and a new itemadapter_ library makes it easy to
write code that :ref:`supports any item type <supporting-item-types>`
(:issue:`2749`, :issue:`2807`, :issue:`3761`, :issue:`3881`, :issue:`4642`)
* A new :meth:`TextResponse.json <scrapy.http.TextResponse.json>` method
allows to deserialize JSON responses (:issue:`2444`, :issue:`4460`,
:issue:`4574`)
* A new :signal:`bytes_received` signal allows monitoring response download
progress and :ref:`stopping downloads <topics-stop-response-download>`
(:issue:`4205`, :issue:`4559`)
* The dictionaries in the result list of a :ref:`media pipeline
<topics-media-pipeline>` now include a new key, ``status``, which indicates
if the file was downloaded or, if the file was not downloaded, why it was
not downloaded; see :meth:`FilesPipeline.get_media_requests
<scrapy.pipelines.files.FilesPipeline.get_media_requests>` for more
information (:issue:`2893`, :issue:`4486`)
* When using :ref:`Google Cloud Storage <media-pipeline-gcs>` for
a :ref:`media pipeline <topics-media-pipeline>`, a warning is now logged if
the configured credentials do not grant the required permissions
(:issue:`4346`, :issue:`4508`)
* :ref:`Link extractors <topics-link-extractors>` are now serializable,
as long as you do not use :ref:`lambdas <lambda>` for parameters; for
example, you can now pass link extractors in :attr:`Request.cb_kwargs
<scrapy.http.Request.cb_kwargs>` or
:attr:`Request.meta <scrapy.http.Request.meta>` when :ref:`persisting
scheduled requests <topics-jobs>` (:issue:`4554`)
* Upgraded the :ref:`pickle protocol <pickle-protocols>` that Scrapy uses
from protocol 2 to protocol 4, improving serialization capabilities and
performance (:issue:`4135`, :issue:`4541`)
* :func:`scrapy.utils.misc.create_instance` now raises a :exc:`TypeError`
exception if the resulting instance is ``None`` (:issue:`4528`,
:issue:`4532`)
.. _itemadapter: https://github.com/scrapy/itemadapter
Bug fixes
~~~~~~~~~
* :class:`~scrapy.downloadermiddlewares.cookies.CookiesMiddleware` no longer
discards cookies defined in :attr:`Request.headers
<scrapy.http.Request.headers>` (:issue:`1992`, :issue:`2400`)
* :class:`~scrapy.downloadermiddlewares.cookies.CookiesMiddleware` no longer
re-encodes cookies defined as :class:`bytes` in the ``cookies`` parameter
of the ``__init__`` method of :class:`~scrapy.http.Request`
(:issue:`2400`, :issue:`3575`)
* When :setting:`FEEDS` defines multiple URIs, :setting:`FEED_STORE_EMPTY` is
``False`` and the crawl yields no items, Scrapy no longer stops feed
exports after the first URI (:issue:`4621`, :issue:`4626`)
* :class:`~scrapy.spiders.Spider` callbacks defined using :doc:`coroutine
syntax <topics/coroutines>` no longer need to return an iterable, and may
instead return a :class:`~scrapy.http.Request` object, an
:ref:`item <topics-items>`, or ``None`` (:issue:`4609`)
* The :command:`startproject` command now ensures that the generated project
folders and files have the right permissions (:issue:`4604`)
* Fix a :exc:`KeyError` exception being sometimes raised from
:class:`scrapy.utils.datatypes.LocalWeakReferencedCache` (:issue:`4597`,
:issue:`4599`)
* When :setting:`FEEDS` defines multiple URIs, log messages about items being
stored now contain information from the corresponding feed, instead of
always containing information about only one of the feeds (:issue:`4619`,
:issue:`4629`)
Documentation
~~~~~~~~~~~~~
* Added a new section about :ref:`accessing cb_kwargs from errbacks
<errback-cb_kwargs>` (:issue:`4598`, :issue:`4634`)
* Covered chompjs_ in :ref:`topics-parsing-javascript` (:issue:`4556`,
:issue:`4562`)
* Removed from :doc:`topics/coroutines` the warning about the API being
experimental (:issue:`4511`, :issue:`4513`)
* Removed references to unsupported versions of :doc:`Twisted
<twisted:index>` (:issue:`4533`)
* Updated the description of the :ref:`screenshot pipeline example
<ScreenshotPipeline>`, which now uses :doc:`coroutine syntax
<topics/coroutines>` instead of returning a
:class:`~twisted.internet.defer.Deferred` (:issue:`4514`, :issue:`4593`)
* Removed a misleading import line from the
:func:`scrapy.utils.log.configure_logging` code example (:issue:`4510`,
:issue:`4587`)
* The display-on-hover behavior of internal documentation references now also
covers links to :ref:`commands <topics-commands>`, :attr:`Request.meta
<scrapy.http.Request.meta>` keys, :ref:`settings <topics-settings>` and
:ref:`signals <topics-signals>` (:issue:`4495`, :issue:`4563`)
* It is again possible to download the documentation for offline reading
(:issue:`4578`, :issue:`4585`)
* Removed backslashes preceding ``*args`` and ``**kwargs`` in some function
and method signatures (:issue:`4592`, :issue:`4596`)
.. _chompjs: https://github.com/Nykakin/chompjs
Quality assurance
~~~~~~~~~~~~~~~~~
* Adjusted the code base further to our :ref:`style guidelines
<coding-style>` (:issue:`4237`, :issue:`4525`, :issue:`4538`,
:issue:`4539`, :issue:`4540`, :issue:`4542`, :issue:`4543`, :issue:`4544`,
:issue:`4545`, :issue:`4557`, :issue:`4558`, :issue:`4566`, :issue:`4568`,
:issue:`4572`)
* Removed remnants of Python 2 support (:issue:`4550`, :issue:`4553`,
:issue:`4568`)
* Improved code sharing between the :command:`crawl` and :command:`runspider`
commands (:issue:`4548`, :issue:`4552`)
* Replaced ``chain(*iterable)`` with ``chain.from_iterable(iterable)``
(:issue:`4635`)
* You may now run the :mod:`asyncio` tests with Tox on any Python version
(:issue:`4521`)
* Updated test requirements to reflect an incompatibility with pytest 5.4 and
5.4.1 (:issue:`4588`)
* Improved :class:`~scrapy.spiderloader.SpiderLoader` test coverage for
scenarios involving duplicate spider names (:issue:`4549`, :issue:`4560`)
* Configured Travis CI to also run the tests with Python 3.5.2
(:issue:`4518`, :issue:`4615`)
* Added a `Pylint <https://www.pylint.org/>`_ job to Travis CI
(:issue:`3727`)
* Added a `Mypy <http://mypy-lang.org/>`_ job to Travis CI (:issue:`4637`)
* Made use of set literals in tests (:issue:`4573`)
* Cleaned up the Travis CI configuration (:issue:`4517`, :issue:`4519`,
:issue:`4522`, :issue:`4537`)
.. _release-2.1.0:
Scrapy 2.1.0 (2020-04-24)
......
......@@ -201,6 +201,9 @@ For self-hosting you also might feel the need not to use SSL and not to verify S
.. _s3.scality: https://s3.scality.com/
.. _canned ACLs: https://docs.aws.amazon.com/AmazonS3/latest/dev/acl-overview.html#canned-acl
.. _media-pipeline-gcs:
Google Cloud Storage
---------------------
......@@ -475,7 +478,11 @@ See here the methods that you can override in your custom Files Pipeline:
* ``checksum`` - a `MD5 hash`_ of the image contents
* ``status`` - the file status indication. It can be one of the following:
* ``status`` - the file status indication.
.. versionadded:: 2.2
It can be one of the following:
* ``downloaded`` - file was downloaded.
* ``uptodate`` - file was not downloaded, as it was downloaded recently,
......
......@@ -191,7 +191,7 @@ Request objects
In case of a failure to process the request, this dict can be accessed as
``failure.request.cb_kwargs`` in the request's errback. For more information,
see :ref:`topics-request-response-ref-accessing-callback-arguments-in-errback`.
see :ref:`errback-cb_kwargs`.
.. method:: Request.copy()
......@@ -316,7 +316,7 @@ errors if needed::
request = failure.request
self.logger.error('TimeoutError on %s', request.url)
.. _topics-request-response-ref-accessing-callback-arguments-in-errback:
.. _errback-cb_kwargs:
Accessing additional data in errback functions
----------------------------------------------
......
......@@ -74,6 +74,8 @@ class TextResponse(Response):
def json(self):
"""
.. versionadded:: 2.2
Deserialize a JSON document to a Python object.
"""
if self._cached_decoded_json is _NONE:
......
......@@ -138,8 +138,9 @@ def create_instance(objcls, settings, crawler, *args, **kwargs):
Raises ``ValueError`` if both ``settings`` and ``crawler`` are ``None``.
Raises ``TypeError`` if the resulting instance is ``None`` (e.g. if an
extension has not been implemented correctly).
.. versionchanged:: 2.2
Raises ``TypeError`` if the resulting instance is ``None`` (e.g. if an
extension has not been implemented correctly).
"""
if settings is None:
if crawler is None:
......
Markdown is supported
0% .
You are about to add 0 people to the discussion. Proceed with caution.
先完成此消息的编辑!
想要评论请 注册