提交 bf8dc61f 编写于 作者: A Alex Cepoi

SEP-017 contracts: pretty-printing and docs

上级 6f1a5d8d
......@@ -129,6 +129,7 @@ Solving specific problems
faq
topics/debug
topics/testing
topics/firefox
topics/firebug
topics/leaks
......@@ -143,6 +144,9 @@ Solving specific problems
:doc:`topics/debug`
Learn how to debug common problems of your scrapy spider.
:doc:`topics/testing`
Learn how to use contracts for testing your spiders.
:doc:`topics/firefox`
Learn how to scrape with Firefox and some useful add-ons.
......
......@@ -142,6 +142,7 @@ Global commands:
Project-only commands:
* :command:`crawl`
* :command:`check`
* :command:`list`
* :command:`edit`
* :command:`parse`
......@@ -221,6 +222,33 @@ Usage examples::
[ ... myspider starts crawling ... ]
.. command:: check
check
-----
* Syntax: ``scrapy check [-l] <spider>``
* Requires project: *yes*
Run contract checks.
Usage examples::
$ scrapy check -l
first_spider
* parse
* parse_item
second_spider
* parse
* parse_item
$ scrapy check
[FAILED] first_spider:parse_item
>>> 'RetailPricex' field is missing
[FAILED] first_spider:parse
>>> Returned 92 requests, expected 0..4
.. command:: server
server
......
......@@ -832,6 +832,30 @@ The scheduler to use for crawling.
.. setting:: SPIDER_MIDDLEWARES
SPIDER_CONTRACTS
----------------
Default:: ``{}``
A dict containing the scrapy contracts enabled in your project, used for
testing spiders. For more info see :ref:`topics-testing`.
SPIDER_CONTRACTS_BASE
---------------------
Default::
{
'scrapy.contracts.default.UrlContract' : 1,
'scrapy.contracts.default.ReturnsContract': 2,
'scrapy.contracts.default.ScrapesContract': 3,
}
A dict containing the scrapy contracts enabled by default in Scrapy. You should
never modify this setting in your project, modify :setting:`SPIDER_CONTRACTS`
instead. For more info see :ref:`topics-testing`.
SPIDER_MIDDLEWARES
------------------
......
.. _topics-testing:
===============
Testing Spiders
===============
Testing spiders can get particularly annoying and while nothing prevents you
from writing unit tests the task gets cumbersome quickly. Scrapy offers an
integrated way of testing your spiders by the means of contracts.
This allows you to test each callback of your spider by hardcoding a sample url
and check various constraints for how the callback processes the response. Each
contract is prefixed with an ``@`` and included in the docstring. See the
following example::
def parse(self, response):
""" This function parses a sample response. Some contracts are mingled
with this docstring.
@url http://www.amazon.com/s?field-keywords=selfish+gene
@returns items 1 16
@returns requests 0 0
@scrapes Title Author Year Price
"""
This callback is tested using three built-in contracts:
.. module:: scrapy.contracts.default
.. class:: UrlContract
This contract (``@url``) sets the sample url used when checking other
contract conditions for this spider. This contract is mandatory. All
callbacks lacking this contract are ignored when running the checks::
@url url
.. class:: ReturnsContract
This contract (``@returns``) sets lower and upper bounds for the items and
requests returned by the spider. The upper bound is optional::
@returns item(s)|request(s) [min [max]]
.. class:: ScrapesContract
This contract (``@scrapes``) checks that all the items returned by the
callback have the specified fields::
@scrapes field_1 field_2 ...
Use the :command:`check` command to run the contract checks.
Custom Contracts
================
If you find you need more power than the built-in scrapy contracts you can
create and load your own contracts in the project by using the
:setting:`SPIDER_CONTRACTS` setting::
SPIDER_CONTRACTS = {
'myproject.contracts.ResponseCheck': 10,
'myproject.contracts.ItemValidate': 10,
}
Each contract must inherit from :class:`scrapy.contracts.Contract` and can
override three methods:
.. module:: scrapy.contracts
.. class:: Contract(method, \*args)
:param method: callback function to which the contract is associated
:type method: function
:param args: list of arguments passed into the docstring (whitespace
separated)
:type args: list
.. method:: Contract.adjust_request_args(args)
This receives a ``dict`` as an argument containing default arguments
for :class:`~scrapy.http.Request` object. Must return the same or a
modified version of it.
.. method:: Contract.pre_process(response)
This allows hooking in various checks on the response received from the
sample request, before it's being passed to the callback.
.. method:: Contract.post_process(output)
This allows processing the output of the callback. Iterators are
converted listified before being passed to this hook.
Here is a demo contract which checks the presence of a custom header in the
response received. Raise :class:`scrapy.exceptions.ContractFail` in order to
get the failures pretty printed::
from scrapy.contracts import Contract
from scrapy.exceptions import ContractFail
class HasHeaderContract(Contract):
""" Demo contract which checks the presence of a custom header
@has_header X-CustomHeader
"""
name = 'has_header'
def pre_process(self, response):
for header in self.args:
if header not in response.headers:
raise ContractFail('X-CustomHeader not present')
......@@ -60,7 +60,9 @@ class Contract(object):
cb = request.callback
@wraps(cb)
def wrapper(response):
self.pre_process(response)
try: self.pre_process(response)
except ContractFail as e:
print e.format(self.method)
return list(iterate_spider_output(cb(response)))
request.callback = wrapper
......@@ -71,7 +73,9 @@ class Contract(object):
@wraps(cb)
def wrapper(response):
output = list(iterate_spider_output(cb(response)))
self.post_process(output)
try: self.post_process(output)
except ContractFail as e:
print e.format(self.method)
return output
request.callback = wrapper
......
......@@ -52,4 +52,7 @@ class ScrapyDeprecationWarning(Warning):
class ContractFail(AssertionError):
"""Error raised in case of a failing contract"""
pass
def format(self, method):
return '[FAILED] %s:%s\n>>> %s\n' % \
(method.im_class.name, method.__name__, self)
Markdown is supported
0% .
You are about to add 0 people to the discussion. Proceed with caution.
先完成此消息的编辑!
想要评论请 注册