doc: several more improvements

--HG-- extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%40635

doc: several more improvements
--HG-- extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%40635
f8f0db8b · Pablo Hoffman · 137429e5 · f8f0db8b · f8f0db8b · f8f0db8b
11 changed file
--- a/scrapy/trunk/docs/faq.rst
+++ b/scrapy/trunk/docs/faq.rst
@@ -4,7 +4,7 @@ Frequently Asked Questions
 ==========================

 How does Scrapy compare to BeautifulSoul or lxml?
------------------------------------------------
+-------------------------------------------------

 `BeautifulSoup`_ and `lxml`_ are libraries for parsing HTML and XML. Scrapy is
 an application framework for writing web spiders that crawl web sites and

--- a/scrapy/trunk/docs/intro/overview.rst
+++ b/scrapy/trunk/docs/intro/overview.rst
@@ -9,8 +9,8 @@ The basic idea of scrapy is to be a robot that goes through websites, crawling p
 The framework is formed by components that take care of different activities.
 These components are basically:

-* :ref:`spiders`
-* :ref:`selectors`
+* :ref:`topics-spiders`
+* :ref:`topics-selectors`
 * Items
 * Adaptors


--- a/scrapy/trunk/docs/intro/tutorial/tutorial4.rst
+++ b/scrapy/trunk/docs/intro/tutorial/tutorial4.rst
@@ -10,6 +10,8 @@ Finishing the job
 | To make it simple, we'll export the scraped items to a CSV file by making use of a useful function that Scrapy brings: *items_to_csv*.
  This simple function takes a file descriptor/filename, and a list of items, and writes their attributes to that file, in CSV format.

+.. highlight:: python
+
 Let's see how would our spider end up looking like after applying this change::

    # -*- coding: utf8 -*-

--- a/scrapy/trunk/docs/ref/exceptions.rst
+++ b/scrapy/trunk/docs/ref/exceptions.rst
 .. _exceptions:

+.. module:: scrapy.core.exceptions
+   :synopsis: Exceptions definitions
+
 Available Exceptions
 ====================

@@ -20,7 +23,7 @@ DropItem
 --------

 The exception that must be raised by item pipeline stages to stop processing an
-Item. For more information see :topic:`item-pipeline`.
+Item. For more information see :ref:`topics-item-pipeline`.

 .. exception:: NotConfigured


--- a/scrapy/trunk/docs/ref/signals.rst
+++ b/scrapy/trunk/docs/ref/signals.rst
 .. _signals:

+.. module:: scrapy.core.signals
+   :synopsis: Signals definitions
+
 Available Signals
 =================

@@ -8,56 +11,53 @@ catch some of those signals in your Scrapy project or extension to perform
 additional tasks or extend Scrapy to add functionality not provided out of the
 box.

+Even though signals provide several arguments, the handlers which catch them
+don't have to receive all of them.
+
+For more information about working when see the documentation of
+`pydispatcher`_ (library used to implement signals).
+
+.. _pydispatcher: http://pydispatcher.sourceforge.net/
+
 Here's a list of signals used in Scrapy and their meaning, in alphabetical
 order.

 .. signal:: domain_closed
-
-domain_closed
-------------
-
-Arguments: 
- * ``domain`` - the domain (of the spider) which has been closed
- * ``spider`` - the spider which has been closed
+.. function:: domain_closed(domain, spider)

 Sent right after a spider/domain has been closed.

-.. signal:: domain_open
-
-domain_open
-----------
+``domain`` is a string which contains the domain of the spider which has been closed
+``spider`` is the spider which has been closed

-Arguments: 
- * ``domain`` - the domain (of the spider) which is about to be opened
- * ``spider`` - the spider which is about to be opened
+.. signal:: domain_open
+.. function:: domain_open(domain, spider)

 Sent right before a spider has been opened for crawling.

-.. signal:: domain_opened
-
-domain_opened
-------------
+``domain`` is a string which contains the domain of the spider which is about
+to be opened
+``spider`` is the spider which is about to be opened

-Arguments: 
- * ``domain`` - the domain (of the spider) which has been opened
- * ``spider`` - the spider which has been opened
+.. signal:: domain_opened
+.. function:: domain_opened(domain, spider)

 Sent right after a spider has been opened for crawling.

-.. signal:: domain_idle
-
-domain_idle
-----------
+``domain`` is a string with the domain of the spider which has been opened
+``spider`` is the spider which has been opened

-Arguments: 
- * ``domain`` - the domain (of the spider) which has gone idle
- * ``spider`` - the spider which has gone idle
+.. signal:: domain_idle
+.. function:: domain_idle(domain, spider)

 Sent when a domain has no further:
 * requests waiting to be downloaded
 * requests scheduled
 * items being processed in the item pipeline

+``domain`` is a string with the domain of the spider which has gone idle
+``spider`` is the spider which has gone idle
+
 If any handler of this signals raises a :exception:`DontCloseDomain` the domain
 won't be closed at this time and will wait until another idle signal is sent.
 Otherwise (if no handler raises :exception:`DontCloseDomain`) the domain will
@@ -65,111 +65,85 @@ be closed immediately after all handlers of ``domain_idle`` have finished, and
 a :signal:`domain_closed` will thus be sent.

 .. signal:: engine_started
-
-engine_started
--------------
-
-Arguments: ``None``
+.. function:: engine_started()

 Sent when the Scrapy engine is started (for example, when a crawling
 process has started).

 .. signal:: engine_stopped
-
-engine_stopped
--------------
-
-Arguments: ``None``
+.. function:: engine_stopped()

 Sent when the Scrapy engine is stopped (for example, when a crawling
 process has started).

 .. signal:: request_received
+.. function:: request_received(request, spider, response)

-request_received
----------------
-
-Arguments: 
- * ``request`` - the ``HTTPRequest`` received
- * ``spider`` - the spider which generated the request
- * ``response`` - the ``HTTPResponse`` fed to the spider which generated the
-    request
+Sent when the engine receives a :class:`~scrapy.http.Request` from a spider.

-Sent when the engine receives a ``HTTPRequest`` from a spider.
+``request`` is the :class:`~scrapy.http.Request` received
+``spider`` is the spider which generated the request
+``response`` is the :class:`~scrapy.http.Response` fed to the spider which
+generated the request

 .. signal:: request_uploaded
+.. function:: request_uploaded(request, spider)

-request_uploaded
----------------
+Sent right after the download has sent a :class:`~scrapy.http.Request`.

-Arguments: 
- * ``request`` - the ``HTTPRequest`` uploaded/sent
- * ``spider`` - the spider which generated the request
-
-Sent right after the download has sent a ``HTTPRequest``.
+``request`` is the :class:`~scrapy.http.Request` uploaded/sent
+``spider`` is the spider which generated the request

 .. signal:: response_received
+.. function:: response_received(response, spider)

-response_received
-----------------
-
-Arguments: 
- * ``response`` - the ``HTTPResponse`` received
- * ``spider`` - the spider for which the response is intended
+``response`` is the :class:`~scrapy.http.Response` received
+``spider`` is  the spider for which the response is intended

-Sent when the engine receives a new ``HTTPResponse`` from the downloader.
+Sent when the engine receives a new :class:`~scrapy.http.Response` from the
+downloader.

 .. signal:: response_downloaded
-
-response_downloaded
-------------------
-
-Arguments: 
- * ``response`` - the ``HTTPResponse`` downloaded
- * ``spider`` - the spider for which the response is intended
+.. function:: response_downloaded(response, spider)

 Sent by the downloader right after a ``HTTPResponse`` is downloaded.

-.. signal:: item_scraped
-
-item_scraped
------------
+``response`` is the ``HTTPResponse`` downloaded
+``spider`` is the spider for which the response is intended

-Arguments:
- * ``item`` - the item scraped
- * ``spider`` - the spider which scraped the item 
- * ``response`` - the response from which the item was scraped
+.. signal:: item_scraped
+.. function:: item_scraped(item, spider, response)

 Sent when the engine receives a new scraped item from the spider, and right
-before the item is sent to the :topic:`item-pipeline`.
-
-.. signal:: item_passed
+before the item is sent to the :ref:`topics-item-pipeline`.

-item_passed
-----------
+``item`` is the item scraped
+``spider`` is the spider which scraped the item 
+``response`` is the :class:`~scrapy.http.Response` from which the item was
+scraped

-Arguments:
- * ``item`` - the item passed
- * ``spider`` - the spider which scraped the item 
- * ``response`` - the response from which the item was scraped
- * ``pipe_output`` - the output of the item pipeline. Typically, this points to
-    the same ``item`` object, unless some pipeline stage created a new item.
+.. signal:: item_passed
+.. function:: item_passed(item, spider, response, pipe_output)

-Sent after an item has passed al the :topic:`item-pipeline` stages without
+Sent after an item has passed al the :ref:`topics-item-pipeline` stages without
 being dropped.

-.. signal:: item_dropped
-
-item_dropped
------------
+``item`` is the item which passed the pipeline
+``spider`` is the spider which scraped the item 
+``response`` is the :class:`~scrapy.http.Response` from which the item was scraped
+``pipe_output`` is  the output of the item pipeline. Typically, this points to
+the same ``item`` object, unless some pipeline stage created a new item.

-Arguments:
- * ``item`` - the item dropped
- * ``spider`` - the spider which scraped the item 
- * ``response`` - the response from which the item was scraped
- * ``exception`` - the exception that caused the item to be dropped (which must inherit from :exception:`DropItem`) 
+.. signal:: item_dropped
+.. function:: item_dropped(item, spider, response, exception)

-Sent after an item has dropped from the :topic:`item-pipeline` when some stage
+Sent after an item has dropped from the :ref:`topics-item-pipeline` when some stage
 raised a :exception:`DropItem` exception.

+``item`` is the item dropped from the :ref:`topics-item-pipeline`
+``spider`` is the spider which scraped the item 
+``response`` is the :class:`~scrapy.http.Response` from which the item was scraped
+``exception`` is the (:exception:`DropItem` child) exception that caused the
+item to be dropped 
+

--- a/scrapy/trunk/docs/topics/downloader-middleware.rst
+++ b/scrapy/trunk/docs/topics/downloader-middleware.rst
+.. _topics-downloader-middleware:
+
 =====================
 Downloader Middleware
 =====================

--- a/scrapy/trunk/docs/topics/item-pipeline.rst
+++ b/scrapy/trunk/docs/topics/item-pipeline.rst
+.. _topics-item-pipeline:
+
 =============
 Item Pipeline
 =============

--- a/scrapy/trunk/docs/topics/robotstxt.rst
+++ b/scrapy/trunk/docs/topics/robotstxt.rst
+.. _topics-robotstxt:
+
 ==================
 Obeying robots.txt
 ==================

-Scrapy deals with robots.txt files using a :topic:`downloader-middleware`
+Scrapy deals with robots.txt files using a :ref:`topics-downloader-middleware`.
 called `RobotsTxtMiddleware`.

 To make sure Scrapy respects robots.txt files make sure the following

--- a/scrapy/trunk/docs/topics/selectors.rst
+++ b/scrapy/trunk/docs/topics/selectors.rst
-.. _selectors:
+.. _topics-selectors:

 Selectors
 ---------

--- a/scrapy/trunk/docs/topics/settings.rst
+++ b/scrapy/trunk/docs/topics/settings.rst
+.. _topics-settings:
+
 ========
 Settings
 ========

--- a/scrapy/trunk/docs/topics/spiders.rst
+++ b/scrapy/trunk/docs/topics/spiders.rst
-.. _spiders:
+.. _topics-spiders:

+=======
 Spiders
-------
+=======
+
 We'll start off by the spiders because they're the ones that actually use the other components, and they are used themselves by scrapy's core, so they must be the first for you to know about.

 Spiders are little programs, let's say, whose purpose is to scrape information from html pages or other data sources. Having said that, it's obvious that their process is something like: