提交 5a58d641 编写于 作者: S Shadab Zafar 提交者: Julia Medina

Fix some redirection links in documentation

Fixes #606
上级 4c11201d
......@@ -173,10 +173,10 @@ And their unit-tests are in::
tests/test_contrib_loader.py
.. _issue tracker: https://github.com/scrapy/scrapy/issues
.. _scrapy-users: http://groups.google.com/group/scrapy-users
.. _scrapy-users: https://groups.google.com/forum/#!forum/scrapy-users
.. _Twisted unit-testing framework: http://twistedmatrix.com/documents/current/core/development/policy/test-standard.html
.. _AUTHORS: https://github.com/scrapy/scrapy/blob/master/AUTHORS
.. _tests/: https://github.com/scrapy/scrapy/tree/master/tests
.. _open issues: https://github.com/scrapy/scrapy/issues
.. _pull request: http://help.github.com/send-pull-requests/
.. _pull request: https://help.github.com/send-pull-requests/
.. _tox: https://pypi.python.org/pypi/tox
......@@ -21,8 +21,8 @@ comparing `jinja2`_ to `Django`_.
.. _BeautifulSoup: http://www.crummy.com/software/BeautifulSoup/
.. _lxml: http://lxml.de/
.. _jinja2: http://jinja.pocoo.org/2/
.. _Django: http://www.djangoproject.com
.. _jinja2: http://jinja.pocoo.org/
.. _Django: https://www.djangoproject.com/
.. _faq-python-versions:
......@@ -57,7 +57,7 @@ focus on the real problems we need to solve.
We'd be proud if Scrapy serves as an inspiration for other projects. Feel free
to steal from us!
.. _Django: http://www.djangoproject.com
.. _Django: https://www.djangoproject.com/
Does Scrapy work with HTTP proxies?
-----------------------------------
......@@ -221,7 +221,7 @@ more info on how it works see `this page`_. Also, here's an `example spider`_
which scrapes one of these sites.
.. _this page: http://search.cpan.org/~ecarroll/HTML-TreeBuilderX-ASP_NET-0.09/lib/HTML/TreeBuilderX/ASP_NET.pm
.. _example spider: http://github.com/AmbientLighter/rpn-fas/blob/master/fas/spiders/rnp.py
.. _example spider: https://github.com/AmbientLighter/rpn-fas/blob/master/fas/spiders/rnp.py
What's the best way to parse big XML/CSV data feeds?
----------------------------------------------------
......
......@@ -18,8 +18,8 @@ Having trouble? We'd like to help!
* Ask a question in the `#scrapy IRC channel`_.
* Report bugs with Scrapy in our `issue tracker`_.
.. _archives of the scrapy-users mailing list: http://groups.google.com/group/scrapy-users/
.. _post a question: http://groups.google.com/group/scrapy-users/
.. _archives of the scrapy-users mailing list: https://groups.google.com/forum/#!forum/scrapy-users
.. _post a question: https://groups.google.com/forum/#!forum/scrapy-users
.. _#scrapy IRC channel: irc://irc.freenode.net/scrapy
.. _issue tracker: https://github.com/scrapy/scrapy/issues
......
......@@ -21,5 +21,5 @@ middlewares, extensions, or scripts. Feel free (and encouraged!) to share any
code there.
.. _dirbot: https://github.com/scrapy/dirbot
.. _Downloads: https://github.com/scrapy/dirbot/archives/master
.. _Downloads: https://github.com/scrapy/dirbot/downloads
.. _scrapy tag on Snipplr: http://snipplr.com/all/tags/scrapy/
......@@ -37,7 +37,7 @@ Platform specific installation notes
Windows
-------
* Install Python 2.7 from http://python.org/download/
* Install Python 2.7 from https://www.python.org/downloads/
You need to adjust ``PATH`` environment variable to include paths to
the Python executable and additional scripts. The following paths need to be
......@@ -87,8 +87,8 @@ You can follow the generic instructions or install Scrapy from `AUR Scrapy packa
yaourt -S scrapy
.. _Python: http://www.python.org
.. _pip: http://www.pip-installer.org/en/latest/installing.html
.. _Python: https://www.python.org/
.. _pip: https://pip.pypa.io/en/latest/installing.html
.. _easy_install: http://pypi.python.org/pypi/setuptools
.. _Control Panel: http://www.microsoft.com/resources/documentation/windows/xp/all/proddocs/en-us/sysdm_advancd_environmnt_addchange_variable.mspx
.. _lxml: http://lxml.de/
......
......@@ -258,7 +258,7 @@ interest!
.. _the community: http://scrapy.org/community/
.. _screen scraping: http://en.wikipedia.org/wiki/Screen_scraping
.. _web scraping: http://en.wikipedia.org/wiki/Web_scraping
.. _Amazon Associates Web Services: http://aws.amazon.com/associates/
.. _Amazon Associates Web Services: https://affiliate-program.amazon.com/gp/advertising/api/detail/main.html
.. _Mininova: http://www.mininova.org
.. _XPath: http://www.w3.org/TR/xpath
.. _XPath reference: http://www.w3.org/TR/xpath
......
......@@ -26,8 +26,8 @@ Python quickly, we recommend `Learn Python The Hard Way`_. If you're new to pro
and want to start with Python, take a look at `this list of Python resources
for non-programmers`_.
.. _Python: http://www.python.org
.. _this list of Python resources for non-programmers: http://wiki.python.org/moin/BeginnersGuide/NonProgrammers
.. _Python: https://www.python.org/
.. _this list of Python resources for non-programmers: https://wiki.python.org/moin/BeginnersGuide/NonProgrammers
.. _Learn Python The Hard Way: http://learnpythonthehardway.org/book/
Creating a project
......
......@@ -578,7 +578,7 @@ Scrapy changes:
------
- added precise to supported ubuntu distros (:commit:`b7e46df`)
- fixed bug in json-rpc webservice reported in https://groups.google.com/d/topic/scrapy-users/qgVBmFybNAQ/discussion. also removed no longer supported 'run' command from extras/scrapy-ws.py (:commit:`340fbdb`)
- fixed bug in json-rpc webservice reported in https://groups.google.com/forum/#!topic/scrapy-users/qgVBmFybNAQ/discussion. also removed no longer supported 'run' command from extras/scrapy-ws.py (:commit:`340fbdb`)
- meta tag attributes for content-type http equiv can be in any order. #123 (:commit:`0cb68af`)
- replace "import Image" by more standard "from PIL import Image". closes #88 (:commit:`4d17048`)
- return trial status as bin/runtests.sh exit value. #118 (:commit:`b7b2e7f`)
......@@ -902,14 +902,14 @@ Backwards-incompatible changes
First release of Scrapy.
.. _AJAX crawleable urls: http://code.google.com/web/ajaxcrawling/docs/getting-started.html
.. _AJAX crawleable urls: https://developers.google.com/webmasters/ajax-crawling/docs/getting-started?csw=1
.. _chunked transfer encoding: http://en.wikipedia.org/wiki/Chunked_transfer_encoding
.. _w3lib: https://github.com/scrapy/w3lib
.. _scrapely: https://github.com/scrapy/scrapely
.. _marshal: http://docs.python.org/library/marshal.html
.. _marshal: https://docs.python.org/2/library/marshal.html
.. _w3lib.encoding: https://github.com/scrapy/w3lib/blob/master/w3lib/encoding.py
.. _lxml: http://lxml.de/
.. _ClientForm: http://wwwsearch.sourceforge.net/old/ClientForm/
.. _resource: http://docs.python.org/library/resource.html
.. _resource: https://docs.python.org/2/library/resource.html
.. _queuelib: https://github.com/scrapy/queuelib
.. _cssselect: https://github.com/SimonSapin/cssselect
......@@ -484,7 +484,7 @@ You can also add your custom project commands by using the
:setting:`COMMANDS_MODULE` setting. See the Scrapy commands in
`scrapy/commands`_ for examples on how to implement your commands.
.. _scrapy/commands: https://github.com/scrapy/scrapy/blob/master/scrapy/commands
.. _scrapy/commands: https://github.com/scrapy/scrapy/tree/master/scrapy/commands
.. setting:: COMMANDS_MODULE
COMMANDS_MODULE
......
......@@ -451,7 +451,7 @@ In order to use this storage backend:
* install `LevelDB python bindings`_ like ``pip install leveldb``
.. _LevelDB: http://code.google.com/p/leveldb/
.. _leveldb python bindings: http://pypi.python.org/pypi/leveldb
.. _leveldb python bindings: https://pypi.python.org/pypi/leveldb
HTTPCache middleware settings
......@@ -635,8 +635,8 @@ HttpProxyMiddleware
You can also set the meta key ``proxy`` per-request, to a value like
``http://some_proxy_server:port``.
.. _urllib: http://docs.python.org/library/urllib.html
.. _urllib2: http://docs.python.org/library/urllib2.html
.. _urllib: https://docs.python.org/2/library/urllib.html
.. _urllib2: https://docs.python.org/2/library/urllib2.html
RedirectMiddleware
------------------
......@@ -890,5 +890,5 @@ enable it for :ref:`broad crawls <topics-broad-crawls>`.
.. _DBM: http://en.wikipedia.org/wiki/Dbm
.. _anydbm: http://docs.python.org/library/anydbm.html
.. _anydbm: https://docs.python.org/2/library/anydbm.html
.. _chunked transfer encoding: http://en.wikipedia.org/wiki/Chunked_transfer_encoding
......@@ -14,7 +14,7 @@ interfering with the non-blocking IO of the crawler. It also provides a
simple API for sending attachments and it's very easy to configure, with a few
:ref:`settings <topics-email-settings>`.
.. _smtplib: http://docs.python.org/library/smtplib.html
.. _smtplib: https://docs.python.org/2/library/smtplib.html
.. _Twisted non-blocking IO: http://twistedmatrix.com/documents/current/core/howto/defer-intro.html
Quick example
......
......@@ -297,7 +297,7 @@ CsvItemExporter
Color TV,1200
DVD player,200
.. _csv.writer: http://docs.python.org/library/csv.html#csv.writer
.. _csv.writer: https://docs.python.org/2/library/csv.html#csv.writer
PickleItemExporter
------------------
......@@ -318,7 +318,7 @@ PickleItemExporter
Pickle isn't a human readable format, so no output examples are provided.
.. _pickle module documentation: http://docs.python.org/library/pickle.html
.. _pickle module documentation: https://docs.python.org/2/library/pickle.html
PprintItemExporter
------------------
......@@ -367,7 +367,7 @@ JsonItemExporter
stream-friendly format, consider using :class:`JsonLinesItemExporter`
instead, or splitting the output in multiple chunks.
.. _JSONEncoder: http://docs.python.org/library/json.html#json.JSONEncoder
.. _JSONEncoder: https://docs.python.org/2/library/json.html#json.JSONEncoder
JsonLinesItemExporter
---------------------
......@@ -390,4 +390,4 @@ JsonLinesItemExporter
Unlike the one produced by :class:`JsonItemExporter`, the format produced by
this exporter is well suited for serializing large amounts of data.
.. _JSONEncoder: http://docs.python.org/library/json.html#json.JSONEncoder
.. _JSONEncoder: https://docs.python.org/2/library/json.html#json.JSONEncoder
......@@ -368,5 +368,5 @@ For more info see `Debugging in Python`.
This extension only works on POSIX-compliant platforms (ie. not Windows).
.. _Python debugger: http://docs.python.org/library/pdb.html
.. _Python debugger: https://docs.python.org/2/library/pdb.html
.. _Debugging in Python: http://www.ferg.org/papers/debugging_in_python.html
......@@ -32,7 +32,7 @@ you to inspect the HTML code of the different page elements just by hovering
your mouse over them. Otherwise you would have to search for the tags manually
through the HTML body which can be a very tedious task.
.. _Inspect Element: http://www.youtube.com/watch?v=-pT_pDe54aA
.. _Inspect Element: https://www.youtube.com/watch?v=-pT_pDe54aA
In the following screenshot you can see the `Inspect Element`_ tool in action.
......@@ -164,4 +164,4 @@ elements.
or tags which Therefer in page HTML
sources may on Firebug inspects the live DOM
.. _has been shut down by Google: http://searchenginewatch.com/article/2096661/Google-Directory-Has-Been-Shut-Down
.. _has been shut down by Google: http://searchenginewatch.com/sew/news/2096661/google-directory-shut
......@@ -74,9 +74,9 @@ extension to create a new cookie, delete existing cookies, see a list of cookies
for the current site, manage cookies permissions and a lot more.
.. _Firebug: http://getfirebug.com
.. _Inspect Element: http://www.youtube.com/watch?v=-pT_pDe54aA
.. _XPather: https://addons.mozilla.org/firefox/addon/1192
.. _XPath Checker: https://addons.mozilla.org/firefox/addon/1095
.. _Tamper Data: http://addons.mozilla.org/firefox/addon/966
.. _Firecookie: https://addons.mozilla.org/firefox/addon/6683
.. _Inspect Element: https://www.youtube.com/watch?v=-pT_pDe54aA
.. _XPather: https://addons.mozilla.org/en-US/firefox/addon/xpather/
.. _XPath Checker: https://addons.mozilla.org/en-US/firefox/addon/xpath-checker/
.. _Tamper Data: https://addons.mozilla.org/en-US/firefox/addon/tamper-data/
.. _Firecookie: https://addons.mozilla.org/en-US/firefox/addon/firecookie/
......@@ -30,7 +30,7 @@ so you need to install this library in order to use the images pipeline.
is known to cause troubles in some setups, so we recommend to use `Pillow`_
instead of `PIL <Python Imaging Library>`_.
.. _Pillow: https://github.com/python-imaging/Pillow
.. _Pillow: https://github.com/python-pillow/Pillow
.. _Python Imaging Library: http://www.pythonware.com/products/pil/
Using the Images Pipeline
......@@ -104,7 +104,7 @@ Images Storage
File system is currently the only officially supported storage, but there is
also (undocumented) support for `Amazon S3`_.
.. _Amazon S3: https://s3.amazonaws.com/
.. _Amazon S3: http://aws.amazon.com/s3/
File system storage
-------------------
......
......@@ -15,7 +15,7 @@ purpose.
They provide a `dictionary-like`_ API with a convenient syntax for declaring
their available fields.
.. _dictionary-like: http://docs.python.org/library/stdtypes.html#dict
.. _dictionary-like: https://docs.python.org/2/library/stdtypes.html#dict
.. _topics-items-declaring:
......@@ -37,8 +37,8 @@ objects. Here is an example::
declared similar to `Django Models`_, except that Scrapy Items are much
simpler as there is no concept of different field types.
.. _Django: http://www.djangoproject.com/
.. _Django Models: http://docs.djangoproject.com/en/dev/topics/db/models/
.. _Django: https://www.djangoproject.com/
.. _Django Models: https://docs.djangoproject.com/en/dev/topics/db/models/
.. _topics-items-fields:
......@@ -214,7 +214,7 @@ Item objects
:class:`Field` objects used in the :ref:`Item declaration
<topics-items-declaring>`.
.. _dict API: http://docs.python.org/library/stdtypes.html#dict
.. _dict API: https://docs.python.org/2/library/stdtypes.html#dict
Field objects
=============
......@@ -227,6 +227,6 @@ Field objects
to support the :ref:`item declaration syntax <topics-items-declaring>`
based on class attributes.
.. _dict: http://docs.python.org/library/stdtypes.html#dict
.. _dict: https://docs.python.org/2/library/stdtypes.html#dict
......@@ -203,7 +203,7 @@ other cases where the memory leaks could come from other (more or less obscure)
objects. If this is your case, and you can't find your leaks using ``trackref``,
you still have another resource: the `Guppy library`_.
.. _Guppy library: http://pypi.python.org/pypi/guppy
.. _Guppy library: https://pypi.python.org/pypi/guppy
If you use ``pip``, you can install Guppy with the following command::
......@@ -264,9 +264,9 @@ though neither Scrapy nor your project are leaking memory. This is due to a
(not so well) known problem of Python, which may not return released memory to
the operating system in some cases. For more information on this issue see:
* `Python Memory Management <http://evanjones.ca/python-memory.html>`_
* `Python Memory Management Part 2 <http://evanjones.ca/python-memory-part2.html>`_
* `Python Memory Management Part 3 <http://evanjones.ca/python-memory-part3.html>`_
* `Python Memory Management <http://www.evanjones.ca/python-memory.html>`_
* `Python Memory Management Part 2 <http://www.evanjones.ca/python-memory-part2.html>`_
* `Python Memory Management Part 3 <http://www.evanjones.ca/python-memory-part3.html>`_
The improvements proposed by Evan Jones, which are detailed in `this paper`_,
got merged in Python 2.5, but this only reduces the problem, it doesn't fix it
......@@ -280,7 +280,7 @@ completely. To quote the paper:
to move to a compacting garbage collector, which is able to move objects in
memory. This would require significant changes to the Python interpreter.*
.. _this paper: http://evanjones.ca/memoryallocator/
.. _this paper: http://www.evanjones.ca/memoryallocator/
To keep memory consumption reasonable you can split the job into several
smaller jobs or enable :ref:`persistent job queue <topics-jobs>`
......
......@@ -8,7 +8,7 @@ Scrapy provides a logging facility which can be used through the
:mod:`scrapy.log` module. The current underlying implementation uses `Twisted
logging`_ but this may change in the future.
.. _Twisted logging: http://twistedmatrix.com/projects/core/documentation/howto/logging.html
.. _Twisted logging: http://twistedmatrix.com/documents/current/core/howto/logging.html
The logging service must be explicitly started through the
:func:`scrapy.log.start` function to catch the top level Scrapy's log messages.
......
......@@ -157,7 +157,7 @@ Request objects
``copy()`` or ``replace()`` methods, and can also be accessed, in your
spider, from the ``response.meta`` attribute.
.. _shallow copied: http://docs.python.org/library/copy.html
.. _shallow copied: https://docs.python.org/2/library/copy.html
.. method:: Request.copy()
......
......@@ -8,4 +8,4 @@ Scrapyd has been moved into a separate project.
Its documentation is now hosted at:
http://scrapyd.readthedocs.org/
http://scrapyd.readthedocs.org/en/latest/
......@@ -38,7 +38,7 @@ For a complete reference of the selectors API see
.. _BeautifulSoup: http://www.crummy.com/software/BeautifulSoup/
.. _lxml: http://lxml.de/
.. _ElementTree: http://docs.python.org/library/xml.etree.elementtree.html
.. _ElementTree: https://docs.python.org/2/library/xml.etree.elementtree.html
.. _cssselect: https://pypi.python.org/pypi/cssselect/
.. _XPath: http://www.w3.org/TR/xpath
.. _CSS: http://www.w3.org/TR/selectors
......@@ -403,9 +403,9 @@ Here we first iterate over ``itemscope`` elements, and for each one,
we look for all ``itemprops`` elements and exclude those that are themselves
inside another ``itemscope``.
.. _EXSLT: http://www.exslt.org/
.. _regular expressions: http://www.exslt.org/regexp/index.html
.. _set manipulation: http://www.exslt.org/set/index.html
.. _EXSLT: http://exslt.org/
.. _regular expressions: http://exslt.org/regexp/index.html
.. _set manipulation: http://exslt.org/set/index.html
Some XPath tips
......
......@@ -26,7 +26,7 @@ The value of ``SCRAPY_SETTINGS_MODULE`` should be in Python path syntax, e.g.
``myproject.settings``. Note that the settings module should be on the
Python `import search path`_.
.. _import search path: http://docs.python.org/2/tutorial/modules.html#the-module-search-path
.. _import search path: https://docs.python.org/2/tutorial/modules.html#the-module-search-path
Populating the settings
=======================
......
......@@ -159,7 +159,7 @@ following methods:
:type spider: :class:`~scrapy.spider.Spider` object
.. _Exception: http://docs.python.org/library/exceptions.html#exceptions.Exception
.. _Exception: https://docs.python.org/2/library/exceptions.html#exceptions.Exception
.. _topics-spider-middleware-ref:
......
......@@ -706,7 +706,7 @@ Combine SitemapSpider with other sources of urls::
pass # ... scrape other here ...
.. _Sitemaps: http://www.sitemaps.org
.. _Sitemap index files: http://www.sitemaps.org/protocol.php#index
.. _Sitemap index files: http://www.sitemaps.org/protocol.html#index
.. _robots.txt: http://www.robotstxt.org/
.. _TLD: http://en.wikipedia.org/wiki/Top-level_domain
.. _Scrapyd documentation: http://scrapyd.readthedocs.org/
.. _Scrapyd documentation: http://scrapyd.readthedocs.org/en/latest/
Markdown is supported
0% .
You are about to add 0 people to the discussion. Proceed with caution.
先完成此消息的编辑!
想要评论请 注册