settings.rst 5.2 KB
Newer Older
P
Pablo Hoffman 已提交
1 2
.. _topics-settings:

3 4 5
========
Settings
========
6

P
Pablo Hoffman 已提交
7 8 9
.. module:: scrapy.conf
   :synopsis: Settings manager

10 11 12 13 14 15 16
The Scrapy settings allows you to customize the behaviour of all Scrapy
components, including the core, extensions, pipelines and spiders themselves.

The settings infrastructure provides a global namespace of key-value mappings
where the code can pull configuration values from. The settings can be
populated through different mechanisms, which are described below.

17 18
Read :ref:`settings` for all supported entries.

19 20 21 22 23 24 25 26 27
How to populate settings
========================

Settings can be populated using different mechanisms, each of which having a
different precedence. Here is the list of them in decreasing order of
precedence:

 1. Global overrides (most precedence)
 2. Environment variables
P
Pablo Hoffman 已提交
28 29 30
 3. scrapy_settings
 4. Default settings per-command
 5. Default global settings (less precedence)
31 32 33 34 35 36 37

This mechanisms are described with more detail below.

1. Global overrides
-------------------

Global overrides are the ones that takes most precedence, and are usually
P
Pablo Hoffman 已提交
38
populated by command line options.
39 40 41 42 43

Example::
   >>> from scrapy.conf import settings
   >>> settings.overrides['LOG_ENABLED'] = True

44 45
You can also override one (or more) settings from command line using the
``--set`` command line argument. 
46

P
Pablo Hoffman 已提交
47 48
.. highlight:: sh

49 50
Example::

51
    scrapy-ctl.py crawl domain.com --set LOG_FILE=scrapy.log
52 53 54 55

2. Environment variables
------------------------

56
You can populate settings using environment variables prefixed with
57
``SCRAPY_``. For example, to change the log file location un Unix systems::
58

59
    $ export SCRAPY_LOG_FILE=scrapy.log
60 61
    $ scrapy-ctl.py crawl example.com

62 63 64 65 66
In Windows systems, you can change the environment variables from the Control
Panel following `these guidelines`_.

.. _these guidelines: http://www.microsoft.com/resources/documentation/windows/xp/all/proddocs/en-us/sysdm_advancd_environmnt_addchange_variable.mspx

P
Pablo Hoffman 已提交
67
3. scrapy_settings
68 69 70 71 72
------------------

scrapy_settings is the standard configuration file for your Scrapy project.
It's where most of your custom settings will be populated.

P
Pablo Hoffman 已提交
73 74 75 76 77 78 79 80 81 82 83
4. Default settings per-command
-------------------------------

Each scrapy-ctl command can have its own default settings, which override the
global default settings. Those custom command settings are located inside the
``scrapy.conf.commands`` module, or you can specify custom settings to override
per-comand inside your project, by writing them in the module referenced by the
:setting:`COMMANDS_SETTINGS_MODULE` setting. Those settings will take more

5. Default global settings
--------------------------
84 85

The global defaults are located in scrapy.conf.default_settings and documented
86
in the :ref:`settings` page.
87 88 89 90 91


How to access settings
======================

P
Pablo Hoffman 已提交
92 93
.. highlight:: python

P
Pablo Hoffman 已提交
94
Here's an example of the simplest way to access settings from Python code::
95 96 97 98 99

   >>> from scrapy.conf import settings
   >>> print settings['LOG_ENABLED']
   True

P
Pablo Hoffman 已提交
100 101 102 103
In other words, settings can be accesed like a dict, but it's usually preferred
to extract the setting in the format you need it to avoid type errors. In order
to do that you'll have to use one of the following methods:

104 105 106 107 108 109
.. class:: Settings()

   The Settings object is automatically instantiated when the
   :mod:`scrapy.conf` module is loaded, and it's usually accessed like this::

   >>> from scrapy.conf import settings
P
Pablo Hoffman 已提交
110

111
.. method:: Settings.get(name, default=None)
P
Pablo Hoffman 已提交
112 113 114 115 116 117 118

   Get a setting value without affecting its original type.

   ``name`` is a string with the setting name

   ``default`` is the value to return if no setting is found

119
.. method:: Settings.getbool(name, deafult=Flse)
P
Pablo Hoffman 已提交
120 121 122 123 124 125 126 127 128 129 130 131

   Get a setting value as a boolean. For example, both ``1`` and ``'1'``, and
   ``True`` return ``True``, while ``0``, ``'0'``, ``False`` and ``None``
   return ``False````

   For example, settings populated through environment variables set to ``'0'``
   will return ``False`` when using this method.

   ``name`` is a string with the setting name

   ``default`` is the value to return if no setting is found

132
.. method:: Settings.getint(name, default=0)
P
Pablo Hoffman 已提交
133 134 135 136 137 138 139

   Get a setting value as an int

   ``name`` is a string with the setting name

   ``default`` is the value to return if no setting is found

140
.. method:: Settings.getfloat(name, default=0.0)
P
Pablo Hoffman 已提交
141 142 143 144 145 146 147

   Get a setting value as a float

   ``name`` is a string with the setting name

   ``default`` is the value to return if no setting is found

148
.. method:: Settings.getlist(name, default=None)
P
Pablo Hoffman 已提交
149 150 151 152 153 154 155 156 157 158 159 160 161

   Get a setting value as a list. If the setting original type is a list it
   will be returned verbatim. If it's a string it will be splitted by ",".

   For example, settings populated through environment variables set to
   ``'one,two'`` will return a list ['one', 'two'] when using this method.

   ``name`` is a string with the setting name

   ``default`` is the value to return if no setting is found

Available built-in settings
===========================
162

163
See :ref:`settings`.
164 165 166 167 168

Rationale for setting names
===========================

Setting names are usually prefixed with the component that they configure. For
P
Pablo Hoffman 已提交
169
example, proper setting names for a fictional robots.txt extension would be
170
``ROBOTSTXT_ENABLED``, ``ROBOTSTXT_OBEY``, ``ROBOTSTXT_CACHEDIR``, etc.