提交 fbfb5bbd 编写于 作者: L Lisa Owen 提交者: dyozie

docs - add info about pxf jdbc statement properties (#7650)

* docs - add info about pxf jdbc statement properties

* misc edits requested by david
上级 991e17a8
......@@ -19,13 +19,13 @@ In previous releases of Greenplum Database, you may have specified the JDBC driv
### <a id="cfg_jar"></a>JDBC Driver JAR Registration
The PXF JDBC connector is installed with the `postgresql-8.4-702.jdbc4.jar` JAR file. If you require a different JDBC driver, ensure that you install the JDBC driver JAR file for the external SQL database in the `$PXF_CONF/lib` directory on each segment host. Be sure to install JDBC driver JAR files that are compatible with your JRE version. See [Registering PXF JAR Dependencies](reg_jar_depend.html) for additional information.
The PXF JDBC Connector is installed with the `postgresql-8.4-702.jdbc4.jar` JAR file. If you require a different JDBC driver, ensure that you install the JDBC driver JAR file for the external SQL database in the `$PXF_CONF/lib` directory on each segment host. Be sure to install JDBC driver JAR files that are compatible with your JRE version. See [Registering PXF JAR Dependencies](reg_jar_depend.html) for additional information.
### <a id="cfg_server"></a>JDBC Server Configuration
PXF provides a template configuration file for the JDBC Connector. This server template configuration file, located in `$PXF_CONF/templates/jdbc-site.xml`, identifies properties that you can configure to establish a connection to the external SQL database. The template also includes optional properties that you can set before query execution in the external database session.
PXF provides a template configuration file for the JDBC Connector. This server template configuration file, located in `$PXF_CONF/templates/jdbc-site.xml`, identifies properties that you can configure to establish a connection to the external SQL database. The template also includes optional properties that you can set before executing query or insert commands in the external database session.
The base properties in the `jdbc-site.xml` server template file follow:
The required properties in the `jdbc-site.xml` server template file follow:
| Property | Description | Value |
|----------------|--------------------------------------------|-------|
......@@ -55,6 +55,8 @@ Example: To set the `createDatabaseIfNotExist` connection property on a JDBC con
</property>
```
Ensure that the JDBC driver for the external SQL database supports any connection-level property that you specify.
#### <a id="conntransiso"></a>Connection Transaction Isolation Property
The SQL standard defines four transaction isolation levels. The level that you specify for a given connection to an external SQL database determines how and when the changes made by one transaction executed on the connection are visible to another.
......@@ -82,6 +84,29 @@ For example, to set the transaction isolation level to *Read uncommitted*, add t
Different SQL databases support different transaction isolation levels. Ensure that the external database supports the level that you specify.
#### <a id="stateprop"></a>Statement-Level Properties
The PXF JDBC Connector executes a query or insert command on an external SQL database table in a *statement*. The Connector exposes properties that enable you to configure certain aspects of the statement before the command is executed in the external database. The Connector supports the following statement-level properties:
| Property | Description | Value |
|----------------|--------------------------------------------|-------|
| jdbc.statement.batchSize | The number of rows to write to the external database table in a batch. | The number of rows. The default write batch size is 100. |
| jdbc.statement.fetchSize | The number of rows to fetch/buffer when reading from the external database table. | The number of rows. The default read fetch size is 1000. |
| jdbc.statement.queryTimeout | The amount of time (in seconds) the JDBC driver waits for a statement to execute. This timeout applies to statements created for both read and write operations. | The timeout duration in seconds. The default wait time is unlimited. |
PXF uses the default value for any statement-level property that you do not explicitly configure.
Example: To set the read fetch size to 5000, add the following property block to `jdbc-site.xml`:
``` xml
<property>
<name>jdbc.statement.fetchSize</name>
<value>5000</value>
</property>
```
Ensure that the JDBC driver for the external SQL database supports any statement-level property that you specify.
#### <a id="sessprop"></a>Session-Level Properties
To set session-level properties, add the `jdbc.session.property.<SPROP_NAME>` property to `jdbc-site.xml`. PXF will `SET` these properties in the external database before executing a query.
......@@ -105,6 +130,12 @@ Example: To set the `search_path` parameter before running a query in a PostgreS
</property>
```
Ensure that the JDBC driver for the external SQL database supports any property that you specify.
### <a id="cfg_override"></a>Overriding the JDBC Server Configuration
You can override the JDBC server configuration by directly specifying certain JDBC properties via custom options in the `CREATE EXTERNAL TABLE` command `LOCATION` clause. Refer to [Overriding the JDBC Server Configuration](jdbc_pxf.html#jdbc_override) for additional information.
## <a id="cfg_server_proc"></a>Configuration Procedure
When you configure the PXF JDBC Connector to access an external SQL database, you add at least one named PXF server configuration for the connector. You:
......
......@@ -86,7 +86,9 @@ You include JDBC connector custom options in the `LOCATION` URI, prefacing each
| Option Name | Operation | Description
|---------------|------------|--------|
| BATCH_SIZE | Write | Integer identifying the number of `INSERT` operations to batch to the external SQL database. PXF always validates a `BATCH_SIZE` option, even when provided on a read operation. Batching is enabled by default. |
| BATCH_SIZE | Write | Integer that identifies the number of `INSERT` operations to batch to the external SQL database. PXF always validates a `BATCH_SIZE` option, even when provided on a read operation. Write batching is enabled by default; the default value is 100. |
| FETCH_SIZE | Read | Integer that identifies the number of rows to buffer when reading from an external SQL database. Read row batching is enabled by default; the default read fetch size is 1000. |
| QUERY_TIMEOUT | Read/Write | Integer that identifies the amount of time (in seconds) that the JDBC driver waits for a statement to execute. The default wait time is infinite. |
| POOL_SIZE | Write | Enable thread pooling on `INSERT` operations and identify the number of threads in the pool. Thread pooling is disabled by default. |
| PARTITION_BY | Read | The partition column, \<column-name\>:\<column-type\>. You may specify only one partition column. The JDBC connector supports `date`, `int`, and `enum` \<column-type\> values. If you do not identify a `PARTITION_BY` column, a single PXF instance services the read request. |
| RANGE | Read | Required when `PARTITION_BY` is specified. The query range, \<start-value\>[:\<end-value\>]. When the partition column is an `enum` type, `RANGE` must specify a list of values, each of which forms its own fragment. If the partition column is an `int` or `date` type, `RANGE` must specify a finite left-closed range. That is, the range includes the \<start-value\> but does *not* include the \<end-value\>. If the partition column is a `date` type, use the `yyyy-MM-dd` date format. |
......@@ -96,9 +98,9 @@ You include JDBC connector custom options in the `LOCATION` URI, prefacing each
#### <a id="batching"></a>Batching Insert Operations (Write)
When the JDBC driver of the external SQL database supports it, batching of `INSERT` operations may significantly increase performance.
*When the JDBC driver of the external SQL database supports it*, batching of `INSERT` operations may significantly increase performance.
Batching is enabled by default, and the default batch size is 100. To disable batching or to modify the default batch size value, create the PXF external table with a `BATCH_SIZE` setting:
Write batching is enabled by default, and the default batch size is 100. To disable batching or to modify the default batch size value, create the PXF external table with a `BATCH_SIZE` setting:
- `BATCH_SIZE=0` or `BATCH_SIZE=1` - disables batching
- `BATCH_SIZE=(n>1)` - sets the `BATCH_SIZE` to `n`
......@@ -108,6 +110,16 @@ When the external database JDBC driver does not support batching, the behaviour
- `BATCH_SIZE` omitted - The JDBC connector inserts without batching.
- `BATCH_SIZE=(n>1)` - The `INSERT` operation fails and the connector returns an error.
#### <a id="fetching"></a>Batching on Read Operations
By default, the PXF JDBC connector automatically batches the rows it fetches from an external database table. The default row fetch size is 1000. To modify the default fetch size value, specify a `FETCH_SIZE` when you create the PXF external table. For example:
``` pre
FETCH_SIZE=5000
```
If the external database JDBC driver does not support batching on read, you must explicitly disable read row batching by setting `FETCH_SIZE=0`.
#### <a id="threadpool"></a>Thread Pooling (Write)
The PXF JDBC connector can further increase write performance by processing `INSERT` operations in multiple threads when threading is supported by the JDBC driver of the external SQL database.
......@@ -132,7 +144,7 @@ When you enable partitioning, the PXF JDBC connector splits a `SELECT` query int
When you specify the `PARTITION_BY` option, tune the `INTERVAL` value and unit based upon the optimal number of JDBC connections to the target database and the optimal distribution of external data across Greenplum Database segments. The `INTERVAL` low boundary is driven by the number of Greenplum Database segments while the high boundary is driven by the acceptable number of JDBC connections to the target database. The `INTERVAL` setting influences the number of fragments, and should ideally not be set too high nor too low. Testing with multiple values may help you select the optimal settings.
Example JDBC \<custom-option\> substrings identifying partitioning parameters:
Example JDBC \<custom-option\> substrings that identify partitioning parameters:
``` pre
&PARTITION_BY=year:int&RANGE=2011:2013&INTERVAL=1
......@@ -310,16 +322,19 @@ Perform the following procedure to insert some data into the `forpxf_table1` Pos
## <a id="jdbc_override"></a>Overriding the JDBC Server Configuration
If you are accessing an external SQL database, you can override the JDBC server configuration connection parameters by directly specifying these custom options in the `CREATE EXTERNAL TABLE` `LOCATION` clause:
You can override certain properties in a JDBC server configuration for a specific external database table by directly specifying the custom option in the `CREATE EXTERNAL TABLE` `LOCATION` clause:
| Custom Option Name | jdbc-site.xml Property Name | Description
|----------------------|-----------------------------|--------|
| JDBC_DRIVER | jdbc.driver | The JDBC driver class name. (Required) |
| DB_URL | jdbc.url | The external database URL. Depends on the external SQL database, typically includes at least the hostname, port, and database name. (Required) |
| USER | jdbc.user | The database user name. Required if `PASS` is provided. |
| PASS | jdbc.password | The database password for `USER`. Required if `USER` is provided. |
| Custom Option Name | jdbc-site.xml Property Name |
|----------------------|-----------------------------|
| JDBC_DRIVER | jdbc.driver |
| DB_URL | jdbc.url |
| USER | jdbc.user |
| PASS | jdbc.password |
| BATCH_SIZE | jdbc.statement.batchSize |
| FETCH_SIZE | jdbc.statement.fetchSize |
| QUERY_TIMEOUT | jdbc.statement.queryTimeout |
Example JDBC \<custom-option\> connection strings:
Example JDBC connection strings specified via custom options:
``` pre
&JDBC_DRIVER=org.postgresql.Driver&DB_URL=jdbc:postgresql://pgserverhost:5432/pgtestdb&USER=pguser1&PASS=changeme
......
Markdown is supported
0% .
You are about to add 0 people to the discussion. Proceed with caution.
先完成此消息的编辑!
想要评论请 注册