From 38e6d5f838e6c25d9570a101ceb20f3ec1fc9dcb Mon Sep 17 00:00:00 2001 From: Mel Kiyama Date: Wed, 3 Jan 2018 16:45:31 -0800 Subject: [PATCH] docs: pl/container updates - plcontainer command, configuration file (#4223) * docs: pl/container updates - plcontainer command, configuration file -removed experimental warning. -replaced plcontainer command with new options. -updated configuration file - changed element names, added setting element. -updated default shared volume. PR for 5X_STABLE Will be ported to MAIN * docs: pl/container - updates based on review comments. * docs: pl/container - more updates based on additional review comments. * docs: pl/container - minor edit * docs: pl/container - another minor fix. --- .../ref_guide/extensions/pl_container.xml | 949 +++++++++++------- 1 file changed, 559 insertions(+), 390 deletions(-) diff --git a/gpdb-doc/dita/ref_guide/extensions/pl_container.xml b/gpdb-doc/dita/ref_guide/extensions/pl_container.xml index e41670644e..e37aa53db0 100644 --- a/gpdb-doc/dita/ref_guide/extensions/pl_container.xml +++ b/gpdb-doc/dita/ref_guide/extensions/pl_container.xml @@ -19,10 +19,8 @@
  • - PL/Container is an experimental feature and is not intended for use in a - production environment. Experimental features are subject to change without notice in future - releases.

    PL/Container is compatible with Greenplum Database 5.2.0. PL/Container has not - been tested for compatibility with Greenplum Database 5.1.0 or 5.0.0.

    + PL/Container is compatible with Greenplum Database 5.2.0 and later. + PL/Container has not been tested for compatibility with Greenplum Database 5.1.0 or 5.0.0. @@ -84,9 +82,9 @@

    The Docker container tag represents the PL/Container extension release version (for example, 1.0.0). For example, the full container name for plc_python_shared - is similar to pivotaldata/plc_python_shared:1.0.0, version 1.0.0. This is - the name that is referred to in the default PL/Container configuration. Also, You can also - create custom Docker images and add the image to the PL/Container configuration.

    + is similar to pivotaldata/plc_python_shared:1.0.0. This is the name that is + referred to in the default PL/Container configuration. Also, You can create custom Docker + images, install the image and add the image to the PL/Container configuration.

    @@ -136,6 +134,7 @@ Installing the PL/Container Extension Package +

    Install the PL/Container extension with the Greenplum Database gppkg utility.

    @@ -151,18 +150,16 @@
  • Restart Greenplum Database.gpstop -ra
  • Enable PL/Container for specific databases by runningpsql -d your_database -f $GPHOME/share/postgresql/plcontainer/plcontainer_install.sql

    The - SQL script registers the language plcontainer in the database creates - PL/Container specific UDFs.

  • -
  • Initialize PL/Container configuration on the Greenplum Database hosts by running the - plcontainer configure - command.plcontainer configure --reset

    The - plcontainer utility is included with the PL/Container - extension.

  • + SQL script registers the language plcontainer in the database and + creates PL/Container specific UDFs.

    +

    After installing PL/Container, you can manage Docker images and manage the PL/Container + configuration with the Greenplum Database plcontainer utility.

    Building and Installing the PL/Container Extension +

    The PL/Container extension is available as an open source module. For information about the building and installing the module as part of Greenplum Database, see the README file @@ -175,16 +172,16 @@ Installing PL/Container Language Docker Images

    The PL/Container extension includes the plcontainer utility that installs - Docker images in the host Docker repository and adds the installed image to the PL/Container - configuration. The utility adds the Docker image to all Greenplum Database hosts and updates - configuration information on all the hosts. For information about - plcontainer, see .

    + Docker images on the Greenplum Database hosts and adds configuration information to the + PL/Container configuration file. The configuration information allows PL/Container to create + Docker containers with the Docker images. For information about + plcontainer, see .

    Download the tar.gz file that contains the Docker images from Pivotal Network.

      -
    • plcontainer-python-images-1.0.0-beta1.tar.gz
    • -
    • plcontainer-r-images-1.0.0-beta1.tar.gz
    • +
    • plcontainer-python-images-1.0.0.tar.gz
    • +
    • plcontainer-r-images-1.0.0.tar.gz

    The PL/Container open source module contains dockerfiles to build @@ -192,33 +189,56 @@ PL/Python UDFs and a Docker image to run PL/R UDFs. See the dockerfiles in the GitHub repository at https://github.com/greenplum-db/plcontainer.

    -

    Install the Docker images on the Greenplum Database hosts. These examples use the - plcontainer utility to install Docker images for Python and R and add the - images to the PL/Container configuration. The utility installs the images and configures all - the Greenplum Database hosts. The examples assume the Docker images are in - /home/gpadmin.

    -

    This example runs plcontainer to install the Docker image for PL/Python - and add the image to the PL/Container configuration. - plcontainer install -n plc_python_shared -i /home/gpadmin/plcontainer-python-images-0.9.3.tar.gz \ - -c pivotaldata/plc_python_shared:1.0.0 -l python

    -

    This example runs plcontainer to install the Docker image for PL/R and add - the image to the PL/Container configuration.

    - plcontainer install -n plc_r -i /home/gpadmin/plcontainer-r-images-0.9.3.tar.gz \ - -c pivotaldata/plc_r_shared:1.0.0 -l r -

    You can view the host system Docker repository with the docker images - command. The image name specified with the -c option appears in the list of - Docker images.

    -

    You can view the updated the PL/Container configuration file with the plcontainer - configure -s command. A container element in the configuration XML file with the - name specified with the -n option appears in the file.

    +

    Install the Docker images on the Greenplum Database hosts. This example uses the + plcontainer utility to install a Docker image for Python and to update + the PL/Container configuration. The example assumes the Docker image to be installed is in a + file in /home/gpadmin.

    +

    This plcontainer command installs the Docker image for PL/Python from a + Docker image file. + plcontainer image-add -i /home/gpadmin/plcontainer-python-images-1.0.0.tar.gz

    +

    The utility displays progress information as it installs the Docker image on the Greenplum + Database hosts.

    +

    Use the plcontainer image-show command to display the installed Docker + images on the local host.

    +

    This command adds information to the PL/Container configuration file so that PL/Container + can access the Docker image to create a Docker + container.plcontainer runtime-add -r plc_py -i pivotaldata/plcontainer:devel -l python

    +

    The utility displays progress information as it updates the PL/Container configuration file + on the Greenplum Database instances.

    +

    You can view the PL/Container configuration information with the plcontainer + runtime-show -r plc_py command. You can view the PL/Container configuration XML + file with the plcontainer runtime-edit command.

    Uninstalling PL/Container +

    To uninstall PL/Container, remove Docker containers and images, and then remove the + PL/Container support from Greenplum Database.

    When you remove support for the PL/Container extension, the plcontainer user-defined functions that you created in the database will no longer work.

    + + Uninstall Docker Containers and Images + +

    On the Greenplum Database hosts, uninstall the Docker containers and images that are no + longer required.

    +

    The plcontainer image-list command lists the Docker images that are + installed on the local Greenplum Database host.

    +

    The plcontainer image-delete command deletes Docker images from all + Greenplum Database hosts.

    +

    Some Docker containers might exist on a host if the containers were not managed by + PL/Container. You might need to remove the containers with Docker commands. These + docker commands manage Docker containers and images on a local host.

      +
    • The command docker ps -a lists all containers on a host. The + command docker stop stops a container.
    • +
    • The command docker images lists the images on a host.
    • +
    • The command docker rmi removes images.
    • +
    • The command docker rm removes containers.
    • +

    + +
    Remove PL/Container Support for a Database @@ -249,19 +269,6 @@ - - Uninstall Docker Containers and Images - -

    On the Greenplum Database hosts, uninstall the Docker containers and images that are no - longer required.

      -
    • The command docker ps -a lists the containers on a host. The - command docker stop stops a container.
    • -
    • The command docker images lists the images on a host.
    • -
    • The command docker rmi removes images.
    • -
    • The command docker rm removes containers.
    • -

    - -
    Using PL/Container Languages @@ -271,16 +278,15 @@ images. To create a UDF that uses PL/Container, the UDF must have the these items.

    • The first line of the UDF must be # container: - name
    • + ID
    • The LANGUAGE attribute must be plcontainer
    -

    The name is the name that PL/Container uses to identify the Docker - container that runs the UDF. in the XML configuration file - plcontainer_configuration.xml, there should be a - container XML element with a corresponding name XML - element that specifies the detail Docker container information. See for information about how PL/Container maps the - name to a Docker container.

    +

    The ID is the name that PL/Container uses to identify the Docker image + that is used to start a Docker container that runs the UDF. In the XML configuration file + plcontainer_configuration.xml, there is a runtime XML + element that contains a corresponding id XML element that specifies the + Docker container startup information. See for + information about how PL/Container maps the ID to a Docker image.

    The PL/Container configuration file is read only on the first invocation of a PL/Container function in each Greenplum Database session that runs PL/Container functions. You can force the configuration file to be re-read by performing a SELECT command on the @@ -288,13 +294,52 @@ SELECT command forces a the configuration file to be read.

    select * from plcontainer_refresh_config;

    Running the command executes a PL/Container function that updates the configuration on the - master and segment instances.

    + master and segment instances and returns the status of the + refresh. gp_segment_id | plcontainer_refresh_local_config +---------------+---------------------------------- + 1 | ok + 0 | ok + -1 | ok +(3 rows)

    Also, you can show all the configurations in the session by performing a SELECT command on the view plcontainer_show_config. For example, this SELECT command returns the PL/Container configurations.

    select * from plcontainer_show_config;

    Running the command executes a PL/Container function that displays configuration - information from the master and segment instances.

    + information from the master and segment instances. This is an example of the start and end + of the view + output.INFO: Container 'plc_python_example1' configuration +INFO: image = 'pivotaldata/plcontainer_python_with_clients:0.1' +INFO: memory_mb = '1024' +INFO: use network = 'no' +INFO: enable log = 'no' +INFO: Container 'plc_python_example2' configuration +INFO: image = 'pivotaldata/plcontainer_python_without_clients:0.1' +INFO: memory_mb = '1024' +INFO: use network = 'yes' +INFO: enable log = 'yes' +INFO: shared directory from host '/usr/local/greenplum-db/bin/plcontainer_clients' to container '/clientdir' +INFO: access = readonly + + ... + + gp_segment_id | plcontainer_show_local_config +---------------+------------------------------- + 0 | ok + -1 | ok + 1 | ok

    +

    The PL/Container function plcontainer_containers_summary() displays + information about the currently running Docker + containers.select * from plcontainer_containers_summary();

    +

    If a normal (non-superuser) Greenplum Database user runs the function, the function + displays information only for containers created by the user. If a Greenplum Database + superuser runs the function, information for all containers created by Greenplum Database + users is displayed. This is sample output when 2 containers are running.

    + SEGMENT_ID | CONTAINER_ID | UP_TIME | OWNER | MEMORY_USAGE(KB) +------------+------------------------------------------------------------------+--------------+---------+------------------ + 1 | 693a6cb691f1d2881ec0160a44dae2547a0d5b799875d4ec106c09c97da422ea | Up 8 seconds | gpadmin | 12940 + 1 | bc9a0c04019c266f6d8269ffe35769d118bfb96ec634549b2b1bd2401ea20158 | Up 2 minutes | gpadmin | 13628 +(2 rows) Examples @@ -311,12 +356,14 @@ $$ LANGUAGE plcontainer;

    # container: plc_r_shared return(log10(100)) $$ LANGUAGE plcontainer;

    -

    The PL/Container Docker container that you specify, plc_python_shared - and plc_r_shared in the examples, are the name elements - defined in plcontainer_config.xml file, and they are mapped to the - image XML element that specifies the Docker image to be started. - Removing a specific container XML element from the configuration file - makes it impossible for end users to start the container.

    +

    The values in the # container lines of the examples, + plc_python_shared and plc_r_shared, are the + id XML elements defined in the plcontainer_config.xml + file. The id element is mapped to the image element that + specifies the Docker image to be started.

    +

    If the # container line in a UDF specifies an ID that is not the + PL/Container configuration file, Greenplum Database returns an error when you try to + execute the UDF.

    @@ -412,224 +459,307 @@ $$ LANGUAGE plcontainer;

    The Greenplum Database utility plcontainer manages the PL/Container configuration files in a Greenplum Database system. The utility ensures that the configuration files are consistent across the Greenplum Database master and segment - hosts.

    - Modifying the configuration files manually might create different, - incompatible configurations on different Greenplum Database segments that could cause - unexpected behavior. + instances.

    + Modifying the configuration files on the segment instances without using + the utility might create different, incompatible configurations on different Greenplum + Database segments that could cause unexpected behavior.

    Configuration changes that are made with the utility are applied to the XML files on all Greenplum Database segments. However, PL/Container configurations of currently running sessions use the configuration that existed during session start up. To update the PL/Container configuration in a running session, execute this command in the session.

    select * from plcontainer_refresh_config; -

    Running the command executes a PL/Container function that updates the configuration on the - master and segment instances.

    -

    When you change the plcontainer_configuration.xml configuration file with - the plcontainer utility, the utility creates a back up of the original - configuration file in the same directory. The backup file name is - plcontainer_configuration.xml.bakYYYYMMDD_hhmmss. - The timestamp of the change is appended to the file name. Using the plcontainer - configure command with the --restore option, you can roll back - the configuration changes to the previous version.

    +

    Running the command executes a PL/Container function that updates the session configuration + on the master and segment instances.

    - plcontainer Utility + The plcontainer Utility

    The plcontainer utility installs Docker images and manages the - PL/Container configuration. The utility consists of two commands.

    -
      -
    • plcontainer configure - Manages the PL/Container configuration file - on the hosts. You can add Docker image information to the PL/Container configuration - file including the image name, location, and shared folder information. You can also - edit the configuration file.
    • -
    • plcontainer install - Install a Docker image in Docker repository and - add the image information to the PL/Container configuration file on each host.
    • + PL/Container configuration. The utility consists of two sets of commands.

      +
        +
      • image-* commands manage Docker images on the Greenplum Database + system hosts.
      • +
      • runtime-* commands manage the PL/Container configuration file on the + Greenplum Database instances. You can add Docker image information to the PL/Container + configuration file including the image name, location, and shared folder information. + You can also edit the configuration file.
      -

      The plcontainer utility - syntax:plcontainer configure {{-n | --name} container-name - {-i | --image} image-location - {-l | --language} language - {-v | --volume} shared-volumes } | - {[-e --editor [editor] } | - { --reset | --restore } | - { | {-s | --show} | - {-f --file} config-file} - [{-y | --yes)] - [--verbose] - -plcontainer install {-n | --name} container-name - {-i | --image} image-location - {-c | --imagename} docker-image - {-l | --language} language - {-v | --volume} shared-volumes +

      To configure PL/Container to use a Docker image, you install the Docker image on all the + Greenplum Database hosts and then add configuration information to the PL/Container + configuration.

      +

      PL/Container configuration values, such as image names, runtime IDs, and parameter values + and names are case sensitive.

      +
      + plcontainer Syntax + plcontainer [command] [-h | --help] [--verbose] +

      Where command is one of the following.

      + image-add {{-f | --file} image_file} | {{-u | --URL} image_URL} + image-delete {-i | --image} image_name + image-list -plcontainer {configure | install} {-h | --help}

      + runtime-add {-r | --runtime} runtime_id + {-i | --image} image_name {-l | --language} {python | r} + [{-v | --volume} shared_volume [{-v| --volume} shared_volume...]] + [{-s | --setting} param_value [{-s | --setting} param_value ...]] + runtime-replace {-r | --runtime} runtime_id + {-i | --image} image_name -l {r | python} + [{-v | --volume} shared_volume [{-v | --volume} shared_volume...]] + [{-s | --setting} param_value [{-s | --setting} param_value ...]] + runtime-show {-r | --runtime} runtime_id + runtime-delete {-r | --runtime} runtime_id + runtime-edit [{-e | --editor} editor] + runtime-backup {-f | --file} config_file + runtime-restore {-f | --file} config_file + runtime-verify +
      - Options - - - {-c | --imagename} local-image - The utility installs the Docker image on the Greenplum Database hosts with the - specified Docker name and uses the name in the PL/Container configuration file - element image when creating a container element in the - configuration file. - - - {-e | --editor } [editor] - Open the file plcontainer_configuration.xml with the specified - editor. The default is the vi editor. - Saving the file updates the configuration file on all Greenplum Database hosts and - saves the previous version of the file. - - - {-f | --file} config-file - -

      The utility replaces the existing PL/Container configuration file with the - specified file. Specify the absolute path to a configuration file. The - configuration file is replaced on all Greenplum Database hosts.

      -
      -
      - - {-i | --image} docker-image - Specify a full Docker image. For example - pivotaldata/plcontainer_python:1.0.0.
        -
      • configure - When creating a container entry - in PL/Container configuration this is the value of configuration file element - image. The Docker image must be installed.
      • -
      • install - Installs the Docker image from the specified - location. You can specify a URL to a Docker registry or the absolute path to a - tar.gz file that contains a docker image. When installing a - docker image, the utility uses --imagename - local-image for the value of configuration file - element image.
      • -
      -
      - - {-l | --language} language - -

      Configure PL/Container language type, supported values are - python (PL/Python) and r (PL/R).

      -
      -
      - - {-n | --name} container-name - When adding a container element in the PL/Container configuration file, this is - the value of the name element. You specify the name in the - Greenplum Database UDF on the # container line. For example, this - line in a PL/Container UDF plc_r_shared specifies using the - information in the plc_r_shared container element to create a - Docker container.# container: plc_r_shared - - - --reset - Reset the configuration file to the default. - - - --restore - Restore the previous version of the PL/Container configuration file. - - - -s | --show - Display the contents of the PL/Container configuration file. - - - {-v | --volume} shared-volume - Optional. Specify a Docker volume to bind mount. You can specify multiple volumes - as a comma separated lists of volumes. - The format for a shared volume: - host-dir:container-dir:[rw|ro]. - The information is stored as attributes in the shared_directory - element of the container element in the PL/Container configuration - file.
        -
      • host-dir - absolute path to a directory on the host system. - The Greenplum Database administrator user (gpadmin) must have appropriate access - to the directory.
      • -
      • container-dir - absolute path to a directory in the Docker - container.
      • -
      • [rw|ro] - read-write or read-only access to the host - directory from the container. Information is stored in the configuration file - element shared_directory.
      • -
      - The utility sets a read-only shared volume when the Docker images are installed. - This is the shared-volume that the utility specifies for the - Greenplum PL/R Docker image. - - /usr/local/greenplum-db/./bin/rclient:/clientdir:ro - - This is the shared-volume that the utility specifies for the - Greenplum PL/Python Docker - image./usr/local/greenplum-db/./bin/pyclient:/clientdir:ro - If needed, you can specify other shared directories. Specifying the same shared - directory as the one that is automatically set by the utility will cause a Docker - container startup failure. - When specifying read-write access to host directory, ensure that the specified - host directory has the correct permissions. Also, if a Docker image managed by - PL/Container is configured with read-write access to a host directory, PL/Container - could run multiple Docker containers on a host that change data in the directory. - This might cause issues when running PL/Container user-defined functions that access - the shared directory. -
      - - --verbose - Enable verbose logging. - - - -y | --yes - Continue without confirmation prompts. - - - h | --help - Display help text. - -
      + plcontainer Commands and Options
      + + + image-add location + Install a Docker image on the Greenplum Database hosts. Specify either the location + of the Docker image file on the host or the URL to the Docker image. These are the + supported location options.
        +
      • {-f | --file} image_file Specify the tar + archive file on the host that contains the Docker image. This example points to an + image file in the gpadmin home directory + /home/gpadmin/test_image.tar.gz
      • +
      • {-u | --URL} image_URL Specify the URL of the + Docker repository and image. This example URL points to a local Docker repository + 192.168.0.1:5000/images/mytest_plc_r:devel
      • +
      + After installing the Docker image, use the runtime-add + command to configure PL/Container to use the Docker image. +
      + + image-delete {-i | --image} image_name + Remove an installed Docker image from all Greenplum Database hosts. Specify the full + Docker image name including the tag for example + pivotaldata/plcontainer_python_shared:1.0.0 + + + image-list + List the Docker images installed on the host. The command list only the images on + the local host, not remote hosts. The command lists all installed Docker images, + including images installed with Docker commands. + + + runtime-add options + Add configuration information to the PL/Container configuration file on all + Greenplum Database hosts. If the specified runtime_id exists, the + utility returns an error and the configuration information is not added. + For information about PL/Container configuration, see . + These are the supported options: + + + + {-i | --image} docker-image + Required. Specify the full Docker image name, including the tag, that is + installed on the Greenplum Database hosts. For example + pivotaldata/plcontainer_python:1.0.0. + The utility does not check if the Docker image is installed. + The plcontainer image-list command displays installed image + information including the name and tag (the Repository and Tag columns). + + + {-l | --language} python | r + Required. Specify the PL/Container language type, supported values are + python (PL/Python) and r (PL/R). When adding + configuration information for a new runtime, the utility adds a startup command + to the configuration based on the language you specify. + Startup command for the Python + language./clientdir/pyclient.sh + Startup command for the R + language./clientdir/rclient.sh + + + {-r | --runtime} runtime_id + + Required. Add the runtime ID. When adding a runtime element + in the PL/Container configuration file, this is the value of the + id element in the PL/Container configuration file. + You specify the name in the Greenplum Database UDF on the # + container line. See . + + + {-s | --setting} + param=value + Optional. Specify a setting to add to the runtime configuration information. + You can specify this option multiple times. The parameter is the XML attribute + of the setting element in the PL/Container configuration file. + These are valid parameters.
        +
      • memory_mb - Set the memory allocated for the container. + The value is an integer that specifies the amount of memory in MB.
      • +
      • use_network - Set the type of networking for + communication between the container and Greenplum Database. The value is + either yes, use TCP, or no use IPC. The + default is no, use IPC.
      • +
      • logs - Enable or disable Docker logging. The value is + either enable (enable logging) or disable + (disable logging, the default).
      • +
      +
      + + {-v | --volume} shared-volume + Optional. Specify a Docker volume to bind mount. You can specify this option + multiple times to define multiple volumes. + The format for a shared volume: + host-dir:container-dir:[rw|ro]. + The information is stored as attributes in the shared_directory + element of the runtime element in the PL/Container + configuration file.
        +
      • host-dir - absolute path to a directory on the host + system. The Greenplum Database administrator user (gpadmin) must have + appropriate access to the directory.
      • +
      • container-dir - absolute path to a directory in the + Docker container.
      • +
      • [rw|ro] - read-write or read-only access to the host + directory from the container.
      • +
      + When adding configuration information for a new runtime, the utility adds this + read-only shared volume information. + + greenplum-home/bin/plcontainer_clients:/clientdir:ro + + If needed, you can specify other shared directories. The utility returns an + error if the specified container-dir is the same as the one + that is added by the utility, or if you specify multiple shared volumes with the + same container-dir.Allowing read-write + access to a host directory requires special considerations.
        +
      • When specifying read-write access to host directory, ensure that the + specified host directory has the correct permissions.
      • +
      • When running PL/Container user-defined functions, multiple concurrent + Docker containers that are running on a host could change data in the host + directory. Ensure that the functions support multiple concurrent access to + the data in the host directory.
      • +
      +
      +
      +
      +
      + + runtime-backup {-f | --file} config_file + +

      Copies the PL/Container configuration file to the specified file on the + local host.

      +
      +
      + + runtime-delete {-r | --runtime} runtime_id + +

      Removes runtime configuration information in the PL/Container + configuration file on all Greenplum Database instances. The utility returns a + message if the specified runtime_id does not exist in the + file.

      +
      +
      + + runtime-edit [{-e | --editor} editor] + Edit the XML file plcontainer_configuration.xml with the specified + editor. The default editor is vi. + Saving the file updates the configuration file on all Greenplum Database hosts. If + errors exist in the updated file, the utility returns an error and does not update the + file. + + + runtime-replace options + +

      Replaces runtime configuration information in the PL/Container + configuration file on all Greenplum Database instances. If the + runtime_id does not exist, the information is added to the + configuration file. The utility adds a startup command and shared directory to the + configuration.

      +

      See runtime-add for command options and information added to the + configuration.

      +
      +
      + + runtime-restore {-f | --file} config_file + +

      Replaces information in the PL/Container configuration file + plcontainer_configuration.xml on all Greenplum Database instances + with the information from the specified file on the local host.

      +
      +
      + + runtime-show [{-r | --runtime} runtime_id] + +

      Displays formatted PL/Container runtime configuration information. If a + runtime_id is not specified, the configuration for all runtime + IDs are displayed.

      +
      +
      + + runtime-verify + +

      Checks the PL/Container configuration information on the Greenplum + Database instances with the configuration information on the master. If the utility + finds inconsistencies, you are prompted to replace the remote copy with the local + copy. The utility also performs XML validation.

      +
      +
      + + -h | --help + Display help text. If specified without a command, displays help for all + plcontainer commands. If specified with a command, displays help + for the command. + + + --verbose + Enable verbose logging for the command. + +
      Examples

      These are examples of common commands to manage PL/Container:

      -
        -
      • -

        Initialize the Greenplum Database installation with default configuration file - after installing a PL/Container - package:plcontainer configure --reset

        -
      • -
      -
        -
      • -

        Edit the configuration in an interactive editor of your - choice:plcontainer configure -e vim

        -
      • -
      -
        -
      • -

        Show the current configuration - file:plcontainer configure --show

        -
      • -
      -
        -
      • -

        Restore the previous configuration from a - backup:plcontainer configure --restore

        -
      • -
      -
        -
      • -

        Overwrite the PL/Container configuration file with an XML - file:plcontainer configure -f new_plcontainer_configuration.xml

        -
      • -
      -
        -
      • -

        Add a container entry to the PL/Container configuration - file:plcontainer configure -n plc_python_newpy -l python - -i pivotaldata/plc_python_newimage:latest

        -
      • -
      -
        -
      • -

        Install a Docker image and add a container entry for the image in the PL/Container - configuration - file.plcontainer install -n plc_r_newr -i plc_newr.tar.gz -c pivotaldata/plc_r_newr:latest - -l r

        -
      • +
          +
        • Install a Docker image on all Greenplum Database hosts. This example loads a Docker + image from a file. The utility displays progress information on the command line as + the utility installs the Docker image on all the + hosts.plcontainer image-add -f plc_newr.tar.gz

          After + installing the Docker image, you add or update a runtime entry in the PL/Container + configuration file to give PL/Container access to the Docker image to start Docker + containers.

        • +
        • Add a container entry to the PL/Container configuration file. This example adds + configuration information for a PL/R runtime, and specifies a shared volume and + settings for memory and network. + plcontainer runtime-add -r runtime2 -i test_image2:0.1 -l r \ + -v /host_dir2/shared2:/container_dir2/shared2:ro \ + -s memory_mb=512 -s use_network=yes

          The + utility displays progress information on the command line as it adds the runtime + configuration to the configuration file and distributes the updated configuration to + all instances.

        • +
        • Show specific runtime with given runtime id in configuration + fileplcontainer runtime-show -r plc_python_shared

          The + utility displays the configuration information similar to this + output.PL/Container Runtime Configuration: +--------------------------------------------------------- + Runtime ID: plc_python_shared + Linked Docker Image: test1:latest + Runtime Setting(s): + Shared Directory: + ---- Shared Directory From HOST '/usr/local/greenplum-db/bin/plcontainer_clients' to Container '/clientdir', access mode is 'ro' + ---- Shared Directory From HOST '/home/gpadmin/share/' to Container '/opt/share', access mode is 'rw' +---------------------------------------------------------

        • +
        • Edit the configuration in an interactive editor of your choice. This example edits + the configuration file with the vim + editor.plcontainer runtime-edit -e vim

          When you save the + file, the utility displays progress information on the command line as it + distributes the file to the Greenplum Database hosts.

        • +
        • Save the current PL/Container configuration to a file. This example saves the file + to the local file + /home/gpadmin/saved_plc_config.xmlplcontainer runtime-backup -f /home/gpadmin/saved_plc_config.xml
        • +
        • Overwrite PL/Container configuration file with an XML file. This example replaces + the information in the configuration file with the information from the file in the + /home/gpadmin + directory.plcontainer runtime-restore -f /home/gpadmin/new_plcontainer_configuration.xmlThe + utility displays progress information on the command line as it distributes the + updated file to the Greenplum Database instances.
      @@ -637,29 +767,40 @@ $$ LANGUAGE plcontainer;

      PL/Container Configuration File -

      The default PL/Container configuration file is in - $GPHOME/share/postgresql/plcontainer/plcontainer_configuration.xml of - each host. The PL/Container configuration file is an XML file. In the XML file, the root - element configuration contains a one or more container - elements, one element for each PL/Container language in the Greenplum Database - installation. - <configuration> - <container> - <name>plc_python_shared</name> - <image>pivotaldata/plcontainer_python:1.0.0</image> - <command>./client</command> - <memory_mb>128</memory_mb> - <use_network>no</use_network> - <shared_directory access="ro" container="/clientdir" host="/path/to/pyclient"/> - </container> - <container> - <name>plc_r</name> - <image>pivotaldata/plcontainer_r:1.0.0</image> - <command>/rclient.sh</command> - <memory_mb>256</memory_mb> - <use_network>yes</use_network> - <shared_directory access="ro" container="/clientdir" host="/usr/local/greenplum-db/./bin/rclient"/> - </container> +

      PL/Container maintains a configuration file + plcontainer_configuration.xml in the data directory of all Greenplum + Database segments. The PL/Container configuration file is an XML file. In the XML file, + the root element configuration contains one or more + runtime elements. You specify the id of the + runtime element in the # container: line of a + PL/Container function definition.

      +

      In an XML file, names, such as element and attribute names, and values are case + sensitive.

      +

      This is an example + file.<?xml version="1.0" ?> +<configuration> + <runtime> + <id>plc_python_example1</id> + <image>pivotaldata/plcontainer_python_with_clients:0.1</image> + <command>./pyclient</command> + </runtime> + <runtime> + <id>plc_python_example2</id> + <image>pivotaldata/plcontainer_python_without_clients:0.1</image> + <command>/clientdir/pyclient.sh</command> + <shared_directory access="ro" container="/clientdir" host="/usr/local/greenplum-db/bin/plcontainer_clients"/> + <setting memory_mb="512"/> + <setting use_network="yes"/> + <setting logs="enable"/> + </runtime> + <runtime> + <id>plc_r_example</id> + <image>pivotaldata/plcontainer_r_without_clients:0.2</image> + <command>/clientdir/rclient.sh</command> + <shared_directory access="ro" container="/clientdir" host="/usr/local/greenplum-db/bin/plcontainer_clients"/> + <setting logs="enable"/> + </runtime> + <runtime> </configuration>

      These are the XML elements and attributes in a PL/Container configuration file.

      @@ -668,97 +809,119 @@ $$ LANGUAGE plcontainer;

      Root element for the XML file. - container - One element for each specific container available in the system. Child element of - the configuration element. + runtime + One element for each specific container available in the system. These are child + elements of the configuration element. - name - Required. The value is used to reference a Docker container from a function. - Only containers defined in the PL/Container configuration file can be specified - in PL/Container functions. A Docker container cannot be referenced by its full - Docker name (container ID) for security reasons. This name must be unique in the - configuration file. + id + Required. The value is used to reference a Docker container from a + PL/Container user-defined function. The id value must be unique + in the configuration. + The id specifies which Docker image to use when PL/Container + creates a Docker container to execute a user-defined function. - container_id + image

      Required. The value is the full Docker image name, including image tag. The same way you specify them for starting this container in Docker. Configuration allows to have many container objects referencing the same image name, this way in Docker they would be represented by identical containers.

      -

      For example, you might have two containers named - plc_python_128 and plc_python_256, both - referencing the Docker image - pivotaldata/plcontainer_python:1.0.0, but first one with - 128MB RAM limit and the second one with 256MB limit that is specified by the - memory_mb element.

      +

      For example, you might have two runtime elements, with + different id elements, plc_python_128 and + plc_python_256, both referencing the Docker image + pivotaldata/plcontainer_python:1.0.0. The first + runtime specifies a 128MB RAM limit and the second one + specifies a 256MB limit that is specified by the memory_mb + attribute of a setting element.

      command Required. The value is the command to be run inside of container to start the - client process inside in the container. - You should modify it only if you build your custom container and want to + client process inside in the container. When creating a runtime + element, the plcontainer utility adds a + command element based on the language (the + -l option). + command element for the python + language.<command>/clientdir/pyclient.sh</command> + command element for the R + language.<command>/clientdir/rclient.sh</command> + You should modify the value only if you build a custom container and want to implement some additional initialization logic before the container - starts.This element cannot be set with the plcontainer - install command. You can update the configuration file with the - with the plcontainer configure -e command. - - - memory_mb - The value specifies the amount of memory container is allowed to use, in MB. - Each container is started with this amount of RAM and twice the amount of swap - space. The container memory consumption is limited by the host system - cgroupsconfiguration, which means in case of memory - overcommit, the container is killed by the System.You can add this element - by editing the configuration file with the plcontainer configure - -e command. + starts.This element cannot be set with the plcontainer + utility. You can update the configuration file with the with the + plcontainer runtime-edit command.
      shared_directory - Required. This element specifies one or more shared directories for a - container, with different sharing options. There must be at least one shared - directory between client location and the directory in the container, - /clientdir usually in the Pivotal provided image. - XML attributes allowed:
        -
      • host - specifies a shared directory location on the host - system.
      • -
      • container - specifies a directory location inside of + Optional. This element specifies a shared Docker shared volume for a container + with access information. Mutliple shared_directory elements are + allowed. Each shared_directory element specifies a single + shared volume. XML attributes for the shared_directory + element:
          +
        • host - a directory location on the host system.
        • +
        • container - a directory location inside of container.
        • -
        • access - specifies access level to this shared directory, - which can be either ro (read-only) or rw - (read-write).
        • +
        • access - access level to the host directory, which can be + either ro (read-only) or rw (read-write). +
        - The plcontainer utility sets a read-only shared volume when - the Docker images are installed. - This is the shared_directory element that the utility creates - for the Greenplum PL/R Docker image. - - <shared_directory access="ro" container="/clientdir" host="/usr/local/greenplum-db/./bin/rclient"/> - - This is the shared_directory element that the utility creates - for the Greenplum PL/Python Docker - image.<shared_directory access="ro" container="/clientdir" host="/usr/local/greenplum-db/./bin/pyclient"/> - If needed, you can specify other shared directories. Specifying the same - shared directory as the one that is automatically set by the utility will cause - a Docker container startup failure. - When specifying read-write access to host directory, ensure that the specified - host directory has the correct permissions. Also, if a PL/Container - container is configured with read-write access to a host - directory, PL/Container could run multiple Docker containers on a host that - change data in the directory. This might cause issues when running PL/Container - user-defined functions that access the shared directory. + When creating a runtime element, the + plcontainer utility adds a shared_directory + element.<shared_directory access="ro" container="/clientdir" host="/usr/local/greenplum-db/bin/plcontainer_clients"/> + For each runtime element, the container + attribute of the shared_directory elements must be unique. For + example, a runtime element cannot have two + shared_directory elements with attribute + container="/clientdir". Allowing + read-write access to a host directory requires special consideration.
          +
        • When specifying read-write access to host directory, ensure that the + specified host directory has the correct permissions.
        • +
        • When running PL/Container user-defined functions, multiple concurrent + Docker containers that are running on a host could change data in the host + directory. Ensure that the functions support multiple concurrent access to + the data in the host directory.
        • +
        - use_network - -

        Optional. The value can be either yes or no - to specify whether use TCP or IPC for - communication between the Greenplum Database process and the Docker container - process. The default is no use IPC.

        -
        + settings + Optional. This element specifies Docker container configuration information. + Each setting element contains one attribute. The element + attribute specifies logging, memory, or networking information. For example, + this element enables + logging.<setting logs="enable"/> + These are the valid attributes. + + logs="{enable | disable}" + Enables or disables Docker logging for the container. The attribute + logs="enable" enables logging. The attribute + logs="disable" disables logging (the default). + On Red Hat 7 or CentOS 7 systems, the log is sent to the + journald service. On Red Hat 6 or CentOS 6 systems, the + log is sent to the syslogd service. + + + memory_mb="size" + Optional. The value specifies the amount of memory, in MB, that a + container is allowed to use. Each container is started with this amount of + RAM and twice the amount of swap space. The container memory consumption + is limited by the host system cgroups configuration, + which means in case of memory overcommit, the container is killed by the + system. + + + use_network="{yes | no}" + The value can be either yes or no to + specify whether to use TCP (the value yes) or IPC (the + value no) for communication between the Greenplum + Database process and the Docker container process. The default is + no use IPC. + +
        @@ -769,31 +932,35 @@ $$ LANGUAGE plcontainer;

        Updating the PL/Container Configuration -

        You can add a container element to the PL/Container configuration file - with the plcontainer configure command specifying options with options - that specify values such as the name, Docker image, command, and shared directory. You can - use the plcontainer configure command with the -e option to edit the - configuration file. The utility updates the configuration file on all hosts.

        -

        The PL/Container configuration file can contain multiple container +

        You can add a runtime element to the PL/Container configuration file + with the plcontainer runtime-add command. The command options specify + information such as the runtime ID, Docker image, and language. You can use the + plcontainer runtime-replace command to update an existing + runtime element. The utility updates the configuration file on the + master and all segment instances.

        +

        The PL/Container configuration file can contain multiple runtime elements that reference the same Docker image specified by the XML element - image. In the example configuration file, the image - specifies contains container elements named - plc_python_128 and plc_python_256, both referencing - the Docker container pivotaldata/plcontainer_python:1.0.0. The first - element is defined with a 128MB RAM limit and the second one with a 256MB RAM limit.

        + image. In the example configuration file, the runtime + elements contain id elements named plc_python_128 and + plc_python_256, both referencing the Docker container + pivotaldata/plcontainer_python:1.0.0. The first + runtime element is defined with a 128MB RAM limit and the second one + with a 256MB RAM limit.

        <configuration> - <container> - <name>plc_python_128</name> + <runtime> + <id>plc_python_128</id> <image>pivotaldata/plcontainer_python:1.0.0</image> <command>./client</command> - <memory_mb>128</memory_mb> - </container> - <container> - <name>plc_python_256</name> - <cimage>pivotaldata/plcontainer_python:1.0.0</image> + <shared_directory access="ro" container="/clientdir" host="/usr/local/gpdb/bin/plcontainer_clients"/> + <setting memory_mb="128"/> + </runtime> + <runtime> + <id>plc_python_256</id> + <image>pivotaldata/plcontainer_python:1.0.0</image> <command>./client</command> - <memory_mb>256</memory_mb> - </container> + <shared_directory access="ro" container="/clientdir" host="/usr/local/gpdb/bin/plcontainer_clients"/> + <setting memory_mb="256"/> + </runtime> <configuration>
        @@ -801,15 +968,17 @@ $$ LANGUAGE plcontainer;

        Notes
          -
        • PL/Container configuration file plcontainer_configuration.xml is - stored in all the Greenplum Database data directories for all the Greenplum Database - segment instances: master, standby master, primary and mirror. This query lists the - Greenplum Database system data - directories:select g.hostname, fe.fselocation as directory - from pg_filespace as f, pg_filespace_entry as fe, - gp_segment_configuration as g - where f.oid = fe.fsefsoid and g.dbid = fe.fsedbid - and f.fsname = 'pg_system';
        • +
        • PL/Container maintains the configuration file + plcontainer_configuration.xml in the data directory of all Greenplum + Database segment instances: master, standby master, primary, and mirror. This query + lists the Greenplum Database system data + directories:SELECT g.hostname, fe.fselocation as directory + FROM pg_filespace AS f, pg_filespace_entry AS fe, + gp_segment_configuration AS g + WHERE f.oid = fe.fsefsoid AND g.dbid = fe.fsedbid + AND f.fsname = 'pg_system';

          A + sample PL/Container configuration file is in + $GPHOME/share/postgresql/plcontainer.

        • In some cases, when PL/Container is running in a high concurrency environment, the Docker daemon hangs with log entries that indicate a memory shortage. This can happen even when the system seems to have adequate free memory.

          The issue seems to be -- GitLab