提交 4bddf0e6 编写于 作者: D David Yozie 提交者: GitHub

DOCS: Adding security guide source (#2698)

* DOCS: Adding security guide source

* Proposed updates from review
上级 464ecf37
<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE map PUBLIC "-//OASIS//DTD DITA Map//EN" "map.dtd">
<map>
<title>Security Configuration Guide</title>
<topicref href="topics/preface.xml" id="preface"/>
<topicref href="topics/SecuringGPDB.xml"/>
<topicref href="topics/ports_and_protocols.xml"/>
<topicref href="topics/Authenticate.xml"/>
<topicref href="topics/Authorization.xml"/>
<topicref href="topics/gpcc.xml" otherprops="pivotal"/>
<topicref href="topics/Auditing.xml"/>
<topicref href="topics/Encryption.xml"/>
<topicref href="topics/kerberos-hdfs.xml"/>
<topicref href="topics/BestPractices.xml"/>
</map>
<?xml version="1.0" encoding="utf-8"?>
<val>
<prop att="otherprops" val="op-draft" action="exclude"/>
<prop att="otherprops" val="op-help" action="exclude"/>
<prop att="otherprops" val="op-hidden" action="exclude"/>
<prop att="otherprops" val="op-print" action="include"/>
</val>
此差异已折叠。
此差异已折叠。
此差异已折叠。
<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE topic PUBLIC "-//OASIS//DTD DITA Topic//EN" "topic.dtd">
<topic id="topic_ivr_cs2_jr">
<title>Configuring Database Authorization</title>
<body>
<p>Authorization governs access to Greenplum Database database objects. </p>
</body>
<topic id="topic_k35_qtx_kr">
<title>Access Permissions and Roles</title>
<body>
<p>Greenplum Database manages database access permissions using <i>roles</i>. The concept of
roles subsumes the concepts of users and groups. A role can be a database user, a group, or
both. Roles can own database objects (for example, tables) and can assign privileges on
those objects to other roles to control access to the objects. Roles can be members of other
roles, thus a member role can inherit the object privileges of its parent role. </p>
<p>Every Greenplum Database system contains a set of database roles (users and groups). Those
roles are separate from the users and groups managed by the operating system on which the
server runs. However, for convenience you may want to maintain a relationship between
operating system user names and Greenplum Database role names, since many of the client
applications use the current operating system user name as the default. </p>
<p> In Greenplum Database, users log in and connect through the master instance, which
verifies their role and access privileges. The master then issues out commands to the
segment instances behind the scenes using the currently logged in role. </p>
<p> Roles are defined at the system level, so they are valid for all databases in the system. </p>
<p> To bootstrap the Greenplum Database system, a freshly initialized system always contains
one predefined superuser role (also referred to as the system user). This role will have the
same name as the operating system user that initialized the Greenplum Database system.
Customarily, this role is named <codeph>gpadmin</codeph>. To create more roles you first
must connect as this initial role. </p>
</body>
</topic>
<topic id="managing_obj_priv">
<title>Managing Object Privileges</title>
<body>
<p> When an object (table, view, sequence, database, function, language, schema, or
tablespace) is created, it is assigned an owner. The owner is normally the role that
executed the creation statement. For most kinds of objects, the initial state is that only
the owner (or a superuser) can do anything with the object. To allow other roles to use it,
privileges must be granted. Greenplum Database supports the following privileges for each
object type: </p>
<table>
<tgroup cols="0">
<colspec colwidth="50*" align="left"/>
<colspec colwidth="50*" align="left"/>
<thead>
<row>
<entry> Object Type </entry>
<entry> Privileges </entry>
</row>
</thead>
<tbody>
<row>
<entry> Tables, Views, Sequences </entry>
<entry>
<sl>
<sli>
<codeph>SELECT</codeph>
</sli>
<sli>
<codeph>INSERT</codeph>
</sli>
<sli>
<codeph>UPDATE</codeph>
</sli>
<sli>
<codeph>DELETE</codeph>
</sli>
<sli>
<codeph>RULE</codeph>
</sli>
<sli>
<codeph>ALL</codeph>
</sli>
</sl>
</entry>
</row>
<row>
<entry> External Tables </entry>
<entry>
<sl>
<sli>
<codeph>SELECT</codeph>
</sli>
<sli>
<codeph>RULE</codeph>
</sli>
<sli>
<codeph>ALL</codeph>
</sli>
</sl>
</entry>
</row>
<row>
<entry> Databases </entry>
<entry>
<sl>
<sli>
<codeph>CONNECT</codeph>
</sli>
<sli>
<codeph>CREATE</codeph>
</sli>
<sli>
<codeph>TEMPORARY | TEMP</codeph>
</sli>
<sli>
<codeph>ALL</codeph>
</sli>
</sl>
</entry>
</row>
<row>
<entry> Functions </entry>
<entry>
<codeph>EXECUTE</codeph>
</entry>
</row>
<row>
<entry> Procedural Languages </entry>
<entry>
<codeph>USAGE</codeph>
</entry>
</row>
<row>
<entry> Schemas </entry>
<entry>
<sl>
<sli>
<codeph>CREATE</codeph>
</sli>
<sli>
<codeph>USAGE</codeph>
</sli>
<sli>
<codeph>ALL</codeph>
</sli>
</sl>
</entry>
</row>
</tbody>
</tgroup>
</table>
<p> Privileges must be granted for each object individually. For example, granting
<codeph>ALL</codeph> on a database does not grant full access to the objects within that
database. It only grants all of the database-level privileges (<codeph>CONNECT</codeph>,
<codeph>CREATE</codeph>, <codeph>TEMPORARY</codeph>) to the database itself. </p>
<p> Use the <codeph>GRANT</codeph> SQL command to give a specified role privileges on an
object. For example: <codeblock>=# GRANT INSERT ON mytable TO jsmith; </codeblock></p>
<p> To revoke privileges, use the <codeph>REVOKE</codeph> command. For example:
<codeblock>=# REVOKE ALL PRIVILEGES ON mytable FROM jsmith; </codeblock></p>
<p> You can also use the <codeph>DROP OWNED</codeph> and <codeph>REASSIGN OWNED</codeph>
commands for managing objects owned by deprecated roles. (Note: only an object's owner or a
superuser can drop an object or reassign ownership.) For example:
<codeblock> =# REASSIGN OWNED BY sally TO bob;
=# DROP OWNED BY visitor; </codeblock></p>
</body>
</topic>
<topic id="using-ssh-256">
<title>Using SSH-256 Encryption</title>
<body>
<p>Greenplum Database access control corresponds roughly to the Orange Book 'C2' level of
security, not the 'B1' level. Greenplum Database currently supports access privileges at the
object level. Row-level or column-level access is not supported, nor is labeled security. </p>
<p>Row-level and column-level access can be simulated using views to restrict the columns
and/or rows that are selected. Row-level labels can be simulated by adding an extra column
to the table to store sensitivity information, and then using views to control row-level
access based on this column. Roles can then be granted access to the views rather than the
base table. While these workarounds do not provide the same as "B1" level security, they may
still be a viable alternative for many organizations. </p>
<p> To use SHA-256 encryption, you must set a parameter either at the system or the session
level. This section outlines how to use a server parameter to implement SHA-256 encrypted
password storage. Note that in order to use SHA-256 encryption for storage, the client
authentication method must be set to password rather than the default, MD5. (See <xref
href="Authenticate.xml#topic_fzv_wb2_jr/config_ssl_client_conn"/> for more details.) This
means that the password is transmitted in clear text over the network, so we highly
recommend that you set up SSL to encrypt the client server communication channel. </p>
<p> You can set your chosen encryption method system-wide or on a per-session basis. The
available encryption methods are SHA-256 and MD5 (for backward compatibility). </p>
</body>
<topic id="system-wide">
<title>Setting Encryption Method System-wide</title>
<body>
<p>To set the <codeph>password_hash_algorithm</codeph> server parameter on a complete
Greenplum system (master and its segments): <ol id="ol_hcg_hw2_jr">
<li> Log in to your Greenplum Database instance as a superuser. </li>
<li> Execute <codeph>gpconfig</codeph> with the <codeph>password_hash_algorithm</codeph>
set to
SHA-256:<codeblock>$ gpconfig -c password_hash_algorithm -v 'SHA-256' </codeblock></li>
<li> Verify the setting: <codeblock>$ gpconfig -s</codeblock><p> You will see:
<codeblock>Master value: SHA-256
Segment value: SHA-256 </codeblock></p></li>
</ol></p>
</body>
</topic>
<topic id="individual_session">
<title>Setting Encryption Method for an Individual Session</title>
<body>
<p> To set the <codeph>password_hash_algorithm</codeph> server parameter for an individual
session: </p>
<ol id="ol_iv1_cvx_kr">
<li> Log in to your Greenplum Database instance as a superuser. </li>
<li> Set the <codeph>password_hash_algorithm</codeph> to
SHA-256:<codeblock># set password_hash_algorithm = 'SHA-256'
</codeblock></li>
<li> Verify the setting: <codeblock># show password_hash_algorithm;</codeblock><p> You
will see: </p><codeblock>SHA-256 </codeblock></li>
</ol>
<p> Following is an example of how the new setting works: </p>
<ol id="ol_ex1_cvx_kr">
<li> Log in as a super user and verify the password hash algorithm
setting:<codeblock># show password_hash_algorithm
password_hash_algorithm
-------------------------------
SHA-256</codeblock></li>
<li> Create a new role with password that has login privileges.
<codeblock>create role testdb with password 'testdb12345#' LOGIN; </codeblock></li>
<li> Change the client authentication method to allow for storage of SHA-256 encrypted
passwords:<p> Open the <codeph>pg_hba.conf</codeph> file on the master and add the
following line: </p><codeblock>host all testdb 0.0.0.0/0 password
</codeblock></li>
<li> Restart the cluster. </li>
<li> Log in to the database as the user just created, <codeph>testdb</codeph>.
<codeblock>psql -U testdb</codeblock></li>
<li> Enter the correct password at the prompt. </li>
<li> Verify that the password is stored as a SHA-256 hash. <p> Password hashes are stored
in <codeph>pg_authid.rolpasswod</codeph>. </p></li>
<li> Log in as the super user. </li>
<li> Execute the following query:
<codeblock>
# SELECT rolpassword FROM pg_authid WHERE rolname = 'testdb';
Rolpassword
-----------
sha256&lt;64 hexidecimal characters&gt;
</codeblock></li>
</ol>
</body>
</topic>
</topic>
<topic id="time-based-restriction">
<title>Restricting Access by Time</title>
<body>
<p> Greenplum Database enables the administrator to restrict access to certain times by role.
Use the <codeph>CREATE ROLE</codeph> or <codeph>ALTER ROLE</codeph> commands to specify
time-based constraints. </p>
<p> Access can be restricted by day or by day and time. The constraints are removable without
deleting and recreating the role. </p>
<p> Time-based constraints only apply to the role to which they are assigned. If a role is a
member of another role that contains a time constraint, the time constraint is not
inherited. </p>
<p> Time-based constraints are enforced only during login. The <codeph>SET ROLE</codeph> and
<codeph>SET SESSION AUTHORIZATION</codeph> commands are not affected by any time-based
constraints. </p>
<p> Superuser or <codeph>CREATEROLE</codeph> privileges are required to set time-based
constraints for a role. No one can add time-based constraints to a superuser. </p>
<p> There are two ways to add time-based constraints. Use the keyword <codeph>DENY</codeph> in
the <codeph>CREATE ROLE</codeph> or <codeph>ALTER ROLE</codeph> command followed by one of
the following. <ul id="ul_bjq_jz2_jr">
<li> A day, and optionally a time, when access is restricted. For example, no access on
Wednesdays. </li>
<li> An interval—that is, a beginning and ending day and optional time—when access is
restricted. For example, no access from Wednesday 10 p.m. through Thursday at 8 a.m.
</li>
</ul></p>
<p> You can specify more than one restriction; for example, no access Wednesdays at any time
and no access on Fridays between 3:00 p.m. and 5:00 p.m.  </p>
<p> There are two ways to specify a day. Use the word <codeph>DAY</codeph> followed by either
the English term for the weekday, in single quotation marks, or a number between 0 and 6, as
shown in the table below. </p>
<table id="table_az1_cvx_kr">
<tgroup cols="0">
<colspec colwidth="50*" align="left"/>
<colspec colwidth="50*" align="left"/>
<thead>
<row>
<entry> English Term </entry>
<entry> Number </entry>
</row>
</thead>
<tbody>
<row>
<entry> DAY 'Sunday' </entry>
<entry> DAY 0 </entry>
</row>
<row>
<entry> DAY 'Monday' </entry>
<entry> DAY 1 </entry>
</row>
<row>
<entry> DAY 'Tuesday' </entry>
<entry> DAY 2 </entry>
</row>
<row>
<entry> DAY 'Wednesday' </entry>
<entry> DAY 3 </entry>
</row>
<row>
<entry> DAY 'Thursday' </entry>
<entry> DAY 4 </entry>
</row>
<row>
<entry> DAY 'Friday' </entry>
<entry> DAY 5 </entry>
</row>
<row>
<entry> DAY 'Saturday' </entry>
<entry> DAY 6 </entry>
</row>
</tbody>
</tgroup>
</table>
<p> A time of day is specified in either 12- or 24-hour format. The word <codeph>TIME</codeph>
is followed by the specification in single quotation marks. Only hours and minutes are
specified and are separated by a colon ( : ). If using a 12-hour format, add
<codeph>AM</codeph> or <codeph>PM</codeph> at the end. The following examples show various
time specifications. </p>
<codeblock>TIME '14:00' # 24-hour time implied
TIME '02:00 PM' # 12-hour time specified by PM
TIME '02:00' # 24-hour time implied. This is equivalent to TIME '02:00 AM'. </codeblock>
<note type="important">Time-based authentication is enforced with the server time. Timezones
are disregarded. </note>
<p> To specify an interval of time during which access is denied, use two day/time
specifications with the words <codeph>BETWEEN</codeph> and <codeph>AND</codeph>, as shown.
<codeph>DAY</codeph> is always required. </p>
<codeblock>BETWEEN DAY 'Monday' AND DAY 'Tuesday'
BETWEEN DAY 'Monday' TIME '00:00' AND
       DAY 'Monday' TIME '01:00'
BETWEEN DAY 'Monday' TIME '12:00 AM' AND
       DAY 'Tuesday' TIME '02:00 AM'
BETWEEN DAY 'Monday' TIME '00:00' AND
       DAY 'Tuesday' TIME '02:00'
       DAY 2 TIME '02:00'</codeblock>
<p>The last three statements are equivalent. </p>
<note>Intervals of days cannot wrap past Saturday.</note>
<p>The following syntax is not correct:
<codeblock>DENY BETWEEN DAY 'Saturday' AND DAY 'Sunday'</codeblock>
</p>
<p> The correct specification uses two DENY clauses, as follows: </p>
<codeblock>DENY DAY 'Saturday'
DENY DAY 'Sunday'</codeblock>
<p> The following examples demonstrate creating a role with time-based constraints and
modifying a role to add time-based constraints. Only the statements needed for time-based
constraints are shown. For more details on creating and altering roles see the descriptions
of <codeph>CREATE ROLE</codeph> and <codeph>ALTER ROLE</codeph> in in the <i>Greenplum
Database Reference Guide</i>. </p>
<section>
<title> Example 1 – Create a New Role with Time-based Constraints </title>
<p> No access is allowed on weekends. </p>
<codeblock> CREATE ROLE generaluser
DENY DAY 'Saturday'
DENY DAY 'Sunday'
... </codeblock>
</section>
<section>
<title>Example 2 – Alter a Role to Add Time-based Constraints </title>
<p> No access is allowed every night between 2:00 a.m. and 4:00 a.m. </p>
<codeblock>ALTER ROLE generaluser
DENY BETWEEN DAY 'Monday' TIME '02:00' AND DAY 'Monday' TIME '04:00'
DENY BETWEEN DAY 'Tuesday' TIME '02:00' AND DAY 'Tuesday' TIME '04:00'
DENY BETWEEN DAY 'Wednesday' TIME '02:00' AND DAY 'Wednesday' TIME '04:00'
DENY BETWEEN DAY 'Thursday' TIME '02:00' AND DAY 'Thursday' TIME '04:00'
DENY BETWEEN DAY 'Friday' TIME '02:00' AND DAY 'Friday' TIME '04:00'
DENY BETWEEN DAY 'Saturday' TIME '02:00' AND DAY 'Saturday' TIME '04:00'
DENY BETWEEN DAY 'Sunday' TIME '02:00' AND DAY 'Sunday' TIME '04:00'
... </codeblock>
</section>
<section>
<title>Excample 3 – Alter a Role to Add Time-based Constraints </title>
<p> No access is allowed Wednesdays or Fridays between 3:00 p.m. and 5:00 p.m. </p>
<codeblock>ALTER ROLE generaluser
DENY DAY 'Wednesday'
DENY BETWEEN DAY 'Friday' TIME '15:00' AND DAY 'Friday' TIME '17:00'
</codeblock>
</section>
</body>
</topic>
<topic id="drop_timebased_restriction">
<title> Dropping a Time-based Restriction </title>
<body>
<p> To remove a time-based restriction, use the ALTER ROLE command. Enter the keywords DROP
DENY FOR followed by a day/time specification to drop. </p>
<codeblock>DROP DENY FOR DAY 'Sunday' </codeblock>
<p> Any constraint containing all or part of the conditions in a DROP clause is removed. For
example, if an existing constraint denies access on Mondays and Tuesdays, and the DROP
clause removes constraints for Mondays, the existing constraint is completely dropped. The
DROP clause completely removes all constraints that overlap with the contraint in the drop
clause. The overlapping constraints are completely removed even if they contain more
restrictions that the restrictions mentioned in the DROP clause. </p>
<p> Example 1 - Remove a Time-based Restriction from a Role </p>
<codeblock> ALTER ROLE generaluser
DROP DENY FOR DAY 'Monday'
... </codeblock>
<p> This statement would remove all constraints that overlap with a Monday constraint for the
role <codeph>generaluser</codeph> in Example 2, even if there are additional
constraints.</p>
</body>
</topic>
</topic>
<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE topic PUBLIC "-//OASIS//DTD DITA Topic//EN" "topic.dtd">
<topic id="topic_nyc_mdf_jr">
<title>Security Best Practices</title>
<body>
<p> This chapter describes basic security best practices that you should follow to ensure the
highest level of system security.  </p>
<p>Greenplum Database default security configuration:<ul id="ul_mhk_rrg_4r">
<li>Only local connections are allowed. </li>
<li>Basic authentication is configured for the superuser (<codeph>gpadmin</codeph>).</li>
<li>The superuser is authorized to do anything.</li>
<li>Only database role passwords are encrypted.</li>
</ul></p>
<section>
<title>System User (gpadmin)</title>
<p>Secure and limit access to the <codeph>gpadmin</codeph> system user. </p>
<p>Greenplum requires a UNIX user id to install and initialize the Greenplum Database system.
This system user is referred to as <codeph>gpadmin</codeph> in the Greenplum documentation.
The <codeph>gpadmin</codeph> user is the default database superuser in Greenplum Database,
as well as the file system owner of the Greenplum installation and its underlying data
files. The default administrator account is fundamental to the design of Greenplum Database.
The system cannot run without it, and there is no way to limit the access of the
<codeph>gpadmin</codeph> user id. </p>
<p>The <codeph>gpadmin</codeph> user can bypass all security features of Greenplum Database.
Anyone who logs on to a Greenplum host with this user id can read, alter, or delete any
data, including system catalog data and database access rights. Therefore, it is very
important to secure the <codeph>gpadmin</codeph> user id and only allow essential system
administrators access to it. </p>
<p>Administrators should only log in to Greenplum as <codeph>gpadmin</codeph> when performing
certain system maintenance tasks (such as upgrade or expansion). </p>
<p>Database users should never log on as <codeph>gpadmin</codeph>, and ETL or production
workloads should never run as <codeph>gpadmin</codeph>. </p>
</section>
<section>
<title>Superusers</title>
<p>Roles granted the <codeph>SUPERUSER</codeph> attribute are superusers. Superusers bypass
all access privilege checks and resource queues. Only system administrators should be given
superuser rights. </p>
<p> See "Altering Role Attributes" in the <i>Greenplum Database Administrator Guide</i>. </p>
</section>
<section>
<title>Login Users</title>
<p>Assign a distinct role to each user who logs in and set the <codeph>LOGIN</codeph>
attribute. </p>
<p>For logging and auditing purposes, each user who is allowed to log in to Greenplum Database
should be given their own database role. For applications or web services, consider creating
a distinct role for each application or service. See "Creating New Roles (Users)" in the
<i>Greenplum Database Administrator Guide</i>. </p>
<p>Each login role should be assigned to a single, non-default resource queue.</p>
</section>
<section>
<title>Groups</title>
<p>Use groups to manage access privileges.</p>
<p>Create a group for each logical grouping of object/access permissions. </p>
<p>Every login user should belong to one or more roles. Use the <codeph>GRANT</codeph>
statement to add group access to a role. Use the <codeph>REVOKE</codeph> statement to remove
group access from a role. </p>
<p>The <codeph>LOGIN</codeph> attribute should not be set for group roles. </p>
<p>See "Creating Groups (Role Membership)" in the <i>Greenplum Database Administrator
Guide</i>. </p>
</section>
<section>
<title>Object Privileges</title>
<p>Only the owner and superusers have full permissions to new objects. Permission must be
granted to allow other rules (users or groups) to access objects. Each type of database
object has different privileges that may be granted. Use the <codeph>GRANT</codeph>
statement to add a permission to a role and the <codeph>REVOKE</codeph> statement to remove
the permission.</p>
<p>You can change the owner of an object using the <codeph>REASIGN OWNED BY</codeph>
statement. For example, to prepare to drop a role, change the owner of the objects that
belong to the role. Use the <codeph>DROP OWNED BY</codeph> to drop objects, including
dependent objects, that are owned by a role.</p>
<p>Schemas can be used to enforce an additional layer of object permissions checking, but
schema permissions do not override object privileges set on objects contained within the
schema.</p>
</section>
<section id="password-strength-recommendations">
<title>Operating System Users and File System</title>
<p> To protect the network from intrusion, system administrators should verify the passwords
used within an organization are sufficently strong. The following recommendations can
strengthen a password: </p>
<ul>
<li> Minimum password length recommendation: At least 9 characters. MD5 passwords should be
15 characters or longer. </li>
<li> Mix upper and lower case letters. </li>
<li> Mix letters and numbers. </li>
<li> Include non-alphanumeric characters. </li>
<li> Pick a password you can remember. </li>
</ul>
<p> The following are recommendations for password cracker software that you can use to
determine the strength of a password. </p>
<ul id="ul_gd3_fxg_4r">
<li> John The Ripper. A fast and flexible password cracking program. It allows the use of
multiple word lists and is capable of brute-force password cracking. It is available
online at <xref href="http://www.openwall.com/john/" format="html" scope="external"
>http://www.openwall.com/john/</xref>. </li>
<li>Crack. Perhaps the most well-known password cracking software, Crack is also very fast,
though not as easy to use as John The Ripper. It can be found online at <xref
href="https://dropsafe.crypticide.com/alecm/software/crack/c50-faq.html" format="html"
scope="external"
>https://dropsafe.crypticide.com/alecm/software/crack/c50-faq.html</xref></li>
</ul>
<p>The security of the entire system depends on the strength of the root password. This
password should be at least 12 characters long and include a mix of capitalized letters,
lowercase letters, special characters, and numbers. It should not be based on any dictionary
word. </p>
<p> Password expiration parameters should be configured. </p>
<p> Ensure the following line exists within the file <codeph>/etc/libuser.conf</codeph> under
the <codeph>[import]</codeph> section. </p>
<codeblock>login_defs = /etc/login.defs
</codeblock>
<p> Ensure no lines in the <codeph>[userdefaults]</codeph> section begin with the following
text, as these words override settings from <codeph>/etc/login.defs</codeph>: </p>
<ul id="ul_ef3_fxg_4r">
<li>
<codeph>LU_SHADOWMAX</codeph>
</li>
<li>
<codeph>LU_SHADOWMIN</codeph>
</li>
<li>
<codeph>LU_SHADOWWARNING</codeph>
</li>
</ul>
<p> Ensure the following command produces no output. Any accounts listed by running this
command should be locked. </p>
<codeblock>
grep &quot;^+:&quot; /etc/passwd /etc/shadow /etc/group
</codeblock>
<p> Note: We strongly recommend that customers change their passwords after initial setup. </p>
<codeblock>
cd /etc
chown root:root passwd shadow group gshadow
chmod 644 passwd group
chmod 400 shadow gshadow
</codeblock>
<p> Find all the files that are world-writable and that do not have their sticky bits set. </p>
<codeblock>
find / -xdev -type d \( -perm -0002 -a ! -perm -1000 \) -print
</codeblock>
<p> Set the sticky bit (<codeph># chmod +t {dir}</codeph>) for all the directories that result
from running the previous command. </p>
<p> Find all the files that are world-writable and fix each file listed. </p>
<codeblock>
find / -xdev -type f -perm -0002 -print
</codeblock>
<p> Set the right permissions (<codeph># chmod o-w {file}</codeph>) for all the files
generated by running the aforementioned command. </p>
<p> Find all the files that do not belong to a valid user or group and either assign an owner
or remove the file, as appropriate. </p>
<codeblock>
find / -xdev \( -nouser -o -nogroup \) -print
</codeblock>
<p> Find all the directories that are world-writable and ensure they are owned by either root
or a system account (assuming only system accounts have a User ID lower than 500). If the
command generates any output, verify the assignment is correct or reassign it to root. </p>
<codeblock>
find / -xdev -type d -perm -0002 -uid +500 -print
</codeblock>
<p> Authentication settings such as password quality, password expiration policy, password
reuse, password retry attempts, and more can be configured using the Pluggable
Authentication Modules (PAM) framework. PAM looks in the directory
<codeph>/etc/pam.d</codeph> for application-specific configuration information. Running
<codeph>authconfig</codeph> or <codeph>system-config-authentication</codeph> will re-write
the PAM configuration files, destroying any manually made changes and replacing them with
system defaults. </p>
<p> The default <codeph>pam_cracklib</codeph> PAM module provides strength checking for
passwords. To configure <codeph>pam_cracklib</codeph> to require at least one uppercase
character, lowercase character, digit, and special character, as recommended by the U.S.
Department of Defense guidelines, edit the file <codeph>/etc/pam.d/system-auth</codeph> to
include the following parameters in the line corresponding to password requisite
<codeph>pam_cracklib.so try_first_pass</codeph>. </p>
<codeblock>retry=3:
dcredit=-1. Require at least one digit
ucredit=-1. Require at least one upper case character
ocredit=-1. Require at least one special character
lcredit=-1. Require at least one lower case character
minlen-14. Require a minimum password length of 14.</codeblock>
<p> For example: </p>
<codeblock>
password required pam_cracklib.so try_first_pass retry=3\minlen=14 dcredit=-1 ucredit=-1 ocredit=-1 lcredit=-1
</codeblock>
<p> These parameters can be set to reflect your security policy requirements. Note that the
password restrictions are not applicable to the root password. </p>
<p> The <codeph>pam_tally2</codeph> PAM module provides the capability to lock out user
accounts after a specified number of failed login attempts. To enforce password lockout,
edit the file <codeph>/etc/pam.d/system-auth</codeph> to include the following lines:<ul
id="ul_ehw_ls2_lr">
<li>The first of the auth lines should
include:<codeblock>auth required pam_tally2.so deny=5 onerr=fail unlock_time=900</codeblock></li>
<li>The first of the account lines should
include:<codeblock>account required pam_tally2.so</codeblock></li>
</ul></p>
<p>Here, the deny parameter is set to limit the number of retries to 5 and the
<codeph>unlock_time</codeph> has been set to 900 seconds to keep the account locked for
900 seconds before it is unlocked. These parameters may be configured appropriately to
reflect your security policy requirements. A locked account can be manually unlocked using
the <codeph>pam_tally2</codeph>
utility:<codeblock>
/sbin/pam_tally2 --user {username} -reset
</codeblock></p>
<p> You can use PAM to limit the reuse of recent passwords. The remember option for the
<codeph>pam_ unix</codeph> module can be set to remember the recent passwords and prevent
their reuse. To accomplish this, edit the appropriate line in
<codeph>/etc/pam.d/system-auth</codeph> to include the remember option. </p>
<p> For example: </p>
<codeblock>
password sufficient pam_unix.so [ … existing_options …]
remember=5
</codeblock>
<p> You can set the number of previous passwords to remember to appropriately reflect your
security policy requirements. </p>
<codeblock>
cd /etc
chown root:root passwd shadow group gshadow
chmod 644 passwd group
chmod 400 shadow gshadow
</codeblock>
</section>
</body>
</topic>
此差异已折叠。
<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE topic PUBLIC "-//OASIS//DTD DITA Topic//EN" "topic.dtd">
<topic id="topic_iqr_ym2_jr">
<title>Securing the Database</title>
<body>
<p>The intent of security configuration is to configure the Greenplum Database server to
eliminate as many security vulnerabilities as possible. This guide provides a baseline for
minimum security requirements, and is supplemented by additional security documentation. </p>
<p>The essential security requirements fall into the following categories: <ul
id="ul_fwy_bn2_jr">
<li><xref href="Authenticate.xml#topic_n5w_gtd_jr">Authentication</xref> covers the
mechanisms that are supported and that can be used by the Greenplum database server to
establish the identity of a client application.</li>
<li><xref href="Authorization.xml#topic_ivr_cs2_jr">Authorization</xref> pertains to the
privilege and permission models used by the database to authorize client access. </li>
<li><xref href="Auditing.xml#topic_ufw_zn2_jr">Auditing</xref>, or log settings, covers the
logging options available in Greenplum Database to track successful or failed user
actions.</li>
<li><xref href="Encryption.xml#topic_th5_5bf_jr">Data Encryption</xref> addresses the
encryption capabilities that are available for protecting data at rest and data in
transit. This includes the security certifications that are relevant to the Greenplum
Database. </li>
</ul></p>
<section>
<title>Accessing a Kerberized Hadoop Cluster</title>
<p>Greenplum Database can read or write external tables in a Hadoop file system. If the Hadoop
cluster is secured with Kerberos ("Kerberized"), Greenplum Database must be configured to
allow external table owners to authenticate with Kerberos. See <xref
href="kerberos-hdfs.xml#topic_lhr_yrf_qr"/> for the steps to perform this setup. </p>
</section>
<section>
<title>Platform Hardening</title>
<p>Platform hardening involves assessing and minimizing system vulnerability by following best
practices and enforcing federal security standards. Hardening the product is based on the US
Department of Defense (DoD) guidelines Security Template Implementation Guides (STIG).
Hardening removes unnecessary packages, disables services that are not required, sets up
restrictive file and directory permissions, removes unowned files and directories, performs
authentication for single-user mode, and provides options for end users to configure the
package to be compliant to the latest STIGs.  </p>
</section>
</body>
</topic>
<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE topic PUBLIC "-//OASIS//DTD DITA Topic//EN" "topic.dtd">
<topic id="topic_zyt_rxp_f5">
<title>Greenplum Command Center Security</title>
<body>
<p>Greenplum Command Center (GPCC) is a web-based application for monitoring and managing
Greenplum clusters. GPCC works with data collected by agents running on the segment hosts and
saved to the gpperfmon database. The gpperfmon database is created by running the
<codeph>gpperfmon_install</codeph> utility, which also creates the <codeph>gpmon</codeph>
database role that GPCC uses to access the gpperfmon database. </p>
<section>
<title>The gpmon User</title>
<p>The <codeph>gpperfmon_install</codeph> utility creates the <codeph>gpmon</codeph> database
role and adds the role to the <codeph>pg_hba.conf</codeph> file with the following
entries:<codeblock>local gpperfmon gpmon md5
host all gpmon 127.0.0.1/28 md5
host all gpmon ::1/128 md5</codeblock>These
entries allow <codeph>gpmon</codeph> to establish a local socket connection to the gpperfmon
database and a TCP/IP connection to any database. </p>
<p>The <codeph>gpmon</codeph> database role is a superuser. In a secure or production
environment, it may be desirable to restrict the <codeph>gpmon</codeph> user to just the
gpperfmon database. Do this by editing the <codeph>gpmon</codeph> host entry in the
<codeph>pg_hba.conf</codeph> file and changing <codeph>all</codeph> in the database field
to
<codeph>gpperfmon</codeph>:<codeblock>local gpperfmon gpmon md5
host gpperfmon gpmon 127.0.0.1/28 md5
host gpperfmon gpmon ::1/128 md5</codeblock></p>
<p>The password used to authenticate the <codeph>gpmon</codeph> user is set by the
<codeph>gpperfmon_install</codeph> utility and is stored in the <codeph>gpadmin</codeph>
home directory in the <codeph>~/.pgpass</codeph> file. The <codeph>~/.pgpass</codeph> file
must be owned by the <codeph>gpadmin</codeph> user and be RW-accessible only by the
<codeph>gpadmin</codeph> user. To change the <codeph>gpmon</codeph> password, use the
<codeph>ALTER ROLE</codeph> command to change the password in the database, change the
password in the <codeph>~/.pgpass</codeph> file, and then restart GPCC with the
<codeph>gpcmdr --restart <varname>instance_name</varname></codeph> command. </p>
<note>The GPCC web server can be configured to encrypt connections with SSL. Two-way
authentication with public keys can also be enabled for GPCC users. However, the
<codeph>gpmon</codeph> user always uses md5 authentication with the password saved in the
<codeph>~/.pgpass</codeph> file.</note>
<p>GPCC does not allow logins from any role configured with trust authentication, including
the <codeph>gpadmin</codeph> user. </p>
<p>The <codeph>gpmon</codeph> user can log in to the Command Center Console and has access to
all of the application's features. You can allow other database roles access to GPCC so that
you can secure the <codeph>gpmon</codeph> user and restrict other users' access to GPCC
features. Setting up other GPCC users is described in the next section. </p>
</section>
<section>
<title>Greenplum Command Center Users</title>
<p>GPCC has the following types of users:<ul id="ul_tdv_qnt_g5">
<li><i>Self Only</i> users can view metrics and view and cancel their own queries. Any
Greenplum Database user successfully authenticated through the Greenplum Database
authentication system can access Greenplum Command Center with Self Only permission.
Higher permission levels are required to view and cancel other’s queries and to access
the System and Admin Control Center features.</li>
<li><i>Basic</i> users can view metrics, view all queries, and cancel their own queries.
Users with Basic permission are members of the Greenplum Database
<codeph>gpcc_basic</codeph> group. </li>
<li><i>Operator Basic</i> users can view metrics, view their own and others’ queries,
cancel their own queries, and view the System and Admin screens. Users with Operator
Basic permission are members of the Greenplum Database
<codeph>gpcc_operator_basic</codeph> group.</li>
<li><i>Operator</i> users can view their own and others’ queries, cancel their own and
other’s queries, and view the System and Admin screens. Users with Operator permission
are members of the Greenplum Database <codeph>gpcc_operator</codeph> group.</li>
<li><i>Admin</i> users can access all views and capabilities in the Command Center.
Greenplum Database users with the <codeph>SUPERUSER</codeph> privilege have Admin
permissions in Command Center.</li>
</ul></p>
<p>To log in to the GPCC web application, a user must be allowed access to the gpperfmon
database in <codeph>pg_hba.conf</codeph>. For example, to make <codeph>user1</codeph> a
regular GPCC user, edit the <codeph>pg_hba.conf</codeph> file and either add or edit a line
for the user so that the gpperfmon database is included in the database field. For
example:</p>
<codeblock>host gpperfmon,accounts user1 127.0.0.1/28 md5</codeblock>
<p>To designate a user as an operator, grant the <codeph>gpcc_operator</codeph> role to the
user:<codeblock>=# GRANT gpcc_operator TO <varname>user</varname>;</codeblock></p>
<p>You can also grant <codeph>gpcc_operator</codeph> to a group role to make all members of
the group GPCC operators.</p>
<p>See the <codeph>gpperfmon_install</codeph> reference in <cite>Greenplum Database Utility
Guide</cite> for more information about managing the <codeph>gpperfmon</codeph>
database.</p>
</section>
<section>
<title>Enabling SSL for Greenplum Command Center</title>
<p>The GPCC web server can be configured to support SSL so that client connections are
encrypted. A server certificate can be generated when the Command Center instance is created
or you can supply an existing certificate. </p>
<p>Two-way authentication with public key encryption can also be enabled for GPCC. See the
<cite>Greenplum Command Center Administration Guide</cite> for instructions. </p>
</section>
<section>
<title>Enabling Kerberos Authentication for Greenplum Command Center Users</title>
<p>If Kerberos authentication is enabled for Greenplum Database, Command Center users can also
authenticate with Kerberos. Command Center supports three Kerberos authentication modes:
<i>strict</i>, <i>normal</i>, and <i>gpmon-only</i>. </p>
<parml>
<plentry>
<pt>Strict</pt>
<pd>Command Center has a Kerberos keytab file containing the Command Center service
principal and a principal for every Command Center user. If the principal in the
client’s connection request is in the keytab file, the web server grants the client
access and the web server connects to Greenplum Database using the client’s principal
name. If the principal is not in the keytab file, the connection request fails.</pd>
</plentry>
<plentry>
<pt>Normal</pt>
<pd>The Command Center Kerberos keytab file contains the Command Center principal and may
contain principals for Command Center users. If the principal in the client’s connection
request is in Command Center’s keytab file, it uses the client’s principal for database
connections. Otherwise, Command Center uses the <codeph>gpmon</codeph> user for database
connections.</pd>
</plentry>
<plentry>
<pt>gpmon-only</pt>
<pd>The Command Center uses the <codeph>gpmon</codeph> database role for all Greenplum
Database connections. No client principals are needed in the Command Center’s keytab
file.</pd>
</plentry>
</parml>
</section>
<p>See the <xref href="http://gptext.docs.pivotal.io" format="html" scope="external">Greenplum
Command Center documentation</xref> for instructions to enable Kerberos authentication with
Greenplum Command Center</p>
</body>
</topic>
<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE topic PUBLIC "-//OASIS//DTD DITA Topic//EN" "topic.dtd">
<topic id="topic_ewg_q3c_pr">
<title>Installing Greenplum Workload Manager</title>
<body>
<section>Prerequisites</section>
<ul id="ul_il5_53c_pr">
<li>Red Hat Enterprise Linux (RHEL) 5 or 6</li>
<li>Greenplum Database version 4.3.<i>x</i></li>
</ul>
<section>
<title>Running the Greenplum Workload Manager Installer</title>
<p>The Greenplum Workload Manager installer binary is installed on the Greenplum Database
master node. It will then automatically distribute to all segment servers in the database
cluster.</p>
<p>
<ol id="ol_sg3_bjc_pr">
<li>Run the Greenplum Workload Manager installer. You must specify the absolute path to an
installation directory where you have write permission. For example:
<codeblock>$ /bin/bash gp­wlm.bin ­­install=/usr/local/</codeblock> This command
installs Greenplum Workload Manager in the gp­wlmsubdirectory of the specified directory
on all of the segments, creating directories as needed. For example, the above command
installs Workload Manager in the <codeph>/usr/local/gp­wlmdirectory</codeph>.</li>
<li>For convenience you may source
<codeph><i>INSTALL­DIR</i>/gp­wlm/gp­wlm_path.sh</codeph> to add the Workload Manager
executables to your path.</li>
</ol>
</p>
<p>To uninstall Greenplum Workload Manager, run the following command:
<codeblock>$ <i>INSTALLDIR</i>/gp­wlm/bin/uninstall.sh ­­symlink <i>INSTALLDIR</i>/gp­wlm</codeblock></p>
</section>
</body>
</topic>
<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE topic PUBLIC "-//OASIS//DTD DITA Topic//EN" "topic.dtd">
<topic id="topic_lhr_yrf_qr">
<title>Enabling gphdfs Authentication with a Kerberos-secured Hadoop Cluster</title>
<shortdesc>Using external tables and the <codeph>gphdfs</codeph> protocol, Greenplum Database can
read files from and write files to a Hadoop File System (HDFS). Greenplum segments read and
write files in parallel from HDFS for fast performance. </shortdesc>
<body>
<p>When a Hadoop cluster is secured with Kerberos ("Kerberized"), Greenplum Database must be
configured to allow the Greenplum Database gpadmin role, which owns external tables in HDFS,
to authenticate through Kerberos. This topic provides the steps for configuring Greenplum
Database to work with a Kerberized HDFS, including verifying and troubleshooting the
configuration.</p>
<ul id="ul_i3b_xvm_qr">
<li>
<xref href="#topic_gbw_rjl_qr" format="dita"/>
</li>
<li>
<xref href="#topic_t11_tjl_qr" format="dita"/>
</li>
<li>
<xref href="#topic_wtt_y1r_zr" format="dita"/>
</li>
<li>
<xref href="#topic_jsb_cll_qr" format="dita"/>
</li>
<li>
<xref href="#topic_dlj_yjv_yr" format="dita"/>
</li>
<li>
<xref href="#topic_mwd_rlm_qr" format="dita"/>
</li>
</ul>
</body>
<topic id="topic_gbw_rjl_qr">
<title>Prerequisites</title>
<body>
<p>Make sure the following components are functioning and accessible on the network:</p>
<ul id="ul_krv_t5f_qr">
<li>Greenplum Database cluster</li>
<li>Kerberos-secured Hadoop cluster. See the <i>Greenplum Database Release Notes</i> for
supported Hadoop versions.</li>
<li>Kerberos Key Distribution Center (KDC) server.</li>
</ul>
</body>
</topic>
<topic id="topic_t11_tjl_qr">
<title>Configuring the Greenplum Cluster</title>
<body>
<p>The hosts in the Greenplum Cluster must have a Java JRE, Hadoop client files, and Kerberos
clients installed. </p>
<p>Follow these steps to prepare the Greenplum Cluster. </p>
<ol id="ol_ykk_rwf_qr">
<li>Install a Java 1.6 or later JRE on all Greenplum cluster hosts. <p>Match the JRE version
the Hadoop cluster is running. You can find the JRE version by running <codeph>java
--version</codeph> on a Hadoop node.</p></li>
<li><i>(Optional) </i>Confirm that Java Cryptography Extension (JCE) is present.<p>The
default location of the JCE libraries is
<filepath><varname>JAVA_HOME</varname>/lib/security</filepath>. If a JDK is installed,
the directory is <filepath><varname>JAVA_HOME</varname>/jre/lib/security</filepath>. The
files <filepath>local_policy.jar</filepath> and
<filepath>US_export_policy.jar</filepath> should be present in the JCE
directory.</p><p>The Greenplum cluster and the Kerberos server should, preferably, use
the same version of the JCE libraries. You can copy the JCE files from the Kerberos
server to the Greenplum cluster, if needed.</p></li>
<li>Set the <codeph>JAVA_HOME</codeph> environment variable to the location of the JRE in
the <filepath>.bashrc</filepath> or <filepath>.bash_profile</filepath> file for the
<codeph>gpadmin</codeph> account. For
example:<codeblock>export JAVA_HOME=/usr/java/default</codeblock></li>
<li>Source the <filepath>.bashrc</filepath> or <filepath>.bash_profile</filepath> file to
apply the change to your environment. For
example:<codeblock>$ source ~/.bashrc</codeblock></li>
<li>Install the Kerberos client utilities on all cluster hosts. Ensure the libraries match
the version on the KDC server before you install them. <p>For example, the following
command installs the Kerberos client files on Red Hat or CentOS
Linux:<codeblock>$ sudo yum install krb5-libs krb5-workstation</codeblock></p><p>Use the
<codeph>kinit</codeph> command to confirm the Kerberos client is installed and
correctly configured.</p></li>
<li>Install Hadoop client files on all hosts in the Greenplum Cluster. Refer to the
documentation for your Hadoop distribution for instructions.</li>
<li>Set the Greenplum Database server configuration parameters for Hadoop. The
<codeph>gp_hadoop_target_version</codeph> parameter specifies the version of the Hadoop
cluster. See the <i>Greenplum Database Release Notes</i> for the target version value that
corresponds to your Hadoop distribution. The <codeph>gp_hadoop_home</codeph> parameter
specifies the Hadoop installation
directory.<codeblock>$ gpconfig -c gp_hadoop_target_version -v "hdp2"
$ gpconfig -c gp_hadoop_home -v "/usr/lib/hadoop"</codeblock><p>See
the <i>Greenplum Database Reference Guide</i> for more information.</p></li>
<li>Reload the updated <filepath>postgresql.conf</filepath> files for master and
segments:<codeblock>gpstop -u</codeblock><p>You can confirm the changes with the
following
commands:<codeblock>$ gpconfig -s gp_hadoop_target_version
$ gpconfig -s gp_hadoop_home</codeblock></p></li>
<li>Grant Greenplum Database gphdfs protocol privileges to roles that own external tables in
HDFS, including <codeph>gpadmin</codeph> and other superuser roles. Grant
<codeph>SELECT</codeph> privileges to enable creating readable external tables in HDFS.
Grant <codeph>INSERT</codeph> privileges to enable creating writable exeternal tables on
HDFS.
<codeblock>#= GRANT SELECT ON PROTOCOL gphdfs TO gpadmin;
#= GRANT INSERT ON PROTOCOL gphdfs TO gpadmin;</codeblock></li>
<li>Grant Greenplum Database external table privileges to external table owner
roles:<codeblock>ALTER ROLE <varname>HDFS_USER</varname> CREATEEXTTABLE (type='readable');
ALTER ROLE <varname>HDFS_USER</varname> CREATEEXTTABLE (type='writable');</codeblock><note>It
is best practice to review database privileges, including gphdfs external table
privileges, at least annually. </note></li>
</ol>
</body>
</topic>
<topic id="topic_wtt_y1r_zr">
<title>Creating and Installing Keytab Files</title>
<body>
<ol id="ol_tch_2br_zr">
<li>Log in to the KDC server as root.</li>
<li>Use the <codeph>kadmin.local</codeph> command to create a new principal for the
<codeph>gpadmin</codeph>
user:<codeblock># kadmin.local -q "addprinc -randkey gpadmin@<i>LOCAL.DOMAIN</i>"</codeblock></li>
<li>Use <codeph>kadmin.local</codeph> to generate a Kerberos service principal for each host
in the Greenplum Database cluster. The service principal should be of the form
<varname>name</varname>/<varname>role</varname>@<varname>REALM</varname>, where:<ul
id="ul_cbd_dcr_zr">
<li><i>name</i> is the gphdfs service user name. This example uses
<codeph>gphdfs</codeph>.</li>
<li><i>role</i> is the DNS-resolvable host name of a Greenplum cluster host (the output
of the <codeph>hostname -f</codeph> command).</li>
<li><i>REALM</i> is the Kerberos realm, for example <codeph>LOCAL.DOMAIN</codeph>. </li>
</ul><p>For example, the following commands add service principals for four Greenplum
Database hosts, mdw.example.com, smdw.example.com, sdw1.example.com, and
sdw2.example.com:<codeblock># kadmin.local -q "addprinc -randkey gphdfs/mdw.example.com@LOCAL.DOMAIN"
# kadmin.local -q "addprinc -randkey gphdfs/smdw.example.com@LOCAL.DOMAIN"
# kadmin.local -q "addprinc -randkey gphdfs/sdw1.example.com@LOCAL.DOMAIN"
# kadmin.local -q "addprinc -randkey gphdfs/sdw2.example.com@LOCAL.DOMAIN"</codeblock></p><p>Create
a principal for each Greenplum cluster host. Use the same principal name and realm,
substituting the fully-qualified domain name for each host. </p></li>
<li>Generate a keytab file for each principal that you created (<codeph>gpadmin</codeph> and
each <codeph>gphdfs</codeph> service principal). You can store the keytab files in any
convenient location (this example uses the directory
<filepath>/etc/security/keytabs</filepath>). You will deploy the service principal
keytab files to their respective Greenplum host machines in a later
step:<codeblock># kadmin.local -q “xst -k /etc/security/keytabs/gphdfs.service.keytab gpadmin@LOCAL.DOMAIN”
# kadmin.local -q “xst -k /etc/security/keytabs/mdw.service.keytab gpadmin/mdw gphdfs/mdw.example.com@LOCAL.DOMAIN”
# kadmin.local -q “xst -k /etc/security/keytabs/smdw.service.keytab gpadmin/smdw gphdfs/smdw.example.com@LOCAL.DOMAIN”
# kadmin.local -q “xst -k /etc/security/keytabs/sdw1.service.keytab gpadmin/sdw1 gphdfs/sdw1.example.com@LOCAL.DOMAIN”
# kadmin.local -q “xst -k /etc/security/keytabs/sdw2.service.keytab gpadmin/sdw2 gphdfs/sdw2.example.com@LOCAL.DOMAIN”
# kadmin.local -q “listprincs”</codeblock></li>
<li>Change the ownership and permissions on <codeph>gphdfs.service.keytab</codeph> as
follows:<codeblock># chown gpadmin:gpadmin /etc/security/keytabs/gphdfs.service.keytab
# chmod 440 /etc/security/keytabs/gphdfs.service.keytab</codeblock></li>
<li>Copy the keytab file for <codeph>gpadmin@LOCAL.DOMAIN</codeph> to the Greenplum master
host:<codeblock># scp /etc/security/keytabs/gphdfs.service.keytab mdw_fqdn:/home/gpadmin/gphdfs.service.keytab</codeblock></li>
<li>Copy the keytab file for each service principal to its respective Greenplum
host:<codeblock># scp /etc/security/keytabs/mdw.service.keytab mdw_fqdn:/home/gpadmin/mdw.service.keytab
# scp /etc/security/keytabs/smdw.service.keytab smdw_fqdn:/home/gpadmin/smdw.service.keytab
# scp /etc/security/keytabs/sdw1.service.keytab sdw1_fqdn:/home/gpadmin/sdw1.service.keytab
# scp /etc/security/keytabs/sdw2.service.keytab sdw2_fqdn:/home/gpadmin/sdw2.service.keytab</codeblock></li>
</ol>
</body>
</topic>
<topic id="topic_jsb_cll_qr">
<title>Configuring gphdfs for Kerberos</title>
<body>
<ol id="ol_lnj_f4l_qr">
<li>Edit the Hadoop <filepath>core-site.xml</filepath> client configuration file on all
Greenplum cluster hosts. Enable service-level authorization for Hadoop by setting the
<codeph>hadoop.security.authorization</codeph> property to <codeph>true</codeph>. For
example:<codeblock>&lt;property>
&lt;name>hadoop.security.authorization&lt;/name>
&lt;value>true&lt;/value>
&lt;/property></codeblock></li>
<li>Edit the <filepath>yarn-site.xml</filepath> client configuration file on all cluster
hosts. Set the resource manager address and yarn Kerberos service principle. For
example:<codeblock>&lt;property>
&lt;name>yarn.resourcemanager.address&lt;/name>
&lt;value><varname>hostname</varname>:<varname>8032</varname>&lt;/value>
&lt;/property>
&lt;property>
&lt;name>yarn.resourcemanager.principal&lt;/name>
&lt;value>yarn/<varname>hostname</varname>@<varname>DOMAIN</varname>&lt;/value>
&lt;/property></codeblock></li>
<li>Edit the <filepath>hdfs-site.xml</filepath> client configuration file on all cluster
hosts. Set properties to identify the NameNode Kerberos principals, the location of the
Kerberos keytab file, and the principal it is for:<ul id="ul_t3s_w4l_qr">
<li><codeph>dfs.namenode.kerberos.principal</codeph> - the Kerberos principal name the
gphdfs protocol will use for the NameNode, for example
<codeph>gpadmin@LOCAL.DOMAIN</codeph>.</li>
<li><codeph>dfs.namenode.https.principal</codeph> - the Kerberos principal name the
gphdfs protocol will use for the NameNode's secure HTTP server, for example
<codeph>gpadmin@LOCAL.DOMAIN</codeph>.</li>
<li><codeph>com.emc.greenplum.gpdb.hdfsconnector.security.user.keytab.file</codeph> -
the path to the keytab file for the Kerberos HDFS service, for example
<codeph>/home/gpadmin/mdw.service.keytab</codeph>. . </li>
<li><codeph>com.emc.greenplum.gpdb.hdfsconnector.security.user.name</codeph> - the
gphdfs service principal for the host, for example
<codeph>gphdfs/mdw.example.com@LOCAL.DOMAIN</codeph>.</li>
</ul><p>For
example:</p><codeblock>&lt;property>
&lt;name>dfs.namenode.kerberos.principal&lt;/name>
&lt;value>gphdfs/gpadmin@LOCAL.DOMAIN&lt;/value>
&lt;/property>
&lt;property>
&lt;name>dfs.namenode.https.principal&lt;/name>
&lt;value>gphdfs/gpadmin@LOCAL.DOMAIN&lt;/value>
&lt;/property>
&lt;property>
&lt;name>com.emc.greenplum.gpdb.hdfsconnector.security.user.keytab.file&lt;/name>
&lt;value>/home/gpadmin/gpadmin.hdfs.keytab&lt;/value>
&lt;/property>
&lt;property>
&lt;name>com.emc.greenplum.gpdb.hdfsconnector.security.user.name&lt;/name>
&lt;value>gpadmin/@LOCAL.DOMAIN&lt;/value>
&lt;/property></codeblock></li>
</ol>
</body>
</topic>
<topic id="topic_dlj_yjv_yr">
<title>Testing Greenplum Database Access to HDFS</title>
<body>
<p>Confirm that HDFS is accessible via Kerberos authentication on all hosts in the Greenplum
cluster. For example, enter the following command to list an HDFS
directory:<codeblock>hdfs dfs -ls hdfs://<varname>namenode</varname>:8020</codeblock></p>
<section>
<title>Create a Readable External Table in HDFS</title>
<p>Follow these steps to verify that you can create a readable external table in a
Kerberized Hadoop cluser. </p>
<ol id="ol_elf_2ql_qr">
<li>Create a comma-delimited text file, <codeph>test1.txt</codeph>, with contents such as
the following:<codeblock>25, Bill
19, Anne
32, Greg
27, Gloria</codeblock></li>
<li>Persist the sample text file in
HDFS:<codeblock>hdfs dfs -put <varname>test1.txt</varname> hdfs://<varname>namenode</varname>:8020/tmp</codeblock></li>
<li>Log in to Greenplum Database and create a readable external table that points to the
<codeph>test1.txt</codeph> file in
Hadoop:<codeblock>CREATE EXTERNAL TABLE test_hdfs (age int, name text)
LOCATION('gphdfs://<varname>namenode</varname>:<varname>8020</varname>/tmp/test1.txt')
FORMAT 'text' (delimiter ',');</codeblock></li>
<li>Read data from the external table:<codeblock>SELECT * FROM test_hdfs;</codeblock></li>
</ol>
</section>
<section>
<title>Create a Writable External Table in HDFS</title>
<p>Follow these steps to verify that you can create a writable external table in a
Kerberized Hadoop cluster. The steps use the <codeph>test_hdfs</codeph> readable external
table created previously. </p>
<ol id="ol_ht3_kfm_qr">
<li>Log in to Greenplum Database and create a writable external table pointing to a text
file in
HDFS:<codeblock>CREATE WRITABLE EXTERNAL TABLE test_hdfs2 (LIKE test_hdfs)
LOCATION ('gphdfs://<varname>namenode</varname>:8020/tmp/test2.txt'
FORMAT 'text' (DELIMITER ',');</codeblock></li>
<li>Load data into the writable external
table:<codeblock>INSERT INTO test_hdfs2
SELECT * FROM test_hdfs;</codeblock></li>
<li>Check that the file exists in
HDFS:<codeblock>hdfs dfs -ls hdfs://<varname>namenode</varname>:8020/tmp/test2.txt</codeblock></li>
<li>Verify the contents of the external
file:<codeblock>hdfs dfs -cat hdfs://<varname>namenode</varname>:8020/tmp/test2.txt</codeblock></li>
</ol>
</section>
</body>
</topic>
<topic id="topic_mwd_rlm_qr">
<title>Troubleshooting HDFS with Kerberos</title>
<body>
<section>
<title>Forcing Classpaths</title>
<p>If you encounter "class not found" errors when executing <codeph>SELECT</codeph>
statements from <codeph>gphdfs</codeph> external tables, edit the
<filepath>$GPHOME/lib/hadoop-env.sh</filepath> file and add the following lines towards
the end of the file, before the <codeph>JAVA_LIBRARY_PATH</codeph> is set. Update the
script on all of the cluster
hosts.<codeblock>if [ -d "/usr/hdp/current" ]; then
for f in /usr/hdp/current/**/*.jar; do
CLASSPATH=${CLASSPATH}:$f;
done
fi</codeblock></p>
</section>
<section>
<title>Enabling Kerberos Client Debug Messages</title>
<p>To see debug messages from the Kerberos client, edit the
<filepath>$GPHOME/lib/hadoop-env.sh</filepath> client shell script on all cluster hosts
and set the <codeph>HADOOP_OPTS</codeph> variable as
follows:<codeblock>export HADOOP_OPTS="-Djava.net.prefIPv4Stack=true -Dsun.security.krb5.debug=true ${HADOOP_OPTS}"</codeblock></p>
</section>
<section>
<title>Adjusting JVM Process Memory on Segment Hosts</title>
<p>Each segment launches a JVM process when reading or writing an external table in HDFS. To
change the amount of memory allocated to each JVM process, configure the
<codeph>GP_JAVA_OPT</codeph> environment variable. </p>
<p>Edit the <filepath>$GPHOME/lib/hadoop-env.sh</filepath> client shell script on all
cluster hosts. </p>
<p>For example:<codeblock>export GP_JAVA_OPT=-Xmx1000m</codeblock></p>
</section>
<section>
<title>Verify Kerberos Security Settings</title>
<p>Review the <filepath>/etc/krb5.conf</filepath> file:</p>
<ul id="ul_os5_f4m_qr">
<li>If AES256 encryption is not disabled, ensure that all cluster hosts have the JCE
Unlimited Strength Jurisdiction Policy Files installed.</li>
<li>Ensure all encryption types in the Kerberos keytab file match definitions in the
<filepath>krb5.conf</filepath> file.
<codeblock>cat /etc/krb5.conf | egrep supported_enctypes</codeblock></li>
</ul>
</section>
<section>
<title>Test Connectivity on an Individual Segment Host</title>
<p>Follow these steps to test that a single Greenplum Database host can read HDFS data. This
test method executes the Greenplum <codeph>HDFSReader</codeph> Java class at the
command-line, and can help to troubleshoot connectivity problems outside of the database. </p>
<ol id="ol_vm1_m4m_qr">
<li>Save a sample data file in HDFS.
<codeblock>hdfs dfs -put test1.txt hdfs://<varname>namenode</varname>:8020/tmp</codeblock></li>
<li>On the segment host to be tested, create an environment script,
<codeph>env.sh</codeph>, like the
following:<codeblock>export JAVA_HOME=/usr/java/default
export HADOOP_HOME=/usr/lib/hadoop
export GP_HADOOP_CON_VERSION=hdp2
export GP_HADOOP_CON_JARDIR=/usr/lib/hadoop</codeblock></li>
<li>Source all environment
scripts:<codeblock>source /usr/local/greenplum-db/greenplum_path.sh
source env.sh
source $GPHOME/lib/hadoop-env.sh</codeblock></li>
<li>Test the Greenplum Database HDFS
reader:<codeblock>java com.emc.greenplum.gpdb.hdfsconnector.HDFSReader 0 32 TEXT hdp2 gphdfs://<varname>namenode</varname>:8020/tmp/test1.txt</codeblock></li>
</ol>
</section>
</body>
</topic>
</topic>
<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE topic PUBLIC "-//OASIS//DTD DITA Topic//EN" "topic.dtd">
<topic id="greenplum_database_ports_and_protocols">
<title>Greenplum Database Ports and Protocols</title>
<body>
<p>Greenplum Database clients connect with TCP to the Greenplum master instance at the client
connection port, 5432 by default. The listen port can be reconfigured in the
<filepath>postgresql.conf</filepath> configuration file. Client connections use the
PostgreSQL libpq API. The <codeph>psql</codeph> command-line interface, several Greenplum
utilities, and language-specific programming APIs all either use the libpq library directly or
implement the libpq protocol internally. </p>
<p>Each segment instance also has a client connection port, used solely by the master instance
to coordinate database operations with the segments. The <codeph>gpstate -p</codeph> command,
executed on the Greenplum master, lists the port assignments for the Greenplum master and the
primary segments and mirrors. For example:
<codeblock>[gpadmin@mdw ~]$ gpstate -p
20160126:15:40:22:028389 gpstate:mdw:gpadmin-[INFO]:-Starting gpstate with args: -p
20160126:15:40:22:028389 gpstate:mdw:gpadmin-[INFO]:-local Greenplum Version: 'postgres (Greenplum Database) 4.3.6.0 build 62994'
20160126:15:40:22:028389 gpstate:mdw:gpadmin-[INFO]:-master Greenplum Version: 'PostgreSQL 8.2.15 (Greenplum Database 4.3.6.0 build 62994) on x86_64-unknown-linux-gnu, compiled by GCC gcc (GCC) 4.4.2 compiled on Jul 24 2015 11:35:08'
20160126:15:40:22:028389 gpstate:mdw:gpadmin-[INFO]:-Obtaining Segment details from master...
20160126:15:40:22:028389 gpstate:mdw:gpadmin-[INFO]:--Master segment instance /data/master/gpseg-1 port = 5432
20160126:15:40:22:028389 gpstate:mdw:gpadmin-[INFO]:--Segment instance port assignments
20160126:15:40:22:028389 gpstate:mdw:gpadmin-[INFO]:-----------------------------------
20160126:15:40:22:028389 gpstate:mdw:gpadmin-[INFO]:- Host Datadir Port
20160126:15:40:22:028389 gpstate:mdw:gpadmin-[INFO]:- sdw1 /data/primary/gpseg0 40000
20160126:15:40:22:028389 gpstate:mdw:gpadmin-[INFO]:- sdw2 /data/mirror/gpseg0 50000
20160126:15:40:22:028389 gpstate:mdw:gpadmin-[INFO]:- sdw2 /data/primary/gpseg1 40000
20160126:15:40:22:028389 gpstate:mdw:gpadmin-[INFO]:- sdw1 /data/mirror/gpseg1 50001
20160126:15:40:22:028389 gpstate:mdw:gpadmin-[INFO]:- sdw3 /data/primary/gpseg2 40000
20160126:15:40:22:028389 gpstate:mdw:gpadmin-[INFO]:- sdw4 /data/mirror/gpseg2 50000
20160126:15:40:22:028389 gpstate:mdw:gpadmin-[INFO]:- sdw4 /data/primary/gpseg3 40000
20160126:15:40:22:028389 gpstate:mdw:gpadmin-[INFO]:- sdw3 /data/mirror/gpseg3 50001
</codeblock></p>
<p>Additional Greenplum Database network connections are created for features such as standby
replication, segment mirroring, statistics collection, and data exchange between segments.
Some persistent connections are established when the database starts up and other transient
connections are created during operations such as query execution. Transient connections for
query execution processes, data movement, and statistics collection use available ports in the
range 1025 to 65535 with both TCP and UDP protocols. </p>
<p>Some add-on products and services that work with Greenplum Database have additional
networking requirements. The following table lists ports and protocols used within the
Greenplum cluster, and includes services and applications that integrate with Greenplum
Database.</p>
<table frame="all" rowsep="1" colsep="1" id="table_zf3_lzz_s5">
<title>Greenplum Database Ports and Protocols</title>
<tgroup cols="3">
<colspec colname="newCol1" colnum="1" colwidth="1*"/>
<colspec colname="c1" colnum="2" colwidth="1.0*"/>
<colspec colname="c3" colnum="3" colwidth="2.0*"/>
<thead>
<row>
<entry>Service</entry>
<entry>Protocol/Port</entry>
<entry>Description</entry>
</row>
</thead>
<tbody>
<row>
<entry>Master SQL client connection</entry>
<entry>TCP 5432, libpq</entry>
<entry>SQL client connection port on the Greenplum master host. Supports clients using
the PostgreSQL libpq API. Configurable.</entry>
</row>
<row>
<entry>Segment SQL client connection</entry>
<entry>varies, libpq</entry>
<entry>The SQL client connection port for a segment instance. Each primary and mirror
segment on a host must have a unique port. Ports are assigned when the Greenplum
system is initialized or expanded. The <codeph>gp_segment_configuration</codeph>
system catalog records port numbers for each segment in the <codeph>port</codeph>
column. Run <codeph>gpstate -p</codeph> to view the ports in use.</entry>
</row>
<row>
<entry>Segment mirroring port</entry>
<entry>varies, libpq</entry>
<entry>The port where a segment receives mirrored blocks from its primary. The port is
assigned when the mirror is set up. The port number is stored in the
<codeph>gp_segment_configuration</codeph> system catalog in the
<codeph>mirror_port</codeph> column.</entry>
</row>
<row>
<entry>Greenplum Database Interconnect</entry>
<entry>UDP 1025-65535, dynamically allocated</entry>
<entry>The Interconnect transports database tuples between Greenplum segments during
query execution. </entry>
</row>
<row>
<entry>Standby master client listener</entry>
<entry>TCP 5432, libpq</entry>
<entry>SQL client connection port on the standby master host. Usually the same as the
master client connection port. Configure with the <codeph>gpinitstandby</codeph>
utility <codeph>-P</codeph> option.</entry>
</row>
<row>
<entry>Standby master replicator</entry>
<entry>TCP 1025-65535, gpsyncmaster</entry>
<entry>The <codeph>gpsyncmaster</codeph> process on the master host establishes a
connection to the secondary master host to replicate the master's log to the standby
master. </entry>
</row>
<row otherprops="pivotal">
<entry>Greenplum Control Center (GPCC)</entry>
<entry>TCP 28080, HTTP/HTTPS</entry>
<entry>Default listen port for the GPCC console web server, which is usually installed
on the master host. Configured in the <filepath>lighttpd.conf</filepath> file in the
GPCC instance.</entry>
</row>
<row>
<entry>Greenplum database file load and transfer utilities: gpfdist, gpload,
gptransfer</entry>
<entry>TCP 8080, HTTP<p>TCP 9000, HTTPS</p></entry>
<entry>The gpfdist file serving utility can run on Greenplum hosts or external hosts.
Specify the connection port with the <codeph>-p</codeph> option when starting the
server. <p>The gpload and gptransfer utilities run one or more instances of gpfdist
with ports or port ranges specified in a configuration file.</p></entry>
</row>
<row>
<entry>GPCC agents</entry>
<entry>TCP 8888</entry>
<entry>Connection port for GPCC agents executing on each Greenplum host. Configure by
setting the <codeph>gpperfmon_port</codeph> configuration variable in
<filepath>postgresql.conf</filepath> on master and segment hosts.</entry>
</row>
<row>
<entry>Backup completion notification</entry>
<entry>TCP 25, TCP 587, SMTP</entry>
<entry>The gpcrondump backup utility can optionally send email to a list of email
addresses at completion of a backup. The SMTP service must be enabled on the Greenplum
master host. </entry>
</row>
<row>
<entry>Greenplum Database secure shell (SSH): gpssh, gpscp, gpssh-exkeys, gppkg,
gpseginstall</entry>
<entry>TCP 22, SSH</entry>
<entry>Many Greenplum utilities use scp and ssh to transfer files between hosts and
manage the Greenplum system within the cluster. </entry>
</row>
<row>
<entry>gphdfs</entry>
<entry>TCP 8020</entry>
<entry>The gphdfs protocol allows access to data in a Hadoop file system via Greenplum
external tables. The URL in the <codeph>LOCATION</codeph> clause of the <codeph>CREATE
EXTERNAL TABLE</codeph> command specifies the host address and port number for the
Hadoop namenode service. </entry>
</row>
<row otherprops="pivotal">
<entry morerows="2">Greenplum Workload Manager (GP-WLM)</entry>
<entry>TCP 4369, epmd</entry>
<entry>Erlang port mapper (epmd) allows nodes in the cluster to resolve node names.
</entry>
</row>
<row otherprops="pivotal">
<entry>TCP 25672</entry>
<entry>rabbitmq clustering</entry>
</row>
<row otherprops="pivotal">
<entry>TCP 7777</entry>
<entry>rabbitmq main port</entry>
</row>
<row otherprops="pivotal">
<entry morerows="3">EMC Data Domain and DD Boost</entry>
<entry>TCP/UDP 111, NFS portmapper </entry>
<entry>Used to assign a random port for the mountd service used by NFS and DD Boost. The
mountd service port can be statically assigned on the Data Domain server.</entry>
</row>
<row otherprops="pivotal">
<entry>TCP 2052</entry>
<entry>Main port used by NFS mountd. This port can be set on the Data Domain system
using the <codeph>nfs set mountd-port</codeph> command .</entry>
</row>
<row otherprops="pivotal">
<entry>TCP 2049, NFS</entry>
<entry>Main port used by NFS. This port can be configured using the <codeph>nfs set
server-port</codeph> command on the Data Domain server. </entry>
</row>
<row otherprops="pivotal">
<entry>TCP 2051, replication</entry>
<entry>Used when replication is configured on the Data Domain system. This port can be
configured using the <codeph>replication modify</codeph> command on the Data Domain
server.</entry>
</row>
<row>
<entry morerows="1">Symantec NetBackup</entry>
<entry>TCP/UDB 1556, veritas-pbx</entry>
<entry>The Symantec NetBackup client network port. </entry>
</row>
<row>
<entry>TCP 13724, vnetd</entry>
<entry>Symantec NetBackup vnetd communication port.</entry>
</row>
<row>
<entry>Pgbouncer connection pooler</entry>
<entry>TCP, libpq</entry>
<entry>The pgbouncer connection pooler runs between libpq clients and Greenplum (or
PostgreSQL) databases. It can be run on the Greenplum master host, but running it on a
host outside of the Greenplum cluster is recommended. When it runs on a separate host,
pgbouncer can act as a warm standby mechanism for the Greenplum master host, switching
to the Greenplum standby host without requiring clients to reconfigure. Set the client
connection port and the Greenplum master host address and port in the
<filepath>pgbouncer.ini</filepath> configuration file. </entry>
</row>
<row>
<entry>stunnel SSL proxy</entry>
<entry>TCP, ssh, libpq</entry>
<entry>A stunnel SSL proxy can be used to add SSL support for database clients accessing
the database through a pgbouncer connection pool. A secure tunnel can be set up by
setting up stunnel on the client and the pgbouncer host. Newer versions of stunnel
that support encrypted libpq connections only require stunnel on the pgbouncer host.
The stunnel proxy's connection ports and the pgbouncer host and port are specified in
the <filepath>stunnel.conf</filepath> configuration file.</entry>
</row>
</tbody>
</tgroup>
</table>
</body>
</topic>
<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE topic
PUBLIC "-//OASIS//DTD DITA Composite//EN" "ditabase.dtd">
<topic id="topic1" xml:lang="en">
<title id="nk110126">About This Guide</title>
<body>
<p>This guide describes how to secure a Greenplum Database system. The guide consists of the
following sections:</p>
<ul id="ul_tnm_vl3_34">
<li id="nk169273"><xref href="SecuringGPDB.xml#topic_iqr_ym2_jr"/> introduces Greenplum
Database security topics. </li>
<li><xref href="ports_and_protocols.xml#greenplum_database_ports_and_protocols"/> lists
network ports and protocols used within the Greenplum cluster. </li>
<li id="nk169420"><xref href="Authenticate.xml#topic_n5w_gtd_jr"/> describes the available
methods for authenticating Greenplum Database clients. </li>
<li id="nk169450"><xref href="Authorization.xml#topic_ivr_cs2_jr"/> describes how to restrict
access to database data at the user level by using roles and permissions.</li>
<li id="nk169480"><xref href="Auditing.xml#topic_ufw_zn2_jr"/> describes Greenplum Database
events that are logged and should be monitored to detect security threats. </li>
<li id="nk169507"><xref href="Encryption.xml#topic_th5_5bf_jr"/> describes how to encrypt data
in the database or in transit over the network, to protect from evesdroppers or
man-in-the-middle attacks. </li>
<li><xref href="kerberos-hdfs.xml#topic_lhr_yrf_qr"/> provides steps for configuring Greenplum
Database to access external tables in a Hadoop cluster secured with Kerberos.</li>
<li><xref href="BestPractices.xml#topic_nyc_mdf_jr"/> provides steps for securing Greenplum
Database hosts and the cluster. </li>
</ul>
<p>This guide assumes knowledge of Linux/UNIX system administration and database management
systems. Familiarity with structured query language (SQL) is helpful.</p>
<p>Because Greenplum Database is based on PostgreSQL8.3.23, this guide assumes some familiarity
with PostgreSQL. References to <xref
href="http://www.postgresql.org/docs/8.3/static/index.html" scope="external" format="html"
><ph>PostgreSQL documentation</ph></xref> are provided throughout this guide for features
that are similar to those in Greenplum Database.</p>
<p>This guide provides information for system administrators responsible for administering a
Greenplum Database system.</p>
</body>
</topic>
Markdown is supported
0% .
You are about to add 0 people to the discussion. Proceed with caution.
先完成此消息的编辑!
想要评论请 注册