Skip to content
体验新版
项目
组织
正在加载...
登录
切换导航
打开侧边栏
Greenplum
Gpdb
提交
014a86ac
G
Gpdb
项目概览
Greenplum
/
Gpdb
通知
7
Star
1
Fork
0
代码
文件
提交
分支
Tags
贡献者
分支图
Diff
Issue
0
列表
看板
标记
里程碑
合并请求
0
DevOps
流水线
流水线任务
计划
Wiki
0
Wiki
分析
仓库
DevOps
项目成员
Pages
G
Gpdb
项目概览
项目概览
详情
发布
仓库
仓库
文件
提交
分支
标签
贡献者
分支图
比较
Issue
0
Issue
0
列表
看板
标记
里程碑
合并请求
0
合并请求
0
Pages
DevOps
DevOps
流水线
流水线任务
计划
分析
分析
仓库分析
DevOps
Wiki
0
Wiki
成员
成员
收起侧边栏
关闭侧边栏
动态
分支图
创建新Issue
流水线任务
提交
Issue看板
体验新版 GitCode,发现更多精彩内容 >>
提交
014a86ac
编写于
8月 11, 2002
作者:
T
Tom Lane
浏览文件
操作
浏览文件
下载
电子邮件补丁
差异文件
Editorial improvements.
上级
74ce5c93
变更
1
隐藏空白更改
内联
并排
Showing
1 changed file
with
38 addition
and
50 deletion
+38
-50
doc/src/sgml/ref/cluster.sgml
doc/src/sgml/ref/cluster.sgml
+38
-50
未找到文件。
doc/src/sgml/ref/cluster.sgml
浏览文件 @
014a86ac
<!--
$Header: /cvsroot/pgsql/doc/src/sgml/ref/cluster.sgml,v 1.1
8 2002/08/10 21:03:33 momjian
Exp $
$Header: /cvsroot/pgsql/doc/src/sgml/ref/cluster.sgml,v 1.1
9 2002/08/11 02:43:57 tgl
Exp $
PostgreSQL documentation
-->
...
...
@@ -73,19 +73,6 @@ CLUSTER
</para>
</listitem>
</varlistentry>
<varlistentry>
<term><computeroutput>
ERROR: Relation <replaceable class="PARAMETER">table</replaceable> does not exist!
</computeroutput></term>
<listitem>
<para>
<comment>
The specified relation was not shown in the error message,
which contained a random string instead of the relation name.
</comment>
</para>
</listitem>
</varlistentry>
</variablelist>
</para>
</refsect2>
...
...
@@ -101,7 +88,7 @@ ERROR: Relation <replaceable class="PARAMETER">table</replaceable> does not exis
<para>
<command>CLUSTER</command> instructs <productname>PostgreSQL</productname>
to cluster the table specified
by <replaceable class="parameter">table</replaceable>
approximately
by <replaceable class="parameter">table</replaceable>
based on the index specified by
<replaceable class="parameter">indexname</replaceable>. The index must
already have been defined on
...
...
@@ -110,11 +97,11 @@ ERROR: Relation <replaceable class="PARAMETER">table</replaceable> does not exis
<para>
When a table is clustered, it is physically reordered
based on the index information.
The clustering is static.
In other words, as the table is
updated, the changes are
not clustered.
No attempt is made to keep new instances
or
updated tuples
clustered
. If one wishes, one can
re-cluster manually
by issuing the command again.
based on the index information.
Clustering is a one-time operation:
when the table is subsequently
updated, the changes are
not clustered.
That is, no attempt is made to store new
or
updated tuples
according to their index order
. If one wishes, one can
periodically re-cluster
by issuing the command again.
</para>
<refsect2 id="R2-SQL-CLUSTER-3">
...
...
@@ -146,18 +133,34 @@ ERROR: Relation <replaceable class="PARAMETER">table</replaceable> does not exis
</para>
<para>
There are two ways to cluster data. The first is with the
<command>CLUSTER</command> command, which reorders the original table with
During the cluster operation, a temporary copy of the table is created
that contains the table data in the index order. Temporary copies of
each index on the table are created as well. Therefore, you need free
space on disk at least equal to the sum of the table size and the index
sizes.
</para>
<para>
CLUSTER preserves GRANT, inheritance, index, foreign key, and other
ancillary information about the table.
</para>
<para>
Because the optimizer records statistics about the ordering of tables, it
is advisable to run <command>ANALYZE</command> on the newly clustered
table. Otherwise, the optimizer may make poor choices of query plans.
</para>
<para>
There is another way to cluster data. The
<command>CLUSTER</command> command reorders the original table using
the ordering of the index you specify. This can be slow
on large tables because the rows are fetched from the heap
in index order, and if the heap table is unordered, the
entries are on random pages, so there is one disk page
retrieved for every row moved. <productname>PostgreSQL</productname> has a cache,
but the majority of a big table will not fit in the cache.
</para>
<para>
Another way to cluster data is to use
retrieved for every row moved. (<productname>PostgreSQL</productname> has a cache,
but the majority of a big table will not fit in the cache.)
The other way to cluster a table is to use
<programlisting>
SELECT <replaceable class="parameter">columnlist</replaceable> INTO TABLE <replaceable class="parameter">newtable</replaceable>
...
...
@@ -165,30 +168,15 @@ SELECT <replaceable class="parameter">columnlist</replaceable> INTO TABLE <repla
</programlisting>
which uses the <productname>PostgreSQL</productname> sorting code in
the ORDER BY clause to match the index, and which is much faster for
the ORDER BY clause to create the desired order; this is usually much
faster than an indexscan for
unordered data. You then drop the old table, use
<command>ALTER TABLE...RENAME</command>
to rename <replaceable class="parameter">newtable</replaceable> to the old name, and
recreate the table's indexes. The only problem is that <acronym>OID</acronym>s
will not be preserved. From then on, <command>CLUSTER</command> should be
fast because most of the heap data has already been
ordered, and the existing index is used.
</para>
<para>
During the cluster operation, a temporal table is created that contains
the table in the index order. Due to this, you need to have free space
on disk at least the size of the table itself, or the biggest index if
you have one that is larger than the table.
</para>
<para>
CLUSTER preserves GRANT, inheritance index, and foreign key information.
</para>
<para>
Because the optimizer records the cluster status of tables, it is
advised to run <command>ANALYZE</command> on the newly clustered table.
recreate the table's indexes. However, this approach does not preserve
OIDs, constraints, foreign key relationships, granted privileges, and
other ancillary properties of the table --- all such items must be
manually recreated.
</para>
</refsect2>
...
...
@@ -199,7 +187,7 @@ SELECT <replaceable class="parameter">columnlist</replaceable> INTO TABLE <repla
Usage
</title>
<para>
Cluster the employees relation on the basis of its
salary
attribute:
Cluster the employees relation on the basis of its
ID
attribute:
</para>
<programlisting>
CLUSTER emp_ind ON emp;
...
...
编辑
预览
Markdown
is supported
0%
请重试
或
添加新附件
.
添加附件
取消
You are about to add
0
people
to the discussion. Proceed with caution.
先完成此消息的编辑!
取消
想要评论请
注册
或
登录