# 71.2. System Catalog Initial Data

71.2.1. Data File Format

71.2.2. OID Assignment

71.2.3. OID Reference Lookup

71.2.4. Automatic Creation of Array Types

71.2.5. Recipes for Editing Data Files

Each catalog that has any manually-created initial data (some do not) has a corresponding.datfile that contains its initial data in an editable format.

# 71.2.1. Data File Format

Each.datfile contains Perl data structure literals that are simply eval'd to produce an in-memory data structure consisting of an array of hash references, one per catalog row. A slightly modified excerpt frompg_database.datwill demonstrate the key features:

[

# A comment could appear here.
{ oid => '1', oid_symbol => 'TemplateDbOid',
  descr => 'database\'s default template',
  datname => 'template1', encoding => 'ENCODING', datcollate => 'LC_COLLATE',
  datctype => 'LC_CTYPE', datistemplate => 't', datallowconn => 't',
  datconnlimit => '-1', datlastsysoid => '0', datfrozenxid => '0',
  datminmxid => '1', dattablespace => 'pg_default', datacl => '_null_' },

]

Points to note:

  • The overall file layout is: open square bracket, one or more sets of curly braces each of which represents a catalog row, close square bracket. Write a comma after each closing curly brace.

  • Within each catalog row, write comma-separated*key* => *valuepairs. The allowedkey*s are the names of the catalog's columns, plus the metadata keysoid,oid_symbol,array_type_oid, and描述.(指某东西的用途样的oid_symbol描述在第 71.2.2 节下面,而array_type_oid描述在第 71.2.4 节.描述为对象提供一个描述字符串,它将被插入到pg_description要么pg_shdescription视情况而定。)虽然元数据键是可选的,但目录的定义列必须全部提供,除非目录的。Hfile 指定列的默认值。(在上面的例子中,达达巴字段已被省略,因为pg_database.h为其提供合适的默认值。)

  • 所有值都必须用单引号引起来。使用反斜杠转义值中使用的单引号。反斜杠意味着数据可以但不必加倍;这遵循 Perl 的简单引用文字规则。请注意,作为数据出现的反斜杠将被引导扫描程序视为转义,根据与转义字符串常量相同的规则(请参阅第 4.1.2.2 节);例如\t转换为制表符。如果你真的想在最终值中使用反斜杠,你需要写四个:Perl 去掉两个,留下\\供引导扫描程序查看。

  • 空值表示为_空值_.(请注意,无法创建仅是该字符串的值。)

  • 评论前面有#,并且必须在自己的线路上。

  • 作为其他商品的 OID 的字段值应由符号名称而不是实际的数字 OID 表示。(在上面的例子中,数据表空间包含这样的参考。)这在第 71.2.3 节以下。

  • 由于哈希是无序的数据结构,因此字段顺序和行布局在语义上并不重要。但是,为了保持一致的外观,我们设置了一些由格式化脚本应用的规则重新格式化_dat_file.pl

    • 在每对花括号内,元数据字段样的,oid_symbol,array_type_oid, 和描述(如果存在)按该顺序首先出现,然后目录自己的字段按其定义的顺序出现。

    • 如果可能,将根据需要在字段之间插入换行符以将行长度限制为 80 个字符。在元数据字段和常规字段之间也插入了一个换行符。

    • 如果目录的。H文件为列指定默认值,并且数据条目具有相同的值,重新格式化_dat_file.pl将从数据文件中省略它。这使数据表示保持紧凑。

    • reformat_dat_file.plpreserves blank lines and comment lines as-is.

      It's recommended to runreformat_dat_file.plbefore submitting catalog data patches. For convenience, you can simply change tosrc/include/catalog/and runmake reformat-dat-files.

  • If you want to add a new method of making the data representation smaller, you must implement it inreformat_dat_file.pland also teachCatalog::ParseData()how to expand the data back into the full representation.

# 71.2.2. OID Assignment

A catalog row appearing in the initial data can be given a manually-assigned OID by writing anoid => *nnnn*metadata field. Furthermore, if an OID is assigned, a C macro for that OID can be created by writing anoid_symbol => *name*metadata field.

Pre-loaded catalog rows must have preassigned OIDs if there are OID references to them in other pre-loaded rows. A preassigned OID is also needed if the row's OID must be referenced from C code. If neither case applies, theoidmetadata field can be omitted, in which case the bootstrap code assigns an OID automatically. In practice we usually preassign OIDs for all or none of the pre-loaded rows in a given catalog, even if only some of them are actually cross-referenced.

Writing the actual numeric value of any OID in C code is considered very bad form; always use a macro, instead. Direct references topg_procOIDs are common enough that there's a special mechanism to create the necessary macros automatically; seesrc/backend/utils/Gen_fmgrtab.pl.类似地——但是,由于历史原因,没有采用同样的方式——有一种自动创建宏的方法pg_typeOID。oid_symbol因此,这两个目录中的条目是不必要的。同样,宏pg_class系统目录和索引的 OID 是自动设置的。对于所有其他系统目录,您必须通过手动指定所需的任何宏oid_symbol条目。

要为新的预加载行查找可用 OID,请运行脚本src/include/catalog/unused_oids.它打印未使用 OID 的包含范围(例如,输出行45-900表示尚未分配 OID 45 到 900)。目前,OID 1-9999 保留用于手动分配;这未使用的_oids脚本只是查看目录标题和.dat文件,看看哪些没有出现。您还可以使用重复样体检查错误的脚本。(genbki.pl将为没有手动分配给它们的任何行分配 OID,并且它还将在编译时检测重复的 OID。)

当为一个预计不会立即提交的补丁选择 OID 时,最佳实践是使用一组或多或少连续的 OID,从 8000-9999 范围内的某个随机选择开始。这将 OID 与同时开发的其他补丁发生冲突的风险降至最低。为了保持 8000-9999 范围免费用于开发目的,在将补丁提交到主 git 存储库后,应将其 OID 重新编号为低于该范围的可用空间。通常,这将在每个开发周期快结束时完成,同时移动该周期中提交的补丁所消耗的所有 OID。剧本重新编号_oids.pl可用于此目的。如果发现未提交的补丁与某个最近提交的补丁存在 OID 冲突,重新编号_oids.pl也可能有助于从这种情况中恢复。

由于这种可能对补丁分配的 OID 重新编号的约定,补丁分配的 OID 不应被视为稳定,直到补丁包含在正式版本中。但是,一旦发布,我们不会更改手动分配的对象 OID,因为这会产生各种兼容性问题。

如果genbki.plneeds to assign an OID to a catalog entry that does not have a manually-assigned OID, it will use a value in the range 10000—11999. The server's OID counter is set to 12000 at the start of a bootstrap run. Thus objects created by regular SQL commands during the later phases of bootstrap, such as objects created while running theinformation_schema.sqlscript, receive OIDs of 12000 or above.

OIDs assigned during normal database operation are constrained to be 16384 or higher. This ensures that the range 10000—16383 is free for OIDs assigned automatically bygenbki.plor during bootstrap. These automatically-assigned OIDs are not considered stable, and may change from one installation to another.

# 71.2.3. OID Reference Lookup

In principle, cross-references from one initial catalog row to another could be written just by writing the preassigned OID of the referenced row in the referencing field. However, that is against project policy, because it is error-prone, hard to read, and subject to breakage if a newly-assigned OID is renumbered. Thereforegenbki.plprovides mechanisms to write symbolic references instead. The rules are as follows:

  • Use of symbolic references is enabled in a particular catalog column by attachingBKI_LOOKUP(*lookuprule*)to the column's definition, where*lookuprule*is the name of the referenced catalog, e.g.,pg_proc.BKI_LOOKUPcan be attached to columns of typeOid,regproc,oidvector, orOid[]; in the latter two cases it implies performing a lookup on each element of the array.

  • 也可以附上BKI_LOOKUP(编码)到整数列以引用字符集编码,这些编码当前不表示为目录 OID,但具有一组已知的值genbki.pl.

  • 在某些目录列中,允许条目为零而不是有效参考。如果允许,请写BKI_LOOKUP_OPT代替BKI_LOOKUP.然后你可以写0一个条目。(如果该列被声明正则程序, 你可以选择写-代替0.) 除了这种特殊情况,a 中的所有条目BKI_LOOKUP列必须是符号引用。genbki.pl将警告无法识别的名称。

  • 大多数种类的目录对象只是通过它们的名称来引用。请注意,类型名称必须与引用的完全匹配pg_type条目的类型名;您不能使用任何别名,例如整数为了整数4.

  • 一个函数可以用它的表示名字, 如果这是唯一的pg_proc.dat条目(这类似于 regproc 输入)。否则,写成*proname(argtypename,argtypename,...)*,如重新程序。参数类型名称的拼写必须与它们在pg_proc.dat条目的参数类型场地。不要插入任何空格。

  • 运算符表示为*oprname(lefttype,righttype)*,完全按照它们在pg_operator.dat条目的左派正确的字段。(写0对于一元运算符的省略操作数。)

  • opclasses 和 opfamilies 的名称仅在访问方法中是唯一的,因此它们表示为*访问方法名称/对象名*.

  • 在这些情况下,都没有任何模式限定的规定;在引导期间创建的所有对象都应该在pg_catalogschema.

genbki.plresolves all symbolic references while it runs, and puts simple numeric OIDs into the emitted BKI file. There is therefore no need for the bootstrap backend to deal with symbolic references.

It's desirable to mark OID reference columns withBKI_LOOKUPorBKI_LOOKUP_OPTeven if the catalog has no initial data that requires lookup. This allowsgenbki.plto record the foreign key relationships that exist in the system catalogs. That information is used in the regression tests to check for incorrect entries. See also the macrosDECLARE_FOREIGN_KEY,DECLARE_FOREIGN_KEY_OPT,DECLARE_ARRAY_FOREIGN_KEY, andDECLARE_ARRAY_FOREIGN_KEY_OPT, which are used to declare foreign key relationships that are too complex forBKI_LOOKUP(typically, multi-column foreign keys).

# 71.2.4. Automatic Creation of Array Types

Most scalar data types should have a corresponding array type (that is, a standard varlena array type whose element type is the scalar type, and which is referenced by thetyparrayfield of the scalar type'spg_typeentry).genbki.plis able to generate thepg_typeentry for the array type automatically in most cases.

To use this facility, just write anarray_type_oid => *nnnn*metadata field in the scalar type'spg_typeentry, specifying the OID to use for the array type. You may then omit thetyparrayfield, since it will be filled automatically with that OID.

The generated array type's name is the scalar type's name with an underscore prepended. The array entry's other fields are filled fromBKI_ARRAY_DEFAULT(*value*)annotations inpg_type.h, or if there isn't one, copied from the scalar type. (There's also a special case fortypalign.) Then thetypelemandtyparrayfields of the two entries are set to cross-reference each other.

# 71.2.5. Recipes for Editing Data Files

Here are some suggestions about the easiest ways to perform common tasks when updating catalog data files.

**Add a new column with a default to a catalog:**Add the column to the header file with aBKI_DEFAULT(*value*)annotation. The data file need only be adjusted by adding the field in existing rows where a non-default value is needed.

**向没有默认值的现有列添加默认值:**添加一个BKI_DEFAULT头文件的注释,然后运行重新格式化 dat 文件删除现在冗余的字段条目。

**删除一列,无论它是否具有默认值:**从标题中删除列,然后运行重新格式化 dat 文件删除现在无用的字段条目。

**更改或删除现有的默认值:**您不能简单地更改头文件,因为这会导致当前数据被错误地解释。首轮制作扩展数据文件用显式插入的所有默认值重写数据文件,然后更改或删除BKI_DEFAULT注释,然后运行重新格式化 dat 文件再次删除多余的字段。

临时批量编辑: 重新格式化_dat_file.pl可以适应执行多种批量更改。查找它的块注释,显示可以插入一次性代码的位置。在下面的示例中,我们将合并两个布尔字段pg_proc进入一个字符字段:

  1. 使用默认值将新列添加到pg_proc.h

    +    /* see PROKIND_ categories below */
    +    char        prokind BKI_DEFAULT(f);
    
  2. 创建一个新的脚本基于重新格式化_dat_file.pl即时插入适当的值:

    -           # At this point we have the full row in memory as a hash
    -           # and can do any operations we want. As written, it only
    -           # removes default values, but this script can be adapted to
    -           # do one-off bulk-editing.
    +           # One-off change to migrate to prokind
    +           # Default has already been filled in by now, so change to other
    +           # values as appropriate
    +           if ($values{proisagg} eq 't')
    +           {
    +               $values{prokind} = 'a';
    +           }
    +           elsif ($values{proiswindow} eq 't')
    +           {
    +               $values{prokind} = 'w';
    +           }
    
  3. 运行新脚本:

    $ cd src/include/catalog
    $ perl  rewrite_dat_with_prokind.pl  pg_proc.dat
    

    在此刻pg_proc.dat拥有所有三列,亲善, 普罗萨格, 和窗口,尽管它们只会出现在具有非默认值的行中。

  4. 删除旧列pg_proc.h

    -    /* is it an aggregate? */
    -    bool        proisagg BKI_DEFAULT(f);
    -
    -    /* is it a window function? */
    -    bool        proiswindow BKI_DEFAULT(f);
    
  5. 最后,运行重新格式化 dat 文件删除无用的旧条目pg_proc.dat.

    有关用于批量编辑的脚本的更多示例,请参阅convert_oid2name.plremove_pg_type_oid_symbols.pl附加到此消息:https://www.postgresql.org/message-id/CAJVSVGVX8gXnPm+Xa=DxR7kFYprcQ1tNcCT5D0O3ShfnM6jehA@mail.gmail.com (opens new window)