提交 1c097891 编写于 作者: J J. Bruce Fields

user-manual: rewrite index discussion

Add an example using git-ls-files, standardize on the new "index"
terminology (as opposed to "cache"), attempt to clarify discussion and
make it a little shorter, avoid some unnecessary jargon ("write-back
cache").
Signed-off-by: NJ. Bruce Fields <bfields@citi.umich.edu>
上级 1c6045ff
......@@ -2911,57 +2911,63 @@ gitlink:git-verify-tag[1].
[[the-index]]
The "index" aka "Current Directory Cache"
-----------------------------------------
The index
-----------
The index is a binary file (generally kept in .git/index) containing a
sorted list of path names, each with permissions and the SHA1 of a blob
object; gitlink:git-ls-files[1] can show you the contents of the index:
The index is a simple binary file, which contains an efficient
representation of the contents of a virtual directory. It
does so by a simple array that associates a set of names, dates,
permissions and content (aka "blob") objects together. The cache is
always kept ordered by name, and names are unique (with a few very
specific rules) at any point in time, but the cache has no long-term
meaning, and can be partially updated at any time.
In particular, the index certainly does not need to be consistent with
the current directory contents (in fact, most operations will depend on
different ways to make the index 'not' be consistent with the directory
hierarchy), but it has three very important attributes:
'(a) it can re-generate the full state it caches (not just the
directory structure: it contains pointers to the "blob" objects so
that it can regenerate the data too)'
As a special case, there is a clear and unambiguous one-way mapping
from a current directory cache to a "tree object", which can be
efficiently created from just the current directory cache without
actually looking at any other data. So a directory cache at any one
time uniquely specifies one and only one "tree" object (but has
additional data to make it easy to match up that tree object with what
has happened in the directory)
'(b) it has efficient methods for finding inconsistencies between that
cached state ("tree object waiting to be instantiated") and the
current state.'
'(c) it can additionally efficiently represent information about merge
conflicts between different tree objects, allowing each pathname to be
-------------------------------------------------
$ git ls-files --stage
100644 63c918c667fa005ff12ad89437f2fdc80926e21c 0 .gitignore
100644 5529b198e8d14decbe4ad99db3f7fb632de0439d 0 .mailmap
100644 6ff87c4664981e4397625791c8ea3bbb5f2279a3 0 COPYING
100644 a37b2152bd26be2c2289e1f57a292534a51a93c7 0 Documentation/.gitignore
100644 fbefe9a45b00a54b58d94d06eca48b03d40a50e0 0 Documentation/Makefile
...
100644 2511aef8d89ab52be5ec6a5e46236b4b6bcd07ea 0 xdiff/xtypes.h
100644 2ade97b2574a9f77e7ae4002a4e07a6a38e46d07 0 xdiff/xutils.c
100644 d5de8292e05e7c36c4b68857c1cf9855e3d2f70a 0 xdiff/xutils.h
-------------------------------------------------
Note that in older documentation you may see the index called the
"current directory cache" or just the "cache". It has three important
properties:
1. The index contains all the information necessary to generate a single
(uniquely determined) tree object.
+
For example, running gitlink:git-commit[1] generates this tree object
from the index, stores it in the object database, and uses it as the
tree object associated with the new commit.
2. The index enables fast comparisons between the tree object it defines
and the working tree.
+
It does this by storing some additional data for each entry (such as
the last modified time). This data is not displayed above, and is not
stored in the created tree object, but it can be used to determine
quickly which files in the working directory differ from what was
stored in the index, and thus save git from having to read all of the
data from such files to look for changes.
3. It can efficiently represent information about merge conflicts
between different tree objects, allowing each pathname to be
associated with sufficient information about the trees involved that
you can create a three-way merge between them.'
Those are the ONLY three things that the directory cache does. It's a
cache, and the normal operation is to re-generate it completely from a
known tree object, or update/compare it with a live tree that is being
developed. If you blow the directory cache away entirely, you generally
haven't lost any information as long as you have the name of the tree
that it described.
At the same time, the index is also the staging area for creating
new trees, and creating a new tree always involves a controlled
modification of the index file. In particular, the index file can
have the representation of an intermediate tree that has not yet been
instantiated. So the index can be thought of as a write-back cache,
which can contain dirty information that has not yet been written back
to the backing store.
you can create a three-way merge between them.
+
We saw in <<conflict-resolution>> that during a merge the index can
store multiple versions of a single file (called "stages"). The third
column in the gitlink:git-ls-files[1] output above is the stage
number, and will take on values other than 0 for files with merge
conflicts.
The index is thus a sort of temporary staging area, which is filled with
a tree which you are in the process of working on.
If you blow the index away entirely, you generally haven't lost any
information as long as you have the name of the tree that it described.
[[low-level-operations]]
Low-level git operations
......
Markdown is supported
0% .
You are about to add 0 people to the discussion. Proceed with caution.
先完成此消息的编辑!
想要评论请 注册