提交 · ebea9dd4f1b62cb3c8302f10aaca3af0231e9818 · 李少辉-开发者 / git

19 1月, 2007 2 次提交

Use fixed-size integers when writing out the index in fast-import. · ebea9dd4

由 Shawn O. Pearce 提交于 1月 18, 2007

Currently the pack .idx file format uses 32-bit unsigned integers
for the fan-out table and the object offsets.  We had previously
defined these as 'unsigned int', but not every system will define
that type to be a 32 bit value.  To ensure maximum portability we
should always use 'uint32_t'.
Signed-off-by: NShawn O. Pearce <spearce@spearce.org>

ebea9dd4

Always use struct pack_header for pack header in fast-import. · 566f4425

由 Shawn O. Pearce 提交于 1月 18, 2007

Previously we were using 'unsigned int' to update the hdr_entries
field of the pack header after the file had been completed and
was being hashed.  This may not be 32 bits on all platforms.
Instead we want to always uint32_t.

I'm actually cheating here by just using the pack_header like the
rest of Git and letting the struct definition declare the correct
type.  Right now that field is still 'unsigned int' (wrong) but a
pending change submitted by Simon 'corecode' Schubert changes it
to uint32_t.  After that change is merged in fast-import will do
the right thing all of the time.
Signed-off-by: NShawn O. Pearce <spearce@spearce.org>

566f4425

17 1月, 2007 5 次提交

Correct packfile edge output in fast-import. · 69e74e74

由 Shawn O. Pearce 提交于 1月 17, 2007

Branches are only contained by a packfile if the branch actually
had its most recent commit in that packfile.  So new branches are
set to MAX_PACK_ID to ensure they don't cause their commit to list
as part of the first packfile when it closes out if the commit was
actually in existance before fast-import started.

Also corrected the type of last_commit to be umaxint_t to prevent
overflow and wraparound on very large imports.  Though that is
highly unlikely to occur as we're talking 4 billion commits, which
no real project has right now.
Signed-off-by: NShawn O. Pearce <spearce@spearce.org>

69e74e74

Declare no-arg functions as (void) in fast-import. · fd99224e

由 Shawn O. Pearce 提交于 1月 17, 2007

Apparently the git convention is to declare any function which
takes no arguments as taking void.  I did not do this during the
early fast-import development, but should have.
Signed-off-by: NShawn O. Pearce <spearce@spearce.org>

fd99224e

Correct a few types to be unsigned in fast-import. · 6f64f6d9

由 Shawn O. Pearce 提交于 1月 17, 2007

The length of an atom string cannot be negative.  So make it
explicit and declare it as an unsigned value.

The shift width in a mark table node also cannot be negative.
I'm also moving it to after the pointer arrays to prevent any
possible alignment problems on a 64 bit system.
Signed-off-by: NShawn O. Pearce <spearce@spearce.org>

6f64f6d9

Corrected BNF input documentation for fast-import. · 2104838b

由 Shawn O. Pearce 提交于 1月 17, 2007

Now that fast-import uses uintmax_t (the largest available unsigned
integer type) for marks we don't want to say its an unsigned 32
bit integer in ASCII base 10 notation.  It could be much larger,
especially on 64 bit systems, and especially if a frontend uses
a very large number of marks (1 per file revision on a very, very
large import).
Signed-off-by: NShawn O. Pearce <spearce@spearce.org>

2104838b

Print out the edge commits for each packfile in fast-import. · 2369ed79

由 Shawn O. Pearce 提交于 1月 16, 2007

To help callers repack very large repositories into a series of
packfiles fast-import now outputs the last commits/tags it wrote to
a packfile when it prints out the packfile name.  This information
can be feed to pack-objects --revs to repack.  For the first pack
of an initial import this is pretty easy (just feed those SHA1s on
stdin) but for subsequent packs you want to feed the subsequent
pack's final SHA1s but also all prior pack's SHA1s prefixed with
the negation operator.  This way the prior pack's data does not
get included into the subsequent pack.
Signed-off-by: NShawn O. Pearce <spearce@spearce.org>

2369ed79

16 1月, 2007 9 次提交

Correct object_count type and stat output in fast-import. · a7ddc487

由 Shawn O. Pearce 提交于 1月 16, 2007

Since object_count is limited to 'unsigned long' (really an
unsigned 32 bit integer value) by the pack file format we may as
well use exactly that type here in fast-import for that counter.
An earlier change by me incorrectly made it uintmax_t.

But since object_count is a counter for the current packfile only,
we don't want to output its value at the end.  Instead we should
sum up the individual type counters and report that total, as that
will cover all of the packfiles.
Signed-off-by: NShawn O. Pearce <spearce@spearce.org>

a7ddc487

Correct max_packsize default in fast-import. · eec11c24

由 Shawn O. Pearce 提交于 1月 16, 2007

Apparently amd64 has defined 'unsigned long' to be a 64 bit value,
which means -1 was way over the 4 GiB packfile limit.  Whoops.
Signed-off-by: NShawn O. Pearce <spearce@spearce.org>

eec11c24

Remove unnecessary pack_fd global in fast-import. · 0fcbcae7

由 Shawn O. Pearce 提交于 1月 16, 2007

Much like the pack_sha1 the pack_fd is an unnecessary global
variable, we already have the fd stored in our struct packed_git
*pack_data so that the core library functions in sha1_file.c are
able to lookup and decompress object data that we have previously
written.  Keeping an extra copy of this value in our own variable
is just a hold-over from earlier versions of fast-import and is
now completely unnecessary.
Signed-off-by: NShawn O. Pearce <spearce@spearce.org>

0fcbcae7

Ensure we close the packfile after creating it in fast-import. · 12801587

由 Shawn O. Pearce 提交于 1月 16, 2007

Because we are renaming the packfile into its file destination we
need to be sure its not open when the rename is called, otherwise
some operating systems (e.g. Windows) may prevent the rename from
occurring.
Signed-off-by: NShawn O. Pearce <spearce@spearce.org>

12801587

Use .keep files in fast-import during processing. · 8455e484

由 Shawn O. Pearce 提交于 1月 16, 2007

Because fast-import automatically updates all references (heads
and tags) at the end of its run the repository is corrupt unless
the objects are available in the .git/objects/pack directory prior
to the refs being modified.  The easiest way to ensure that is true
is to move the packfile and its associated index directly into the
.git/objects/pack directory as soon as we have finished output to it.

But the only safe way to do this is to create the a temporary .keep
file for that pack, so we use the same tricks that index-pack uses
when its being invoked by receive-pack.
Signed-off-by: NShawn O. Pearce <spearce@spearce.org>

8455e484

Reuse sha1 in packed_git in fast-import. · 09543c96

由 Shawn O. Pearce 提交于 1月 16, 2007

Rather than maintaing our own packfile level sha1 variable we
can make use of the one already available in struct packed_git.
Its meant for the SHA1 of the index but it can also hold the
SHA1 of the packfile itself between final checksumming of the
packfile and creation of the index.
Signed-off-by: NShawn O. Pearce <spearce@spearce.org>

09543c96

Replace redundant yread() with read_in_full() in fast-import. · 6cf09261

由 Shawn O. Pearce 提交于 1月 16, 2007

Prior to git having read_in_full() fast-import used its own private
function yread to perform the header reading task.  No sense in
keeping that around now that read_in_full is a public, stable
function.
Signed-off-by: NShawn O. Pearce <spearce@spearce.org>

6cf09261

Use uintmax_t for marks in fast-import. · 0ea9f045

由 Shawn O. Pearce 提交于 1月 16, 2007

If a frontend wants to use a mark per file revision and per commit
and is doing a truly huge import (such as a 32 GiB SVN repository)
we may need more than 2**32 unique mark values, especially if the
frontend is unable (or unwilling) to recycle mark values.  For mark
idnums we should use the largest unsigned integer type available,
hoping that will be at least 64 bits when we are compiled as a 64
bit executable.  This way we may consume huge amounts of memory
storing our mark table, but we'll at least be able to process
the entire import without failing.
Signed-off-by: NShawn O. Pearce <spearce@spearce.org>

0ea9f045

Corrected buffer overflow during automatic checkpoint in fast-import. · 5d6f3ef6

由 Shawn O. Pearce 提交于 1月 15, 2007

If we previously were using a delta but we needed to checkpoint the
current packfile and switch to a new packfile we need to throw away
the delta and compress the raw object by itself, as delta chains
cannot span non-thin packfiles. Unfortunately the output buffer
in this case needs to grow, as the size of the compressed object
may be quite a bit larger than the size of the compressed delta.

I've also avoided recompressing the object if we are checkpointing
and we didn't use a delta. In this case the output buffer is the
correct size and has already been populated with the right data,
we just need to close out the current packfile and open a new one.
Signed-off-by: NShawn O. Pearce <spearce@spearce.org>

5d6f3ef6

15 1月, 2007 9 次提交

Print the packfile names to stdout from fast-import. · 9d1b1b5e

由 Shawn O. Pearce 提交于 1月 15, 2007

Caller scripts may want to know what packfiles the fast-import
process just wrote out for them.  This is now output to stdout,
one packfile name per line, after we checkpoint each packfile.
Signed-off-by: NShawn O. Pearce <spearce@spearce.org>

9d1b1b5e

Implemented automatic checkpoints within fast-import. · d9ee53ce

由 Shawn O. Pearce 提交于 1月 15, 2007

When the number of objects or number of bytes gets close to the limit
allowed by the packfile format (or configured on the command line by
our caller) we should automatically checkpoint the current packfile
and start a new one before writing the object out.  This does however
require that we abandon the delta (if we had one) as its not valid
in a new packfile.

I also added the simple rule that if we got a delta back but the
delta itself is the same size as or larger than the uncompressed
object to ignore the delta and just store the object data.  This
should avoid some really bad behavior caused by our current delta
strategy.
Signed-off-by: NShawn O. Pearce <spearce@spearce.org>

d9ee53ce

Optimize index creation on large object sets in fast-import. · 2fce1f3c

由 Shawn O. Pearce 提交于 1月 15, 2007

When we are generating multiple packfiles at once we only need
to scan the blocks of object_entry structs which contain objects
for the current packfile.  Because the most recent blocks are at
the front of the linked list, and because all new objects going
into the current file are allocated from the front of that list,
we can stop scanning for objects as soon as we identify one which
doesn't belong to the current packfile.
Signed-off-by: NShawn O. Pearce <spearce@spearce.org>

2fce1f3c

Don't create a final empty packfile in fast-import. · 3e005baf

由 Shawn O. Pearce 提交于 1月 15, 2007

If the last packfile is going to be empty (has 0 objects) then it
shouldn't be kept after the import has terminated, as there is no
point to the packfile.  So rather than hashing it and making the
index file, just delete the packfile.
Signed-off-by: NShawn O. Pearce <spearce@spearce.org>

3e005baf

Implemented manual packfile switching in fast-import. · 7bfe6e26

由 Shawn O. Pearce 提交于 1月 15, 2007

To help importers which are dealing with massive amounts of data
fast-import needs to be able to close the packfile it is currently
writing to and open a new packfile for any additional data that
will be received. A new 'checkpoint' command has been introduced
which can be used by the frontend import process to force this
to occur at any time. This may be useful to ensure a very long
running import doesn't lose any work due to unexpected failures.
Signed-off-by: NShawn O. Pearce <spearce@spearce.org>

7bfe6e26

Remove unnecessary duplicate_count in fast-import. · 80144727

由 Shawn O. Pearce 提交于 1月 15, 2007

There is little reason to be keeping a global duplicate_count
value when we also keep it per object type.  The global counter can
easily be computed at the end, once all processing has completed.
This saves us a couple of machine instructions in an unimportant
part of code.  But it looks slightly better to me to not keep
two counters around.
Signed-off-by: NShawn O. Pearce <spearce@spearce.org>

80144727

Restructure fast-import to support creating multiple packfiles. · f70b6534

由 Shawn O. Pearce 提交于 1月 15, 2007

Now that we are starting to see some really large projects (such
as KDE or a fork of FreeBSD) get imported into Git we're running
into the upper limit on packfile object count as well as overall
byte length. The KDE and FreeBSD projects are both likely to
require more than 4 GiB to store their current history, which means
we really need multiple packfiles to handle their content.

This is a fairly simple restructuring of the internal code to help
us support creating multiple packfiles from within fast-import.
We are now adding a 5 digit incrementing suffix to the end of the
basename supplied to us by the caller, permitting up to 99,999
packs to be generated in a single fast-import run.
Signed-off-by: NShawn O. Pearce <spearce@spearce.org>

f70b6534

S
Misc. type cleanups within fast-import. · 03842d8e
由 Shawn O. Pearce 提交于 1月 15, 2007
```
Signed-off-by: NShawn O. Pearce <spearce@spearce.org>
```
03842d8e

Improve reuse of sha1_file library within fast-import. · d489bc14

由 Shawn O. Pearce 提交于 1月 14, 2007

Now that the sha1_file.c library routines use the sliding mmap
routines to perform efficient access to portions of a packfile
I can remove that code from fast-import.c and just invoke it.
One benefit is we now have reloading support for any packfile which
uses OBJ_OFS_DELTA.  Another is we have significantly less code
to maintain.

This code reuse change *requires* that fast-import generate only
an OBJ_OFS_DELTA format packfile, as there is absolutely no index
available to perform OBJ_REF_DELTA lookup in while unpacking
an object.  This is probably reasonable to require as the delta
offsets result in smaller packfiles and are faster to unpack,
as no index searching is required.  Its also only a temporary
requirement as users could always repack without offsets before
making the import available to older versions of Git.
Signed-off-by: NShawn O. Pearce <spearce@spearce.org>

d489bc14

14 1月, 2007 15 次提交

Merge branch 'master' into sp/fast-import · 1fcdd62a

由 Shawn O. Pearce 提交于 1月 14, 2007

I'm bringing master in early so that the OBJ_OFS_DELTA implementation
is available as part of the topic.  This way git-fast-import can
learn about this new slightly smaller and faster packfile format,
and can generate them directly rather than needing to have them be
repacked with git-pack-objects.

Due to the API changes in master during the period of development
of git-fast-import, a few minor tweaks to fast-import.c are needed
to produce a working merge.  I've done them here as part of the
merge to ensure bisection always works.
Signed-off-by: NShawn O. Pearce <spearce@spearce.org>

1fcdd62a

Allow creating branches without committing in fast-import. · 9938ffc5

由 Shawn O. Pearce 提交于 1月 11, 2007

Some importers may want to create a branch long before they actually
commit to it, or in some cases they may never commit to the branch
but they still need the ref to be created in the repository after
the import is complete.

This extends the 'reset ' command to automatically create a new
branch if the supplied reference isn't already known as a branch.

While I'm at it I also modified the syntax of the reset command
to terminate with an empty line, like commit and tag operate.
This just makes the command set more consistent.
Signed-off-by: NShawn O. Pearce <spearce@spearce.org>

9938ffc5

Support creation of merge commits in fast-import. · 62b6f483

由 Shawn O. Pearce 提交于 1月 11, 2007

Some importers are able to determine when branch merges occurred
within their source data. In these cases they will want to supply
the correct commits to fast-import so that a proper merge commit
will exist in Git. This is now supported by supplying a 'merge '
command after the commit message and optional from command.

A merge is not actually performed by fast-import, its assumed that
the frontend performed any sort of merging activity already and
that fast-import should simply be storing its result.
Signed-off-by: NShawn O. Pearce <spearce@spearce.org>

62b6f483

Fix repository corruption when using marks for modified blobs. · cacbdd0a

由 Shawn O. Pearce 提交于 1月 11, 2007

Apparently we did not copy the blob SHA1 into the stack variable
'sha1' when a mark is used to refer to a prior blob.  This code
was not previously tested as the Mozilla CVS -> git-fast-import
program always fed us full SHA1s for modified blobs and did not
use the mark feature there.
Signed-off-by: NShawn O. Pearce <spearce@spearce.org>

cacbdd0a

S
Additional fast-import tree delta corruption cleanups. · 8a8c55ea
由 Shawn O. Pearce 提交于 8月 28, 2006
```
Signed-off-by: NShawn O. Pearce <spearce@spearce.org>
```
8a8c55ea

Correct tree corruption problems in fast-import. · b54d6422

由 Shawn O. Pearce 提交于 8月 28, 2006

The new tree delta implementation caused blob SHA1s to be used
instead of a tree SHA1 when a tree was written out.  This really
only appeared to happen when converting an existing file to a tree,
but may have been possible in some other situations.
Signed-off-by: NShawn O. Pearce <spearce@spearce.org>

b54d6422

S
Replace ywrite in fast-import with the standard write_or_die. · 23bc886c
由 Shawn O. Pearce 提交于 8月 28, 2006
```
Signed-off-by: NShawn O. Pearce <spearce@spearce.org>
```
23bc886c

Reuse the same buffer for all commits/tags in fast-import. · 243f801d

由 Shawn O. Pearce 提交于 8月 28, 2006

Since most commits and tag objects are around the same size and we
only generate one at a time we can reuse the same buffer rather than
xmalloc'ing and free'ing the buffer every time we generate a commit.
Signed-off-by: NShawn O. Pearce <spearce@spearce.org>

243f801d

Recycle data buffers for tree generation in fast-import. · e2eb469d

由 Shawn O. Pearce 提交于 8月 28, 2006

We only ever generate at most two tree streams at a time. Since most
trees are around the same size we can simply recycle the buffers from
one tree generation to the next rather than constantly xmalloc'ing
and free'ing them. This should perform slightly better when handling
a large number of trees as malloc has less work to do.
Signed-off-by: NShawn O. Pearce <spearce@spearce.org>

e2eb469d

Implemented tree delta compression in fast-import. · 4cabf858

由 Shawn O. Pearce 提交于 8月 28, 2006

We now store for every tree entry two modes and two sha1 values;
the base (aka "version 0") and the current/new (aka "version 1").
When we generate a tree object we also regenerate the prior version
object and use that as our base object for a delta. This strategy
saves a significant amount of memory as we can continue to use the
atom pool for file/directory names and only increases each tree
entry by an additional 24 bytes of memory.

Branches should automatically delta against their ancestor tree,
unless the ancestor tree is already at the delta chain limit.
Signed-off-by: NShawn O. Pearce <spearce@spearce.org>

4cabf858

S
Converted hash memcpy/memcmp to new hashcpy/hashcmp/hashclr. · 445b8599
由 Shawn O. Pearce 提交于 8月 28, 2006
```
Signed-off-by: NShawn O. Pearce <spearce@spearce.org>
```
445b8599
S
Don't crash fast-import if no branch log was requested. · 08d7e892
由 Shawn O. Pearce 提交于 8月 27, 2006
```
Signed-off-by: NShawn O. Pearce <spearce@spearce.org>
```
08d7e892

Added 'reset' command to clear a branch's tree. · 5fced8dc

由 Shawn O. Pearce 提交于 8月 27, 2006

Sometimes an import frontend may need to work with a temporary branch
which will actually contain many different branches over the life
of the import.  This is especially useful when the frontend needs
to create a tag from a set of file versions which are otherwise
never a commit.
Signed-off-by: NShawn O. Pearce <spearce@spearce.org>

5fced8dc

Map only part of the generated pack file at any point in time. · 53dbce78

由 Shawn O. Pearce 提交于 8月 27, 2006

When generating a very large pack file (for example close to 1 GB
in size) it may be impossible for the kernel to find a contiguous
free range within a 32 bit address space for the mapping to be
located at.  This is especially problematic on large imports where
there is a lot of malloc activity occuring within the same process
and the malloc'd regions may straddle the previously mapped regions,
thereby creating large holes in the address space.

So instead we map only 128 MB of the pack at any given time.
This will likely increase the number of times the file gets mapped
(with additional system time required to update the page tables
more frequently) but will allow the program to handle packs up to
4 GB in size.
Signed-off-by: NShawn O. Pearce <spearce@spearce.org>

53dbce78

S
Fixed compile error in fast-import. · 35ef237c
由 Shawn O. Pearce 提交于 8月 26, 2006
```
Signed-off-by: NShawn O. Pearce <spearce@spearce.org>
```
35ef237c

李少辉-开发者 / git 与 Fork 源项目一致

李少辉-开发者 / git
与 Fork 源项目一致