提交 · 40db58b8dc17ee9fac86ad126d442bb87b5ad549 · 李少辉-开发者 / git

08 2月, 2007 1 次提交

fast-import: Fix compile warnings · 40db58b8

由 Johannes Schindelin 提交于 2月 07, 2007

Not on all platforms are size_t and unsigned long equivalent.
Since I do not know how portable %z is, I play safe, and just
cast the respective variables to unsigned long.
Signed-off-by: NJohannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: NJunio C Hamano <junkio@cox.net>

40db58b8

07 2月, 2007 8 次提交

Don't crash fast-import if the marks cannot be exported. · 22c9f7e4

由 Shawn O. Pearce 提交于 2月 07, 2007

Apparently fast-import used to die a horrible death if we
were unable to open the marks file for output.  This is
slightly less than ideal, especially now that we dump
the marks as part of the `checkpoint` command.
Signed-off-by: NShawn O. Pearce <spearce@spearce.org>

22c9f7e4

Dump all refs and marks during a checkpoint in fast-import. · 820b9310

由 Shawn O. Pearce 提交于 2月 07, 2007

If the frontend asks us to checkpoint (via the explicit checkpoint
command) its probably because they are afraid the current import
will crash/fail/whatever and want to make sure they can pickup from
the last checkpoint. To do that sort of recovery, we will need the
current tip of every branch and tag available at the next startup.
Signed-off-by: NShawn O. Pearce <spearce@spearce.org>

820b9310

Teach fast-import how to sit quietly in the corner. · c499d768

由 Shawn O. Pearce 提交于 2月 07, 2007

Often users will be running fast-import from within a larger frontend
process, and this may be a frequent periodic tool such as a future
edition of `git-svn fetch`. We don't want to bombard users with our
large stats output if they won't be interested in it, so `--quiet`
is now an option to make gfi be more silent.
Signed-off-by: NShawn O. Pearce <spearce@spearce.org>

c499d768

Teach fast-import how to clear the internal branch content. · 825769a8

由 Shawn O. Pearce 提交于 2月 07, 2007

Some frontends may not be able to (easily) keep track of which files
are included in the branch, and which aren't.  Performing this
tracking can be tedious and error prone for the frontend to do,
especially if its foreign data source cannot supply the changed
path list on a per-commit basis.

fast-import now allows a frontend to request that a branch's tree
be wiped clean (reset to the empty tree) at the start of a commit,
allowing the frontend to feed in all paths which belong on the branch.

This is ideal for a tar-file importer frontend, for example, as
the frontend just needs to reformat the tar data stream into a gfi
data stream, which may be something a few Perl regexps can take
care of. :)
Signed-off-by: NShawn O. Pearce <spearce@spearce.org>

825769a8

J
S_IFLNK != 0140000 · 9981b6d9
由 Junio C Hamano 提交于 2月 06, 2007
```
Signed-off-by: NShawn O. Pearce <spearce@spearce.org>
```
9981b6d9

Don't do non-fastforward updates in fast-import. · 7073e69e

由 Shawn O. Pearce 提交于 2月 06, 2007

If fast-import is being used to update an existing branch of
a repository, the user may not want to lose commits if another
process updates the same ref at the same time.  For example, the
user might be using fast-import to make just one or two commits
against a live branch.

We now perform a fast-forward check during the ref updating process.
If updating a branch would cause commits in that branch to be lost,
we skip over it and display the new SHA1 to standard error.

This new default behavior can be overridden with `--force`, like
git-push and git-fetch.
Signed-off-by: NShawn O. Pearce <spearce@spearce.org>

7073e69e

Support RFC 2822 date parsing in fast-import. · 63e0c8b3

由 Shawn O. Pearce 提交于 2月 06, 2007

Since some frontends may be working with source material where
the dates are only readily available as RFC 2822 strings, it is
more friendly if fast-import exposes Git's parse_date() function
to handle the conversion.  This way the frontend doesn't need
to perform the parsing itself.

The new --date-format option to fast-import can be used by a
frontend to select which format it will supply date strings in.
The default is the standard `raw` Git format, which fast-import
has always supported.  Format rfc2822 can be used to activate the
parse_date() function instead.

Because fast-import could also be useful for creating new, current
commits, the format `now` is also supported to generate the current
system timestamp.  The implementation of `now` is a trivial call
to datestamp(), but is actually a whole whopping 3 lines so that
fast-import can verify the frontend really meant `now`.

As part of this change I have added validation of the `raw` date
format.  Prior to this change fast-import would accept anything
in a `committer` command, even if it was seriously malformed.
Now fast-import requires the '> ' near the end of the string and
verifies the timestamp is formatted properly.
Signed-off-by: NShawn O. Pearce <spearce@spearce.org>

63e0c8b3

Remove unnecessary null pointer checks in fast-import. · e7d06a4b

由 Shawn O. Pearce 提交于 2月 06, 2007

There is no need to check for a NULL pointer before invoking free(),
the runtime library automatically performs this check anyway and
does nothing if a NULL pointer is supplied.
Signed-off-by: NShawn O. Pearce <spearce@spearce.org>

e7d06a4b

06 2月, 2007 6 次提交

Correct minor style issue in fast-import. · e5b1444b

由 Shawn O. Pearce 提交于 2月 06, 2007

Junio noticed that I was using a different style in fast-import
for returned pointers than the rest of Git.  Before merging this
code into the main git.git tree I'd like to make it consistent,
as this style variation was not intentional.
Signed-off-by: NShawn O. Pearce <spearce@spearce.org>

e5b1444b

Correct compiler warnings in fast-import. · 10e8d688

由 Shawn O. Pearce 提交于 2月 06, 2007

Junio noticed these warnings/errors in fast-import when compiling
with `-Werror -ansi -pedantic`.  A few changes are to reduce compiler
warnings, while one (in cmd_merge) is a bug fix.
Signed-off-by: NShawn O. Pearce <spearce@spearce.org>

10e8d688

Remove --branch-log from fast-import. · 0b868e02

由 Shawn O. Pearce 提交于 2月 06, 2007

The --branch-log option and its associated code hasn't been used in
several months, as its not really very useful for debugging fast-import
or a frontend. I don't plan on supporting it in this state long-term,
so I'm killing it now before it gets distributed to a wider audience.
Signed-off-by: NShawn O. Pearce <spearce@spearce.org>

0b868e02

Don't support shell-quoted refnames in fast-import. · 6c3aac1c

由 Shawn O. Pearce 提交于 2月 05, 2007

The current implementation of shell-style quoted refnames and
SHA-1 expressions within fast-import contains a bad memory leak.
We leak the unquoted strings used by the `from` and `merge`
commands, maybe others.  Its also just muddling up the docs.

Since Git refnames cannot contain LF, and that is our delimiter
for the end of the refname, and we accept any other character
as-is, there is no reason for these strings to support quoting,
except to be nice to frontends.  But frontends shouldn't be
expecting to use funny refs in Git, and its just as simple to
never quote them as it is to always pass them through the same
quoting filter as pathnames.  So frontends should never quote
refs, or ref expressions.
Signed-off-by: NShawn O. Pearce <spearce@spearce.org>

6c3aac1c

Reduce memory usage of fast-import. · 10831c55

由 Shawn O. Pearce 提交于 2月 05, 2007

Some structs are allocated rather frequently, but were using integer
types which were far larger than required to actually store their
full value range.

As packfiles are limited to 4 GiB we don't need more than 32 bits to
store the offset of an object within that packfile, an `unsigned long`
on a 64 bit system is likely a 64 bit unsigned value.  Saving 4 bytes
per object on a 64 bit system can add up fast on any sizable import.

As atom strings are strictly single components in a path name these
are probably limited to just 255 bytes by the underlying OS.  Going
to that short of a string is probably too restrictive, but certainly
`unsigned int` is far too large for their lengths.  `unsigned short`
is a reasonable limit.

Modes within a tree really only need two bytes to store their whole
value; using `unsigned int` here is vast overkill.  Saving 4 bytes
per file entry in an active branch can add up quickly on a project
with a large number of files.
Signed-off-by: NShawn O. Pearce <spearce@spearce.org>

10831c55

Include checkpoint command in the BNF. · 8c1f22da

由 Shawn O. Pearce 提交于 2月 05, 2007

This command isn't encouraged (as its slow) but it does exist and
is accepted, so it still should be covered in the BNF.
Signed-off-by: NShawn O. Pearce <spearce@spearce.org>

8c1f22da

19 1月, 2007 5 次提交

Accept 'inline' file data in fast-import commit structure. · b715cfbb

由 Shawn O. Pearce 提交于 1月 18, 2007

Its very annoying to need to specify the file content ahead of a
commit and use marks to connect the individual blobs to the commit's
file modification entry, especially if the frontend can't/won't
generate the blob SHA1s itself.  Instead it would much easier to
use if we can accept the blob data at the same time as we receive
each file_change line.

Now fast-import accepts 'inline' instead of a mark idnum or blob
SHA1 within the 'M' type file_change command.  If an inline is
detected the very next line must be a 'data n' command, supplying
the file data.
Signed-off-by: NShawn O. Pearce <spearce@spearce.org>

b715cfbb

Support delimited data regions in fast-import. · 3b4dce02

由 Shawn O. Pearce 提交于 1月 18, 2007

During testing its nice to not have to feed the length of a data
chunk to the 'data' command of fast-import. Instead we would
prefer to be able to establish a data chunk much like shell's <<
operator and use a line delimiter to denote the end of the input.

So now if a data command is started as 'data <<EOF' we will look
for a terminator line containing only the string EOF on that line.
Once found, we stop the data command. Everything between the two
lines is used as the data value.

The 'data <<' syntax is slower than 'data n', as we don't know how
many bytes to expect and instead must grow our buffer on the fly.
It also has the problem that the frontend must use a string which
will not appear on a line by itself in the input, and the data
region will always end in an LF. For these reasons real import
frontends are encouraged to continue to use _only_ 'data n'.
Signed-off-by: NShawn O. Pearce <spearce@spearce.org>

3b4dce02

Remove unnecessary options from fast-import. · e5808826

由 Shawn O. Pearce 提交于 1月 18, 2007

The --objects command line option is rather unnecessary.  Internally
we allocate objects in 5000 unit blocks, ensuring that any sort
of malloc overhead is ammortized over the individual objects to
almost nothing.  Since most frontends don't know how many objects
they will need for a given import run (and its hard for them to
predict without just doing the run) we probably won't see anyone
using --objects.  Further since there's really no major benefit
to using the option, most frontends won't even bother supplying
it even if they could estimate the number of objects.  So I'm
removing it.

The --max-objects-per-pack option was probably a mistake to even
have added in the first place.  The packfile format is limited
to 4 GiB today; given that objects need at least 3 bytes of data
(and probably need even more) there's no way we are going to exceed
the limit of 1<<32-1 objects before we reach the file size limit.
So I'm removing it (to slightly reduce the complexity of the code)
before anyone gets any wise ideas and tries to use it.
Signed-off-by: NShawn O. Pearce <spearce@spearce.org>

e5808826

Use fixed-size integers when writing out the index in fast-import. · ebea9dd4

由 Shawn O. Pearce 提交于 1月 18, 2007

Currently the pack .idx file format uses 32-bit unsigned integers
for the fan-out table and the object offsets.  We had previously
defined these as 'unsigned int', but not every system will define
that type to be a 32 bit value.  To ensure maximum portability we
should always use 'uint32_t'.
Signed-off-by: NShawn O. Pearce <spearce@spearce.org>

ebea9dd4

Always use struct pack_header for pack header in fast-import. · 566f4425

由 Shawn O. Pearce 提交于 1月 18, 2007

Previously we were using 'unsigned int' to update the hdr_entries
field of the pack header after the file had been completed and
was being hashed.  This may not be 32 bits on all platforms.
Instead we want to always uint32_t.

I'm actually cheating here by just using the pack_header like the
rest of Git and letting the struct definition declare the correct
type.  Right now that field is still 'unsigned int' (wrong) but a
pending change submitted by Simon 'corecode' Schubert changes it
to uint32_t.  After that change is merged in fast-import will do
the right thing all of the time.
Signed-off-by: NShawn O. Pearce <spearce@spearce.org>

566f4425

17 1月, 2007 5 次提交

Correct packfile edge output in fast-import. · 69e74e74

由 Shawn O. Pearce 提交于 1月 17, 2007

Branches are only contained by a packfile if the branch actually
had its most recent commit in that packfile.  So new branches are
set to MAX_PACK_ID to ensure they don't cause their commit to list
as part of the first packfile when it closes out if the commit was
actually in existance before fast-import started.

Also corrected the type of last_commit to be umaxint_t to prevent
overflow and wraparound on very large imports.  Though that is
highly unlikely to occur as we're talking 4 billion commits, which
no real project has right now.
Signed-off-by: NShawn O. Pearce <spearce@spearce.org>

69e74e74

Declare no-arg functions as (void) in fast-import. · fd99224e

由 Shawn O. Pearce 提交于 1月 17, 2007

Apparently the git convention is to declare any function which
takes no arguments as taking void.  I did not do this during the
early fast-import development, but should have.
Signed-off-by: NShawn O. Pearce <spearce@spearce.org>

fd99224e

Correct a few types to be unsigned in fast-import. · 6f64f6d9

由 Shawn O. Pearce 提交于 1月 17, 2007

The length of an atom string cannot be negative.  So make it
explicit and declare it as an unsigned value.

The shift width in a mark table node also cannot be negative.
I'm also moving it to after the pointer arrays to prevent any
possible alignment problems on a 64 bit system.
Signed-off-by: NShawn O. Pearce <spearce@spearce.org>

6f64f6d9

Corrected BNF input documentation for fast-import. · 2104838b

由 Shawn O. Pearce 提交于 1月 17, 2007

Now that fast-import uses uintmax_t (the largest available unsigned
integer type) for marks we don't want to say its an unsigned 32
bit integer in ASCII base 10 notation.  It could be much larger,
especially on 64 bit systems, and especially if a frontend uses
a very large number of marks (1 per file revision on a very, very
large import).
Signed-off-by: NShawn O. Pearce <spearce@spearce.org>

2104838b

Print out the edge commits for each packfile in fast-import. · 2369ed79

由 Shawn O. Pearce 提交于 1月 16, 2007

To help callers repack very large repositories into a series of
packfiles fast-import now outputs the last commits/tags it wrote to
a packfile when it prints out the packfile name.  This information
can be feed to pack-objects --revs to repack.  For the first pack
of an initial import this is pretty easy (just feed those SHA1s on
stdin) but for subsequent packs you want to feed the subsequent
pack's final SHA1s but also all prior pack's SHA1s prefixed with
the negation operator.  This way the prior pack's data does not
get included into the subsequent pack.
Signed-off-by: NShawn O. Pearce <spearce@spearce.org>

2369ed79

16 1月, 2007 9 次提交

Correct object_count type and stat output in fast-import. · a7ddc487

由 Shawn O. Pearce 提交于 1月 16, 2007

Since object_count is limited to 'unsigned long' (really an
unsigned 32 bit integer value) by the pack file format we may as
well use exactly that type here in fast-import for that counter.
An earlier change by me incorrectly made it uintmax_t.

But since object_count is a counter for the current packfile only,
we don't want to output its value at the end.  Instead we should
sum up the individual type counters and report that total, as that
will cover all of the packfiles.
Signed-off-by: NShawn O. Pearce <spearce@spearce.org>

a7ddc487

Correct max_packsize default in fast-import. · eec11c24

由 Shawn O. Pearce 提交于 1月 16, 2007

Apparently amd64 has defined 'unsigned long' to be a 64 bit value,
which means -1 was way over the 4 GiB packfile limit.  Whoops.
Signed-off-by: NShawn O. Pearce <spearce@spearce.org>

eec11c24

Remove unnecessary pack_fd global in fast-import. · 0fcbcae7

由 Shawn O. Pearce 提交于 1月 16, 2007

Much like the pack_sha1 the pack_fd is an unnecessary global
variable, we already have the fd stored in our struct packed_git
*pack_data so that the core library functions in sha1_file.c are
able to lookup and decompress object data that we have previously
written.  Keeping an extra copy of this value in our own variable
is just a hold-over from earlier versions of fast-import and is
now completely unnecessary.
Signed-off-by: NShawn O. Pearce <spearce@spearce.org>

0fcbcae7

Ensure we close the packfile after creating it in fast-import. · 12801587

由 Shawn O. Pearce 提交于 1月 16, 2007

Because we are renaming the packfile into its file destination we
need to be sure its not open when the rename is called, otherwise
some operating systems (e.g. Windows) may prevent the rename from
occurring.
Signed-off-by: NShawn O. Pearce <spearce@spearce.org>

12801587

Use .keep files in fast-import during processing. · 8455e484

由 Shawn O. Pearce 提交于 1月 16, 2007

Because fast-import automatically updates all references (heads
and tags) at the end of its run the repository is corrupt unless
the objects are available in the .git/objects/pack directory prior
to the refs being modified.  The easiest way to ensure that is true
is to move the packfile and its associated index directly into the
.git/objects/pack directory as soon as we have finished output to it.

But the only safe way to do this is to create the a temporary .keep
file for that pack, so we use the same tricks that index-pack uses
when its being invoked by receive-pack.
Signed-off-by: NShawn O. Pearce <spearce@spearce.org>

8455e484

Reuse sha1 in packed_git in fast-import. · 09543c96

由 Shawn O. Pearce 提交于 1月 16, 2007

Rather than maintaing our own packfile level sha1 variable we
can make use of the one already available in struct packed_git.
Its meant for the SHA1 of the index but it can also hold the
SHA1 of the packfile itself between final checksumming of the
packfile and creation of the index.
Signed-off-by: NShawn O. Pearce <spearce@spearce.org>

09543c96

Replace redundant yread() with read_in_full() in fast-import. · 6cf09261

由 Shawn O. Pearce 提交于 1月 16, 2007

Prior to git having read_in_full() fast-import used its own private
function yread to perform the header reading task.  No sense in
keeping that around now that read_in_full is a public, stable
function.
Signed-off-by: NShawn O. Pearce <spearce@spearce.org>

6cf09261

Use uintmax_t for marks in fast-import. · 0ea9f045

由 Shawn O. Pearce 提交于 1月 16, 2007

If a frontend wants to use a mark per file revision and per commit
and is doing a truly huge import (such as a 32 GiB SVN repository)
we may need more than 2**32 unique mark values, especially if the
frontend is unable (or unwilling) to recycle mark values.  For mark
idnums we should use the largest unsigned integer type available,
hoping that will be at least 64 bits when we are compiled as a 64
bit executable.  This way we may consume huge amounts of memory
storing our mark table, but we'll at least be able to process
the entire import without failing.
Signed-off-by: NShawn O. Pearce <spearce@spearce.org>

0ea9f045

Corrected buffer overflow during automatic checkpoint in fast-import. · 5d6f3ef6

由 Shawn O. Pearce 提交于 1月 15, 2007

If we previously were using a delta but we needed to checkpoint the
current packfile and switch to a new packfile we need to throw away
the delta and compress the raw object by itself, as delta chains
cannot span non-thin packfiles. Unfortunately the output buffer
in this case needs to grow, as the size of the compressed object
may be quite a bit larger than the size of the compressed delta.

I've also avoided recompressing the object if we are checkpointing
and we didn't use a delta. In this case the output buffer is the
correct size and has already been populated with the right data,
we just need to close out the current packfile and open a new one.
Signed-off-by: NShawn O. Pearce <spearce@spearce.org>

5d6f3ef6

15 1月, 2007 6 次提交

Print the packfile names to stdout from fast-import. · 9d1b1b5e

由 Shawn O. Pearce 提交于 1月 15, 2007

Caller scripts may want to know what packfiles the fast-import
process just wrote out for them.  This is now output to stdout,
one packfile name per line, after we checkpoint each packfile.
Signed-off-by: NShawn O. Pearce <spearce@spearce.org>

9d1b1b5e

Implemented automatic checkpoints within fast-import. · d9ee53ce

由 Shawn O. Pearce 提交于 1月 15, 2007

When the number of objects or number of bytes gets close to the limit
allowed by the packfile format (or configured on the command line by
our caller) we should automatically checkpoint the current packfile
and start a new one before writing the object out.  This does however
require that we abandon the delta (if we had one) as its not valid
in a new packfile.

I also added the simple rule that if we got a delta back but the
delta itself is the same size as or larger than the uncompressed
object to ignore the delta and just store the object data.  This
should avoid some really bad behavior caused by our current delta
strategy.
Signed-off-by: NShawn O. Pearce <spearce@spearce.org>

d9ee53ce

Optimize index creation on large object sets in fast-import. · 2fce1f3c

由 Shawn O. Pearce 提交于 1月 15, 2007

When we are generating multiple packfiles at once we only need
to scan the blocks of object_entry structs which contain objects
for the current packfile.  Because the most recent blocks are at
the front of the linked list, and because all new objects going
into the current file are allocated from the front of that list,
we can stop scanning for objects as soon as we identify one which
doesn't belong to the current packfile.
Signed-off-by: NShawn O. Pearce <spearce@spearce.org>

2fce1f3c

Don't create a final empty packfile in fast-import. · 3e005baf

由 Shawn O. Pearce 提交于 1月 15, 2007

If the last packfile is going to be empty (has 0 objects) then it
shouldn't be kept after the import has terminated, as there is no
point to the packfile.  So rather than hashing it and making the
index file, just delete the packfile.
Signed-off-by: NShawn O. Pearce <spearce@spearce.org>

3e005baf

Implemented manual packfile switching in fast-import. · 7bfe6e26

由 Shawn O. Pearce 提交于 1月 15, 2007

To help importers which are dealing with massive amounts of data
fast-import needs to be able to close the packfile it is currently
writing to and open a new packfile for any additional data that
will be received. A new 'checkpoint' command has been introduced
which can be used by the frontend import process to force this
to occur at any time. This may be useful to ensure a very long
running import doesn't lose any work due to unexpected failures.
Signed-off-by: NShawn O. Pearce <spearce@spearce.org>

7bfe6e26

Remove unnecessary duplicate_count in fast-import. · 80144727

由 Shawn O. Pearce 提交于 1月 15, 2007

There is little reason to be keeping a global duplicate_count
value when we also keep it per object type.  The global counter can
easily be computed at the end, once all processing has completed.
This saves us a couple of machine instructions in an unimportant
part of code.  But it looks slightly better to me to not keep
two counters around.
Signed-off-by: NShawn O. Pearce <spearce@spearce.org>

80144727

李少辉-开发者 / git 与 Fork 源项目一致

李少辉-开发者 / git
与 Fork 源项目一致