1. 19 8月, 2007 8 次提交
    • S
      Generate crash reports on die in fast-import · 8acb3297
      Shawn O. Pearce 提交于
      As fast-import is quite strict about its input and die()'s anytime
      something goes wrong it can be difficult for a frontend developer
      to troubleshoot why fast-import rejected their input, or to even
      determine what input command it rejected.
      
      This change introduces a custom handler for Git's die() routine.
      When we receive a die() for any reason (fast-import or a lower level
      core Git routine we called) the error is first dumped onto stderr
      and then a more extensive crash report file is prepared in GIT_DIR.
      Finally we exit the process with status 128, just like the stock
      builtin die handler.
      
      An internal flag is set to prevent any further die()'s that may be
      invoked during the crash report generator from causing us to enter
      into an infinite loop.  We shouldn't die() from our crash report
      handler, but just in case someone makes a future code change we are
      prepared to gaurd against small mistakes turning into huge problems
      for the end-user.
      Signed-off-by: NShawn O. Pearce <spearce@spearce.org>
      8acb3297
    • S
      Allow frontends to bidirectionally communicate with fast-import · ac053c02
      Shawn O. Pearce 提交于
      The existing checkpoint command is very useful to force fast-import
      to dump the branches out to disk so that standard Git tools can
      access them and the objects they refer to.  However there was not a
      way to know when fast-import had finished executing the checkpoint
      and it was safe to read those refs.
      
      The progress command can be used to make fast-import output any
      message of the frontend's choosing to standard out.  The frontend
      can scan for these messages using select() or poll() to monitor a
      pipe connected to the standard output of fast-import.
      Signed-off-by: NShawn O. Pearce <spearce@spearce.org>
      ac053c02
    • S
      Make trailing LF optional for all fast-import commands · 1fdb649c
      Shawn O. Pearce 提交于
      For the same reasons as the prior change we want to allow frontends
      to omit the trailing LF that usually delimits commands.  In some
      cases these just make the input stream more verbose looking than
      it needs to be, and its just simpler for the frontend developer to
      get started if our parser is slightly more lenient about where an
      LF is required and where it isn't.
      
      To make this optional LF feature work we now have to buffer up to one
      line of input in command_buf.  This buffering can happen if we look
      at the current input command but don't recognize it at this point
      in the code.  In such a case we need to "unget" the entire line,
      but we cannot depend upon the stdio library to let us do ungetc()
      for that many characters at once.
      Signed-off-by: NShawn O. Pearce <spearce@spearce.org>
      1fdb649c
    • S
      Make trailing LF following fast-import `data` commands optional · 2c570cde
      Shawn O. Pearce 提交于
      A few fast-import frontend developers have found it odd that we
      require the LF following a `data` command, especially in the exact
      byte count format.  Technically we don't need this LF to parse
      the stream properly, but having it here does make the stream more
      readable to humans.  We can easily make the LF optional by peeking
      at the next byte available from the stream and pushing it back into
      the buffer if its not LF.
      Signed-off-by: NShawn O. Pearce <spearce@spearce.org>
      2c570cde
    • S
      Teach fast-import to ignore lines starting with '#' · 401d53fa
      Shawn O. Pearce 提交于
      Several frontend developers have asked that some form of stream
      comments be permitted within a fast-import data stream.  This way
      they can include information from their own frontend program about
      where specific data was taken from in the source system, or about
      a decision that their frontend may have made while creating the
      fast-import data stream.
      
      This change introduces comments in the Bourne-shell/Tcl/Perl style.
      Lines starting with '#' are ignored, up to and including the LF.
      Unlike the above mentioned three languages however we do not look for
      and ignore leading whitespace.  This just simplifies the definition
      of the comment format and the code that parses them.
      
      To make comments work we had to stop using read_next_command() within
      cmd_data() and directly invoke read_line() during the inline variant
      of the function.  This is necessary to retain any lines of the
      input data that might otherwise look like a comment to fast-import.
      Signed-off-by: NShawn O. Pearce <spearce@spearce.org>
      401d53fa
    • S
      Use handy ALLOC_GROW macro in fast-import when possible · 31490074
      Shawn O. Pearce 提交于
      Instead of growing our buffer by hand during the inline variant of
      cmd_data() we can save a few lines of code and just use the nifty
      new ALLOC_GROW macro already available to us.
      Signed-off-by: NShawn O. Pearce <spearce@spearce.org>
      31490074
    • S
      Actually allow TAG_FIXUP branches in fast-import · ea08a6fd
      Shawn O. Pearce 提交于
      Michael Haggerty <mhagger@alum.mit.edu> noticed while debugging a
      Git backend for cvs2svn that fast-import was barfing when he tried
      to use "TAG_FIXUP" as a branch name for temporary work needed to
      cleanup the tree prior to creating an annotated tag object.
      
      The reason we were rejecting the branch name was check_ref_format()
      returns -2 when there are less than 2 '/' characters in the input
      name.  TAG_FIXUP has 0 '/' characters, but is technically just as
      valid of a ref as HEAD and MERGE_HEAD, so we really should permit it
      (and any other similar looking name) during import.
      
      New test cases have been added to make sure we still detect very
      wrong branch names (e.g. containing [ or starting with .) and yet
      still permit reasonable names (e.g. TAG_FIXUP).
      Signed-off-by: NShawn O. Pearce <spearce@spearce.org>
      ea08a6fd
    • A
      Fix whitespace in "Format of STDIN stream" of fast-import · c905e090
      Alex Riesen 提交于
      Something probably assumed that HT indentation is 4 characters.
      Signed-off-by: NAlex Riesen <raa.lkml@gmail.com>
      Signed-off-by: NShawn O. Pearce <spearce@spearce.org>
      c905e090
  2. 15 8月, 2007 1 次提交
  3. 15 7月, 2007 1 次提交
    • S
      Teach fast-import to recursively copy files/directories · b6f3481b
      Shawn O. Pearce 提交于
      Some source material (e.g. Subversion dump files) perform directory
      renames by telling us the directory was copied, then deleted in the
      same revision.  This makes it difficult for a frontend to convert
      such data formats to a fast-import stream, as all the frontend has
      on hand is "Copy a/ to b/; Delete a/" with no details about what
      files are in a/, unless the frontend also kept track of all files.
      
      The new 'C' subcommand within a commit allows the frontend to make a
      recursive copy of one path to another path within the branch, without
      needing to keep track of the individual file paths.  The metadata
      copy is performed in memory efficiently, but is implemented as a
      copy-immediately operation, rather than copy-on-write.
      
      With this new 'C' subcommand frontends could obviously implement an
      'R' (rename) on their own as a combination of 'C' and 'D' (delete),
      but since we have already offered up 'R' in the past and it is a
      trivial thing to keep implemented I'm not going to deprecate it.
      Signed-off-by: NShawn O. Pearce <spearce@spearce.org>
      b6f3481b
  4. 10 7月, 2007 1 次提交
    • S
      Support wholesale directory renames in fast-import · f39a946a
      Shawn O. Pearce 提交于
      Some source material (e.g. Subversion dump files) perform directory
      renames without telling us exactly which files in that subdirectory
      were moved.  This makes it hard for a frontend to convert such data
      formats to a fast-import stream, as all the frontend has on hand
      is "Rename a/ to b/" with no details about what files are in a/,
      unless the frontend also kept track of all files.
      
      The new 'R' subcommand within a commit allows the frontend to
      rename either a file or an entire subdirectory, without needing to
      know the object's SHA-1 or the specific files contained within it.
      The rename is performed as efficiently as possible internally,
      making it cheaper than a 'D'/'M' pair for a file rename.
      Signed-off-by: NShawn O. Pearce <spearce@spearce.org>
      f39a946a
  5. 24 5月, 2007 4 次提交
  6. 11 5月, 2007 1 次提交
  7. 03 5月, 2007 1 次提交
  8. 29 4月, 2007 1 次提交
  9. 25 4月, 2007 1 次提交
  10. 20 4月, 2007 1 次提交
    • S
      Don't repack existing objects in fast-import · a5c1780a
      Shawn O. Pearce 提交于
      Some users of fast-import have been trying to use it to rewrite
      commits and trees, an activity where the all of the relevant blobs
      are already available from the existing packfiles.  In such a case
      we don't want to repack a blob, even if the frontend application
      has supplied us the raw data rather than a mark or a SHA-1 name.
      
      I'm intentionally only checking the packfiles that existed when
      fast-import started and am always ignoring all loose object files.
      
      We ignore loose objects because fast-import tends to operate on a
      very large number of objects in a very short timespan, and it is
      usually creating new objects, not reusing existing ones.  In such
      a situtation the majority of the objects will not be found in the
      existing packfiles, nor will they be loose object files.  If the
      frontend application really wants us to look at loose object files,
      then they can just repack the repository before running fast-import.
      Signed-off-by: NShawn O. Pearce <spearce@spearce.org>
      a5c1780a
  11. 31 3月, 2007 1 次提交
    • T
      Rename warn() to warning() to fix symbol conflicts on BSD and Mac OS · 46efd2d9
      Theodore Ts'o 提交于
      This fixes a problem reported by Randal Schwartz:
      
      >I finally tracked down all the (albeit inconsequential) errors I was getting
      >on both OpenBSD and OSX.  It's the warn() function in usage.c.  There's
      >warn(3) in BSD-style distros.  It'd take a "great rename" to change it, but if
      >someone with better C skills than I have could do that, my linker and I would
      >appreciate it.
      
      It was annoying to me, too, when I was doing some mergetool testing on
      Mac OS X, so here's a fix.
      Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>
      Cc: "Randal L. Schwartz" <merlyn@stonehenge.com>
      Signed-off-by: NJunio C Hamano <junkio@cox.net>
      46efd2d9
  12. 25 3月, 2007 1 次提交
  13. 13 3月, 2007 2 次提交
    • S
      Remove unnecessary casts from fast-import · 061e35c5
      Shawn O. Pearce 提交于
      Jeff King pointed out that these casts are quite unnecessary, as
      the compiler should be doing them anyway, and may cause problems
      in the future if the size of the argument for to_atom were to ever
      be increased.
      Signed-off-by: NShawn O. Pearce <spearce@spearce.org>
      061e35c5
    • J
      fast-import: grow tree storage more aggressively · f022f85f
      Jeff King 提交于
      When building up a tree for a commit, fast-import
      dynamically allocates memory for the tree entries. When more
      space is needed, the allocated memory is increased by a
      constant amount. For very large trees, this means
      re-allocating and memcpy()ing the memory O(n) times.
      
      To compound this problem, releasing the previous tree
      resource does not free the memory; it is kept in a pool
      for future trees. This means that each of the O(n)
      allocations will consume increasing amounts of memory,
      giving O(n^2) memory consumption.
      Signed-off-by: NJeff King <peff@peff.net>
      Signed-off-by: NShawn O. Pearce <spearce@spearce.org>
      f022f85f
  14. 08 3月, 2007 6 次提交
    • S
      Allow fast-import frontends to reload the marks table · e8438420
      Shawn O. Pearce 提交于
      I'm giving fast-import a lesson on how to reload the marks table
      using the same format it outputs with --export-marks.  This way
      a frontend can reload the marks table from a prior import, making
      incremental imports less painful.
      Signed-off-by: NShawn O. Pearce <spearce@spearce.org>
      e8438420
    • S
      Use atomic updates to the fast-import mark file · 60b9004c
      Shawn O. Pearce 提交于
      When we allow fast-import frontends to reload a mark file from a
      prior session we want to let them use the same file as they exported
      the marks to.  This makes it very simple for the frontend to save
      state across incremental imports.
      
      But we don't want to lose the old marks table if anything goes wrong
      while writing our current marks table.  So instead of truncating and
      overwriting the path specified to --export-marks we use the standard
      lockfile code to write the current marks out to a temporary file,
      then rename it over the old marks table.
      Signed-off-by: NShawn O. Pearce <spearce@spearce.org>
      60b9004c
    • S
      Preallocate memory earlier in fast-import · 93e72d8d
      Shawn O. Pearce 提交于
      I'm about to teach fast-import how to reload the marks file created
      by a prior session.  The general approach that I want to use is to
      immediately parse the marks file when the specific argument is found
      in argv, thereby allowing the caller to supply multiple marks files,
      as the mark space can be sparsely populated.
      
      To make that work out we need to allocate our object tables before
      we parse the command line options.  Since none of these tables
      depend on the command line options, we can easily relocate them.
      Signed-off-by: NShawn O. Pearce <spearce@spearce.org>
      93e72d8d
    • S
      Use off_t in pack-objects/fast-import when we mean an offset · 6777a59f
      Shawn O. Pearce 提交于
      Always use an off_t value in pack-objects anytime we are dealing
      with an offset to some data within a packfile.
      
      Also fixed a minor uintmax_t that was incorrectly defined before.
      Signed-off-by: NShawn O. Pearce <spearce@spearce.org>
      Signed-off-by: NJunio C Hamano <junkio@cox.net>
      6777a59f
    • S
      Use off_t when we really mean a file offset. · c4001d92
      Shawn O. Pearce 提交于
      Not all platforms have declared 'unsigned long' to be a 64 bit value,
      but we want to support a 64 bit packfile (or close enough anyway)
      in the near future as some projects are getting large enough that
      their packed size exceeds 4 GiB.
      
      By using off_t, the POSIX type that is declared to mean an offset
      within a file, we support whatever maximum file size the underlying
      operating system will handle.  For most modern systems this is up
      around 2^60 or higher.
      Signed-off-by: NShawn O. Pearce <spearce@spearce.org>
      Signed-off-by: NJunio C Hamano <junkio@cox.net>
      c4001d92
    • S
      General const correctness fixes · 3a55602e
      Shawn O. Pearce 提交于
      We shouldn't attempt to assign constant strings into char*, as the
      string is not writable at runtime.  Likewise we should always be
      treating unsigned values as unsigned values, not as signed values.
      
      Most of these are very straightforward.  The only exception is the
      (unnecessary) xstrdup/free in builtin-branch.c for the detached
      head case.  Since this is a user-level interactive type program
      and that particular code path is executed no more than once, I feel
      that the extra xstrdup call is well worth the easy elimination of
      this warning.
      Signed-off-by: NShawn O. Pearce <spearce@spearce.org>
      Signed-off-by: NJunio C Hamano <junkio@cox.net>
      3a55602e
  15. 06 3月, 2007 2 次提交
    • S
      fast-import: Fail if a non-existant commit is used for merge · 2f6dc35d
      Shawn O. Pearce 提交于
      Johannes Sixt noticed during one of his own imports that fast-import
      did not fail if a non-existant commit is referenced by SHA-1 value
      as an argument to the 'merge' command.  This allowed the user to
      unknowingly create commits that would fail in fsck, as the commit
      contents would not be completely reachable.
      
      A side effect of this bug was that a frontend process could mark
      any SHA-1 object (blob, tree, tag) as a parent of a merge commit.
      This should also fail in fsck, as the commit is not a valid commit.
      
      We now use the same rule as the 'from' command.  If a commit is
      referenced in the 'merge' command by hex formatted SHA-1 then the
      SHA-1 must be a commit or a tag that can be peeled back to a commit,
      the commit must already exist, and must be readable by the core Git
      infrastructure code.  This requirement means that the commit must
      have existed prior to fast-import starting, or the commit must have
      been flushed out by a prior 'checkpoint' command.
      Signed-off-by: NShawn O. Pearce <spearce@spearce.org>
      2f6dc35d
    • S
      fast-import: Avoid infinite loop after reset · 734c91f9
      Shawn O. Pearce 提交于
      Johannes Sixt noticed that a 'reset' command applied to a branch that
      is already active in the branch LRU cache can cause fast-import to
      relink the same branch into the LRU cache twice.  This will cause
      the LRU cache to contain a cycle, making unload_one_branch run in an
      infinite loop as it tries to select the oldest branch for eviction.
      
      I have trivially fixed the problem by adding an active bit to
      each branch object; this bit indicates if the branch is already
      in the LRU and allows us to avoid trying to add it a second time.
      Converting the pack_id field into a bitfield makes this change take
      up no additional memory.
      Signed-off-by: NShawn O. Pearce <spearce@spearce.org>
      734c91f9
  16. 27 2月, 2007 2 次提交
  17. 21 2月, 2007 3 次提交
    • J
      prefixcmp(): fix-up mechanical conversion. · 599065a3
      Junio C Hamano 提交于
      Previous step converted use of strncmp() with literal string
      mechanically even when the result is only used as a boolean:
      
          if (!strncmp("foo", arg, 3)) ==> if (!(-prefixcmp(arg, "foo")))
      
      This step manually cleans them up to read:
      
          if (!prefixcmp(arg, "foo"))
      Signed-off-by: NJunio C Hamano <junkio@cox.net>
      599065a3
    • J
      Mechanical conversion to use prefixcmp() · cc44c765
      Junio C Hamano 提交于
      This mechanically converts strncmp() to use prefixcmp(), but only when
      the parameters match specific patterns, so that they can be verified
      easily.  Leftover from this will be fixed in a separate step, including
      idiotic conversions like
      
          if (!strncmp("foo", arg, 3))
      
        =>
      
          if (!(-prefixcmp(arg, "foo")))
      
      This was done by using this script in px.perl
      
         #!/usr/bin/perl -i.bak -p
         if (/strncmp\(([^,]+), "([^\\"]*)", (\d+)\)/ && (length($2) == $3)) {
                 s|strncmp\(([^,]+), "([^\\"]*)", (\d+)\)|prefixcmp($1, "$2")|;
         }
         if (/strncmp\("([^\\"]*)", ([^,]+), (\d+)\)/ && (length($1) == $3)) {
                 s|strncmp\("([^\\"]*)", ([^,]+), (\d+)\)|(-prefixcmp($2, "$1"))|;
         }
      
      and running:
      
         $ git grep -l strncmp -- '*.c' | xargs perl px.perl
      Signed-off-by: NJunio C Hamano <junkio@cox.net>
      cc44c765
    • J
      Check for PRIuMAX rather than NO_C99_FORMAT in fast-import.c. · 3efb1f34
      Jason Riedy 提交于
      Thanks to Simon 'corecode' Schubert <corecode@fs.ei.tum.de> for
      the clean-up.  Defining the C99 standard PRIuMAX when necessary
      replaces UM_FMT and the awkward UM10_FMT.  There are no direct
      C99 translations for other uses of NO_C99_FORMAT in git, alas.
      Signed-off-by: NJason Riedy <ejr@cs.berkeley.edu>
      Signed-off-by: NJunio C Hamano <junkio@cox.net>
      3efb1f34
  18. 20 2月, 2007 1 次提交
  19. 13 2月, 2007 1 次提交
    • S
      fast-import: Support reusing 'from' and brown paper bag fix reset. · ea5e370a
      Shawn O. Pearce 提交于
      It was suggested on the mailing list that being able to use `from`
      in any commit to reset the current branch is useful in some types of
      importers, such as a darcs importer.
      
      We originally did not permit resetting an existing branch with a
      new `from` command during a `commit` command, but this restriction
      was only to help debug the hacked up cvs2svn that Jon Smirl was
      developing in parallel with git-fast-import.  It is probably more
      of a problem to disallow it than to allow it.  So now we permit a
      `from` during any `commit`.
      
      While making the changes required to permit multiple `from`
      commands on the same branch, I discovered we no longer needed the
      last_commit field to be set to 0 during a reset, so that was removed.
      (Reset was originally setting the field to 0 to signal cmd_from()
      that it was OK to execute on the branch.)
      
      While poking around in this section of fast-import I also realized
      the `reset` command was not working as intended if the corresponding
      `from` command was omitted (as allowed by the BNF grammar and the
      code).  If `from` was omitted we cleared out the tree but we left
      the tree SHA-1 and parent commit SHA-1 intact.  This is not what
      the user intended in this case.  Instead they would be trying to
      reset the branch to have no parent and to have no tree, making the
      branch look new-born during the next commit.  We now clear these
      SHA-1 values during `reset`, ensuring the branch looks new-born if
      `from` does not get supplied.
      
      New test cases for these were also added.
      Signed-off-by: NShawn O. Pearce <spearce@spearce.org>
      ea5e370a
  20. 12 2月, 2007 1 次提交
    • S
      fast-import: Hide the pack boundary commits by default. · bdf1c06d
      Shawn O. Pearce 提交于
      Most users don't need the pack boundary information that fast-import
      was printing to standard output, especially if they were calling
      it with --quiet.
      
      Those users who do want this information probably want it captured
      so they can go back and use it to repack the imported repository.
      So dumping the boundary commits to a log file makes more sense then
      printing them to standard output.
      Signed-off-by: NShawn O. Pearce <spearce@spearce.org>
      bdf1c06d