1. 21 9月, 2007 2 次提交
    • P
      Rework unquote_c_style to work on a strbuf. · 7fb1011e
      Pierre Habouzit 提交于
      If the gain is not obvious in the diffstat, the resulting code is more
      readable, _and_ in checkout-index/update-index we now reuse the same buffer
      to unquote strings instead of always freeing/mallocing.
      
      This also is more coherent with the next patch that reworks quoting
      functions.
      
      The quoting function is also made more efficient scanning for backslashes
      and treating portions of strings without a backslash at once.
      Signed-off-by: NPierre Habouzit <madcoder@debian.org>
      7fb1011e
    • P
      strbuf API additions and enhancements. · c76689df
      Pierre Habouzit 提交于
      Add strbuf_remove, change strbuf_insert:
        As both are special cases of strbuf_splice, implement them as such.
        gcc is able to do the math and generate almost optimal code this way.
      
      Add strbuf_swap:
        Exchange the values of its arguments.
        Use it in fast-import.c
      
      Also fix spacing issues in strbuf.h
      Signed-off-by: NPierre Habouzit <madcoder@debian.org>
      c76689df
  2. 19 9月, 2007 1 次提交
  3. 18 9月, 2007 3 次提交
  4. 17 9月, 2007 1 次提交
  5. 11 9月, 2007 1 次提交
    • P
      Strbuf API extensions and fixes. · f1696ee3
      Pierre Habouzit 提交于
        * Add strbuf_rtrim to remove trailing spaces.
        * Add strbuf_insert to insert data at a given position.
        * Off-by one fix in strbuf_addf: strbuf_avail() does not counts the final
          \0 so the overflow test for snprintf is the strict comparison. This is
          not critical as the growth mechanism chosen will always allocate _more_
          memory than asked, so the second test will not fail. It's some kind of
          miracle though.
        * Add size extension hints for strbuf_init and strbuf_read. If 0, default
          applies, else:
            + initial buffer has the given size for strbuf_init.
            + first growth checks it has at least this size rather than the
              default 8192.
      Signed-off-by: NPierre Habouzit <madcoder@debian.org>
      Signed-off-by: NJunio C Hamano <gitster@pobox.com>
      f1696ee3
  6. 07 9月, 2007 2 次提交
    • P
      fast-import: Use strbuf API, and simplify cmd_data() · 4a241d79
      Pierre Habouzit 提交于
        This patch features the use of strbuf_detach, and prevent the programmer
      to mess with allocation directly. The code is as efficent as before, just
      more concise and more straightforward.
      Signed-off-by: NJunio C Hamano <gitster@pobox.com>
      4a241d79
    • P
      Rework strbuf API and semantics. · b449f4cf
      Pierre Habouzit 提交于
        The gory details are explained in strbuf.h. The change of semantics this
      patch enforces is that the embeded buffer has always a '\0' character after
      its last byte, to always make it a C-string. The offs-by-one changes are all
      related to that very change.
      
        A strbuf can be used to store byte arrays, or as an extended string
      library. The `buf' member can be passed to any C legacy string function,
      because strbuf operations always ensure there is a terminating \0 at the end
      of the buffer, not accounted in the `len' field of the structure.
      
        A strbuf can be used to generate a string/buffer whose final size is not
      really known, and then "strbuf_detach" can be used to get the built buffer,
      and keep the wrapping "strbuf" structure usable for further work again.
      
        Other interesting feature: strbuf_grow(sb, size) ensure that there is
      enough allocated space in `sb' to put `size' new octets of data in the
      buffer. It helps avoiding reallocating data for nothing when the problem the
      strbuf helps to solve has a known typical size.
      Signed-off-by: NPierre Habouzit <madcoder@debian.org>
      Signed-off-by: NJunio C Hamano <gitster@pobox.com>
      b449f4cf
  7. 21 8月, 2007 1 次提交
  8. 20 8月, 2007 1 次提交
    • J
      fast-import pull request · 7e5dcea8
      Junio C Hamano 提交于
      * skip_optional_lf() decl is old-style -- please say
      
      	static skip_optional_lf(void)
              {
              	...
      	}
      
      * t9300 #14 fails, like this:
      
      * expecting failure: git-fast-import <input
      fatal: Branch name doesn't conform to GIT standards: .badbranchname
      fast-import: dumping crash report to .git/fast_import_crash_14354
      ./test-lib.sh: line 143: 14354 Segmentation fault      git-fast-import <input
      
      -- >8 --
      Subject: [PATCH] fastimport: Fix re-use of va_list
      
      The va_list is designed to be used only once. The current code
      reuses va_list argument may cause segmentation fault.  Copy and
      release the arguments to avoid this problem.
      
      While we are at it, fix old-style function declaration of
      skip_optional_lf().
      Signed-off-by: NJunio C Hamano <gitster@pobox.com>
      Signed-off-by: NShawn O. Pearce <spearce@spearce.org>
      7e5dcea8
  9. 19 8月, 2007 9 次提交
    • S
      Include recent command history in fast-import crash reports · 904b1941
      Shawn O. Pearce 提交于
      When we crash the frontend developer (or end-user) may need to know
      roughly around what part of the input stream we had a problem with
      and aborted on.  Because line numbers aren't very useful in this
      sort of application we instead just keep the last 100 commands in
      a FIFO queue and print them as part of the crash report.
      
      Currently one problem with this design is a commit that has
      more than 100 modified files in it will flood the FIFO and any
      context regarding branch/from/committer/mark/comments will be lost.
      We really should save only the last few (10?) file changes for the
      current commit, ensuring we have some prior higher level commands
      in the FIFO when we crash on a file M/D/C/R command.
      
      Another issue with this approach is the FIFO only includes the
      commands, it does not include the commit messages.  Yet having a
      commit message may be useful to help locate the relevant change in
      the source material.  In practice I don't think this is going to be a
      major concern as the frontend can always embed its own source change
      set identifier as a comment (which will appear in the crash report)
      and the commit message(s) for the most recent commits of any given
      branch should be obtainable from the (packed) commit objects.
      Signed-off-by: NShawn O. Pearce <spearce@spearce.org>
      904b1941
    • S
      Generate crash reports on die in fast-import · 8acb3297
      Shawn O. Pearce 提交于
      As fast-import is quite strict about its input and die()'s anytime
      something goes wrong it can be difficult for a frontend developer
      to troubleshoot why fast-import rejected their input, or to even
      determine what input command it rejected.
      
      This change introduces a custom handler for Git's die() routine.
      When we receive a die() for any reason (fast-import or a lower level
      core Git routine we called) the error is first dumped onto stderr
      and then a more extensive crash report file is prepared in GIT_DIR.
      Finally we exit the process with status 128, just like the stock
      builtin die handler.
      
      An internal flag is set to prevent any further die()'s that may be
      invoked during the crash report generator from causing us to enter
      into an infinite loop.  We shouldn't die() from our crash report
      handler, but just in case someone makes a future code change we are
      prepared to gaurd against small mistakes turning into huge problems
      for the end-user.
      Signed-off-by: NShawn O. Pearce <spearce@spearce.org>
      8acb3297
    • S
      Allow frontends to bidirectionally communicate with fast-import · ac053c02
      Shawn O. Pearce 提交于
      The existing checkpoint command is very useful to force fast-import
      to dump the branches out to disk so that standard Git tools can
      access them and the objects they refer to.  However there was not a
      way to know when fast-import had finished executing the checkpoint
      and it was safe to read those refs.
      
      The progress command can be used to make fast-import output any
      message of the frontend's choosing to standard out.  The frontend
      can scan for these messages using select() or poll() to monitor a
      pipe connected to the standard output of fast-import.
      Signed-off-by: NShawn O. Pearce <spearce@spearce.org>
      ac053c02
    • S
      Make trailing LF optional for all fast-import commands · 1fdb649c
      Shawn O. Pearce 提交于
      For the same reasons as the prior change we want to allow frontends
      to omit the trailing LF that usually delimits commands.  In some
      cases these just make the input stream more verbose looking than
      it needs to be, and its just simpler for the frontend developer to
      get started if our parser is slightly more lenient about where an
      LF is required and where it isn't.
      
      To make this optional LF feature work we now have to buffer up to one
      line of input in command_buf.  This buffering can happen if we look
      at the current input command but don't recognize it at this point
      in the code.  In such a case we need to "unget" the entire line,
      but we cannot depend upon the stdio library to let us do ungetc()
      for that many characters at once.
      Signed-off-by: NShawn O. Pearce <spearce@spearce.org>
      1fdb649c
    • S
      Make trailing LF following fast-import `data` commands optional · 2c570cde
      Shawn O. Pearce 提交于
      A few fast-import frontend developers have found it odd that we
      require the LF following a `data` command, especially in the exact
      byte count format.  Technically we don't need this LF to parse
      the stream properly, but having it here does make the stream more
      readable to humans.  We can easily make the LF optional by peeking
      at the next byte available from the stream and pushing it back into
      the buffer if its not LF.
      Signed-off-by: NShawn O. Pearce <spearce@spearce.org>
      2c570cde
    • S
      Teach fast-import to ignore lines starting with '#' · 401d53fa
      Shawn O. Pearce 提交于
      Several frontend developers have asked that some form of stream
      comments be permitted within a fast-import data stream.  This way
      they can include information from their own frontend program about
      where specific data was taken from in the source system, or about
      a decision that their frontend may have made while creating the
      fast-import data stream.
      
      This change introduces comments in the Bourne-shell/Tcl/Perl style.
      Lines starting with '#' are ignored, up to and including the LF.
      Unlike the above mentioned three languages however we do not look for
      and ignore leading whitespace.  This just simplifies the definition
      of the comment format and the code that parses them.
      
      To make comments work we had to stop using read_next_command() within
      cmd_data() and directly invoke read_line() during the inline variant
      of the function.  This is necessary to retain any lines of the
      input data that might otherwise look like a comment to fast-import.
      Signed-off-by: NShawn O. Pearce <spearce@spearce.org>
      401d53fa
    • S
      Use handy ALLOC_GROW macro in fast-import when possible · 31490074
      Shawn O. Pearce 提交于
      Instead of growing our buffer by hand during the inline variant of
      cmd_data() we can save a few lines of code and just use the nifty
      new ALLOC_GROW macro already available to us.
      Signed-off-by: NShawn O. Pearce <spearce@spearce.org>
      31490074
    • S
      Actually allow TAG_FIXUP branches in fast-import · ea08a6fd
      Shawn O. Pearce 提交于
      Michael Haggerty <mhagger@alum.mit.edu> noticed while debugging a
      Git backend for cvs2svn that fast-import was barfing when he tried
      to use "TAG_FIXUP" as a branch name for temporary work needed to
      cleanup the tree prior to creating an annotated tag object.
      
      The reason we were rejecting the branch name was check_ref_format()
      returns -2 when there are less than 2 '/' characters in the input
      name.  TAG_FIXUP has 0 '/' characters, but is technically just as
      valid of a ref as HEAD and MERGE_HEAD, so we really should permit it
      (and any other similar looking name) during import.
      
      New test cases have been added to make sure we still detect very
      wrong branch names (e.g. containing [ or starting with .) and yet
      still permit reasonable names (e.g. TAG_FIXUP).
      Signed-off-by: NShawn O. Pearce <spearce@spearce.org>
      ea08a6fd
    • A
      Fix whitespace in "Format of STDIN stream" of fast-import · c905e090
      Alex Riesen 提交于
      Something probably assumed that HT indentation is 4 characters.
      Signed-off-by: NAlex Riesen <raa.lkml@gmail.com>
      Signed-off-by: NShawn O. Pearce <spearce@spearce.org>
      c905e090
  10. 15 8月, 2007 1 次提交
  11. 15 7月, 2007 1 次提交
    • S
      Teach fast-import to recursively copy files/directories · b6f3481b
      Shawn O. Pearce 提交于
      Some source material (e.g. Subversion dump files) perform directory
      renames by telling us the directory was copied, then deleted in the
      same revision.  This makes it difficult for a frontend to convert
      such data formats to a fast-import stream, as all the frontend has
      on hand is "Copy a/ to b/; Delete a/" with no details about what
      files are in a/, unless the frontend also kept track of all files.
      
      The new 'C' subcommand within a commit allows the frontend to make a
      recursive copy of one path to another path within the branch, without
      needing to keep track of the individual file paths.  The metadata
      copy is performed in memory efficiently, but is implemented as a
      copy-immediately operation, rather than copy-on-write.
      
      With this new 'C' subcommand frontends could obviously implement an
      'R' (rename) on their own as a combination of 'C' and 'D' (delete),
      but since we have already offered up 'R' in the past and it is a
      trivial thing to keep implemented I'm not going to deprecate it.
      Signed-off-by: NShawn O. Pearce <spearce@spearce.org>
      b6f3481b
  12. 10 7月, 2007 1 次提交
    • S
      Support wholesale directory renames in fast-import · f39a946a
      Shawn O. Pearce 提交于
      Some source material (e.g. Subversion dump files) perform directory
      renames without telling us exactly which files in that subdirectory
      were moved.  This makes it hard for a frontend to convert such data
      formats to a fast-import stream, as all the frontend has on hand
      is "Rename a/ to b/" with no details about what files are in a/,
      unless the frontend also kept track of all files.
      
      The new 'R' subcommand within a commit allows the frontend to
      rename either a file or an entire subdirectory, without needing to
      know the object's SHA-1 or the specific files contained within it.
      The rename is performed as efficiently as possible internally,
      making it cheaper than a 'D'/'M' pair for a file rename.
      Signed-off-by: NShawn O. Pearce <spearce@spearce.org>
      f39a946a
  13. 24 5月, 2007 4 次提交
  14. 11 5月, 2007 1 次提交
  15. 03 5月, 2007 1 次提交
  16. 29 4月, 2007 1 次提交
  17. 25 4月, 2007 1 次提交
  18. 20 4月, 2007 1 次提交
    • S
      Don't repack existing objects in fast-import · a5c1780a
      Shawn O. Pearce 提交于
      Some users of fast-import have been trying to use it to rewrite
      commits and trees, an activity where the all of the relevant blobs
      are already available from the existing packfiles.  In such a case
      we don't want to repack a blob, even if the frontend application
      has supplied us the raw data rather than a mark or a SHA-1 name.
      
      I'm intentionally only checking the packfiles that existed when
      fast-import started and am always ignoring all loose object files.
      
      We ignore loose objects because fast-import tends to operate on a
      very large number of objects in a very short timespan, and it is
      usually creating new objects, not reusing existing ones.  In such
      a situtation the majority of the objects will not be found in the
      existing packfiles, nor will they be loose object files.  If the
      frontend application really wants us to look at loose object files,
      then they can just repack the repository before running fast-import.
      Signed-off-by: NShawn O. Pearce <spearce@spearce.org>
      a5c1780a
  19. 31 3月, 2007 1 次提交
    • T
      Rename warn() to warning() to fix symbol conflicts on BSD and Mac OS · 46efd2d9
      Theodore Ts'o 提交于
      This fixes a problem reported by Randal Schwartz:
      
      >I finally tracked down all the (albeit inconsequential) errors I was getting
      >on both OpenBSD and OSX.  It's the warn() function in usage.c.  There's
      >warn(3) in BSD-style distros.  It'd take a "great rename" to change it, but if
      >someone with better C skills than I have could do that, my linker and I would
      >appreciate it.
      
      It was annoying to me, too, when I was doing some mergetool testing on
      Mac OS X, so here's a fix.
      Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>
      Cc: "Randal L. Schwartz" <merlyn@stonehenge.com>
      Signed-off-by: NJunio C Hamano <junkio@cox.net>
      46efd2d9
  20. 25 3月, 2007 1 次提交
  21. 13 3月, 2007 2 次提交
    • S
      Remove unnecessary casts from fast-import · 061e35c5
      Shawn O. Pearce 提交于
      Jeff King pointed out that these casts are quite unnecessary, as
      the compiler should be doing them anyway, and may cause problems
      in the future if the size of the argument for to_atom were to ever
      be increased.
      Signed-off-by: NShawn O. Pearce <spearce@spearce.org>
      061e35c5
    • J
      fast-import: grow tree storage more aggressively · f022f85f
      Jeff King 提交于
      When building up a tree for a commit, fast-import
      dynamically allocates memory for the tree entries. When more
      space is needed, the allocated memory is increased by a
      constant amount. For very large trees, this means
      re-allocating and memcpy()ing the memory O(n) times.
      
      To compound this problem, releasing the previous tree
      resource does not free the memory; it is kept in a pool
      for future trees. This means that each of the O(n)
      allocations will consume increasing amounts of memory,
      giving O(n^2) memory consumption.
      Signed-off-by: NJeff King <peff@peff.net>
      Signed-off-by: NShawn O. Pearce <spearce@spearce.org>
      f022f85f
  22. 08 3月, 2007 3 次提交
    • S
      Allow fast-import frontends to reload the marks table · e8438420
      Shawn O. Pearce 提交于
      I'm giving fast-import a lesson on how to reload the marks table
      using the same format it outputs with --export-marks.  This way
      a frontend can reload the marks table from a prior import, making
      incremental imports less painful.
      Signed-off-by: NShawn O. Pearce <spearce@spearce.org>
      e8438420
    • S
      Use atomic updates to the fast-import mark file · 60b9004c
      Shawn O. Pearce 提交于
      When we allow fast-import frontends to reload a mark file from a
      prior session we want to let them use the same file as they exported
      the marks to.  This makes it very simple for the frontend to save
      state across incremental imports.
      
      But we don't want to lose the old marks table if anything goes wrong
      while writing our current marks table.  So instead of truncating and
      overwriting the path specified to --export-marks we use the standard
      lockfile code to write the current marks out to a temporary file,
      then rename it over the old marks table.
      Signed-off-by: NShawn O. Pearce <spearce@spearce.org>
      60b9004c
    • S
      Preallocate memory earlier in fast-import · 93e72d8d
      Shawn O. Pearce 提交于
      I'm about to teach fast-import how to reload the marks file created
      by a prior session.  The general approach that I want to use is to
      immediately parse the marks file when the specific argument is found
      in argv, thereby allowing the caller to supply multiple marks files,
      as the mark space can be sparsely populated.
      
      To make that work out we need to allocate our object tables before
      we parse the command line options.  Since none of these tables
      depend on the command line options, we can easily relocate them.
      Signed-off-by: NShawn O. Pearce <spearce@spearce.org>
      93e72d8d