1. 16 2月, 2008 3 次提交
  2. 21 1月, 2008 2 次提交
  3. 19 1月, 2008 1 次提交
  4. 18 1月, 2008 2 次提交
    • S
      Fix random fast-import errors when compiled with NO_MMAP · c9ced051
      Shawn O. Pearce 提交于
      fast-import was relying on the fact that on most systems mmap() and
      write() are synchronized by the filesystem's buffer cache.  We were
      relying on the ability to mmap() 20 bytes beyond the current end
      of the file, then later fill in those bytes with a future write()
      call, then read them through the previously obtained mmap() address.
      
      This isn't always true with some implementations of NFS, but it is
      especially not true with our NO_MMAP=YesPlease build time option used
      on some platforms.  If fast-import was built with NO_MMAP=YesPlease
      we used the malloc()+pread() emulation and the subsequent write()
      call does not update the trailing 20 bytes of a previously obtained
      "mmap()" (aka malloc'd) address.
      
      Under NO_MMAP that behavior causes unpack_entry() in sha1_file.c to
      be unable to read an object header (or data) that has been unlucky
      enough to be written to the packfile at a location such that it
      is in the trailing 20 bytes of a window previously opened on that
      same packfile.
      
      This bug has gone unnoticed for a very long time as it is highly data
      dependent.  Not only does the object have to be placed at the right
      position, but it also needs to be positioned behind some other object
      that has been accessed due to a branch cache invalidation.  In other
      words the stars had to align just right, and if you did run into
      this bug you probably should also have purchased a lottery ticket.
      
      Fortunately the workaround is a lot easier than the bug explanation.
      
      Before we allow unpack_entry() to read data from a pack window
      that has also (possibly) been modified through write() we force
      all existing windows on that packfile to be closed.  By closing
      the windows we ensure that any new access via the emulated mmap()
      will reread the packfile, updating to the current file content.
      
      This comes at a slight performance degredation as we cannot reuse
      previously cached windows when we update the packfile.  But it
      is a fairly minor difference as the window closes happen at only
      two points:
      
       - When the packfile is finalized and its .idx is generated:
      
         At this stage we are getting ready to update the refs and any
         data access into the packfile is going to be random, and is
         going after only the branch tips (to ensure they are valid).
         Our existing windows (if any) are not likely to be positioned
         at useful locations to access those final tip commits so we
         probably were closing them before anyway.
      
       - When the branch cache missed and we need to reload:
      
         At this point fast-import is getting change commands for the next
         commit and it needs to go re-read a tree object it previously
         had written out to the packfile.  What windows we had (if any)
         are not likely to cover the tree in question so we probably were
         closing them before anyway.
      
      We do try to avoid unnecessarily closing windows in the second case
      by checking to see if the packfile size has increased since the
      last time we called unpack_entry() on that packfile.  If the size
      has not changed then we have not written additional data, and any
      existing window is still vaild.  This nicely handles the cases where
      fast-import is going through a branch cache reload and needs to read
      many trees at once.  During such an event we are not likely to be
      updating the packfile so we do not cycle the windows between reads.
      
      With this change in place t9301-fast-export.sh (which was broken
      by c3b0dec5) finally works again.
      Signed-off-by: NShawn O. Pearce <spearce@spearce.org>
      Signed-off-by: NJunio C Hamano <gitster@pobox.com>
      c9ced051
    • B
      fast-import.c: don't try to commit marks file if write failed · fb54abd6
      Brandon Casey 提交于
      We also move the assignment of -1 to the lock file descriptor
      up, so that rollback_lock_file() can be called safely after a
      possible attempt to fclose(). This matches the contents of
      the 'if' statement just above testing success of fdopen().
      Signed-off-by: NBrandon Casey <casey@nrlssc.navy.mil>
      Acked-by: NShawn O. Pearce <spearce@spearce.org>
      Signed-off-by: NJunio C Hamano <gitster@pobox.com>
      fb54abd6
  5. 17 1月, 2008 1 次提交
  6. 10 1月, 2008 1 次提交
  7. 03 1月, 2008 1 次提交
    • J
      Update callers of check_ref_format() · 257f3020
      Junio C Hamano 提交于
      This updates send-pack and fast-import to use symbolic constants
      for checking the return values from check_ref_format(), and also
      futureproof the logic in lock_any_ref_for_update() to explicitly
      name the case that is usually considered an error but is Ok for
      this particular use.
      Signed-off-by: NJunio C Hamano <gitster@pobox.com>
      257f3020
  8. 15 12月, 2007 1 次提交
  9. 14 11月, 2007 1 次提交
    • S
      Don't allow fast-import tree delta chains to exceed maximum depth · 436e7a74
      Shawn O. Pearce 提交于
      Brian Downing noticed fast-import can produce tree depths of up
      to 6,035 objects and even deeper.  Long delta chains can create
      very small packfiles but cause problems during repacking as git
      needs to unpack each tree to count the reachable blobs.
      
      What's happening here is the active branch cache isn't big enough.
      We're swapping out the branch and thus recycling the tree information
      (struct tree_content) back into the free pool.  When we later reload
      the tree we set the delta_depth to 0 but we kept the tree we just
      reloaded as a delta base.
      
      So if the tree we reloaded was already at the maximum depth we
      wouldn't know it and make the new tree a delta.  Multiply the
      number of times the branch cache has to swap out the tree times
      max_depth (10) and you get the maximum delta depth of a tree created
      by fast-import.  In Brian's case above the active branch cache had
      to swap the branch out 603/604 times during this import to produce
      a tree with a delta depth of 6035.
      Acked-by: NBrian Downing <bdowning@lavos.net>
      Signed-off-by: NShawn O. Pearce <spearce@spearce.org>
      Signed-off-by: NJunio C Hamano <gitster@pobox.com>
      436e7a74
  10. 27 10月, 2007 1 次提交
  11. 21 10月, 2007 1 次提交
  12. 29 9月, 2007 1 次提交
    • P
      strbuf change: be sure ->buf is never ever NULL. · b315c5c0
      Pierre Habouzit 提交于
      For that purpose, the ->buf is always initialized with a char * buf living
      in the strbuf module. It is made a char * so that we can sloppily accept
      things that perform: sb->buf[0] = '\0', and because you can't pass "" as an
      initializer for ->buf without making gcc unhappy for very good reasons.
      
      strbuf_init/_detach/_grow have been fixed to trust ->alloc and not ->buf
      anymore.
      
      as a consequence strbuf_detach is _mandatory_ to detach a buffer, copying
      ->buf isn't an option anymore, if ->buf is going to escape from the scope,
      and eventually be free'd.
      
      API changes:
        * strbuf_setlen now always works, so just make strbuf_reset a convenience
          macro.
        * strbuf_detatch takes a size_t* optional argument (meaning it can be
          NULL) to copy the buffer's len, as it was needed for this refactor to
          make the code more readable, and working like the callers.
      Signed-off-by: NPierre Habouzit <madcoder@debian.org>
      Signed-off-by: NJunio C Hamano <gitster@pobox.com>
      b315c5c0
  13. 21 9月, 2007 2 次提交
    • P
      Rework unquote_c_style to work on a strbuf. · 7fb1011e
      Pierre Habouzit 提交于
      If the gain is not obvious in the diffstat, the resulting code is more
      readable, _and_ in checkout-index/update-index we now reuse the same buffer
      to unquote strings instead of always freeing/mallocing.
      
      This also is more coherent with the next patch that reworks quoting
      functions.
      
      The quoting function is also made more efficient scanning for backslashes
      and treating portions of strings without a backslash at once.
      Signed-off-by: NPierre Habouzit <madcoder@debian.org>
      7fb1011e
    • P
      strbuf API additions and enhancements. · c76689df
      Pierre Habouzit 提交于
      Add strbuf_remove, change strbuf_insert:
        As both are special cases of strbuf_splice, implement them as such.
        gcc is able to do the math and generate almost optimal code this way.
      
      Add strbuf_swap:
        Exchange the values of its arguments.
        Use it in fast-import.c
      
      Also fix spacing issues in strbuf.h
      Signed-off-by: NPierre Habouzit <madcoder@debian.org>
      c76689df
  14. 19 9月, 2007 1 次提交
  15. 18 9月, 2007 3 次提交
  16. 17 9月, 2007 1 次提交
  17. 11 9月, 2007 1 次提交
    • P
      Strbuf API extensions and fixes. · f1696ee3
      Pierre Habouzit 提交于
        * Add strbuf_rtrim to remove trailing spaces.
        * Add strbuf_insert to insert data at a given position.
        * Off-by one fix in strbuf_addf: strbuf_avail() does not counts the final
          \0 so the overflow test for snprintf is the strict comparison. This is
          not critical as the growth mechanism chosen will always allocate _more_
          memory than asked, so the second test will not fail. It's some kind of
          miracle though.
        * Add size extension hints for strbuf_init and strbuf_read. If 0, default
          applies, else:
            + initial buffer has the given size for strbuf_init.
            + first growth checks it has at least this size rather than the
              default 8192.
      Signed-off-by: NPierre Habouzit <madcoder@debian.org>
      Signed-off-by: NJunio C Hamano <gitster@pobox.com>
      f1696ee3
  18. 07 9月, 2007 2 次提交
    • P
      fast-import: Use strbuf API, and simplify cmd_data() · 4a241d79
      Pierre Habouzit 提交于
        This patch features the use of strbuf_detach, and prevent the programmer
      to mess with allocation directly. The code is as efficent as before, just
      more concise and more straightforward.
      Signed-off-by: NJunio C Hamano <gitster@pobox.com>
      4a241d79
    • P
      Rework strbuf API and semantics. · b449f4cf
      Pierre Habouzit 提交于
        The gory details are explained in strbuf.h. The change of semantics this
      patch enforces is that the embeded buffer has always a '\0' character after
      its last byte, to always make it a C-string. The offs-by-one changes are all
      related to that very change.
      
        A strbuf can be used to store byte arrays, or as an extended string
      library. The `buf' member can be passed to any C legacy string function,
      because strbuf operations always ensure there is a terminating \0 at the end
      of the buffer, not accounted in the `len' field of the structure.
      
        A strbuf can be used to generate a string/buffer whose final size is not
      really known, and then "strbuf_detach" can be used to get the built buffer,
      and keep the wrapping "strbuf" structure usable for further work again.
      
        Other interesting feature: strbuf_grow(sb, size) ensure that there is
      enough allocated space in `sb' to put `size' new octets of data in the
      buffer. It helps avoiding reallocating data for nothing when the problem the
      strbuf helps to solve has a known typical size.
      Signed-off-by: NPierre Habouzit <madcoder@debian.org>
      Signed-off-by: NJunio C Hamano <gitster@pobox.com>
      b449f4cf
  19. 21 8月, 2007 1 次提交
  20. 20 8月, 2007 1 次提交
    • J
      fast-import pull request · 7e5dcea8
      Junio C Hamano 提交于
      * skip_optional_lf() decl is old-style -- please say
      
      	static skip_optional_lf(void)
              {
              	...
      	}
      
      * t9300 #14 fails, like this:
      
      * expecting failure: git-fast-import <input
      fatal: Branch name doesn't conform to GIT standards: .badbranchname
      fast-import: dumping crash report to .git/fast_import_crash_14354
      ./test-lib.sh: line 143: 14354 Segmentation fault      git-fast-import <input
      
      -- >8 --
      Subject: [PATCH] fastimport: Fix re-use of va_list
      
      The va_list is designed to be used only once. The current code
      reuses va_list argument may cause segmentation fault.  Copy and
      release the arguments to avoid this problem.
      
      While we are at it, fix old-style function declaration of
      skip_optional_lf().
      Signed-off-by: NJunio C Hamano <gitster@pobox.com>
      Signed-off-by: NShawn O. Pearce <spearce@spearce.org>
      7e5dcea8
  21. 19 8月, 2007 9 次提交
    • S
      Include recent command history in fast-import crash reports · 904b1941
      Shawn O. Pearce 提交于
      When we crash the frontend developer (or end-user) may need to know
      roughly around what part of the input stream we had a problem with
      and aborted on.  Because line numbers aren't very useful in this
      sort of application we instead just keep the last 100 commands in
      a FIFO queue and print them as part of the crash report.
      
      Currently one problem with this design is a commit that has
      more than 100 modified files in it will flood the FIFO and any
      context regarding branch/from/committer/mark/comments will be lost.
      We really should save only the last few (10?) file changes for the
      current commit, ensuring we have some prior higher level commands
      in the FIFO when we crash on a file M/D/C/R command.
      
      Another issue with this approach is the FIFO only includes the
      commands, it does not include the commit messages.  Yet having a
      commit message may be useful to help locate the relevant change in
      the source material.  In practice I don't think this is going to be a
      major concern as the frontend can always embed its own source change
      set identifier as a comment (which will appear in the crash report)
      and the commit message(s) for the most recent commits of any given
      branch should be obtainable from the (packed) commit objects.
      Signed-off-by: NShawn O. Pearce <spearce@spearce.org>
      904b1941
    • S
      Generate crash reports on die in fast-import · 8acb3297
      Shawn O. Pearce 提交于
      As fast-import is quite strict about its input and die()'s anytime
      something goes wrong it can be difficult for a frontend developer
      to troubleshoot why fast-import rejected their input, or to even
      determine what input command it rejected.
      
      This change introduces a custom handler for Git's die() routine.
      When we receive a die() for any reason (fast-import or a lower level
      core Git routine we called) the error is first dumped onto stderr
      and then a more extensive crash report file is prepared in GIT_DIR.
      Finally we exit the process with status 128, just like the stock
      builtin die handler.
      
      An internal flag is set to prevent any further die()'s that may be
      invoked during the crash report generator from causing us to enter
      into an infinite loop.  We shouldn't die() from our crash report
      handler, but just in case someone makes a future code change we are
      prepared to gaurd against small mistakes turning into huge problems
      for the end-user.
      Signed-off-by: NShawn O. Pearce <spearce@spearce.org>
      8acb3297
    • S
      Allow frontends to bidirectionally communicate with fast-import · ac053c02
      Shawn O. Pearce 提交于
      The existing checkpoint command is very useful to force fast-import
      to dump the branches out to disk so that standard Git tools can
      access them and the objects they refer to.  However there was not a
      way to know when fast-import had finished executing the checkpoint
      and it was safe to read those refs.
      
      The progress command can be used to make fast-import output any
      message of the frontend's choosing to standard out.  The frontend
      can scan for these messages using select() or poll() to monitor a
      pipe connected to the standard output of fast-import.
      Signed-off-by: NShawn O. Pearce <spearce@spearce.org>
      ac053c02
    • S
      Make trailing LF optional for all fast-import commands · 1fdb649c
      Shawn O. Pearce 提交于
      For the same reasons as the prior change we want to allow frontends
      to omit the trailing LF that usually delimits commands.  In some
      cases these just make the input stream more verbose looking than
      it needs to be, and its just simpler for the frontend developer to
      get started if our parser is slightly more lenient about where an
      LF is required and where it isn't.
      
      To make this optional LF feature work we now have to buffer up to one
      line of input in command_buf.  This buffering can happen if we look
      at the current input command but don't recognize it at this point
      in the code.  In such a case we need to "unget" the entire line,
      but we cannot depend upon the stdio library to let us do ungetc()
      for that many characters at once.
      Signed-off-by: NShawn O. Pearce <spearce@spearce.org>
      1fdb649c
    • S
      Make trailing LF following fast-import `data` commands optional · 2c570cde
      Shawn O. Pearce 提交于
      A few fast-import frontend developers have found it odd that we
      require the LF following a `data` command, especially in the exact
      byte count format.  Technically we don't need this LF to parse
      the stream properly, but having it here does make the stream more
      readable to humans.  We can easily make the LF optional by peeking
      at the next byte available from the stream and pushing it back into
      the buffer if its not LF.
      Signed-off-by: NShawn O. Pearce <spearce@spearce.org>
      2c570cde
    • S
      Teach fast-import to ignore lines starting with '#' · 401d53fa
      Shawn O. Pearce 提交于
      Several frontend developers have asked that some form of stream
      comments be permitted within a fast-import data stream.  This way
      they can include information from their own frontend program about
      where specific data was taken from in the source system, or about
      a decision that their frontend may have made while creating the
      fast-import data stream.
      
      This change introduces comments in the Bourne-shell/Tcl/Perl style.
      Lines starting with '#' are ignored, up to and including the LF.
      Unlike the above mentioned three languages however we do not look for
      and ignore leading whitespace.  This just simplifies the definition
      of the comment format and the code that parses them.
      
      To make comments work we had to stop using read_next_command() within
      cmd_data() and directly invoke read_line() during the inline variant
      of the function.  This is necessary to retain any lines of the
      input data that might otherwise look like a comment to fast-import.
      Signed-off-by: NShawn O. Pearce <spearce@spearce.org>
      401d53fa
    • S
      Use handy ALLOC_GROW macro in fast-import when possible · 31490074
      Shawn O. Pearce 提交于
      Instead of growing our buffer by hand during the inline variant of
      cmd_data() we can save a few lines of code and just use the nifty
      new ALLOC_GROW macro already available to us.
      Signed-off-by: NShawn O. Pearce <spearce@spearce.org>
      31490074
    • S
      Actually allow TAG_FIXUP branches in fast-import · ea08a6fd
      Shawn O. Pearce 提交于
      Michael Haggerty <mhagger@alum.mit.edu> noticed while debugging a
      Git backend for cvs2svn that fast-import was barfing when he tried
      to use "TAG_FIXUP" as a branch name for temporary work needed to
      cleanup the tree prior to creating an annotated tag object.
      
      The reason we were rejecting the branch name was check_ref_format()
      returns -2 when there are less than 2 '/' characters in the input
      name.  TAG_FIXUP has 0 '/' characters, but is technically just as
      valid of a ref as HEAD and MERGE_HEAD, so we really should permit it
      (and any other similar looking name) during import.
      
      New test cases have been added to make sure we still detect very
      wrong branch names (e.g. containing [ or starting with .) and yet
      still permit reasonable names (e.g. TAG_FIXUP).
      Signed-off-by: NShawn O. Pearce <spearce@spearce.org>
      ea08a6fd
    • A
      Fix whitespace in "Format of STDIN stream" of fast-import · c905e090
      Alex Riesen 提交于
      Something probably assumed that HT indentation is 4 characters.
      Signed-off-by: NAlex Riesen <raa.lkml@gmail.com>
      Signed-off-by: NShawn O. Pearce <spearce@spearce.org>
      c905e090
  22. 15 8月, 2007 1 次提交
  23. 15 7月, 2007 1 次提交
    • S
      Teach fast-import to recursively copy files/directories · b6f3481b
      Shawn O. Pearce 提交于
      Some source material (e.g. Subversion dump files) perform directory
      renames by telling us the directory was copied, then deleted in the
      same revision.  This makes it difficult for a frontend to convert
      such data formats to a fast-import stream, as all the frontend has
      on hand is "Copy a/ to b/; Delete a/" with no details about what
      files are in a/, unless the frontend also kept track of all files.
      
      The new 'C' subcommand within a commit allows the frontend to make a
      recursive copy of one path to another path within the branch, without
      needing to keep track of the individual file paths.  The metadata
      copy is performed in memory efficiently, but is implemented as a
      copy-immediately operation, rather than copy-on-write.
      
      With this new 'C' subcommand frontends could obviously implement an
      'R' (rename) on their own as a combination of 'C' and 'D' (delete),
      but since we have already offered up 'R' in the past and it is a
      trivial thing to keep implemented I'm not going to deprecate it.
      Signed-off-by: NShawn O. Pearce <spearce@spearce.org>
      b6f3481b
  24. 10 7月, 2007 1 次提交
    • S
      Support wholesale directory renames in fast-import · f39a946a
      Shawn O. Pearce 提交于
      Some source material (e.g. Subversion dump files) perform directory
      renames without telling us exactly which files in that subdirectory
      were moved.  This makes it hard for a frontend to convert such data
      formats to a fast-import stream, as all the frontend has on hand
      is "Rename a/ to b/" with no details about what files are in a/,
      unless the frontend also kept track of all files.
      
      The new 'R' subcommand within a commit allows the frontend to
      rename either a file or an entire subdirectory, without needing to
      know the object's SHA-1 or the specific files contained within it.
      The rename is performed as efficiently as possible internally,
      making it cheaper than a 'D'/'M' pair for a file rename.
      Signed-off-by: NShawn O. Pearce <spearce@spearce.org>
      f39a946a