1. 26 9月, 2015 1 次提交
    • J
      add git_path_buf helper function · bb3788ce
      Jeff King 提交于
      If you have a function that uses git_path a lot, but would
      prefer to avoid the static buffers, it's useful to keep a
      single scratch buffer locally and reuse it for each call.
      You used to be able to do this with git_snpath:
      
        char buf[PATH_MAX];
      
        foo(git_snpath(buf, sizeof(buf), "foo"));
        bar(git_snpath(buf, sizeof(buf), "bar"));
      
      but since 1a83c240, git_snpath has been replaced with
      strbuf_git_path. This is good, because it removes the
      arbitrary PATH_MAX limit. But using strbuf_git_path is more
      awkward for two reasons:
      
        1. It adds to the buffer, rather than replacing it. This
           is consistent with other strbuf functions, but makes
           reuse of a single buffer more tedious.
      
        2. It doesn't return the buffer, so you can't format
           as part of a function's arguments.
      
      The new git_path_buf solves both of these, so you can use it
      like:
      
        struct strbuf buf = STRBUF_INIT;
      
        foo(git_path_buf(&buf, "foo"));
        bar(git_path_buf(&buf, "bar"));
      
        strbuf_release(&buf);
      Signed-off-by: NJeff King <peff@peff.net>
      Signed-off-by: NJunio C Hamano <gitster@pobox.com>
      bb3788ce
  2. 25 8月, 2015 1 次提交
    • J
      write_file(): drop "fatal" parameter · 12d6ce1d
      Junio C Hamano 提交于
      All callers except three passed 1 for the "fatal" parameter to ask
      this function to die upon error, but to a casual reader of the code,
      it was not all obvious what that 1 meant.  Instead, split the
      function into two based on a common write_file_v() that takes the
      flag, introduce write_file_gently() as a new way to attempt creating
      a file without dying on error, and make three callers to call it.
      Signed-off-by: NJunio C Hamano <gitster@pobox.com>
      12d6ce1d
  3. 24 8月, 2015 1 次提交
    • J
      config: silence warnings for command names with invalid keys · 9e9de18f
      Jeff King 提交于
      When we are running the git command "foo", we may have to
      look up the config keys "pager.foo" and "alias.foo". These
      config schemes are mis-designed, as the command names can be
      anything, but the config syntax has some restrictions. For
      example:
      
        $ git foo_bar
        error: invalid key: pager.foo_bar
        error: invalid key: alias.foo_bar
        git: 'foo_bar' is not a git command. See 'git --help'.
      
      You cannot name an alias with an underscore. And if you have
      an external command with one, you cannot configure its
      pager.
      
      In the long run, we may develop a different config scheme
      for these features. But in the near term (and because we'll
      need to support the existing scheme indefinitely), we should
      at least squelch the error messages shown above.
      
      These errors come from git_config_parse_key. Ideally we
      would pass a "quiet" flag to the config machinery, but there
      are many layers between the pager code and the key parsing.
      Passing a flag through all of those would be an invasive
      change.
      
      Instead, let's provide a config function to report on
      whether a key is syntactically valid, and have the pager and
      alias code skip lookup for bogus keys. We can build this
      easily around the existing git_config_parse_key, with two
      minor modifications:
      
        1. We now handle a NULL store_key, to validate but not
           write out the normalized key.
      
        2. We accept a "quiet" flag to avoid writing to stderr.
           This doesn't need to be a full-blown public "flags"
           field, because we can make the existing implementation
           a static helper function, keeping the mess contained
           inside config.c.
      Signed-off-by: NJeff King <peff@peff.net>
      Signed-off-by: NJunio C Hamano <gitster@pobox.com>
      9e9de18f
  4. 20 8月, 2015 1 次提交
  5. 11 8月, 2015 5 次提交
    • J
      memoize common git-path "constant" files · f932729c
      Jeff King 提交于
      One of the most common uses of git_path() is to pass a
      constant, like git_path("MERGE_MSG"). This has two
      drawbacks:
      
        1. The return value is a static buffer, and the lifetime
           is dependent on other calls to git_path, etc.
      
        2. There's no compile-time checking of the pathname. This
           is OK for a one-off (after all, we have to spell it
           correctly at least once), but many of these constant
           strings appear throughout the code.
      
      This patch introduces a series of functions to "memoize"
      these strings, which are essentially globals for the
      lifetime of the program. We compute the value once, take
      ownership of the buffer, and return the cached value for
      subsequent calls.  cache.h provides a helper macro for
      defining these functions as one-liners, and defines a few
      common ones for global use.
      
      Using a macro is a little bit gross, but it does nicely
      document the purpose of the functions. If we need to touch
      them all later (e.g., because we learned how to change the
      git_dir variable at runtime, and need to invalidate all of
      the stored values), it will be much easier to have the
      complete list.
      
      Note that the shared-global functions have separate, manual
      declarations. We could do something clever with the macros
      (e.g., expand it to a declaration in some places, and a
      declaration _and_ a definition in path.c). But there aren't
      that many, and it's probably better to stay away from
      too-magical macros.
      
      Likewise, if we abandon the C preprocessor in favor of
      generating these with a script, we could get much fancier.
      E.g., normalizing "FOO/BAR-BAZ" into "git_path_foo_bar_baz".
      But the small amount of saved typing is probably not worth
      the resulting confusion to readers who want to grep for the
      function's definition.
      Signed-off-by: NJeff King <peff@peff.net>
      Signed-off-by: NJunio C Hamano <gitster@pobox.com>
      f932729c
    • J
      path.c: drop git_path_submodule · 07e3070d
      Jeff King 提交于
      There are no callers of the slightly-dangerous static-buffer
      git_path_submodule left. Let's drop it.
      Signed-off-by: NJeff King <peff@peff.net>
      Signed-off-by: NJunio C Hamano <gitster@pobox.com>
      07e3070d
    • J
      cache.h: complete set of git_path_submodule helpers · f5895fd3
      Jeff King 提交于
      The git_path function has "git_pathdup" and
      "strbuf_git_path" variants, but git_submodule_path only
      comes in the dangerous, static-buffer variant. That makes
      refactoring callers to use the safer functions hard (since
      they don't exist).
      
      Since we're already using a strbuf behind the scenes, it's
      easy to expose all three of these interfaces with thin
      wrappers.
      Signed-off-by: NJeff King <peff@peff.net>
      Signed-off-by: NJunio C Hamano <gitster@pobox.com>
      f5895fd3
    • J
      cache.h: clarify documentation for git_path, et al · 69ddd231
      Jeff King 提交于
      The comment above these functions actually describes
      sha1_file_name, and comes from the very first revision of
      git. Commit 723c31fe (Add "git_path()" and "head_ref()"
      helper functions., 2005-07-05) added git_path, pushing the
      comment away from the function it describes; later commits
      added more functions in this block.
      
      Let's fix the comment to describe these related functions in
      more detail. Let's also make sure to point out their safer
      alternatives (and move those alternatives below, which makes
      more sense when reading the file).
      
      Note that we do not need to move the existing comment to
      sha1_file_name.  Commit d40d535b (sha1_file.c: document a
      bunch of functions defined in the file, 2014-02-21) already
      added a much more descriptive comment to it.
      Signed-off-by: NJeff King <peff@peff.net>
      Signed-off-by: NJunio C Hamano <gitster@pobox.com>
      69ddd231
    • J
      sha1_file.c: rename move_temp_to_file() to finalize_object_file() · cb5add58
      Junio C Hamano 提交于
      Since 5a688fe4 ("core.sharedrepository = 0mode" should set, not
      loosen, 2009-03-25), we kept reminding ourselves:
      
          NEEDSWORK: this should be renamed to finalize_temp_file() as
          "moving" is only a part of what it does, when no patch between
          master to pu changes the call sites of this function.
      
      without doing anything about it.  Let's do so.
      
      The purpose of this function was not to move but to finalize.  The
      detail of the primarily implementation of finalizing was to link the
      temporary file to its final name and then to unlink, which wasn't
      even "moving".  The alternative implementation did "move" by calling
      rename(2), which is a fun tangent.
      Signed-off-by: NJunio C Hamano <gitster@pobox.com>
      cb5add58
  6. 30 6月, 2015 2 次提交
    • J
      introduce "format" date-mode · aa1462cc
      Jeff King 提交于
      This feeds the format directly to strftime. Besides being a
      little more flexible, the main advantage is that your system
      strftime may know more about your locale's preferred format
      (e.g., how to spell the days of the week).
      Signed-off-by: NJeff King <peff@peff.net>
      Signed-off-by: NJunio C Hamano <gitster@pobox.com>
      aa1462cc
    • J
      convert "enum date_mode" into a struct · a5481a6c
      Jeff King 提交于
      In preparation for adding date modes that may carry extra
      information beyond the mode itself, this patch converts the
      date_mode enum into a struct.
      
      Most of the conversion is fairly straightforward; we pass
      the struct as a pointer and dereference the type field where
      necessary. Locations that declare a date_mode can use a "{}"
      constructor.  However, the tricky case is where we use the
      enum labels as constants, like:
      
        show_date(t, tz, DATE_NORMAL);
      
      Ideally we could say:
      
        show_date(t, tz, &{ DATE_NORMAL });
      
      but of course C does not allow that. Likewise, we cannot
      cast the constant to a struct, because we need to pass an
      actual address. Our options are basically:
      
        1. Manually add a "struct date_mode d = { DATE_NORMAL }"
           definition to each caller, and pass "&d". This makes
           the callers uglier, because they sometimes do not even
           have their own scope (e.g., they are inside a switch
           statement).
      
        2. Provide a pre-made global "date_normal" struct that can
           be passed by address. We'd also need "date_rfc2822",
           "date_iso8601", and so forth. But at least the ugliness
           is defined in one place.
      
        3. Provide a wrapper that generates the correct struct on
           the fly. The big downside is that we end up pointing to
           a single global, which makes our wrapper non-reentrant.
           But show_date is already not reentrant, so it does not
           matter.
      
      This patch implements 3, along with a minor macro to keep
      the size of the callers sane.
      Signed-off-by: NJeff King <peff@peff.net>
      Signed-off-by: NJunio C Hamano <gitster@pobox.com>
      a5481a6c
  7. 23 6月, 2015 2 次提交
  8. 16 6月, 2015 1 次提交
  9. 13 6月, 2015 1 次提交
    • M
      Allow to control where the replace refs are looked for · 58d121b2
      Mike Hommey 提交于
      It can be useful to have grafts or replace refs for specific use-cases while
      keeping the default "view" of the repository pristine (or with a different
      set of grafts/replace refs).
      
      It is possible to use a different graft file with GIT_GRAFT_FILE, but while
      replace refs are more powerful, they don't have an equivalent override.
      
      Add a GIT_REPLACE_REF_BASE environment variable to control where git is
      going to look for replace refs.
      Signed-off-by: NMike Hommey <mh@glandium.org>
      Signed-off-by: NJunio C Hamano <gitster@pobox.com>
      58d121b2
  10. 10 6月, 2015 2 次提交
    • E
      setup: add gentle version of read_gitfile · a93bedad
      Erik Elfström 提交于
      read_gitfile will die on most error cases. This makes it unsuitable
      for speculative calls. Extract the core logic and provide a gentle
      version that returns NULL on failure.
      
      The first usecase of the new gentle version will be to probe for
      submodules during git clean.
      Helped-by: NJunio C Hamano <gitster@pobox.com>
      Helped-by: NJeff King <peff@peff.net>
      Signed-off-by: NErik Elfström <erik.elfstrom@gmail.com>
      Signed-off-by: NJunio C Hamano <gitster@pobox.com>
      a93bedad
    • J
      index-pack: avoid excessive re-reading of pack directory · 0eeb077b
      Jeff King 提交于
      Since 45e8a748 (has_sha1_file: re-check pack directory before
      giving up, 2013-08-30), we spend extra effort for
      has_sha1_file to give the right answer when somebody else is
      repacking. Usually this effort does not matter, because
      after finding that the object does not exist, the next step
      is usually to die().
      
      However, some code paths make a large number of
      has_sha1_file checks which are _not_ expected to return 1.
      The collision test in index-pack.c is such a case. On a
      local system, this can cause a performance slowdown of
      around 5%. But on a system with high-latency system calls
      (like NFS), it can be much worse.
      
      This patch introduces a "quick" flag to has_sha1_file which
      callers can use when they would prefer high performance at
      the cost of false negatives during repacks. There may be
      other code paths that can use this, but the index-pack one
      is the most obviously critical, so we'll start with
      switching that one.
      Signed-off-by: NJeff King <peff@peff.net>
      Signed-off-by: NJunio C Hamano <gitster@pobox.com>
      0eeb077b
  11. 06 6月, 2015 1 次提交
  12. 21 5月, 2015 1 次提交
  13. 20 5月, 2015 1 次提交
    • J
      copy.c: make copy_fd() report its status silently · 00b7cbfc
      Junio C Hamano 提交于
      When copy_fd() function encounters errors, it emits error messages
      itself, which makes it impossible for callers to take responsibility
      for reporting errors, especially when they want to ignore certain
      errors.
      
      Move the error reporting to its callers in preparation.
      
       - copy_file() and copy_file_with_time() by indirection get their
         own calls to error().
      
       - hold_lock_file_for_append(), when told to die on error, used to
         exit(128) relying on the error message from copy_fd(), but now it
         does its own die() instead.  Note that the callers that do not
         pass LOCK_DIE_ON_ERROR need to be adjusted for this change, but
         fortunately there is none ;-)
      
       - filter_buffer_or_fd() has its own error() already, in addition to
         the message from copy_fd(), so this will change the output but
         arguably in a better way.
      Signed-off-by: NJunio C Hamano <gitster@pobox.com>
      00b7cbfc
  14. 07 5月, 2015 3 次提交
    • K
      sha1_file: support reading from a loose object of unknown type · 46f03448
      Karthik Nayak 提交于
      Update sha1_loose_object_info() to optionally allow it to read
      from a loose object file of unknown/bogus type; as the function
      usually returns the type of the object it read in the form of enum
      for known types, add an optional "typename" field to receive the
      name of the type in textual form and a flag to indicate the reading
      of a loose object file of unknown/bogus type.
      
      Add parse_sha1_header_extended() which acts as a wrapper around
      parse_sha1_header() allowing more information to be obtained.
      
      Add unpack_sha1_header_to_strbuf() to unpack sha1 headers of
      unknown/corrupt objects which have a unknown sha1 header size to
      a strbuf structure. This was written by Junio C Hamano but tested
      by me.
      Helped-by: NJunio C Hamano <gitster@pobox.com>
      Helped-by: NEric Sunshine <sunshine@sunshineco.com>
      Helped-by: NRamsay Jones <ramsay@ramsay1.demon.co.uk>
      Hepled-by: NJeff King <peff@peff.net>
      Signed-off-by: NKarthik Nayak <karthik.188@gmail.com>
      Signed-off-by: NJunio C Hamano <gitster@pobox.com>
      46f03448
    • P
      path.c: remove home_config_paths() · 846e5dfb
      Paul Tan 提交于
      home_config_paths() combines distinct functionality already implemented
      by expand_user_path() and xdg_config_home(), and it also hard-codes the
      path ~/.gitconfig, which makes it unsuitable to use for other home
      config file paths. Since its use will just add unnecessary complexity to
      the code, remove it.
      Signed-off-by: NPaul Tan <pyokagan@gmail.com>
      Signed-off-by: NJunio C Hamano <gitster@pobox.com>
      846e5dfb
    • P
      path.c: implement xdg_config_home() · ea19289b
      Paul Tan 提交于
      The XDG base dir spec[1] specifies that configuration files be stored in
      a subdirectory in $XDG_CONFIG_HOME. To construct such a configuration
      file path, home_config_paths() can be used. However, home_config_paths()
      combines distinct functionality:
      
      1. Retrieve the home git config file path ~/.gitconfig
      
      2. Construct the XDG config path of the file specified by `file`.
      
      This function was introduced in commit 21cf3227 ("read (but not write)
      from $XDG_CONFIG_HOME/git/config file").  While the intention of the
      function was to allow the home directory configuration file path and the
      xdg directory configuration file path to be retrieved with one function
      call, the hard-coding of the path ~/.gitconfig prevents it from being
      used for other configuration files. Furthermore, retrieving a file path
      relative to the user's home directory can be done with
      expand_user_path(). Hence, it can be seen that home_config_paths()
      introduces unnecessary complexity, especially if a user just wants to
      retrieve the xdg config file path.
      
      As such, implement a simpler function xdg_config_home() for constructing
      the XDG base dir spec configuration file path. This function, together
      with expand_user_path(), can replace all uses of home_config_paths().
      
      [1] http://standards.freedesktop.org/basedir-spec/basedir-spec-0.7.htmlHelped-by: NEric Sunshine <sunshine@sunshineco.com>
      Signed-off-by: NPaul Tan <pyokagan@gmail.com>
      Signed-off-by: NJunio C Hamano <gitster@pobox.com>
      ea19289b
  15. 06 5月, 2015 1 次提交
    • E
      hash-object --literally: fix buffer overrun with extra-long object type · 0c3db67c
      Eric Sunshine 提交于
      "hash-object" learned in 5ba9a93b (hash-object: add --literally
      option, 2014-09-11) to allow crafting a corrupt/broken object of
      unknown type.
      
      When the user-provided type is particularly long, however, it can
      overflow the relatively small stack-based character array handed to
      write_sha1_file_prepare() by hash_sha1_file() and write_sha1_file(),
      leading to stack corruption (and crash).  Introduce a custom helper
      to allow arbitrarily long typenames just for "hash-object --literally".
      
      [jc: Eric's original used a strbuf in the more common codepaths, and
      I rewrote it to avoid penalizing the non-literally code. Bugs are mine]
      Signed-off-by: NEric Sunshine <sunshine@sunshineco.com>
      Signed-off-by: NJunio C Hamano <gitster@pobox.com>
      0c3db67c
  16. 21 4月, 2015 2 次提交
    • J
      sha1_file: only freshen packs once per run · ee1c6c34
      Jeff King 提交于
      Since 33d4221c (write_sha1_file: freshen existing objects,
      2014-10-15), we update the mtime of existing objects that we
      would have written out (had they not existed). For the
      common case in which many objects are packed, we may update
      the mtime on a single packfile repeatedly. This can result
      in a noticeable performance problem if calling utime() is
      expensive (e.g., because your storage is on NFS).
      
      We can fix this by keeping a per-pack flag that lets us
      freshen only once per program invocation.
      
      An alternative would be to keep the packed_git.mtime flag up
      to date as we freshen, and freshen only once every N
      seconds. In practice, it's not worth the complexity. We are
      racing against prune expiration times here, which inherently
      must be set to accomodate reasonable program running times
      (because they really care about the time between an object
      being written and it becoming referenced, and the latter is
      typically the last step a program takes).
      Signed-off-by: NJeff King <peff@peff.net>
      Signed-off-by: NJunio C Hamano <gitster@pobox.com>
      ee1c6c34
    • J
      reachable: only mark local objects as recent · 1385bb7b
      Jeff King 提交于
      When pruning and repacking a repository that has an
      alternate object store configured, we may traverse a large
      number of objects in the alternate. This serves no purpose,
      and may be expensive to do. A longer explanation is below.
      
      Commits d3038d22 and abcb8655 taught prune and pack-objects
      (respectively) to treat "recent" objects as tips for
      reachability, so that we keep whole chunks of history. They
      built on the object traversal in 660c889e (sha1_file: add
      for_each iterators for loose and packed objects,
      2014-10-15), which covers both local and alternate objects.
      
      In both cases, covering alternate objects is unnecessary, as
      both commands can only drop objects from the local
      repository. In the case of prune, we traverse only the local
      object directory. And in the case of repacking, while we may
      or may not include local objects in our pack, we will never
      reach into the alternate with "repack -d". The "-l" option
      is only a question of whether we are migrating objects from
      the alternate into our repository, or leaving them
      untouched.
      
      It is possible that we may drop an object that is depended
      upon by another object in the alternate. For example,
      imagine two repositories, A and B, with A pointing to B as
      an alternate. Now imagine a commit that is in B which
      references a tree that is only in A. Traversing from recent
      objects in B might prevent A from dropping that tree. But
      this case isn't worth covering. Repo B should take
      responsibility for its own objects. It would never have had
      the commit in the first place if it did not also have the
      tree, and assuming it is using the same "keep recent chunks
      of history" scheme, then it would itself keep the tree, as
      well.
      
      So checking the alternate objects is not worth doing, and
      come with a significant performance impact. In both cases,
      we skip any recent objects that have already been marked
      SEEN (i.e., that we know are already reachable for prune, or
      included in the pack for a repack). So there is a slight
      waste of time in opening the alternate packs at all, only to
      notice that we have already considered each object. But much
      worse, the alternate repository may have a large number of
      objects that are not reachable from the local repository at
      all, and we end up adding them to the traversal.
      
      We can fix this by considering only local unseen objects.
      Signed-off-by: NJeff King <peff@peff.net>
      Signed-off-by: NJunio C Hamano <gitster@pobox.com>
      1385bb7b
  17. 25 3月, 2015 1 次提交
    • J
      report_path_error(): move to dir.c · 777c55a6
      Junio C Hamano 提交于
      The expected call sequence is for the caller to use match_pathspec()
      repeatedly on a set of pathspecs, accumulating the "hits" in a
      separate array, and then call this function to diagnose a pathspec
      that never matched anything, as that can indicate a typo from the
      command line, e.g. "git commit Maekfile".
      
      Many builtin commands use this function from builtin/ls-files.c,
      which is not a very healthy arrangement.  ls-files might have been
      the first command to feel the need for such a helper, but the need
      is shared by everybody who uses the "match and then report" pattern.
      
      Move it to dir.c where match_pathspec() is defined.
      Signed-off-by: NJunio C Hamano <gitster@pobox.com>
      777c55a6
  18. 21 3月, 2015 1 次提交
    • J
      refs: introduce a "ref paranoia" flag · 49672f26
      Jeff King 提交于
      Most operations that iterate over refs are happy to ignore
      broken cruft. However, some operations should be performed
      with knowledge of these broken refs, because it is better
      for the operation to choke on a missing object than it is to
      silently pretend that the ref did not exist (e.g., if we are
      computing the set of reachable tips in order to prune
      objects).
      
      These processes could just call for_each_rawref, except that
      ref iteration is often hidden behind other interfaces. For
      instance, for a destructive "repack -ad", we would have to
      inform "pack-objects" that we are destructive, and then it
      would in turn have to tell the revision code that our
      "--all" should include broken refs.
      
      It's much simpler to just set a global for "dangerous"
      operations that includes broken refs in all iterations.
      Signed-off-by: NJeff King <peff@peff.net>
      Signed-off-by: NJunio C Hamano <gitster@pobox.com>
      49672f26
  19. 14 3月, 2015 2 次提交
    • B
      define utility functions for object IDs · aa1c6fdf
      brian m. carlson 提交于
      There are several utility functions (hashcmp and friends) that are used
      for comparing object IDs (SHA-1 values).  Using these functions, which
      take pointers to unsigned char, with struct object_id requires tiresome
      access to the sha1 member, which bloats code and violates the desired
      encapsulation.  Provide wrappers around these functions for struct
      object_id for neater, more maintainable code.  Use the new constants to
      avoid the hard-coded 20s and 40s throughout the original functions.
      
      These functions simply call the underlying pointer-to-unsigned-char
      versions to ensure that any performance improvements will be passed
      through to the new functions.
      Signed-off-by: Nbrian m. carlson <sandals@crustytoothpaste.net>
      Signed-off-by: NJunio C Hamano <gitster@pobox.com>
      aa1c6fdf
    • B
      define a structure for object IDs · 5f7817c8
      brian m. carlson 提交于
      Many places throughout the code use "unsigned char [20]" to store object IDs
      (SHA-1 values).  This leads to lots of hardcoded numbers throughout the
      codebase.  It also leads to confusion about the purposes of a buffer.
      
      Introduce a structure for object IDs.  This allows us to obtain the benefits
      of compile-time checking for misuse.  The structure is expected to remain
      the same size and have the same alignment requirements on all known
      platforms, compared to the array of unsigned char, although this is not
      required for correctness.
      Signed-off-by: Nbrian m. carlson <sandals@crustytoothpaste.net>
      Signed-off-by: NJunio C Hamano <gitster@pobox.com>
      5f7817c8
  20. 13 3月, 2015 3 次提交
  21. 18 2月, 2015 1 次提交
  22. 10 2月, 2015 1 次提交
  23. 06 2月, 2015 1 次提交
    • J
      decimal_width: avoid integer overflow · d306f3d3
      Jeff King 提交于
      The decimal_width function originally appeared in blame.c as
      "lineno_width", and was designed for calculating the
      print-width of small-ish integer values (line numbers in
      text files). In ec7ff5ba, it was made into a reusable
      function, and in dc801e71, we started using it to align
      diffstats.
      
      Binary files in a diffstat show byte counts rather than line
      numbers, meaning they can be quite large (e.g., consider
      adding or removing a 2GB file). decimal_width is not up to
      the challenge for two reasons:
      
        1. It takes the value as an "int", whereas large files may
           easily surpass this. The value may be truncated, in
           which case we will produce an incorrect value.
      
        2. It counts "up" by repeatedly multiplying another
           integer by 10 until it surpasses the value.  This can
           cause an infinite loop when the value is close to the
           largest representable integer.
      
           For example, consider using a 32-bit signed integer,
           and a value of 2,140,000,000 (just shy of 2^31-1).
           We will count up and eventually see that 1,000,000,000
           is smaller than our value. The next step would be to
           multiply by 10 and see that 10,000,000,000 is too
           large, ending the loop. But we can't represent that
           value, and we have signed overflow.
      
           This is technically undefined behavior, but a common
           behavior is to lose the high bits, in which case our
           iterator will certainly be less than the number. So
           we'll keep multiplying, overflow again, and so on.
      
      This patch changes the argument to a uintmax_t (the same
      type we use to store the diffstat information for binary
      filese), and counts "down" by repeatedly dividing our value
      by 10.
      Signed-off-by: NJeff King <peff@peff.net>
      Signed-off-by: NJunio C Hamano <gitster@pobox.com>
      d306f3d3
  24. 18 12月, 2014 3 次提交
    • J
      read-cache: optionally disallow NTFS .git variants · 2b4c6efc
      Johannes Schindelin 提交于
      The point of disallowing ".git" in the index is that we
      would never want to accidentally overwrite files in the
      repository directory. But this means we need to respect the
      filesystem's idea of when two paths are equal. The prior
      commit added a helper to make such a comparison for NTFS
      and FAT32; let's use it in verify_path().
      
      We make this check optional for two reasons:
      
        1. It restricts the set of allowable filenames, which is
           unnecessary for people who are not on NTFS nor FAT32.
           In practice this probably doesn't matter, though, as
           the restricted names are rather obscure and almost
           certainly would never come up in practice.
      
        2. It has a minor performance penalty for every path we
           insert into the index.
      
      This patch ties the check to the core.protectNTFS config
      option. Though this is expected to be most useful on Windows,
      we allow it to be set everywhere, as NTFS may be mounted on
      other platforms. The variable does default to on for Windows,
      though.
      Signed-off-by: NJohannes Schindelin <johannes.schindelin@gmx.de>
      Signed-off-by: NJunio C Hamano <gitster@pobox.com>
      2b4c6efc
    • J
      path: add is_ntfs_dotgit() helper · 1d1d69bc
      Johannes Schindelin 提交于
      We do not allow paths with a ".git" component to be added to
      the index, as that would mean repository contents could
      overwrite our repository files. However, asking "is this
      path the same as .git" is not as simple as strcmp() on some
      filesystems.
      
      On NTFS (and FAT32), there exist so-called "short names" for
      backwards-compatibility: 8.3 compliant names that refer to the same files
      as their long names. As ".git" is not an 8.3 compliant name, a short name
      is generated automatically, typically "git~1".
      
      Depending on the Windows version, any combination of trailing spaces and
      periods are ignored, too, so that both "git~1." and ".git." still refer
      to the Git directory. The reason is that 8.3 stores file names shorter
      than 8 characters with trailing spaces. So literally, it does not matter
      for the short name whether it is padded with spaces or whether it is
      shorter than 8 characters, it is considered to be the exact same.
      
      The period is the separator between file name and file extension, and
      again, an empty extension consists just of spaces in 8.3 format. So
      technically, we would need only take care of the equivalent of this
      regex:
              (\.git {0,4}|git~1 {0,3})\. {0,3}
      
      However, there are indications that at least some Windows versions might
      be more lenient and accept arbitrary combinations of trailing spaces and
      periods and strip them out. So we're playing it real safe here. Besides,
      there can be little doubt about the intention behind using file names
      matching even the more lenient pattern specified above, therefore we
      should be fine with disallowing such patterns.
      
      Extra care is taken to catch names such as '.\\.git\\booh' because the
      backslash is marked as a directory separator only on Windows, and we want
      to use this new helper function also in fsck on other platforms.
      
      A big thank you goes to Ed Thomson and an unnamed Microsoft engineer for
      the detailed analysis performed to come up with the corresponding fixes
      for libgit2.
      
      This commit adds a function to detect whether a given file name can refer
      to the Git directory by mistake.
      Signed-off-by: NJohannes Schindelin <johannes.schindelin@gmx.de>
      Signed-off-by: NJunio C Hamano <gitster@pobox.com>
      1d1d69bc
    • J
      read-cache: optionally disallow HFS+ .git variants · a42643aa
      Jeff King 提交于
      The point of disallowing ".git" in the index is that we
      would never want to accidentally overwrite files in the
      repository directory. But this means we need to respect the
      filesystem's idea of when two paths are equal. The prior
      commit added a helper to make such a comparison for HFS+;
      let's use it in verify_path.
      
      We make this check optional for two reasons:
      
        1. It restricts the set of allowable filenames, which is
           unnecessary for people who are not on HFS+. In practice
           this probably doesn't matter, though, as the restricted
           names are rather obscure and almost certainly would
           never come up in practice.
      
        2. It has a minor performance penalty for every path we
           insert into the index.
      
      This patch ties the check to the core.protectHFS config
      option. Though this is expected to be most useful on OS X,
      we allow it to be set everywhere, as HFS+ may be mounted on
      other platforms. The variable does default to on for OS X,
      though.
      Signed-off-by: NJeff King <peff@peff.net>
      Signed-off-by: NJunio C Hamano <gitster@pobox.com>
      a42643aa
  25. 05 12月, 2014 1 次提交
    • D
      compat: convert modes to use portable file type values · d543d9c0
      David Michael 提交于
      This adds simple wrapper functions around calls to stat(), fstat(),
      and lstat() that translate the operating system's native file type
      bits to those used by most operating systems.  It also rewrites the
      S_IF* macros to the common values, so all file type processing is
      performed using the translated modes.  This makes projects portable
      across operating systems that use different file type definitions.
      
      Only the file type bits may be affected by these compatibility
      functions; the file permission bits are assumed to be 07777 and are
      passed through unchanged.
      Signed-off-by: NDavid Michael <fedora.dm0@gmail.com>
      Signed-off-by: NJunio C Hamano <gitster@pobox.com>
      d543d9c0