- 22 6月, 2017 3 次提交
-
-
由 Jonathan Tan 提交于
In a subsequent patch, packed_object_info() will be modified to use the delta base cache, so move the relevant code to before packed_object_info(). Signed-off-by: NJonathan Tan <jonathantanmy@google.com> Signed-off-by: NJunio C Hamano <gitster@pobox.com>
-
由 Jonathan Tan 提交于
The LOOKUP_REPLACE_OBJECT flag controls whether the lookup_replace_object() function is invoked by sha1_object_info_extended(), read_sha1_file_extended(), and lookup_replace_object_extended(), but it is not immediately clear which functions accept that flag. Therefore restrict this flag to only sha1_object_info_extended(), renaming it appropriately to OBJECT_INFO_LOOKUP_REPLACE and adding some documentation. Update read_sha1_file_extended() to have a boolean parameter instead, and delete lookup_replace_object_extended(). parse_sha1_header() also passes this flag to parse_sha1_header_extended() since commit 46f03448 ("sha1_file: support reading from a loose object of unknown type", 2015-05-03), but that has had no effect since that commit. Therefore this patch also removes this flag from that invocation. Signed-off-by: NJonathan Tan <jonathantanmy@google.com> Signed-off-by: NJunio C Hamano <gitster@pobox.com>
-
由 Jonathan Tan 提交于
The LOOKUP_UNKNOWN_OBJECT flag was introduced in commit 46f03448 ("sha1_file: support reading from a loose object of unknown type", 2015-05-03) in order to support a feature in cat-file subsequently introduced in commit 39e4ae38 ("cat-file: teach cat-file a '--allow-unknown-type' option", 2015-05-03). Despite its name and location in cache.h, this flag is used neither in read_sha1_file_extended() nor in any of the lookup functions, but used only in sha1_object_info_extended(). Therefore rename this flag to OBJECT_INFO_ALLOW_UNKNOWN_TYPE, taking the name of the cat-file flag that invokes this feature, and move it closer to the declaration of sha1_object_info_extended(). Also add documentation for this flag. OBJECT_INFO_ALLOW_UNKNOWN_TYPE is defined to 2, not 1, to avoid conflicting with LOOKUP_REPLACE_OBJECT. Avoidance of this conflict is necessary because sha1_object_info_extended() supports both flags. Signed-off-by: NJonathan Tan <jonathantanmy@google.com> Signed-off-by: NJunio C Hamano <gitster@pobox.com>
-
- 16 6月, 2017 1 次提交
-
-
由 Jonathan Tan 提交于
In commit 46f03448 ("sha1_file: support reading from a loose object of unknown type", 2015-05-06), "struct object_info" gained a "typename" field that could represent a type name from a loose object file, whether valid or invalid, as opposed to the existing "typep" which could only represent valid types. Some relatively complex manipulations were added to avoid breaking packed_object_info() without modifying it, but it is much easier to just teach packed_object_info() about the new field. Therefore, teach packed_object_info() as described above. Signed-off-by: NJonathan Tan <jonathantanmy@google.com> Signed-off-by: NJunio C Hamano <gitster@pobox.com>
-
- 17 4月, 2017 1 次提交
-
-
由 Sebastian Schuberth 提交于
Signed-off-by: NSebastian Schuberth <sschuberth@gmail.com> Acked-by: NJeff King <peff@peff.net> Signed-off-by: NJunio C Hamano <gitster@pobox.com>
-
- 14 4月, 2017 1 次提交
-
-
由 Jonathan Nieder 提交于
Most callers to this function already require that they are in a git repository, but there is an exception: "git apply" uses has_sha1_file to avoid work if the result of applying a binary patch is already present in the repository. When run outside any repository, this produces an error: fatal: BUG: setup_git_env called without repository Signed-off-by: NJonathan Nieder <jrnieder@gmail.com> Signed-off-by: NJunio C Hamano <gitster@pobox.com>
-
- 02 4月, 2017 1 次提交
-
-
由 Jeff King 提交于
When sha1_loose_object_info() finds that a loose object file cannot be stat(2)ed or mmap(2)ed, it returns -1 to signal an error to the caller. However, if it found that the loose object file is corrupt and the object data cannot be used from it, it stuffs OBJ_BAD into "type" field of the object_info, but returns zero (i.e., success), which can confuse callers. This is due to 052fe5ea (sha1_loose_object_info: make type lookup optional, 2013-07-12), which switched the return to a strict success/error, rather than returning the type (but botched the return). Callers of regular sha1_object_info() don't notice the difference, as that function returns the type (which is OBJ_BAD in this case). However, direct callers of sha1_object_info_extended() see the function return success, but without setting any meaningful values in the object_info struct, leading them to access potentially uninitialized memory. The easiest way to see the bug is via "cat-file -s", which will happily ignore the corruption and report whatever value happened to be in the "size" variable. Signed-off-by: NJeff King <peff@peff.net> Signed-off-by: NJunio C Hamano <gitster@pobox.com>
-
- 31 3月, 2017 1 次提交
-
-
由 Jeff King 提交于
These calls to snprintf should always succeed, because their input is small and fixed. Let's use xsnprintf to make sure this is the case (and to make auditing for actual truncation easier). These could be candidates for turning into heap buffers, but they fall into a few broad categories that make it not worth doing: - formatting single numbers is simple enough that we can see the result should fit - the size of a sha1 is likewise well-known, and I didn't want to cause unnecessary conflicts with the ongoing process to convert these constants to GIT_MAX_HEXSZ - the interface for curl_errorstr is dictated by curl Signed-off-by: NJeff King <peff@peff.net>
-
- 27 3月, 2017 2 次提交
-
-
由 brian m. carlson 提交于
Since we will likely be introducing a new hash function at some point, and that hash function might be longer than 20 bytes, use the constant GIT_MAX_RAWSZ, which is designed to be suitable for allocations, instead of GIT_SHA1_RAWSZ. This will ease the transition down the line by distinguishing between places where we need to allocate memory suitable for the largest hash from those where we need to handle the current hash. Signed-off-by: Nbrian m. carlson <sandals@crustytoothpaste.net> Signed-off-by: NJunio C Hamano <gitster@pobox.com>
-
由 brian m. carlson 提交于
Since we will likely be introducing a new hash function at some point, and that hash function might be longer than 40 hex characters, use the constant GIT_MAX_HEXSZ, which is designed to be suitable for allocations, instead of GIT_SHA1_HEXSZ. This will ease the transition down the line by distinguishing between places where we need to allocate memory suitable for the largest hash from those where we need to handle the current hash. Signed-off-by: Nbrian m. carlson <sandals@crustytoothpaste.net> Signed-off-by: NJunio C Hamano <gitster@pobox.com>
-
- 17 3月, 2017 1 次提交
-
-
由 Jeff King 提交于
We provide sha1_pack_name() and sha1_pack_index_name(), but the more generic form (which takes its own strbuf and an arbitrary extension) is only used to implement the other two. Let's make it available, but clean up a few things: 1. Name it odb_pack_name(), as the original sha1_get_pack_name() is long but not all that descriptive. 2. Switch the strbuf argument to the beginning, so that it matches similar path-building functions like git_path_buf(). 3. Clean up the out-dated docstring and move it to the public declaration. Signed-off-by: NJeff King <peff@peff.net> Signed-off-by: NJunio C Hamano <gitster@pobox.com>
-
- 02 3月, 2017 1 次提交
-
-
由 Christian Couder 提交于
This function will be used in a commit soon, so let's make it available globally. Signed-off-by: NChristian Couder <chriscool@tuxfamily.org> Signed-off-by: NJunio C Hamano <gitster@pobox.com>
-
- 28 2月, 2017 1 次提交
-
-
由 René Scharfe 提交于
If a pack entry that's used as a delta base is corrupt, unpack_entry() marks it as unusable and then searches the object again in the hope that it can be found in another pack or in a loose file. The memory for this external base object is never released. Free it after use. Signed-off-by: NRene Scharfe <l.s.r@web.de> Reviewed-by: NJeff King <peff@peff.net> Signed-off-by: NJunio C Hamano <gitster@pobox.com>
-
- 23 2月, 2017 2 次提交
-
-
由 brian m. carlson 提交于
Convert each_loose_object_fn and each_packed_object_fn to take a pointer to struct object_id. Update the various callbacks. Convert several 40-based constants to use GIT_SHA1_HEXSZ. Signed-off-by: Nbrian m. carlson <sandals@crustytoothpaste.net> Signed-off-by: NJunio C Hamano <gitster@pobox.com>
-
由 brian m. carlson 提交于
There are places in the code where we would like to provide a struct object_id *, yet read the hash directly from the pack. Provide an nth_packed_object_oid function that is similar to the nth_packed_object_sha1 function. In order to avoid a potentially invalid cast, nth_packed_object_oid provides a variable into which to store the value, which it returns on success; on error, it returns NULL, as nth_packed_object_sha1 does. Signed-off-by: Nbrian m. carlson <sandals@crustytoothpaste.net> Signed-off-by: NJunio C Hamano <gitster@pobox.com>
-
- 20 1月, 2017 1 次提交
-
-
由 Jeff King 提交于
On Thu, Jan 19, 2017 at 03:03:46PM +0100, Ulrich Spörlein wrote: > > I suspect the patch below may fix things for you. It works around it by > > walking over the lru list (either is fine, as they both contain all > > entries, and since we're clearing everything, we don't care about the > > order). > > Confirmed. With the patch applied, I can import the whole 55G in one go > without any crashes or aborts. Thanks much! Thanks. Here it is rolled up with a commit message. -- >8 -- Subject: clear_delta_base_cache(): don't modify hashmap while iterating Removing entries while iterating causes fast-import to access an already-freed `struct packed_git`, leading to various confusing errors. What happens is that clear_delta_base_cache() drops the whole contents of the cache by iterating over the hashmap, calling release_delta_base_cache() on each entry. That function removes the item from the hashmap. The hashmap code may then shrink the table, but the hashmap_iter struct retains an offset from the old table. As a result, the next call to hashmap_iter_next() may claim that the iteration is done, even though some items haven't been visited. The only caller of clear_delta_base_cache() is fast-import, which wants to clear the cache because it is discarding the packed_git struct for its temporary pack. So by failing to remove all of the entries, we still have references to the freed packed_git. To make things even more confusing, this doesn't seem to trigger with the test suite, because it depends on complexities like the size of the hash table, which entries got cleared, whether we try to access them before they're evicted from the cache, etc. So I've been able to identify the problem with large imports like freebsd's svn import, or a fast-export of linux.git. But nothing that would be reasonable to run as part of the normal test suite. We can fix this easily by iterating over the lru linked list instead of the hashmap. They both contain the same entries, and we can use the "safe" variant of the list iterator, which exists for exactly this case. Let's also add a warning to the hashmap API documentation to reduce the chances of getting bit by this again. Reported-by: NUlrich Spörlein <uqs@freebsd.org> Signed-off-by: NJeff King <peff@peff.net> Signed-off-by: NJunio C Hamano <gitster@pobox.com>
-
- 16 1月, 2017 3 次提交
-
-
由 Jeff King 提交于
When a loose tree or commit is read by fsck (or any git program), unpack_sha1_rest() checks whether there is extra cruft at the end of the object file, after the zlib data. Blobs that are streamed, however, do not have this check. For normal git operations, it's not a big deal. We know the sha1 and size checked out, so we have the object bytes we wanted. The trailing garbage doesn't affect what we're trying to do. But since the point of fsck is to find corruption or other problems, it should be more thorough. This patch teaches its loose-sha1 reader to detect extra bytes after the zlib stream and complain. Signed-off-by: NJeff King <peff@peff.net> Signed-off-by: NJunio C Hamano <gitster@pobox.com>
-
由 Jeff King 提交于
It's surprisingly hard to ask the sha1_file code to open a _specific_ incarnation of a loose object. Most of the functions take a sha1, and loop over the various object types (packed versus loose) and locations (local versus alternates) at a low level. However, some tools like fsck need to look at a specific file. This patch gives them a function they can use to open the loose object at a given path. The implementation unfortunately ends up repeating bits of related functions, but there's not a good way around it without some major refactoring of the whole sha1_file stack. We need to mmap the specific file, then partially read the zlib stream to know whether we're streaming or not, and then finally either stream it or copy the data to a buffer. We can do that by assembling some of the more arcane internal sha1_file functions, but we end up having to essentially reimplement unpack_sha1_file(), along with the streaming bits of check_sha1_signature(). Still, most of the ugliness is contained in the new function, and the interface is clean enough that it may be reusable (though it seems unlikely anything but git-fsck would care about opening a specific file). Signed-off-by: NJeff King <peff@peff.net> Signed-off-by: NJunio C Hamano <gitster@pobox.com>
-
由 Jeff King 提交于
When we fail to open a corrupt loose object, we report an error and mention the filename via sha1_file_name(). However, that function will always give us a path in the local repository, whereas the corrupt object may have come from an alternate. The result is a very misleading error message. Teach the open_sha1_file() and stat_sha1_file() helpers to pass back the path they found, so that we can report it correctly. Note that the pointers we return go to static storage (e.g., from sha1_file_name()), which is slightly dangerous. However, these helpers are static local helpers, and the names are used for immediately generating error messages. The simplicity is an acceptable tradeoff for the danger. Signed-off-by: NJeff King <peff@peff.net> Signed-off-by: NJunio C Hamano <gitster@pobox.com>
-
- 08 1月, 2017 3 次提交
-
-
由 Michael Haggerty 提交于
Add a function that tries to create a file and any containing directories in a way that is robust against races with other processes that might be cleaning up empty directories at the same time. The actual file creation is done by a callback function, which, if it fails, should set errno to EISDIR or ENOENT according to the convention of open(). raceproof_create_file() detects such failures, and respectively either tries to delete empty directories that might be in the way of the file or tries to create the containing directories. Then it retries the callback function. This function is not yet used. Signed-off-by: NMichael Haggerty <mhagger@alum.mit.edu> Reviewed-by: NJeff King <peff@peff.net> Signed-off-by: NJunio C Hamano <gitster@pobox.com>
-
由 Michael Haggerty 提交于
The exit path for SCLD_EXISTS wasn't setting errno, which some callers use to generate error messages for the user. Fix the problem and document that the function sets errno correctly to help avoid similar regressions in the future. Signed-off-by: NMichael Haggerty <mhagger@alum.mit.edu> Reviewed-by: NJeff King <peff@peff.net> Signed-off-by: NJunio C Hamano <gitster@pobox.com>
-
由 Michael Haggerty 提交于
Some implementations of free() change errno (even thought they shouldn't): https://sourceware.org/bugzilla/show_bug.cgi?id=17924 So preserve the errno from safe_create_leading_directories() across the call to free(). Signed-off-by: NMichael Haggerty <mhagger@alum.mit.edu> Reviewed-by: NJeff King <peff@peff.net> Signed-off-by: NJunio C Hamano <gitster@pobox.com>
-
- 13 12月, 2016 2 次提交
-
-
由 Brandon Williams 提交于
Migrate callers of real_path() who duplicate the retern value to use real_pathdup or strbuf_realpath. Signed-off-by: NBrandon Williams <bmwill@google.com> Signed-off-by: NJunio C Hamano <gitster@pobox.com>
-
由 Jeff King 提交于
We read lists of alternates from objects/info/alternates files (delimited by newline), as well as from the GIT_ALTERNATE_OBJECT_DIRECTORIES environment variable (delimited by colon or semi-colon, depending on the platform). There's no mechanism for quoting the delimiters, so it's impossible to specify an alternate path that contains a colon in the environment, or one that contains a newline in a file. We've lived with that restriction for ages because both alternates and filenames with colons are relatively rare, and it's only a problem when the two meet. But since 722ff7f8 (receive-pack: quarantine objects until pre-receive accepts, 2016-10-03), which builds on the alternates system, every push causes the receiver to set GIT_ALTERNATE_OBJECT_DIRECTORIES internally. It would be convenient to have some way to quote the delimiter so that we can represent arbitrary paths. The simplest thing would be an escape character before a quoted delimiter (e.g., "\:" as a literal colon). But that creates a backwards compatibility problem: any path which uses that escape character is now broken, and we've just shifted the problem. We could choose an unlikely escape character (e.g., something from the non-printable ASCII range), but that's awkward to use. Instead, let's treat names as unquoted unless they begin with a double-quote, in which case they are interpreted via our usual C-stylke quoting rules. This also breaks backwards-compatibility, but in a smaller way: it only matters if your file has a double-quote as the very _first_ character in the path (whereas an escape character is a problem anywhere in the path). It's also consistent with many other parts of git, which accept either a bare pathname or a double-quoted one, and the sender can choose to quote or not as required. Signed-off-by: NJeff King <peff@peff.net> Signed-off-by: NJunio C Hamano <gitster@pobox.com>
-
- 09 11月, 2016 1 次提交
-
-
由 Jeff King 提交于
Commit 670c359d (link_alt_odb_entry: handle normalize_path errors, 2016-10-03) regressed the handling of relative paths in the GIT_ALTERNATE_OBJECT_DIRECTORIES variable. It's not entirely clear this was ever meant to work, but it _has_ worked for several years, so this commit restores the original behavior. When we get a path in GIT_ALTERNATE_OBJECT_DIRECTORIES, we add it the path to the list of alternate object directories as if it were found in objects/info/alternates, but with one difference: we do not provide the link_alt_odb_entry() function with a base for relative paths. That function doesn't turn it into an absolute path, and we end up feeding the relative path to the strbuf_normalize_path() function. Most relative paths break out of the top-level directory (e.g., "../foo.git/objects"), and thus normalizing fails. Prior to 670c359d, we simply ignored the error, and due to the way normalize_path_copy() was implemented it happened to return the original path in this case. We then accessed the alternate objects using this relative path. By storing the relative path in the alt_odb list, the path is relative to wherever we happen to be at the time we do an object lookup. That means we look from $GIT_DIR in a bare repository, and from the top of the worktree in a non-bare repository. If this were being designed from scratch, it would make sense to pick a stable location (probably $GIT_DIR, or even the object directory) and use that as the relative base, turning the result into an absolute path. However, given the history, at this point the minimal fix is to match the pre-670c359d behavior. We can do this simply by ignoring the error when we have no relative base and using the original value (which we now reliably have, thanks to strbuf_normalize_path()). That still leaves us with a relative path that foils our duplicate detection, and may act strangely if we ever chdir() later in the process. We could solve that by storing an absolute path based on getcwd(). That may be a good future direction; for now we'll do just the minimum to fix the regression. The new t5615 script demonstrates the fix in its final three tests. Since we didn't have any tests of the alternates environment variable at all, it also adds some tests of absolute paths. Reported-by: NBryan Turner <bturner@atlassian.com> Signed-off-by: NJeff King <peff@peff.net>
-
- 03 11月, 2016 2 次提交
-
-
由 Junio C Hamano 提交于
When we open object files, we try to do so with O_NOATIME. This dates back to 144bde78 (Use O_NOATIME when opening the sha1 files., 2005-04-23), which is an optimization to avoid creating a bunch of dirty inodes when we're accessing many objects. But a few things have changed since then: 1. In June 2005, git learned about packfiles, which means we would do a lot fewer atime updates (rather than one per object access, we'd generally get one per packfile). 2. In late 2006, Linux learned about "relatime", which is generally the default on modern installs. So performance around atimes updates is a non-issue there these days. All the world isn't Linux, but as it turns out, Linux is the only platform to implement O_NOATIME in the first place. So it's very unlikely that this code is helping anybody these days. Helped-by: NJeff King <peff@peff.net> [jc: took idea and log message from peff] Signed-off-by: NJunio C Hamano <gitster@pobox.com>
-
由 Junio C Hamano 提交于
A platform might not support open(2) with O_CLOEXEC but may support telling the same with fcntl(2) to flip FD_CLOEXEC bit on on an open file descriptor. It is a fallback that is inherently racy and this may not be worth doing, though. Suggested-by: NLinus Torvalds <torvalds@linux-foundation.org> Signed-off-by: NJunio C Hamano <gitster@pobox.com>
-
- 28 10月, 2016 1 次提交
-
-
由 Junio C Hamano 提交于
The way we structured the fallback/retry mechanism for opening with O_NOATIME and O_CLOEXEC meant that if we failed due to lack of support to open the file with O_NOATIME option (i.e. EINVAL), we would still try to drop O_CLOEXEC first and retry, and then drop O_NOATIME. A platform on which O_NOATIME is defined in the header without support from the kernel wouldn't have a chance to open with O_CLOEXEC option due to this code structure. Arguably, O_CLOEXEC is more important than O_NOATIME, as the latter is mostly about performance, while the former can affect correctness. Instead use O_CLOEXEC to open the file, and then use fcntl(2) to set O_NOATIME on the resulting file descriptor. open(2) itself does not cause atime to be updated according to Linus [*1*]. The helper to do the former can be usable in the codepath in ce_compare_data() that was recently added to open a file descriptor with O_CLOEXEC; use it while we are at it. *1* <CA+55aFw83E+zOd+z5h-CA-3NhrLjVr-anL6pubrSWttYx3zu8g@mail.gmail.com> Signed-off-by: NJunio C Hamano <gitster@pobox.com>
-
- 26 10月, 2016 2 次提交
-
-
由 Lars Schneider 提交于
All processes that the Git main process spawns inherit the open file descriptors of the main process. These leaked file descriptors can cause problems. Use the O_CLOEXEC flag similar to 05d1ed61 to fix the leaked file descriptors. Signed-off-by: NLars Schneider <larsxschneider@gmail.com> Signed-off-by: NJunio C Hamano <gitster@pobox.com>
-
由 Lars Schneider 提交于
This function is meant to be used when reading from files in the object store, and the original objective was to avoid smudging atime of loose object files too often, hence its name. Because we'll be extending its role in the next commit to also arrange the file descriptors they return auto-closed in the child processes, rename it to lose "noatime" part that is too specific. Signed-off-by: NLars Schneider <larsxschneider@gmail.com> Signed-off-by: NJunio C Hamano <gitster@pobox.com>
-
- 15 10月, 2016 1 次提交
-
-
由 Jeff King 提交于
When we auto-follow tags in a fetch, we look at all of the tags advertised by the remote and fetch ones where we don't already have the tag, but we do have the object it peels to. This involves a lot of calls to has_sha1_file(), some of which we can reasonably expect to fail. Since 45e8a748 (has_sha1_file: re-check pack directory before giving up, 2013-08-30), this may cause many calls to reprepare_packed_git(), which is potentially expensive. This has gone unnoticed for several years because it requires a fairly unique setup to matter: 1. You need to have a lot of packs on the client side to make reprepare_packed_git() expensive (the most expensive part is finding duplicates in an unsorted list, which is currently quadratic). 2. You need a large number of tag refs on the server side that are candidates for auto-following (i.e., that the client doesn't have). Each one triggers a re-read of the pack directory. 3. Under normal circumstances, the client would auto-follow those tags and after one large fetch, (2) would no longer be true. But if those tags point to history which is disconnected from what the client otherwise fetches, then it will never auto-follow, and those candidates will impact it on every fetch. So when all three are true, each fetch pays an extra O(nr_tags * nr_packs^2) cost, mostly in string comparisons on the pack names. This was exacerbated by 47bf4b0f (prepare_packed_git_one: refactor duplicate-pack check, 2014-06-30) which uses a slightly more expensive string check, under the assumption that the duplicate check doesn't happen very often (and it shouldn't; the real problem here is how often we are calling reprepare_packed_git()). This patch teaches fetch to use HAS_SHA1_QUICK to sacrifice accuracy for speed, in cases where we might be racy with a simultaneous repack. This is similar to the fix in 0eeb077b (index-pack: avoid excessive re-reading of pack directory, 2015-06-09). As with that case, it's OK for has_sha1_file() occasionally say "no I don't have it" when we do, because the worst case is not a corruption, but simply that we may fail to auto-follow a tag that points to it. Here are results from the included perf script, which sets up a situation similar to the one described above: Test HEAD^ HEAD ---------------------------------------------------------- 5550.4: fetch 11.21(10.42+0.78) 0.08(0.04+0.02) -99.3% Reported-by: NVegard Nossum <vegard.nossum@oracle.com> Signed-off-by: NJeff King <peff@peff.net> Signed-off-by: NJunio C Hamano <gitster@pobox.com>
-
- 11 10月, 2016 9 次提交
-
-
由 Jeff King 提交于
On a case-insensitive filesystem, we should realize that "a/objects" and "A/objects" are the same path. We already use fspathcmp() to check against the main object directory, but until recently we couldn't use it for comparing against other alternates (because their paths were not NUL-terminated strings). But now we can, so let's do so. Note that we also need to adjust count-objects to load the config, so that it can see the setting of core.ignorecase (this is required by the test, but is also a general bugfix for users of count-objects). Signed-off-by: NJeff King <peff@peff.net> Signed-off-by: NJunio C Hamano <gitster@pobox.com>
-
由 Jeff King 提交于
We recursively expand alternates repositories, so that if A borrows from B which borrows from C, A can see all objects. For the root object database, we allow relative paths, so A can point to B as "../B/objects". However, we currently do not allow relative paths when recursing, so B must use an absolute path to reach C. That is an ancient protection from c2f493a4 (Transitively read alternatives, 2006-05-07) that tries to avoid adding the same alternate through two different paths. Since 5bdf0a84 (sha1_file: normalize alt_odb path before comparing and storing, 2011-09-07), we use a normalized absolute path for each alt_odb entry. This means that in most cases the protection is no longer necessary; we will detect the duplicate no matter how we got there (but see below). And it's a good idea to get rid of it, as it creates an unnecessary complication when setting up recursive alternates (B has to know that A is going to borrow from it and make sure to use an absolute path). Note that our normalization doesn't actually look at the filesystem, so it can still be fooled by crossing symbolic links. But that's also true of absolute paths, so it's not a good reason to disallow only relative paths (it's potentially a reason to switch to real_path(), but that's a separate and non-trivial change). We adjust the test script here to demonstrate that this now works, and add new tests to show that the normalization does indeed suppress duplicates. Signed-off-by: NJeff King <peff@peff.net> Signed-off-by: NJunio C Hamano <gitster@pobox.com>
-
由 Jeff King 提交于
It's currently the responsibility of the caller to give fill_sha1_file() enough bytes to write into, leading them to manually compute the required lengths. Instead, let's just write into a strbuf so that it's impossible to get this wrong. The alt_odb caller already has a strbuf, so this makes things strictly simpler. The other caller, sha1_file_name(), uses a static PATH_MAX buffer and dies when it would overflow. We can convert this to a static strbuf, which means our allocation cost is amortized (and as a bonus, we no longer have to worry about PATH_MAX being too short for normal use). This does introduce some small overhead in fill_sha1_file(), as each strbuf_addchar() will check whether it needs to grow. However, between the optimization in fec501da (strbuf_addch: avoid calling strbuf_grow, 2015-04-16) and the fact that this is not generally called in a tight loop (after all, the next step is typically to access the file!) this probably doesn't matter. And even if it did, the right place to micro-optimize is inside fill_sha1_file(), by calling a single strbuf_grow() there. Signed-off-by: NJeff King <peff@peff.net> Signed-off-by: NJunio C Hamano <gitster@pobox.com>
-
由 Jeff King 提交于
We pre-size the scratch buffer to hold a loose object filename of the form "xx/yyyy...", which leads to allocation code that is hard to verify. We have to use some magic numbers during the initial allocation, and then writers must blindly assume that the buffer is big enough. Using a strbuf makes it more clear that we cannot overflow. Unfortunately, we do still need some magic numbers to grow our strbuf before calling fill_sha1_path(), but the strbuf growth is much closer to the point of use. This makes it easier to see that it's correct, and opens the possibility of pushing it even further down if fill_sha1_path() learns to work on strbufs. Signed-off-by: NJeff King <peff@peff.net> Signed-off-by: NJunio C Hamano <gitster@pobox.com>
-
由 Jeff King 提交于
This function forms a sha1 as "xx/yyyy...", but skips over the slot for the slash rather than writing it, leaving it to the caller to do so. It also does not bother to put in a trailing NUL, even though every caller would want it (we're forming a path which by definition is not a directory, so the only thing to do with it is feed it to a system call). Let's make the lives of our callers easier by just writing out the internal "/" and the NUL. Signed-off-by: NJeff King <peff@peff.net> Signed-off-by: NJunio C Hamano <gitster@pobox.com>
-
由 Jeff King 提交于
The alternate_object_database struct uses a single buffer both for storing the path to the alternate, and as a scratch buffer for forming object names. This is efficient (since otherwise we'd end up storing the path twice), but it makes life hard for callers who just want to know the path to the alternate. They have to remember to stop reading after "alt->name - alt->base" bytes, and to subtract one for the trailing '/'. It would be much simpler if they could simply access a NUL-terminated path string. We could encapsulate this in a function which puts a NUL in the scratch buffer and returns the string, but that opens up questions about the lifetime of the result. The first time another caller uses the alternate, the scratch buffer may get other data tacked onto it. Let's instead just store the root path separately from the scratch buffer. There aren't enough alternates being stored for the duplicated data to matter for performance, and this keeps things simple and safe for the callers. Signed-off-by: NJeff King <peff@peff.net> Signed-off-by: NJunio C Hamano <gitster@pobox.com>
-
由 Jeff King 提交于
The alternate_object_database struct holds a path to the alternate objects, but we also use that buffer as scratch space for forming loose object filenames. Let's pull that logic into a helper function so that we can more easily modify it. Signed-off-by: NJeff King <peff@peff.net> Signed-off-by: NJunio C Hamano <gitster@pobox.com>
-
由 Jeff King 提交于
Allocating a struct alternate_object_database is tricky, as we must over-allocate the buffer to provide scratch space, and then put in particular '/' and NUL markers. Let's encapsulate this in a function so that the complexity doesn't leak into callers (and so that we can modify it later). Signed-off-by: NJeff King <peff@peff.net> Signed-off-by: NJunio C Hamano <gitster@pobox.com>
-
由 Jeff King 提交于
The submodule code wants to temporarily add an alternate object store to our in-memory alt_odb list, but does it manually. Let's provide a helper so it can reuse the code in link_alt_odb_entry(). While we're adding our new add_to_alternates_memory(), let's document add_to_alternates_file(), as the two are related. Signed-off-by: NJeff King <peff@peff.net> Signed-off-by: NJunio C Hamano <gitster@pobox.com>
-