• E
    merge-recursive: new function for better colliding conflict resolutions · 37b65ce3
    Elijah Newren 提交于
    There are three conflict types that represent two (possibly entirely
    unrelated) files colliding at the same location:
      * add/add
      * rename/add
      * rename/rename(2to1)
    
    These three conflict types already share more similarity than might be
    immediately apparent from their description: (1) the handling of the
    rename variants already involves removing any entries from the index
    corresponding to the original file names[*], thus only leaving entries
    in the index for the colliding path; (2) likewise, any trace of the
    original file name in the working tree is also removed.  So, in all
    three cases we're left with how to represent two colliding files in both
    the index and the working copy.
    
    [*] Technically, this isn't quite true because rename/rename(2to1)
    conflicts in the recursive (o->call_depth > 0) case do an "unrename"
    since about seven years ago.  But even in that case, Junio felt
    compelled to explain that my decision to "unrename" wasn't necessarily
    the only or right answer -- search for "Comment from Junio" in t6036 for
    details.
    
    My initial motivation for looking at these three conflict types was that
    if the handling of these three conflict types is the same, at least in
    the limited set of cases where a renamed file is unmodified on the side
    of history where the file is not renamed, then a significant performance
    improvement for rename detection during merges is possible.  However,
    while that served as motivation to look at these three types of
    conflicts, the actual goal of this new function is to try to improve the
    handling for all three cases, not to merely make them the same as each
    other in that special circumstance.
    
    === Handling the working tree ===
    
    The previous behavior for these conflict types in regards to the
    working tree (assuming the file collision occurs at 'foo') was:
      * add/add does a two-way merge of the two files and records it as 'foo'.
      * rename/rename(2to1) records the two different files into two new
        uniquely named files (foo~HEAD and foo~$MERGE), while removing 'foo'
        from the working tree.
      * rename/add records the two different files into two different
        locations, recording the add at foo~$SIDE and, oddly, recording
        the rename at foo (why is the rename more important than the add?)
    
    So, the question for what to write to the working tree boils down to
    whether the two colliding files should be two-way merged and recorded in
    place, or recorded into separate files.  As per discussion on the git
    mailing lit, two-way merging was deemed to always be preferred, as that
    makes these cases all more like content conflicts that users can handle
    from within their favorite editor, IDE, or merge tool.  Note that since
    renames already involve a content merge, rename/add and
    rename/rename(2to1) conflicts could result in nested conflict markers.
    
    === Handling of the index ===
    
    For a typical rename, unpack_trees() would set up the index in the
    following fashion:
               old_path  new_path
       stage1: 5ca1ab1e  00000000
       stage2: f005ba11  00000000
       stage3: 00000000  b0a710ad
    And merge-recursive would rewrite this to
               new_path
       stage1: 5ca1ab1e
       stage2: f005ba11
       stage3: b0a710ad
    Removing old_path from the index means the user won't have to `git rm
    old_path` manually every time a renamed path has a content conflict.
    It also means they can use `git checkout [--ours|--theirs|--conflict|-m]
    new_path`, `git diff [--ours|--theirs]` and various other commands that
    would be difficult otherwise.
    
    This strategy becomes a problem when we have a rename/add or
    rename/rename(2to1) conflict, however, because then we have only three
    slots to store blob sha1s and we need either four or six.  Previously,
    this was handled by continuing to delete old_path from the index, and
    just outright ignoring any blob shas from old_path.  That had the
    downside of deleting any trace of changes made to old_path on the other
    side of history.  This function instead does a three-way content merge of
    the renamed file, and stores the blob sha1 for that at either stage2 or
    stage3 for new_path (depending on which side the rename came from).  That
    has the advantage of bringing information about changes on both sides and
    still allows for easy resolution (no need to git rm old_path, etc.), but
    does have the downside that if the content merge had conflict markers,
    then what we store in the index is the sha1 of a blob with conflict
    markers.  While that is a downside, it seems less problematic than the
    downsides of any obvious alternatives, and certainly makes more sense
    than the previous handling.  Further, it has a precedent in that when we
    do recursive merges, we may accept a file with conflict markers as the
    resolution for the merge of the merge-bases, which will then show up in
    the index of the outer merge at stage 1 if a conflict exists at the outer
    level.
    Signed-off-by: NElijah Newren <newren@gmail.com>
    Signed-off-by: NJunio C Hamano <gitster@pobox.com>
    37b65ce3
merge-recursive.c 109.6 KB