Re: git: uh-oh

Поиск
Список
Период
Сортировка
От Magnus Hagander
Тема Re: git: uh-oh
Дата
Msg-id AANLkTinB+a2de0sxuwcJ7UhP+_MAA4My2kCW1xYgNq7v@mail.gmail.com
обсуждение исходный текст
Ответ на Re: git: uh-oh  (Michael Haggerty <mhagger@alum.mit.edu>)
Список pgsql-hackers
On Wed, Aug 18, 2010 at 11:01, Michael Haggerty <mhagger@alum.mit.edu> wrote:
> Martijn van Oosterhout wrote:
>> On Wed, Aug 18, 2010 at 08:25:45AM +0200, Michael Haggerty wrote:
>>> So let's take the simplest example: a branch BRANCH1 is created from
>>> trunk commit T1, then some time later another FILE1 from trunk commit T3
>>> is added to BRANCH1 in commit B4.  How should this series of events be
>>> represented in a git repository?
>>
>> <snip>
>>
>>> The "exclusive" possibility is to ignore the fact that some of the
>>> content of B4 came from trunk and to pretend that FILE1 just appeared
>>> out of nowhere in commit B4 independent of the FILE1 in TRUNK:
>>>
>>> T0 -- T1 -- T2 -------- T3 -- T4        TRUNK
>>>        \
>>>         B1 -- B2 -- B3 -- B4            BRANCH1
>>>
>>> This is also wrong, because it doesn't reflect the true lineage of FILE1.
>>
>> But the "true lineage" is not stored anywhere in CVS so I don't see why
>> you need to fabricate it for git. Sure, it would be really nice if you
>> could, but if you can't do it reliably, you may as well not do it at
>> all. What's the loss?
>
> CVS does record (albeit somewhat ambiguously) the branch from which a
> new branch sprouted.  The history above might result from commands like
>
> cvs update -A
> cvs tag -b BRANCH1
> <hack hack>                   cvs update -r BRANCH1
> cvs commit -m T2              <hack hack>
> touch FILE1                   cvs commit -m B1
> cvs add FILE1                 <hack hack>
> cvs commit -m T3              cvs commit -m B2
>                              <hack hack>
>                              cvs commit -m B3
> cvs tag -b BRANCH1 FILE1
>
> or the last step might have been an explicit merge into BRANCH1:
>
>                              cvs update -j T1 -j T3
>                              cvs commit -m B4
>
> Either way, the CVS history relatively clearly indicates that content
> was ported from TRUNK to BRANCH1.  There is no way to distinguish
> whether it was a cherry-pick (not recordable in git's history) vs. a
> full merge without more information or more intelligence.

Well, in *our* case we know that it was a "cherry-pick". Because we've
done no full merges ;) So if there's a way for us to short-wire the
tool, that'd be great.


> Magnus Hagander wrote:
>> Our requirements are simple: our cvs history is linear, the git
>> history should be linear. It is *not* the same commit that's on head
>> and the branch. They are two different commits, that happen to have
>> the same commit message and mostly the same content.
>
> I don't think this is at all an issue of cvs2svn merging commits that
> happen to have the same commit message and/or commit time.  The merge
> commits are all manufactured by cvs2svn to do two things:
>
> 1. Add content that needs to be on the branch, because a file was added
> to the branch after the branch's creation.  This *needs* to be done to
> ensure that the branch has the correct content.

Ok.


> 2. Indicate the origin of the new branch content.  This goal is debatable.

I agree this is debatable. We've kind of debated it already (though
not in exactly this context) and decided we'd rather have it appear as
brand new content on this branch and not as a merge.


>> Bottom line is, we want zero merge commits in the git repository. We
>> may start using that sometime in the future (but for now, we've
>> decided we don't want that even in the future), but we most
>> *definitely* don't want it in the past. We don't care about
>> "representing the proper heritage of FILE1" in git, because we never
>> did in cvs.
>>
>> Is there some way to make cvs2git work this way, and just not bother
>> even trying to create merge commits, or is that fundamentally
>> impossible and we need to look at another tool?
>
> A merge is just a special case of content being taken from one branch
> and added to another.  Logically, the same thing happens when a branch
> is created, and some of the same problems can occur in that situation.
> A branch can be created using content from multiple source branches,
> which cvs2git currently also represents as a merge.

Can be, yes. AFAIK, we don't ever do that (though I can't swear to
that, since there have been some funky things in our cvs repository
earlier)


> Assuming that you don't want to discard all record of where a branch
> sprouted from, it is therefore necessary to choose a single parent
> branch for each branch creation.  To be sure, this choice can be
> incorrect the same way as the merge commits discussed above are
> incorrect.  But one reasonable "mostly-exclusive" approach would be to
> choose the most likely parent as the source of the branch and ignore all
> others.

Yes, I believe that is what we'd prefer, as it's what most closely
matches how *we*'ve been using CVS.


> cvs2git doesn't currently have this option.  I'm not sure how much work
> it would be to implement; probably a few days'.  Alternatively, you

Would this be something you'd consider doing, since it might be of
interest to others? I'm sure if it's a few days work for you, it'd be
weeks for one of us, given no knowledge of the codebase :-)

Obviously not saying it needs to be done in two days or anything, now
that we've postponed the migration this time, we're not on as tight a
schedule anymore.


> could write a tool that would rewrite the ancestry information in the
> repository *after* the cvs2git conversion using .git/info/grafts (see
> git-filter-branch(1)).  Such rewriting would have to occur before the
> repository is published, because the rewriting will change the hashes of
> most commits.

That could definitely be done.

Um, I don't see a "info/grafts" though (our repo is a bare one, could
that be why?)

Do you have any more specifics, or a reference, as to how you'd
suggest we look at that?


--
 Magnus Hagander
 Me: http://www.hagander.net/
 Work: http://www.redpill-linpro.com/


В списке pgsql-hackers по дате отправления:

Предыдущее
От: Michael Haggerty
Дата:
Сообщение: Re: git: uh-oh
Следующее
От: Pavel Stehule
Дата:
Сообщение: Re: GROUPING SETS revisited