Обсуждение: git diff --patience

Поиск
Список
Период
Сортировка

git diff --patience

От
"Kevin Grittner"
Дата:
I just discovered the --patience flag on the git diff command, and
I'd like to suggest that we encourage people to use it when possible
for building patches.  I just looked at output with and without it
(and for good measure, before and after filterdiff --format=context
for both), and the results were much better with this switch.
Here's a reference to the algorithm:
http://bramcohen.livejournal.com/73318.html
I think that page understates the benefits, though.
-Kevin


Re: git diff --patience

От
Bruce Momjian
Дата:
Kevin Grittner wrote:
> I just discovered the --patience flag on the git diff command, and
> I'd like to suggest that we encourage people to use it when possible
> for building patches.  I just looked at output with and without it
> (and for good measure, before and after filterdiff --format=context
> for both), and the results were much better with this switch.
>  
> Here's a reference to the algorithm:
>  
> http://bramcohen.livejournal.com/73318.html
>  
> I think that page understates the benefits, though.

I have seen the bracket example shown and the --patience output is
clearly nicer.

--  Bruce Momjian  <bruce@momjian.us>        http://momjian.us EnterpriseDB
http://enterprisedb.com
 + It's impossible for everything to be true. +


Re: git diff --patience

От
Peter Eisentraut
Дата:
On ons, 2010-09-15 at 12:58 -0500, Kevin Grittner wrote:
> I just discovered the --patience flag on the git diff command, and
> I'd like to suggest that we encourage people to use it when possible
> for building patches.  I just looked at output with and without it
> (and for good measure, before and after filterdiff --format=context
> for both), and the results were much better with this switch.

I have tried this switch various times now and haven't seen any
difference at all in the output.  Do you have an existing commit where
you see a difference so I can try it and see if there is some other
problem that my local configuration has?



Re: git diff --patience

От
"Kevin Grittner"
Дата:
Peter Eisentraut <peter_e@gmx.net> wrote:

> I have tried this switch various times now and haven't seen any
> difference at all in the output.  Do you have an existing commit
> where you see a difference so I can try it and see if there is
> some other problem that my local configuration has?

Having looked at it more, I find that the output with the switch is
usually the same as without; but when they differ, I always have
preferred the version with it on.  Attached is the diff which caused
me to see if there was a way to make the diff output smarter, and
the result of adding the --patience flag.

This is the unified form that git puts out by default, but the
benefit is there after filterdiff --format=context, too.

-Kevin



Вложения

Re: git diff --patience

От
"Kevin Grittner"
Дата:
Peter Eisentraut <peter_e@gmx.net> wrote:

> Do you have an existing commit where you see a difference so I can
> try it and see if there is some other problem that my local
> configuration has?

Random poking around in the postgresql.git commits didn't turn up
any where it mattered, so here's before and after files for the
example diff files already posted.  If you create branch, commit the
before copy, and copy in the after copy, you should be able to
replicate the results I posted.

-Kevin

Вложения

Re: git diff --patience

От
Gurjeet Singh
Дата:
Attached are two versions of the same patch, with and without --patience.

The with-patience version has only two hunks, removal of a big block of comment and addition of a big block of code.

The without-patience patience is riddled with the mix of two hunks, spread until line 120.

--patience is a clear winner here.

Regards,

On Wed, Sep 29, 2010 at 5:10 PM, Kevin Grittner <Kevin.Grittner@wicourts.gov> wrote:
Peter Eisentraut <peter_e@gmx.net> wrote:

> Do you have an existing commit where you see a difference so I can
> try it and see if there is some other problem that my local
> configuration has?

Random poking around in the postgresql.git commits didn't turn up
any where it mattered, so here's before and after files for the
example diff files already posted.  If you create branch, commit the
before copy, and copy in the after copy, you should be able to
replicate the results I posted.

-Kevin


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers




--
gurjeet.singh
@ EnterpriseDB - The Enterprise Postgres Company
http://www.EnterpriseDB.com

singh.gurjeet@{ gmail | yahoo }.com
Twitter/Skype: singh_gurjeet

Mail sent from my BlackLaptop device
Вложения

Re: git diff --patience

От
"Kevin Grittner"
Дата:
Gurjeet Singh <singh.gurjeet@gmail.com> wrote:
> The with-patience version has only two hunks, removal of a big
> block of comment and addition of a big block of code.
> 
> The without-patience patience is riddled with the mix of two
> hunks, spread until line 120.
> 
> --patience is a clear winner here.
When I read the description of the algorithm, I can't imagine a
situation where --patience would make the diff *worse*.  I was
somewhat afraid (based on the name) that it would be slow; but
if it is slower, it hasn't been by enough for me to notice it.
-Kevin


Re: git diff --patience

От
Alvaro Herrera
Дата:
Excerpts from Kevin Grittner's message of jue sep 30 16:38:11 -0400 2010:

> When I read the description of the algorithm, I can't imagine a
> situation where --patience would make the diff *worse*.  I was
> somewhat afraid (based on the name) that it would be slow; but
> if it is slower, it hasn't been by enough for me to notice it.

There is a very simple example posted on some of the blog posts that
goes something like

aaaaaaaa
aaaaaaaa
aaaaaaaa
bbbbbbbb
bbbbbbbb
bbbbbbbb
xyz

and the "xyz" is moved to the front.  In this corner case, the patience
diff is a lot worse.

-- 
Álvaro Herrera <alvherre@commandprompt.com>
The PostgreSQL Company - Command Prompt, Inc.
PostgreSQL Replication, Consulting, Custom Development, 24x7 support


Re: git diff --patience

От
Gurjeet Singh
Дата:
<div dir="ltr"><div class="gmail_quote">On Fri, Oct 1, 2010 at 9:38 AM, Alvaro Herrera <span dir="ltr"><<a
href="mailto:alvherre@commandprompt.com">alvherre@commandprompt.com</a>></span>wrote:<br /><blockquote
class="gmail_quote"style="margin: 0pt 0pt 0pt 0.8ex; border-left: 1px solid rgb(204, 204, 204); padding-left: 1ex;">
Excerptsfrom Kevin Grittner's message of jue sep 30 16:38:11 -0400 2010:<br /><div class="im"><br /> > When I read
thedescription of the algorithm, I can't imagine a<br /> > situation where --patience would make the diff *worse*.
 Iwas<br /> > somewhat afraid (based on the name) that it would be slow; but<br /> > if it is slower, it hasn't
beenby enough for me to notice it.<br /><br /></div>There is a very simple example posted on some of the blog posts
that<br/> goes something like<br /><br /> aaaaaaaa<br /> aaaaaaaa<br /> aaaaaaaa<br /> bbbbbbbb<br /> bbbbbbbb<br />
bbbbbbbb<br/> xyz<br /><br /> and the "xyz" is moved to the front.  In this corner case, the patience<br /> diff is a
lotworse.<br clear="all" /></blockquote></div><br />Sorry, but that example didn't make much sense to me. Can you
pleaseelaborate, or maybe share those blog posts you are referring to.<br /><br />Regards,<br />-- <br />
gurjeet.singh<br/>@ EnterpriseDB - The Enterprise Postgres Company<br /><a
href="http://www.EnterpriseDB.com">http://www.EnterpriseDB.com</a><br/><br />singh.gurjeet@{ gmail | yahoo }.com<br
/>Twitter/Skype:singh_gurjeet<br /><br /> Mail sent from my BlackLaptop device<br /></div> 

Re: git diff --patience

От
"Kevin Grittner"
Дата:
Gurjeet Singh <singh.gurjeet@gmail.com> wrote:
> Alvaro Herrera <alvherre@commandprompt.com>wrote:
>> There is a very simple example posted on some of the blog posts
>> that goes something like
>>
>> aaaaaaaa
>> aaaaaaaa
>> aaaaaaaa
>> bbbbbbbb
>> bbbbbbbb
>> bbbbbbbb
>> xyz
>>
>> and the "xyz" is moved to the front.  In this corner case, the
>> patience diff is a lot worse.
>>
> 
> Sorry, but that example didn't make much sense to me. Can you
> please elaborate, or maybe share those blog posts you are referring
to.
I tried it out.  Here are the results:
git diff --color
diff --git a/a1 b/a1
index bd0586b..32736d1 100644
--- a/a1
+++ b/a1
@@ -1,7 +1,7 @@
+xyzaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbb
-xyz
git diff --color --patience
diff --git a/a1 b/a1
index bd0586b..32736d1 100644
--- a/a1
+++ b/a1
@@ -1,7 +1,7 @@
-aaaaaaaa
-aaaaaaaa
-aaaaaaaa
-bbbbbbbb
-bbbbbbbb
-bbbbbbbbxyz
+aaaaaaaa
+aaaaaaaa
+aaaaaaaa
+bbbbbbbb
+bbbbbbbb
+bbbbbbbb
This is because lines which only occur once in a file are the
"anchors" around which lines which occur multiple times move -- 
after keeping intact any leading and trailing lines which match
between the files.  An interesting exercise it so think about what
real-life lines you could have which would have multiple occurrences
in this pattern, and think about whether you would then prefer the
--patience output, especially if this were part of a larger file. 
Even in this supposed "worst case" example, I'm not at all sure I
wouldn't prefer --patience, personally, even though more lines are
flagged.
-Kevin


Re: git diff --patience

От
Greg Stark
Дата:
On Fri, Oct 1, 2010 at 7:15 AM, Kevin Grittner
<Kevin.Grittner@wicourts.gov> wrote:
> An interesting exercise it so think about what
> real-life lines you could have which would have multiple occurrences
> in this pattern, and think about whether you would then prefer the
> --patience output, especially if this were part of a larger file.

The linux-kernel mailing list had examples of this occurring in real
life too. In real C programs function signatures usually end up being
the unique lines which is what you want but it can happen that
surprising lines are unique. Even braces can be unique if a given
indentation level is only used once.

The discussion basically convinced me that using uniqueness alone is a
bad idea but that the basic idea of trying to identify the important
lines is a fine idea. It's just that uniqueness turns out to be a
relatively weak signal for interesting lines. Linus suggested
line-length but it's pretty debatable which is better.

-- 
greg