Обсуждение: Rewritten rows on unchanged values

Поиск
Список
Период
Сортировка

Rewritten rows on unchanged values

От
Bertrand Janin
Дата:
I noticed how rows were re-written to a different location (new ctid) even
without changes to the values. This illustrate what I mean:

    CREATE TABLE demo (id serial, value text);

    -- generate a few pages of dummy data
    INSERT INTO demo (value)
    SELECT md5(s.a::text)
    FROM generate_series(1, 1000) AS s(a);

    -- ctid = (0,1)
    SELECT id, xmin, ctid, value
    FROM demo
    WHERE id = 1;

    UPDATE demo
    SET value = value
    WHERE id = 1;

    -- ctid = (8,41)
    SELECT id, xmin, ctid, value
    FROM demo
    WHERE id = 1;

I'm curious as to what would prevent keeping the row where it is and maybe
change xmin in place?

Thanks,
-b


Re: Rewritten rows on unchanged values

От
Adrian Klaver
Дата:
On 03/22/2013 05:32 AM, Bertrand Janin wrote:
> I noticed how rows were re-written to a different location (new ctid) even
> without changes to the values. This illustrate what I mean:
>
>      CREATE TABLE demo (id serial, value text);
>
>      -- generate a few pages of dummy data
>      INSERT INTO demo (value)
>      SELECT md5(s.a::text)
>      FROM generate_series(1, 1000) AS s(a);
>
>      -- ctid = (0,1)
>      SELECT id, xmin, ctid, value
>      FROM demo
>      WHERE id = 1;
>
>      UPDATE demo
>      SET value = value
>      WHERE id = 1;
>
>      -- ctid = (8,41)
>      SELECT id, xmin, ctid, value
>      FROM demo
>      WHERE id = 1;
>
> I'm curious as to what would prevent keeping the row where it is and maybe
> change xmin in place?

Because Postgres uses MVCC:

http://www.postgresql.org/docs/9.2/static/mvcc-intro.html

So an update is a delete and an insert and you are really seeing a new row.

>
> Thanks,
> -b
>
>


--
Adrian Klaver
adrian.klaver@gmail.com


Re: Rewritten rows on unchanged values

От
Albe Laurenz
Дата:
Adrian Klaver wrote:
> On 03/22/2013 05:32 AM, Bertrand Janin wrote:
>> I noticed how rows were re-written to a different location (new ctid) even
>> without changes to the values. This illustrate what I mean:

>>      -- ctid = (0,1)
>>      SELECT id, xmin, ctid, value
>>      FROM demo
>>      WHERE id = 1;
>>
>>      UPDATE demo
>>      SET value = value
>>      WHERE id = 1;
>>
>>      -- ctid = (8,41)
>>      SELECT id, xmin, ctid, value
>>      FROM demo
>>      WHERE id = 1;
>>
>> I'm curious as to what would prevent keeping the row where it is and maybe
>> change xmin in place?
> 
> Because Postgres uses MVCC:
> 
> http://www.postgresql.org/docs/9.2/static/mvcc-intro.html
> 
> So an update is a delete and an insert and you are really seeing a new row.

You could use a BEFORE UPDATE trigger that returns
NULL if all fields are unchanged, but I'm not sure if
that is what the OP is looking for.

Yours,
Laurenz Albe

Re: Rewritten rows on unchanged values

От
Ryan Kelly
Дата:
On Fri, Mar 03/22/13, 2013 at 06:16:11AM -0700, Adrian Klaver wrote:
> On 03/22/2013 05:32 AM, Bertrand Janin wrote:
> >I noticed how rows were re-written to a different location (new ctid) even
> >without changes to the values. This illustrate what I mean:
> >
> >     CREATE TABLE demo (id serial, value text);
> >
> >     -- generate a few pages of dummy data
> >     INSERT INTO demo (value)
> >     SELECT md5(s.a::text)
> >     FROM generate_series(1, 1000) AS s(a);
> >
> >     -- ctid = (0,1)
> >     SELECT id, xmin, ctid, value
> >     FROM demo
> >     WHERE id = 1;
> >
> >     UPDATE demo
> >     SET value = value
> >     WHERE id = 1;
> >
> >     -- ctid = (8,41)
> >     SELECT id, xmin, ctid, value
> >     FROM demo
> >     WHERE id = 1;
> >
> >I'm curious as to what would prevent keeping the row where it is and maybe
> >change xmin in place?
>
> Because Postgres uses MVCC:
>
> http://www.postgresql.org/docs/9.2/static/mvcc-intro.html
>
> So an update is a delete and an insert and you are really seeing a new row.
>

I'm having trouble understanding why it is necessary to generate a new
tuple even when nothing has changed. It seems that the OP understands
that MVCC is at work, but is questioning why this exact behavior occurs.
I too have the same question.

Perhaps you could provide an example where an replacing the tuple would
be required in the presence of multiple transactions?

-Ryan Kelly



Re: Rewritten rows on unchanged values

От
Tom Lane
Дата:
Ryan Kelly <rpkelly22@gmail.com> writes:
> I'm having trouble understanding why it is necessary to generate a new
> tuple even when nothing has changed. It seems that the OP understands
> that MVCC is at work, but is questioning why this exact behavior occurs.
> I too have the same question.

It's not *necessary* to do so.  However, avoiding it would require
sitting there and comparing the old and new tuples, which would be a
waste of cycles for most applications, which don't do useless updates.
We'd be trading off a small loss on every update for a moderate gain
on a few updates, with all of the benefit going to poorly-coded
applications.  This has come up before and we've always judged that
it would be a net loss not gain of performance to check for useless
updates.

            regards, tom lane


Re: Rewritten rows on unchanged values

От
Adrian Klaver
Дата:
On 03/22/2013 06:41 AM, Ryan Kelly wrote:
> On Fri, Mar 03/22/13, 2013 at 06:16:11AM -0700, Adrian Klaver wrote:
>> On 03/22/2013 05:32 AM, Bertrand Janin wrote:
>>> I noticed how rows were re-written to a different location (new ctid) even
>>> without changes to the values. This illustrate what I mean:
>>>
>>>      CREATE TABLE demo (id serial, value text);
>>>
>>>      -- generate a few pages of dummy data
>>>      INSERT INTO demo (value)
>>>      SELECT md5(s.a::text)
>>>      FROM generate_series(1, 1000) AS s(a);
>>>
>>>      -- ctid = (0,1)
>>>      SELECT id, xmin, ctid, value
>>>      FROM demo
>>>      WHERE id = 1;
>>>
>>>      UPDATE demo
>>>      SET value = value
>>>      WHERE id = 1;
>>>
>>>      -- ctid = (8,41)
>>>      SELECT id, xmin, ctid, value
>>>      FROM demo
>>>      WHERE id = 1;
>>>
>>> I'm curious as to what would prevent keeping the row where it is and maybe
>>> change xmin in place?
>>
>> Because Postgres uses MVCC:
>>
>> http://www.postgresql.org/docs/9.2/static/mvcc-intro.html
>>
>> So an update is a delete and an insert and you are really seeing a new row.
>>
>
> I'm having trouble understanding why it is necessary to generate a new
> tuple even when nothing has changed. It seems that the OP understands
> that MVCC is at work, but is questioning why this exact behavior occurs.
> I too have the same question.


I don't have the answer but Tom Lane does, from:

http://postgresql.1045698.n5.nabble.com/GENERAL-Update-on-tables-when-the-row-doesn-t-change-td1844002.html

"Because testing for this would almost surely be a net loss for the vast
majority of applications.  Checking to see if the new row value exactly
equals the old is hardly a zero-cost operation; if you pay that on every
update, that's a lot of overhead that you are hoping to make back by
sometimes avoiding the physical store of the new tuple.  In most
applications I think the "sometimes" isn't going to be often enough
to justify doing it.

If you have a particular table in a particular app where it is worth it,
I'd recommend writing a BEFORE UPDATE trigger to make the comparisons
and suppress the update when NEW and OLD are equal. "

>
> Perhaps you could provide an example where an replacing the tuple would
> be required in the presence of multiple transactions?

I am not sure what you are asking above?

>
> -Ryan Kelly
>
>
>


--
Adrian Klaver
adrian.klaver@gmail.com


Re: Rewritten rows on unchanged values

От
Kevin Grittner
Дата:
Ryan Kelly <rpkelly22@gmail.com> wrote:

> I'm having trouble understanding why it is necessary to generate a new
> tuple even when nothing has changed. It seems that the OP understands
> that MVCC is at work, but is questioning why this exact behavior occurs.
> I too have the same question.
>
> Perhaps you could provide an example where an replacing the tuple would
> be required in the presence of multiple transactions?

Well, someone might use such an update for concurrency control
purposes, since the SELECT FOR UPDATE in PostgreSQL isn't as strong
as it is in some other products.  An actual UPDATE will generate a
write conflict in some circumstances where SELECT FOR UPDATE will
not.  Aside from that, I think it's mainly an issue of performance
-- it is slower to process an UPDATE command and check whether any
of the columns in the SET list are actually being set to a new
value than to not generate the UPDATE in the first place.  There is
a trigger function to do this extra work for those who need it;
take a look at the suppress_redundant_updates_trigger() function:

http://www.postgresql.org/docs/current/interactive/functions-trigger.html

--
Kevin Grittner
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


Re: Rewritten rows on unchanged values

От
Hannes Erven
Дата:
Hi folks,


On 2013-03-22 13:32, Bertrand Janin wrote:
 >    UPDATE demo
 >    SET value = value
 >    WHERE id = 1;


On 2013-03-22 14:55, Tom Lane wrote:
 >
> It's not *necessary* to do so.  However, avoiding it would require
> sitting there and comparing the old and new tuples,


But in this case, no tuples would need to be compared: just by looking
at the query is becomes clear that it is a no-op.
Isn't this something the planner is or could be aware of?

Betrand, out of curiosity: is this a synthentic test case or do you
really have an application that generates queries like this?


Best regards,

    -hannes


Re: Rewritten rows on unchanged values

От
Bertrand Janin
Дата:
Hannes Erven wrote :
> Hi folks,
>
>
> On 2013-03-22 13:32, Bertrand Janin wrote:
> >    UPDATE demo
> >    SET value = value
> >    WHERE id = 1;
>
>
> On 2013-03-22 14:55, Tom Lane wrote:
> >
> >It's not *necessary* to do so.  However, avoiding it would require
> >sitting there and comparing the old and new tuples,
>
>
> But in this case, no tuples would need to be compared: just by
> looking at the query is becomes clear that it is a no-op.
> Isn't this something the planner is or could be aware of?
>
> Betrand, out of curiosity: is this a synthentic test case or do you
> really have an application that generates queries like this?

I used the value=value to emphasize the example, in my use case it would be
value='whatever is already there'.

Tom is right, this would be an optimization for a corner case, I noticed this
when running a generated script for a batch update that wasn't given a ton of
attention. The BEFORE UPDATE trigger will work great.

-b


Re: Rewritten rows on unchanged values

От
Tom Lane
Дата:
Hannes Erven <hannes@erven.at> writes:
> On 2013-03-22 13:32, Bertrand Janin wrote:
>>> UPDATE demo
>>> SET value = value
>>> WHERE id = 1;

> On 2013-03-22 14:55, Tom Lane wrote:
>> It's not *necessary* to do so.  However, avoiding it would require
>> sitting there and comparing the old and new tuples,

> But in this case, no tuples would need to be compared: just by looking
> at the query is becomes clear that it is a no-op.

That's not clear at all; for example, there might be a BEFORE UPDATE
trigger that changes the row some more (perhaps by updating a timestamp
column, for instance).  To avoid changing the behavior of triggers,
any optimization like this would have to be performed at a very low
level when the new tuple is just about to be written to disk.

Another factor in all this is that if you have an application that's in
the habit of issuing useless updates often enough for the situation to
be worth optimizing, it'd be far better to fix the application to not
issue such commands in the first place.  No amount of short-circuiting
in the planner or executor can buy back the costs of transmitting and
parsing such queries.

            regards, tom lane


Re: Rewritten rows on unchanged values

От
Jasen Betts
Дата:
On 2013-03-22, Ryan Kelly <rpkelly22@gmail.com> wrote:
> On Fri, Mar 03/22/13, 2013 at 06:16:11AM -0700, Adrian Klaver wrote:
>> On 03/22/2013 05:32 AM, Bertrand Janin wrote:
>> >I noticed how rows were re-written to a different location (new ctid) even
>> >without changes to the values. This illustrate what I mean:
>> >
>> >     CREATE TABLE demo (id serial, value text);
>> >
>> >     -- generate a few pages of dummy data
>> >     INSERT INTO demo (value)
>> >     SELECT md5(s.a::text)
>> >     FROM generate_series(1, 1000) AS s(a);
>> >
>> >     -- ctid = (0,1)
>> >     SELECT id, xmin, ctid, value
>> >     FROM demo
>> >     WHERE id = 1;
>> >
>> >     UPDATE demo
>> >     SET value = value
>> >     WHERE id = 1;
>> >
>> >     -- ctid = (8,41)
>> >     SELECT id, xmin, ctid, value
>> >     FROM demo
>> >     WHERE id = 1;
>> >
>> >I'm curious as to what would prevent keeping the row where it is and maybe
>> >change xmin in place?
>>
>> Because Postgres uses MVCC:
>>
>> http://www.postgresql.org/docs/9.2/static/mvcc-intro.html
>>
>> So an update is a delete and an insert and you are really seeing a new row.
>>
>
> I'm having trouble understanding why it is necessary to generate a new
> tuple even when nothing has changed. It seems that the OP understands
> that MVCC is at work, but is questioning why this exact behavior occurs.
> I too have the same question.
>
> Perhaps you could provide an example where an replacing the tuple would
> be required in the presence of multiple transactions?

   create temp table foo as select 1::integer as x ;
   create temp table bar as select 1::integer as y ;
   create temp table goo as select 1::integer as z ;


   A                          B
   begin transaction;        begin transaction;
   update foo set x=1;
                             update bar set y=3;
                             update foo set x=2;
   update goo set z=3;
   commit;
                             commit;

   One of those has to fail.
   if you discard the "update foo set x=1;"
   bot can succeed and you end up with an inconsistant
   state.


--
⚂⚃ 100% natural

Re: Rewritten rows on unchanged values

От
Chris Angelico
Дата:
On Sat, Mar 23, 2013 at 1:50 AM, Bertrand Janin <b@janin.com> wrote:
> Tom is right, this would be an optimization for a corner case, I noticed this
> when running a generated script for a batch update that wasn't given a ton of
> attention. The BEFORE UPDATE trigger will work great.

If you know the app, just tweak its one query:

    UPDATE demo
    SET value = some_value
    WHERE id = 1 AND value != some_value;

ChrisA