Обсуждение: Rewritten rows on unchanged values
I noticed how rows were re-written to a different location (new ctid) even without changes to the values. This illustrate what I mean: CREATE TABLE demo (id serial, value text); -- generate a few pages of dummy data INSERT INTO demo (value) SELECT md5(s.a::text) FROM generate_series(1, 1000) AS s(a); -- ctid = (0,1) SELECT id, xmin, ctid, value FROM demo WHERE id = 1; UPDATE demo SET value = value WHERE id = 1; -- ctid = (8,41) SELECT id, xmin, ctid, value FROM demo WHERE id = 1; I'm curious as to what would prevent keeping the row where it is and maybe change xmin in place? Thanks, -b
On 03/22/2013 05:32 AM, Bertrand Janin wrote: > I noticed how rows were re-written to a different location (new ctid) even > without changes to the values. This illustrate what I mean: > > CREATE TABLE demo (id serial, value text); > > -- generate a few pages of dummy data > INSERT INTO demo (value) > SELECT md5(s.a::text) > FROM generate_series(1, 1000) AS s(a); > > -- ctid = (0,1) > SELECT id, xmin, ctid, value > FROM demo > WHERE id = 1; > > UPDATE demo > SET value = value > WHERE id = 1; > > -- ctid = (8,41) > SELECT id, xmin, ctid, value > FROM demo > WHERE id = 1; > > I'm curious as to what would prevent keeping the row where it is and maybe > change xmin in place? Because Postgres uses MVCC: http://www.postgresql.org/docs/9.2/static/mvcc-intro.html So an update is a delete and an insert and you are really seeing a new row. > > Thanks, > -b > > -- Adrian Klaver adrian.klaver@gmail.com
Adrian Klaver wrote: > On 03/22/2013 05:32 AM, Bertrand Janin wrote: >> I noticed how rows were re-written to a different location (new ctid) even >> without changes to the values. This illustrate what I mean: >> -- ctid = (0,1) >> SELECT id, xmin, ctid, value >> FROM demo >> WHERE id = 1; >> >> UPDATE demo >> SET value = value >> WHERE id = 1; >> >> -- ctid = (8,41) >> SELECT id, xmin, ctid, value >> FROM demo >> WHERE id = 1; >> >> I'm curious as to what would prevent keeping the row where it is and maybe >> change xmin in place? > > Because Postgres uses MVCC: > > http://www.postgresql.org/docs/9.2/static/mvcc-intro.html > > So an update is a delete and an insert and you are really seeing a new row. You could use a BEFORE UPDATE trigger that returns NULL if all fields are unchanged, but I'm not sure if that is what the OP is looking for. Yours, Laurenz Albe
On Fri, Mar 03/22/13, 2013 at 06:16:11AM -0700, Adrian Klaver wrote: > On 03/22/2013 05:32 AM, Bertrand Janin wrote: > >I noticed how rows were re-written to a different location (new ctid) even > >without changes to the values. This illustrate what I mean: > > > > CREATE TABLE demo (id serial, value text); > > > > -- generate a few pages of dummy data > > INSERT INTO demo (value) > > SELECT md5(s.a::text) > > FROM generate_series(1, 1000) AS s(a); > > > > -- ctid = (0,1) > > SELECT id, xmin, ctid, value > > FROM demo > > WHERE id = 1; > > > > UPDATE demo > > SET value = value > > WHERE id = 1; > > > > -- ctid = (8,41) > > SELECT id, xmin, ctid, value > > FROM demo > > WHERE id = 1; > > > >I'm curious as to what would prevent keeping the row where it is and maybe > >change xmin in place? > > Because Postgres uses MVCC: > > http://www.postgresql.org/docs/9.2/static/mvcc-intro.html > > So an update is a delete and an insert and you are really seeing a new row. > I'm having trouble understanding why it is necessary to generate a new tuple even when nothing has changed. It seems that the OP understands that MVCC is at work, but is questioning why this exact behavior occurs. I too have the same question. Perhaps you could provide an example where an replacing the tuple would be required in the presence of multiple transactions? -Ryan Kelly
Ryan Kelly <rpkelly22@gmail.com> writes: > I'm having trouble understanding why it is necessary to generate a new > tuple even when nothing has changed. It seems that the OP understands > that MVCC is at work, but is questioning why this exact behavior occurs. > I too have the same question. It's not *necessary* to do so. However, avoiding it would require sitting there and comparing the old and new tuples, which would be a waste of cycles for most applications, which don't do useless updates. We'd be trading off a small loss on every update for a moderate gain on a few updates, with all of the benefit going to poorly-coded applications. This has come up before and we've always judged that it would be a net loss not gain of performance to check for useless updates. regards, tom lane
On 03/22/2013 06:41 AM, Ryan Kelly wrote: > On Fri, Mar 03/22/13, 2013 at 06:16:11AM -0700, Adrian Klaver wrote: >> On 03/22/2013 05:32 AM, Bertrand Janin wrote: >>> I noticed how rows were re-written to a different location (new ctid) even >>> without changes to the values. This illustrate what I mean: >>> >>> CREATE TABLE demo (id serial, value text); >>> >>> -- generate a few pages of dummy data >>> INSERT INTO demo (value) >>> SELECT md5(s.a::text) >>> FROM generate_series(1, 1000) AS s(a); >>> >>> -- ctid = (0,1) >>> SELECT id, xmin, ctid, value >>> FROM demo >>> WHERE id = 1; >>> >>> UPDATE demo >>> SET value = value >>> WHERE id = 1; >>> >>> -- ctid = (8,41) >>> SELECT id, xmin, ctid, value >>> FROM demo >>> WHERE id = 1; >>> >>> I'm curious as to what would prevent keeping the row where it is and maybe >>> change xmin in place? >> >> Because Postgres uses MVCC: >> >> http://www.postgresql.org/docs/9.2/static/mvcc-intro.html >> >> So an update is a delete and an insert and you are really seeing a new row. >> > > I'm having trouble understanding why it is necessary to generate a new > tuple even when nothing has changed. It seems that the OP understands > that MVCC is at work, but is questioning why this exact behavior occurs. > I too have the same question. I don't have the answer but Tom Lane does, from: http://postgresql.1045698.n5.nabble.com/GENERAL-Update-on-tables-when-the-row-doesn-t-change-td1844002.html "Because testing for this would almost surely be a net loss for the vast majority of applications. Checking to see if the new row value exactly equals the old is hardly a zero-cost operation; if you pay that on every update, that's a lot of overhead that you are hoping to make back by sometimes avoiding the physical store of the new tuple. In most applications I think the "sometimes" isn't going to be often enough to justify doing it. If you have a particular table in a particular app where it is worth it, I'd recommend writing a BEFORE UPDATE trigger to make the comparisons and suppress the update when NEW and OLD are equal. " > > Perhaps you could provide an example where an replacing the tuple would > be required in the presence of multiple transactions? I am not sure what you are asking above? > > -Ryan Kelly > > > -- Adrian Klaver adrian.klaver@gmail.com
Ryan Kelly <rpkelly22@gmail.com> wrote: > I'm having trouble understanding why it is necessary to generate a new > tuple even when nothing has changed. It seems that the OP understands > that MVCC is at work, but is questioning why this exact behavior occurs. > I too have the same question. > > Perhaps you could provide an example where an replacing the tuple would > be required in the presence of multiple transactions? Well, someone might use such an update for concurrency control purposes, since the SELECT FOR UPDATE in PostgreSQL isn't as strong as it is in some other products. An actual UPDATE will generate a write conflict in some circumstances where SELECT FOR UPDATE will not. Aside from that, I think it's mainly an issue of performance -- it is slower to process an UPDATE command and check whether any of the columns in the SET list are actually being set to a new value than to not generate the UPDATE in the first place. There is a trigger function to do this extra work for those who need it; take a look at the suppress_redundant_updates_trigger() function: http://www.postgresql.org/docs/current/interactive/functions-trigger.html -- Kevin Grittner EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
Hi folks, On 2013-03-22 13:32, Bertrand Janin wrote: > UPDATE demo > SET value = value > WHERE id = 1; On 2013-03-22 14:55, Tom Lane wrote: > > It's not *necessary* to do so. However, avoiding it would require > sitting there and comparing the old and new tuples, But in this case, no tuples would need to be compared: just by looking at the query is becomes clear that it is a no-op. Isn't this something the planner is or could be aware of? Betrand, out of curiosity: is this a synthentic test case or do you really have an application that generates queries like this? Best regards, -hannes
Hannes Erven wrote : > Hi folks, > > > On 2013-03-22 13:32, Bertrand Janin wrote: > > UPDATE demo > > SET value = value > > WHERE id = 1; > > > On 2013-03-22 14:55, Tom Lane wrote: > > > >It's not *necessary* to do so. However, avoiding it would require > >sitting there and comparing the old and new tuples, > > > But in this case, no tuples would need to be compared: just by > looking at the query is becomes clear that it is a no-op. > Isn't this something the planner is or could be aware of? > > Betrand, out of curiosity: is this a synthentic test case or do you > really have an application that generates queries like this? I used the value=value to emphasize the example, in my use case it would be value='whatever is already there'. Tom is right, this would be an optimization for a corner case, I noticed this when running a generated script for a batch update that wasn't given a ton of attention. The BEFORE UPDATE trigger will work great. -b
Hannes Erven <hannes@erven.at> writes: > On 2013-03-22 13:32, Bertrand Janin wrote: >>> UPDATE demo >>> SET value = value >>> WHERE id = 1; > On 2013-03-22 14:55, Tom Lane wrote: >> It's not *necessary* to do so. However, avoiding it would require >> sitting there and comparing the old and new tuples, > But in this case, no tuples would need to be compared: just by looking > at the query is becomes clear that it is a no-op. That's not clear at all; for example, there might be a BEFORE UPDATE trigger that changes the row some more (perhaps by updating a timestamp column, for instance). To avoid changing the behavior of triggers, any optimization like this would have to be performed at a very low level when the new tuple is just about to be written to disk. Another factor in all this is that if you have an application that's in the habit of issuing useless updates often enough for the situation to be worth optimizing, it'd be far better to fix the application to not issue such commands in the first place. No amount of short-circuiting in the planner or executor can buy back the costs of transmitting and parsing such queries. regards, tom lane
On 2013-03-22, Ryan Kelly <rpkelly22@gmail.com> wrote: > On Fri, Mar 03/22/13, 2013 at 06:16:11AM -0700, Adrian Klaver wrote: >> On 03/22/2013 05:32 AM, Bertrand Janin wrote: >> >I noticed how rows were re-written to a different location (new ctid) even >> >without changes to the values. This illustrate what I mean: >> > >> > CREATE TABLE demo (id serial, value text); >> > >> > -- generate a few pages of dummy data >> > INSERT INTO demo (value) >> > SELECT md5(s.a::text) >> > FROM generate_series(1, 1000) AS s(a); >> > >> > -- ctid = (0,1) >> > SELECT id, xmin, ctid, value >> > FROM demo >> > WHERE id = 1; >> > >> > UPDATE demo >> > SET value = value >> > WHERE id = 1; >> > >> > -- ctid = (8,41) >> > SELECT id, xmin, ctid, value >> > FROM demo >> > WHERE id = 1; >> > >> >I'm curious as to what would prevent keeping the row where it is and maybe >> >change xmin in place? >> >> Because Postgres uses MVCC: >> >> http://www.postgresql.org/docs/9.2/static/mvcc-intro.html >> >> So an update is a delete and an insert and you are really seeing a new row. >> > > I'm having trouble understanding why it is necessary to generate a new > tuple even when nothing has changed. It seems that the OP understands > that MVCC is at work, but is questioning why this exact behavior occurs. > I too have the same question. > > Perhaps you could provide an example where an replacing the tuple would > be required in the presence of multiple transactions? create temp table foo as select 1::integer as x ; create temp table bar as select 1::integer as y ; create temp table goo as select 1::integer as z ; A B begin transaction; begin transaction; update foo set x=1; update bar set y=3; update foo set x=2; update goo set z=3; commit; commit; One of those has to fail. if you discard the "update foo set x=1;" bot can succeed and you end up with an inconsistant state. -- ⚂⚃ 100% natural
On Sat, Mar 23, 2013 at 1:50 AM, Bertrand Janin <b@janin.com> wrote: > Tom is right, this would be an optimization for a corner case, I noticed this > when running a generated script for a batch update that wasn't given a ton of > attention. The BEFORE UPDATE trigger will work great. If you know the app, just tweak its one query: UPDATE demo SET value = some_value WHERE id = 1 AND value != some_value; ChrisA