Re: [EXTERNAL]: Re: [EXTERNAL]: Re: UPSERT in Postgres

Поиск
Список
Период
Сортировка
От Louis Tian
Тема Re: [EXTERNAL]: Re: [EXTERNAL]: Re: UPSERT in Postgres
Дата
Msg-id SY4P282MB1052B56685C1A94EAEB3682DA6949@SY4P282MB1052.AUSP282.PROD.OUTLOOK.COM
обсуждение исходный текст
Ответ на Re: [EXTERNAL]: Re: UPSERT in Postgres  (Adrian Klaver <adrian.klaver@aklaver.com>)
Список pgsql-general
Hi Adrian, 

> No it is not as Israel Brewster pointed out.
I think we need make a distinction between an operation and a statement here
The examples Israel presented and yours are both non-idempotent statements because the different value being upserted
eachtime the statement is executed. Not because upsert as an operation is not idempotent.  
Likewise, In HTTP, PUT method is defined as idempotent. No one would expect different PUT requests with different
payloadto be idempotent.  

Also, I don't think I am not the only one who think upsert is logically idempotent. Just a few seconds of searching on
google.
- "The upsert action is a combination of insert and update. It allows us to write idempotent statements".
(https://jonmeyers.io/blog/use-on-conflict-to-upsert-in-postgresql.) 
-"One common example of an idempotent operation is an upsert (update or insert) statement in a database."
(https://www.arecadata.com/core-data-engineering-concepts-idempotency)
- "A POST request means that the request is not idempotent, but an "upsert" request is idempotent."
(https://softwareengineering.stackexchange.com/questions/426225/rest-how-to-upsert-a-resource-without-an-specific-resource-url#:~:text=A%20POST%20request%20means%20that,%22upsert%22%20request%20is%20idempotent.)
While no as direct as the article above, but the issue reported in those places are essentially stem from the same
understanding. 
- https://stackoverflow.com/questions/48816629/on-conflict-do-nothing-in-postgres-with-a-not-null-constraint
- https://www.postgresql.org/message-id/flat/1996438.1604952901%40sss.pgh.pa.us#ace8adc1354265aca5672028faea0c0f
So at least I am not only one imaging things?
- "a ready made statement for idempotent data like Mongo's upsert"
(https://www.compose.com/articles/the-potency-of-idempotent-with-rabbitmq-and-mongodb-upsert/)

> And therein lies your problem, you are imagining something that does not
> exist and more to the point will most likely not exist as it would break all code that depends on above behavior.
I never argue the way postgresql does for insert ... on conflict is wrong and need to change.
I said "I think it is doing what it reads quite literally" just like in your words "It does what is advertised on the
tin".So there is no disagreement here. So I am not saying "insert on conflict" need to change.  
What I am point out here is, the "insert...on conflict do update" is not a true "upsert".
I am saying in my opinion it would be nice see UPSERT statement to be added to PostgreSQL so upsert can work "out of
box"even when there're not null constraints on that table. 

It might never happen judging from the replies I am getting but that's fine. The solution you have showed me is good
enoughas a workaround (to not depreciate it's value, just not as easy/bullet proof as it could be in my opinion).  
It was just a question just to confirm my understanding so I got what I need, so thank you all for that.


Cheers,
Louis Tian


From: Adrian Klaver <adrian.klaver@aklaver.com>
Sent: Sunday, April 9, 2023 7:51 AM
To: Louis Tian <louis.tian@aquamonix.com.au>; Peter Geoghegan <pg@bowt.ie>
Cc: pgsql-general@lists.postgresql.org <pgsql-general@lists.postgresql.org>
Subject: [EXTERNAL]: Re: [EXTERNAL]: Re: UPSERT in Postgres
 
CAUTION: This email originated from outside of Envirada. Do not click links or open attachments unless you recognize
thesender and know the content is safe. 


On 4/6/23 17:49, Louis Tian wrote:
> Hi Peter,
>
> Thanks for your reply. Appreciate the help and discussion.
>
>>   In general UPSERT (or any definition of it that I can think of) does
>> not imply idempotency.
>
> "Idempotence is the property of certain operations in mathematics and computer science whereby they can be applied
multipletimes without changing the result beyond the initial application." from Wikipedia. 
> the concept of Idempotence when applies to HTTP is consistent with the above.
https://developer.mozilla.org/en-US/docs/Glossary/Idempotent.Or are you going by a different defintion that I am not
awareof? 
> If you execute the same upsert multiple times, the state of the database remains the same as if only execute once.
> If a row already exists, the first statement will update the row so does any subsequent statements. executing the
sameupdate multiple time is the same as executing it only once. 
> If the row doesn't exist, the first statement will insert that row and any subsequent will try to update, but the
updatehas no real effect since it the value is exactly the same as the insert. 
> So by defintion, upsert is idempotent.

No it is not as Israel Brewster pointed out.

To his example I would add:

alter some_table add column ts_upsert_update timestamptz;

insert into some_table values('foo', 'bar') on conflict(tbl_id) do
update set foo_fld = excluded.foo_fld, bar_fld = some_table.bar_fld,
ts_upsert_update = now();

You are substituting whatever definition you have in your head for the
definition as it actually exists.

>
>> It could just be a unique index or a unique constraint. So you can
>> upsert on any individual unique constraint/index, or the primary key.
>> Of course there might be several on a given table, but you can only
>> use one as the "conflict arbiter" per statement.
>
> Understand that I can use any unique constraint with on conflict.
> But semantically the only correct one is the primary key, since that's what identifies a row logically.
> In that sense, any unique column(s) is a potential candidate for primary key.
> It's more of a pedantic point rather than pragmatic one.
> It's less of a problem for PostgreSQL where the semantic importance of primary key is not manifested at
implementationlevel, since all index points to the tuple directly 
> Whereas it is more import for Databaes like MySQL where the secondary index points to the primary key index.

Again you are dealing with the imagined instead of the reality. Along
that line you left out that a 'exclusion constraint violation error' can
also trigger the ON CONFLICT.


> Use some pesudo code might be helpful here to explain the difference.
>
> How on conflict works at the moment.
>
> try {
>   insert row
> } catch (duplicated key error) {
>    update row
> }
>
> How I think it upsert should work

And therein lies your problem, you are imagining something that does not
exist and more to the point will most likely not exist as it would break
all code that depends on above behavior.

>
> if (new.id exists) {
>   update row
> } else {
>   insert row
> }
>

> I am not expecting an error here. The problem is with no conflict it always go down the insert path first and results
ina not null constraint error. 

To be expected, as after all the command is:

INSERT INTO <some_table> ...

> While I am expecting the insert is never executed in the first place when that row already exist (as identified by it
primarykey). So the update execute without error. 
> I hope the pesudo code above is enough to clarify the difference?
>

> Cheers,
> Louis Tian
>
>

--
Adrian Klaver
adrian.klaver@aklaver.com

--
Message  protected by MailGuard: e-mail anti-virus, anti-spam and content filtering.https://www.mailguard.com.au/mg
Click here to report this message as spam:
https://console.mailguard.com.au/ras/25am4N38I1/7rIyiJl23oANcTXYht5xW0/0



В списке pgsql-general по дате отправления:

Предыдущее
От: Louis Tian
Дата:
Сообщение: Re: [EXTERNAL]: Re: [EXTERNAL]: Re: UPSERT in Postgres
Следующее
От: Alban Hertroys
Дата:
Сообщение: Re: [EXTERNAL]: Re: UPSERT in Postgres