Обсуждение: BUG #14126: INSERT ON CONFLICT DO NOTHING auto increments serial primary key when no insert happens.
BUG #14126: INSERT ON CONFLICT DO NOTHING auto increments serial primary key when no insert happens.
От
cwire4@gmail.com
Дата:
The following bug has been logged on the website: Bug reference: 14126 Logged by: Casey Wireman Email address: cwire4@gmail.com PostgreSQL version: 9.5.2 Operating system: Windows 7 Description: It's unclear to me if this is desired behavior or not, but when doing the following: insert into table(column) values (value) on conflict DO NOTHING the first time a new entry is inserted it works as expected and autoincrements the serial primary key as normal, however if I try to insert this again the on conflict correctly detects that the value already exists, but internally seems to autoincrement the serial key as many times as you try to do the failed insert since on the next insert of a new value the key value is (previous successful key value + number of failed inserts) It seems to me that intuitively the key should only increment on a successful insert, not just an insert statement being executed. For one, this unnecessarily reduces the keyspace available for the column.
Re: BUG #14126: INSERT ON CONFLICT DO NOTHING auto increments serial primary key when no insert happens.
От
Francisco Olarte
Дата:
On Fri, May 6, 2016 at 8:55 AM, <cwire4@gmail.com> wrote: > It's unclear to me if this is desired behavior or not, but when doing the > following: .... > (previous successful key value + number of failed inserts) > It seems to me that intuitively the key should only increment on a > successful insert, not just an insert statement being executed. For one, > this unnecessarily reduces the keyspace available for the column. I suspect this is "working as designed". The same things happen if you do a lot of inserts and then rollback a transaction. This is because serial use sequences, which are not mean to generate exact correlative values, but to a mean to generate unique keys with very high concurrency. What sequences do is grab a chunk of values ( may be of 1 or more, it depends ) per backend needing them and each time you ask for one they give you a unique value and burn it. This is great for concurrency, and it is fast. The fact they normally use correlative numbers is normally a by-product of being the easier way of generating different ids, but their purpose is not to do it exactly. Think of it, if they needed to generate exact correlative amounts anytime someone needed a number they would need to be locked until the operation using it commits or rolls back, and informs everyone. It can be done this way, but is much slower and normally not needed. Francisco Olarte.
On 05/06/16 13:39, Francisco Olarte wrote: > On Fri, May 6, 2016 at 8:55 AM, <cwire4@gmail.com> wrote: >> It's unclear to me if this is desired behavior or not, but when doing the >> following: > .... >> (previous successful key value + number of failed inserts) >> It seems to me that intuitively the key should only increment on a >> successful insert, not just an insert statement being executed. For one, >> this unnecessarily reduces the keyspace available for the column. > > I suspect this is "working as designed". The same things happen if you > do a lot of inserts and then rollback a transaction. This is because > serial use sequences, which are not mean to generate exact correlative > values, but to a mean to generate unique keys with very high > concurrency. > > What sequences do is grab a chunk of values ( may be of 1 or more, it > depends ) per backend needing them and each time you ask for one they > give you a unique value and burn it. This is great for concurrency, > and it is fast. The fact they normally use correlative numbers is > normally a by-product of being the easier way of generating different > ids, but their purpose is not to do it exactly. > > Think of it, if they needed to generate exact correlative amounts > anytime someone needed a number they would need to be locked until the > operation using it commits or rolls back, and informs everyone. It can > be done this way, but is much slower and normally not needed. > > Francisco Olarte. A bit more explanation: default value for serial columns are gotten from a sequence not at saving tuple in a storage, but _before_ tuple tries to pass triggers, checks (constraints, PKs, uniqueness, not nulls etc.) and inserts it to a table. And, as Francisco mentioned, sequence being increased doesn't reverts if transaction rolls back or insert does nothing. That's why its value monotonically increased even if number of real insertions is not changed. -- Best regards, Vitaly Burovoy
On Fri, May 6, 2016 at 6:39 AM, Francisco Olarte <folarte@peoplecall.com> wrote: >> (previous successful key value + number of failed inserts) >> It seems to me that intuitively the key should only increment on a >> successful insert, not just an insert statement being executed. For one, >> this unnecessarily reduces the keyspace available for the column. > > I suspect this is "working as designed". The same things happen if you > do a lot of inserts and then rollback a transaction. This is because > serial use sequences, which are not mean to generate exact correlative > values, but to a mean to generate unique keys with very high > concurrency. It is. There have been several -bugs threads on this already. e.g., the following two: http://www.postgresql.org/message-id/flat/CAM3SWZQMfR6Zfe3A0Nr4ddko8xZrijAuQQ=EcGjGeJSs2piAXA@mail.gmail.com#CAM3SWZQMfR6Zfe3A0Nr4ddko8xZrijAuQQ=EcGjGeJSs2piAXA@mail.gmail.com http://www.postgresql.org/message-id/flat/CAM3SWZS2jd1Ci2TR2bjBXqZgK3JdoFin9yGbC5TQdYdF12RBuA@mail.gmail.com#CAM3SWZS2jd1Ci2TR2bjBXqZgK3JdoFin9yGbC5TQdYdF12RBuA@mail.gmail.com -- Peter Geoghegan
On 2016-05-06 11:18:24 -0700, Peter Geoghegan wrote: > On Fri, May 6, 2016 at 6:39 AM, Francisco Olarte <folarte@peoplecall.com> wrote: > >> (previous successful key value + number of failed inserts) > >> It seems to me that intuitively the key should only increment on a > >> successful insert, not just an insert statement being executed. For one, > >> this unnecessarily reduces the keyspace available for the column. > > > > I suspect this is "working as designed". The same things happen if you > > do a lot of inserts and then rollback a transaction. This is because > > serial use sequences, which are not mean to generate exact correlative > > values, but to a mean to generate unique keys with very high > > concurrency. > > It is. There have been several -bugs threads on this already. e.g., > the following two: > > http://www.postgresql.org/message-id/flat/CAM3SWZQMfR6Zfe3A0Nr4ddko8xZrijAuQQ=EcGjGeJSs2piAXA@mail.gmail.com#CAM3SWZQMfR6Zfe3A0Nr4ddko8xZrijAuQQ=EcGjGeJSs2piAXA@mail.gmail.com > > http://www.postgresql.org/message-id/flat/CAM3SWZS2jd1Ci2TR2bjBXqZgK3JdoFin9yGbC5TQdYdF12RBuA@mail.gmail.com#CAM3SWZS2jd1Ci2TR2bjBXqZgK3JdoFin9yGbC5TQdYdF12RBuA@mail.gmail.com Maybe we should add a note to the docs then? Andres
On Fri, May 6, 2016 at 11:20 AM, Andres Freund <andres@anarazel.de> wrote: > Maybe we should add a note to the docs then? I was just about to suggest that. It's getting out of hand. I'll look into it. -- Peter Geoghegan
Re: BUG #14126: INSERT ON CONFLICT DO NOTHING auto increments serial primary key when no insert happens.
От
Francisco Olarte
Дата:
On Fri, May 6, 2016 at 8:20 PM, Andres Freund <andres@anarazel.de> wrote: > Maybe we should add a note to the docs then? Well, all I now about sequences comes from there, so I think they are clear. But I've read the docs and noticed the warning: "Important: To avoid blocking concurrent transactions that obtain numbers from the same sequence, a nextval operation is never rolled back; that is, once a value has been fetched it is considered used, even if the transaction that did the nextval later aborts. This means that aborted transactions might leave unused "holes" in the sequence of assigned values." Which may lead people to think only rollbacks 'burn' numbers. So I think some rewording indicating: 1.- once fetched a value is considered used even if the using command does not complete, which may be caused by rollbacks, aborts, triggers, insert_or_update and/or any other things. 2.- IIRC using cache>1 can lead to obtain sequence numbers out of order is using the same sequence in two different sessions. I assumed this is normal and do not have a problem with it, but people may get surprised by it, so I think a note there could help. I do not trust my english to suggest an exact rewording, but I think this could lead to less reports AND make many of them solvable by redirecting at the manual. Francisco Olarte.
Francisco Olarte <folarte@peoplecall.com> writes: > But I've read the docs and noticed the warning: > "Important: To avoid blocking concurrent transactions that obtain > numbers from the same sequence, a nextval operation is never rolled > back; that is, once a value has been fetched it is considered used, > even if the transaction that did the nextval later aborts. This means > that aborted transactions might leave unused "holes" in the sequence > of assigned values." > Which may lead people to think only rollbacks 'burn' numbers. Yeah, good point. I'll do some wordsmithing on that. > 2.- IIRC using cache>1 can lead to obtain sequence numbers out of > order is using the same sequence in two different sessions. I assumed > this is normal and do not have a problem with it, but people may get > surprised by it, so I think a note there could help. I think that's explained already, no? regards, tom lane
Re: BUG #14126: INSERT ON CONFLICT DO NOTHING auto increments serial primary key when no insert happens.
От
Francisco Olarte
Дата:
On Sat, May 7, 2016 at 5:07 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote: > Francisco Olarte <folarte@peoplecall.com> writes: >> 2.- IIRC using cache>1 can lead to obtain sequence numbers out of >> order is using the same sequence in two different sessions. ...... >> so I think a note there could help. > I think that's explained already, no? Yes, it is. May be a little hard to find but it certainly is, my fault. I've repeated my docs reading ( I did on the first reading ). First I went to 8.1.3 datatypes/serial, which pointed me "see nextval() in section 9.16 for details.". There I found "with default parameters, successive nextval calls will return successive values beginning with 1. Other behaviors can be obtained by using special parameters in the CREATE SEQUENCE command; see its command reference page for more information.". Then I rescaned that page AND missed the Notes. My fault. A link to that note or to create sequence in the nextval() page may be useful, but maybe I lost it because being familiar with the behaviour I scanned too fast. Francisco Olarte.
On Fri, May 6, 2016 at 11:21 AM, Peter Geoghegan <pg@heroku.com> wrote: > I'll look into it. Looks like Tom took care of it. -- Peter Geoghegan