Обсуждение: [GENERAL] Serializable isolation -- are predicate locks still held across alldatabases?
[GENERAL] Serializable isolation -- are predicate locks still held across alldatabases?
От
"Karl O. Pinc"
Дата:
Hi, I forget all the details, but some time ago I found that I had to increase max_pred_locs_per_transaction. What I recall about the reason for this is that I'm using the serializable transaction isolation, and that I've a test database which occasionally has extremely long running transactions. The PG serializable snapshot isolation implementation at the time (9.1?) was holding predicate locks across all databases during transactions. This even though databases are independent of each other. The long transaction times in the test database lead to predicate lock exhaustion in production databases -- only a single transaction would be executing in the test database but many would occur in the production databases. (I don't know if there was potential for other bad effects due to the production transactions "hanging around" until the transaction in the test db finished.) My question is whether this has changed. Does PG now pay attention to database in it's SSI implementation? Thanks for the help and apologies if I'm not framing the question perfectly. It's not often I think about this. Regards, Karl <kop@meme.com> Free Software: "You don't pay back, you pay forward." -- Robert A. Heinlein
Re: [GENERAL] Serializable isolation -- are predicate locks stillheld across all databases?
От
Kevin Grittner
Дата:
On Thu, May 18, 2017 at 11:07 AM, Karl O. Pinc <kop@meme.com> wrote: > I forget all the details, but some time ago I found > that I had to increase max_pred_locs_per_transaction. > What I recall about the reason for this is that I'm > using the serializable transaction isolation, and that > I've a test database which occasionally has extremely > long running transactions. The PG serializable > snapshot isolation implementation at the time (9.1?) > was holding predicate locks across all databases > during transactions. This even though databases > are independent of each other. The long transaction > times in the test database lead to predicate lock > exhaustion in production databases -- only a single > transaction would be executing in the test database > but many would occur in the production databases. > (I don't know if there was potential for other bad effects > due to the production transactions "hanging around" until the > transaction in the test db finished.) > > My question is whether this has changed. Does PG > now pay attention to database in it's SSI implementation? Well, it pays attention as far as the scope of each lock, but there is only one variable to track how far back the oldest transaction ID for a running serializable transaction goes, which is used in cleanup of old locks. I see your point, and it might be feasible to change that to a list or map that tracks it by database; but I don't even have a gut feel estimate for the scale of such work without investigating it. Just out of curiosity, what is the reason you don't move the production and test databases to separate instances? If nothing else, extremely long-running transaction in one database can lead to bloat in others. > Thanks for the help and apologies if I'm not framing > the question perfectly. It's not often I think about > this. No sweat -- your concern/question is perfectly clear. It's the first time I've heard of someone with this particular issue, so at this point I'm inclined to recommend the workaround of using a separate cluster; but if we get other reports it might be worth adding to the list of enhancements that SSI could use. Thanks! -- Kevin Grittner VMware vCenter Server https://www.vmware.com/
Re: [GENERAL] Serializable isolation -- are predicate locks stillheld across all databases?
От
"Karl O. Pinc"
Дата:
On Thu, 18 May 2017 12:04:42 -0500 Kevin Grittner <kgrittn@gmail.com> wrote: > On Thu, May 18, 2017 at 11:07 AM, Karl O. Pinc <kop@meme.com> wrote: > > > ... Does PG > > now pay attention to database in it's SSI implementation? > > Well, it pays attention as far as the scope of each lock, but there > is only one variable to track how far back the oldest transaction ID > for a running serializable transaction goes, which is used in > cleanup of old locks. I see your point, and it might be feasible to > change that to a list or map that tracks it by database; but I don't > even have a gut feel estimate for the scale of such work without > investigating it. Just out of curiosity, what is the reason you > don't move the production and test databases to separate instances? > If nothing else, extremely long-running transaction in one database > can lead to bloat in others. Ultimately it was easier to change the transaction isolation level to repeatable read (or lower) for the transactions known to take a long time. Any concurrency issues (which have never arisen) are handled at the human level. > > Thanks for the help and apologies if I'm not framing > > the question perfectly. It's not often I think about > > this. > > No sweat -- your concern/question is perfectly clear. It's the > first time I've heard of someone with this particular issue, so at > this point I'm inclined to recommend the workaround of using a > separate cluster; but if we get other reports it might be worth > adding to the list of enhancements that SSI could use. Understood. To give you an idea of the use-case, we're using Chado (http://gmod.org/wiki/Chado) a PG database design which stores genetic information. The datasets being what they are, they are big and take a long time to load. This is especially true because the Chado designers are enamored of ontologies and knowledge representation and so there's a lot of tables where, instead of having separate columns for different types of data there's simply 2 columns "type" and "data". The type is an oncology entry and tells you want the data is. This makes for ugly queries in the process of loading data (and ugly SQL in general). So loading genetic data sets is slow. Not really an issue as there's no anticipation of loading a data set more than every 6 months or a year. (Although non-genetic data is loaded frequently.) The workflow is to load data first into the test db, possibly multiple times until satisfied. Then load the data into production. It is very handy, especially in production, to load all related data in a single transaction in the event something goes wrong. There are many non-optimal elements, not the least of which is that it's not clear how much utility there is in storing genetic datasets in a relational db along side our non-genetic data. (We are finding out.) Thanks for the help. Karl <kop@meme.com> Free Software: "You don't pay back, you pay forward." -- Robert A. Heinlein
Re: [GENERAL] Serializable isolation -- are predicate locks stillheld across all databases?
От
"Karl O. Pinc"
Дата:
On Fri, 19 May 2017 01:52:00 -0500 "Karl O. Pinc" <kop@meme.com> wrote: > On Thu, 18 May 2017 12:04:42 -0500 > Kevin Grittner <kgrittn@gmail.com> wrote: > > > On Thu, May 18, 2017 at 11:07 AM, Karl O. Pinc <kop@meme.com> wrote: > > > > > ... Does PG > > > now pay attention to database in it's SSI implementation? > > > > Well, it pays attention as far as the scope of each lock, but there > > is only one variable to track how far back the oldest transaction ID > > for a running serializable transaction goes, which is used in > > cleanup of old locks. > > ... It's the > > first time I've heard of someone with this particular issue, so at > > this point I'm inclined to recommend the workaround of using a > > separate cluster I think if I was to make an argument for doing something it would be based on reliability -- how many users can you give their own database before somebody leaves an open transaction hanging? Karl <kop@meme.com> Free Software: "You don't pay back, you pay forward." -- Robert A. Heinlein
Re: [GENERAL] Serializable isolation -- are predicate locks stillheld across all databases?
От
Kevin Grittner
Дата:
On Fri, May 19, 2017 at 6:56 AM, Karl O. Pinc <kop@meme.com> wrote: > I think if I was to make an argument for doing something it would > be based on reliability -- how many users can you give their > own database before somebody leaves an open transaction hanging? Yeah, I guess it's worth having on the list, where it will compete with other possible enhancements on a cost/benefit basis. Thanks for raising the issue! -- Kevin Grittner VMware vCenter Server https://www.vmware.com/