Обсуждение: Default gin operator class of jsonb failing with index row size maximum reached
Default gin operator class of jsonb failing with index row size maximum reached
От
Michael Paquier
Дата:
Hi all, While doing some tests with jsonb, I found a failure as told in $subject: =# create table data_jsonb (data jsonb); CREATE TABLE =# insert into data_jsonb ... tuple in the script attached INSERT 1 =# create index data_index on data_jsonb using gin(data); ERROR: 54000: index row size 1808 exceeds maximum 1352 for index "data_index" LOCATION: GinFormTuple, ginentrypage.c:110 =# create index data_index2 on data_jsonb using gin (data jsonb_hash_ops); CREATE INDEX The data creating the failure is a tuple in a dump of geonames (http://www.geonames.org/export/), listing some geographical data, and it is caused by some arabic characters it seems used to provide translations for a given geographical location. Encoding of the database on which I have done the tests is UTF-8. Japanese, Chinese equivalents were working fine btw with this operator. Documentation of jsonb tells that jsonb documents should be kept at a reasonable size to reduce lock contention, but there is no mention of size limitation for indexes: http://www.postgresql.org/docs/devel/static/datatype-json.html A test case is attached, note as well that only the default gin operator class is failing, jsonb_hash_ops worked well. Regards, -- Michael
Вложения
Re: Default gin operator class of jsonb failing with index row size maximum reached
От
Peter Geoghegan
Дата:
On Mon, Apr 7, 2014 at 8:29 PM, Michael Paquier <michael.paquier@gmail.com> wrote: > Documentation of jsonb tells that jsonb documents should be kept at a > reasonable size to reduce lock contention, but there is no mention of > size limitation for indexes: > http://www.postgresql.org/docs/devel/static/datatype-json.html It seems like your complaint is that this warrants special consideration from the jsonb docs, due to this general limitation being particularly likely to affect jsonb users. Is that accurate? The structure of the JSON in your test case is quite atypical, since there is one huge string containing each of the translations, rather than a bunch of individual array elements (one per translation) or a bunch of object pairs. As it happens, just this morning I read that MongoDB's new version 2.6 now comes with stricter enforcement of key length: http://docs.mongodb.org/master/release-notes/2.6-compatibility/#index-key-length-incompatibility . While previous versions just silently failed to index values that were inserted, there is now a 1024 limit imposed on the total size of indexed values. -- Peter Geoghegan
Re: Default gin operator class of jsonb failing with index row size maximum reached
От
Oleg Bartunov
Дата:
We are working to avoid this limitation. On Tue, Apr 8, 2014 at 10:54 PM, Peter Geoghegan <pg@heroku.com> wrote: > On Mon, Apr 7, 2014 at 8:29 PM, Michael Paquier > <michael.paquier@gmail.com> wrote: >> Documentation of jsonb tells that jsonb documents should be kept at a >> reasonable size to reduce lock contention, but there is no mention of >> size limitation for indexes: >> http://www.postgresql.org/docs/devel/static/datatype-json.html > > It seems like your complaint is that this warrants special > consideration from the jsonb docs, due to this general limitation > being particularly likely to affect jsonb users. Is that accurate? > > The structure of the JSON in your test case is quite atypical, since > there is one huge string containing each of the translations, rather > than a bunch of individual array elements (one per translation) or a > bunch of object pairs. > > As it happens, just this morning I read that MongoDB's new version 2.6 > now comes with stricter enforcement of key length: > http://docs.mongodb.org/master/release-notes/2.6-compatibility/#index-key-length-incompatibility > . While previous versions just silently failed to index values that > were inserted, there is now a 1024 limit imposed on the total size of > indexed values. > > -- > Peter Geoghegan > > > -- > Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) > To make changes to your subscription: > http://www.postgresql.org/mailpref/pgsql-hackers
Oleg Bartunov <obartunov@gmail.com> writes: > We are working to avoid this limitation. What do you mean by that ... do you see it as something that could be fixed quickly, or is this a long-term improvement project? regards, tom lane
Re: Default gin operator class of jsonb failing with index row size maximum reached
От
Oleg Bartunov
Дата:
On Wed, Apr 9, 2014 at 1:48 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote: > Oleg Bartunov <obartunov@gmail.com> writes: >> We are working to avoid this limitation. > > What do you mean by that ... do you see it as something that could be > fixed quickly, or is this a long-term improvement project? Unfortunately, It's long-term project VODKA, we hope to use spgist (also needed some adjustment) for keys+values instead of btree. Hopefully, we'll show something at PGCon. > > regards, tom lane
Re: Default gin operator class of jsonb failing with index row size maximum reached
От
Michael Paquier
Дата:
On Wed, Apr 9, 2014 at 6:48 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote: > Oleg Bartunov <obartunov@gmail.com> writes: >> We are working to avoid this limitation. > > What do you mean by that ... do you see it as something that could be > fixed quickly, or is this a long-term improvement project? If this is a known limitation and no fix is planned for 9.4, could it be possible to document it appropriately for this release? This would surprise users. -- Michael
Re: Default gin operator class of jsonb failing with index row size maximum reached
От
Peter Geoghegan
Дата:
On Tue, Apr 8, 2014 at 4:07 PM, Michael Paquier <michael.paquier@gmail.com> wrote: > If this is a known limitation and no fix is planned for 9.4, could it > be possible to document it appropriately for this release? This would > surprise users. It looks like the default GIN opclass will be changed, so it becomes a matter of opting in to the particular set of trade-offs that the current default represents. Presumably that includes the size limitation, which isn't separately documented at present. -- Peter Geoghegan