Обсуждение: Default gin operator class of jsonb failing with index row size maximum reached

Поиск
Список
Период
Сортировка

Default gin operator class of jsonb failing with index row size maximum reached

От
Michael Paquier
Дата:
Hi all,

While doing some tests with jsonb, I found a failure as told in $subject:
=# create table data_jsonb (data jsonb);
CREATE TABLE
=# insert into data_jsonb ... tuple in the script attached
INSERT 1
=# create index data_index on data_jsonb using gin(data);
ERROR:  54000: index row size 1808 exceeds maximum 1352 for index "data_index"
LOCATION:  GinFormTuple, ginentrypage.c:110
=# create index data_index2 on data_jsonb using gin (data jsonb_hash_ops);
CREATE INDEX

The data creating the failure is a tuple in a dump of geonames
(http://www.geonames.org/export/), listing some geographical data, and
it is caused by some arabic characters it seems used to provide
translations for a given geographical location. Encoding of the
database on which I have done the tests is UTF-8. Japanese, Chinese
equivalents were working fine btw with this operator.

Documentation of jsonb tells that jsonb documents should be kept at a
reasonable size to reduce lock contention, but there is no mention of
size limitation for indexes:
http://www.postgresql.org/docs/devel/static/datatype-json.html

A test case is attached, note as well that only the default gin
operator class is failing, jsonb_hash_ops worked well.
Regards,
--
Michael

Вложения

Re: Default gin operator class of jsonb failing with index row size maximum reached

От
Peter Geoghegan
Дата:
On Mon, Apr 7, 2014 at 8:29 PM, Michael Paquier
<michael.paquier@gmail.com> wrote:
> Documentation of jsonb tells that jsonb documents should be kept at a
> reasonable size to reduce lock contention, but there is no mention of
> size limitation for indexes:
> http://www.postgresql.org/docs/devel/static/datatype-json.html

It seems like your complaint is that this warrants special
consideration from the jsonb docs, due to this general limitation
being particularly likely to affect jsonb users. Is that accurate?

The structure of the JSON in your test case is quite atypical, since
there is one huge string containing each of the translations, rather
than a bunch of individual array elements (one per translation) or a
bunch of object pairs.

As it happens, just this morning I read that MongoDB's new version 2.6
now comes with stricter enforcement of key length:
http://docs.mongodb.org/master/release-notes/2.6-compatibility/#index-key-length-incompatibility
. While previous versions just silently failed to index values that
were inserted, there is now a 1024 limit imposed on the total size of
indexed values.

-- 
Peter Geoghegan



Re: Default gin operator class of jsonb failing with index row size maximum reached

От
Oleg Bartunov
Дата:
We are working to avoid this limitation.

On Tue, Apr 8, 2014 at 10:54 PM, Peter Geoghegan <pg@heroku.com> wrote:
> On Mon, Apr 7, 2014 at 8:29 PM, Michael Paquier
> <michael.paquier@gmail.com> wrote:
>> Documentation of jsonb tells that jsonb documents should be kept at a
>> reasonable size to reduce lock contention, but there is no mention of
>> size limitation for indexes:
>> http://www.postgresql.org/docs/devel/static/datatype-json.html
>
> It seems like your complaint is that this warrants special
> consideration from the jsonb docs, due to this general limitation
> being particularly likely to affect jsonb users. Is that accurate?
>
> The structure of the JSON in your test case is quite atypical, since
> there is one huge string containing each of the translations, rather
> than a bunch of individual array elements (one per translation) or a
> bunch of object pairs.
>
> As it happens, just this morning I read that MongoDB's new version 2.6
> now comes with stricter enforcement of key length:
> http://docs.mongodb.org/master/release-notes/2.6-compatibility/#index-key-length-incompatibility
> . While previous versions just silently failed to index values that
> were inserted, there is now a 1024 limit imposed on the total size of
> indexed values.
>
> --
> Peter Geoghegan
>
>
> --
> Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-hackers



Re: Default gin operator class of jsonb failing with index row size maximum reached

От
Tom Lane
Дата:
Oleg Bartunov <obartunov@gmail.com> writes:
> We are working to avoid this limitation.

What do you mean by that ... do you see it as something that could be
fixed quickly, or is this a long-term improvement project?
        regards, tom lane



Re: Default gin operator class of jsonb failing with index row size maximum reached

От
Oleg Bartunov
Дата:
On Wed, Apr 9, 2014 at 1:48 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> Oleg Bartunov <obartunov@gmail.com> writes:
>> We are working to avoid this limitation.
>
> What do you mean by that ... do you see it as something that could be
> fixed quickly, or is this a long-term improvement project?

Unfortunately, It's long-term project VODKA, we hope to use spgist
(also needed some adjustment) for keys+values instead of btree.
Hopefully, we'll show something at PGCon.

>
>                         regards, tom lane



Re: Default gin operator class of jsonb failing with index row size maximum reached

От
Michael Paquier
Дата:
On Wed, Apr 9, 2014 at 6:48 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> Oleg Bartunov <obartunov@gmail.com> writes:
>> We are working to avoid this limitation.
>
> What do you mean by that ... do you see it as something that could be
> fixed quickly, or is this a long-term improvement project?
If this is a known limitation and no fix is planned for 9.4, could it
be possible to document it appropriately for this release? This would
surprise users.
-- 
Michael



Re: Default gin operator class of jsonb failing with index row size maximum reached

От
Peter Geoghegan
Дата:
On Tue, Apr 8, 2014 at 4:07 PM, Michael Paquier
<michael.paquier@gmail.com> wrote:
> If this is a known limitation and no fix is planned for 9.4, could it
> be possible to document it appropriately for this release? This would
> surprise users.

It looks like the default GIN opclass will be changed, so it becomes a
matter of opting in to the particular set of trade-offs that the
current default represents. Presumably that includes the size
limitation, which isn't separately documented at present.


-- 
Peter Geoghegan