Обсуждение: hstore - Implementation and performance issues around its operators
Hi, We did a benchmark comparing a Key-Value-Pairs stored as EAV db schema versus hstore. The results are promising in favor of hstore but there are some question which remain. 1. Obviously the '@>' has to be used in order to let use the GiST index. Why is the '->' operator not supported by GiST ('->' is actually mentioned in all examples of the doc.)? 2. Currently the hstore elements are stored in order as they are coming from the insert statement / constructor. Why are the elements not ordered i.e. why is the hstore not cached in all hstore functions (like hstore_fetchval etc.)? 3. In the source code 'hstore_io.c' one finds the following enigmatic note: "... very large hstore values can't be output. this could be fixed, but many other data types probably have the same issue." What is the max. length of a hstore (i.e. the max. length of the sum of all elements in text representation)? 4. Last, I don't fully understand the following note in the hstore doc. (http://www.postgresql.org/docs/current/interactive/hstore.html ): > Notice that the old names are reversed from the convention > formerly followed by the core geometric data types! Why names? Why not rather 'operators' or 'functions'? What does this "reversed from the convention" mean concretely? Yours, Stefan
On Sun, Jun 19, 2011 at 2:59 PM, Stefan Keller <sfkeller@gmail.com> wrote: > 1. Obviously the '@>' has to be used in order to let use the GiST index. > Why is the '->' operator not supported by GiST ('->' is actually > mentioned in all examples of the doc.)? Because it's not a comparison operator. > 2. Currently the hstore elements are stored in order as they are > coming from the insert statement / constructor. > Why are the elements not ordered i.e. why is the hstore not cached in > all hstore functions (like hstore_fetchval etc.)? Putting the elements in order wouldn't really help, would it? I mean, you'd need some kind of an index inside the hstore... which there isn't. > 3. In the source code 'hstore_io.c' one finds the following enigmatic > note: "... very large hstore values can't be output. this could be > fixed, but many other data types probably have the same issue." > What is the max. length of a hstore (i.e. the max. length of the sum > of all elements in text representation)? I think that anything of half a gigabyte or more is at risk of falling down there. But probably it's not smart to use such big hstores anyway. > 4. Last, I don't fully understand the following note in the hstore > doc. (http://www.postgresql.org/docs/current/interactive/hstore.html > ): >> Notice that the old names are reversed from the convention >> formerly followed by the core geometric data types! > > Why names? Why not rather 'operators' or 'functions'? It's referring to the operator names. > What does this "reversed from the convention" mean concretely? That comment could be a little more clear, but I think what it's saying is that hstore's old @ is like the core geometic types old ~, and visca versa. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
Hi Robert Many thanks for your answers. 2011/7/19 Robert Haas <robertmhaas@gmail.com>: > Putting the elements in order wouldn't really help, would it? I mean, > you'd need some kind of an index inside the hstore... which there > isn't. Sorry for my inprecise question. In fact elements of a hstore are stored in order of (keylength,key) with the key comparison done bytewise (not locale-dependent). See e.g. function hstoreUniquePairs in http://doxygen.postgresql.org/ . This ordered property is being used by some hstore functions but not all - and I'm still wondering why. Yours, Stefan 2011/7/19 Robert Haas <robertmhaas@gmail.com>: > On Sun, Jun 19, 2011 at 2:59 PM, Stefan Keller <sfkeller@gmail.com> wrote: >> 1. Obviously the '@>' has to be used in order to let use the GiST index. >> Why is the '->' operator not supported by GiST ('->' is actually >> mentioned in all examples of the doc.)? > > Because it's not a comparison operator. > >> 2. Currently the hstore elements are stored in order as they are >> coming from the insert statement / constructor. >> Why are the elements not ordered i.e. why is the hstore not cached in >> all hstore functions (like hstore_fetchval etc.)? > > Putting the elements in order wouldn't really help, would it? I mean, > you'd need some kind of an index inside the hstore... which there > isn't. > >> 3. In the source code 'hstore_io.c' one finds the following enigmatic >> note: "... very large hstore values can't be output. this could be >> fixed, but many other data types probably have the same issue." >> What is the max. length of a hstore (i.e. the max. length of the sum >> of all elements in text representation)? > > I think that anything of half a gigabyte or more is at risk of falling > down there. But probably it's not smart to use such big hstores > anyway. > >> 4. Last, I don't fully understand the following note in the hstore >> doc. (http://www.postgresql.org/docs/current/interactive/hstore.html >> ): >>> Notice that the old names are reversed from the convention >>> formerly followed by the core geometric data types! >> >> Why names? Why not rather 'operators' or 'functions'? > > It's referring to the operator names. > >> What does this "reversed from the convention" mean concretely? > > That comment could be a little more clear, but I think what it's > saying is that hstore's old @ is like the core geometic types old ~, > and visca versa. > > -- > Robert Haas > EnterpriseDB: http://www.enterprisedb.com > The Enterprise PostgreSQL Company >
On Tue, Jul 19, 2011 at 5:06 PM, Stefan Keller <sfkeller@gmail.com> wrote: > 2011/7/19 Robert Haas <robertmhaas@gmail.com>: >> Putting the elements in order wouldn't really help, would it? I mean, >> you'd need some kind of an index inside the hstore... which there >> isn't. > > Sorry for my inprecise question. In fact elements of a hstore are > stored in order of (keylength,key) with the key comparison done > bytewise (not locale-dependent). See e.g. function hstoreUniquePairs > in http://doxygen.postgresql.org/ . This ordered property is being > used by some hstore functions but not all - and I'm still wondering > why. Not sure, honestly. Is there some place where it would permit an optimization we're not currently doing? -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company