Re: Storage for multiple variable-length attributes in a single row
От | Julien Rouhaud |
---|---|
Тема | Re: Storage for multiple variable-length attributes in a single row |
Дата | |
Msg-id | 20220207171948.i7v4hn5zv4c6gr5f@jrouhaud обсуждение исходный текст |
Ответ на | Re: Storage for multiple variable-length attributes in a single row ("David G. Johnston" <david.g.johnston@gmail.com>) |
Список | pgsql-hackers |
On Mon, Feb 07, 2022 at 10:10:53AM -0700, David G. Johnston wrote: > On Mon, Feb 7, 2022 at 9:58 AM Esteban Zimanyi <estebanzimanyi@gmail.com> > wrote: > > > > > As suggested by David, this goes beyond the "traditional" usage of > > PostgreSQL. Therefore my questions are > > * What is the suggested strategy to splitting these 2K attributes into > > vertically partitioned tables where the tables are linked by the primary > > key (e.g. trip number in the example above). Are there any limitations/best > > practices in the number/size of TOASTED attributes that a table should > > contain. > > * In each partitioned table containing N TOASTED attributes, given the > > above requirements, are there any limitations/best practices in storing > > them using extended storage or an alternative one such as external. > > > > > Frankly, the best practice is "don't have that many columns". Since you > do, I posit that you are just going to have to make decisions (possibly > with experimentation) on your own. Or maybe ask around on a MobilityDB > forum what people using that tool and having these kinds of data structures > do. From a core PostgreSQL perspective you've already deviated from the > model structures that it was designed with in mind. > I'm really confused that you'd want the data value itself to contain a > timestamp that, on a per-row basis, should be the same timestamp that every > other value on the row has. Extracting the timestamp to it own column and > using simpler and atomic data types is how core PostgreSQL and the > relational model normalization recommend dealing with this situation. Then > you just break up the attributes of a similar nature into their own tables > based upon their shared nature. In almost all cases relying on "main" > storage. Actually looking at the original example: > CREATE TYPE tint ( > internallength = variable, > [...] > storage = extended, > alignment = double, > [...] > ); I'm wondering if it's just some miscommunication here. If the tint data type only needs to hold a timestamp and an int, I don't see why it would be varlerna at all. So if a single tint can internally hold thousands of (int, timestamptz), a bit like pgpointcloud, then having it by default external (so both possibly out-of-line and compressed) seems like a good idea, as you can definitely hit the 8k boundary, it should compress nicely and you also avoid some quite high tuple header overhead.
В списке pgsql-hackers по дате отправления: