Re: Fixed length data types issue
От | Mark Dilger |
---|---|
Тема | Re: Fixed length data types issue |
Дата | |
Msg-id | 450477CF.4020401@markdilger.com обсуждение исходный текст |
Ответ на | Re: Fixed length data types issue (Martijn van Oosterhout <kleptog@svana.org>) |
Ответы |
Re: Fixed length data types issue
|
Список | pgsql-hackers |
Martijn van Oosterhout wrote: > On Sun, Sep 10, 2006 at 11:55:35AM -0700, Mark Dilger wrote: >>> Well, it is unless you are willing to give up support of non-Intel CPUs; >>> most other popular chips are strict about alignment, and will fail an >>> attempt to do a nonaligned fetch. >> Intel CPUs are detectable at compile time, right? Do we use less >> padding in the layout for tables on Intel-based servers? If not, could we? > > Intel CPUs may not complain about unaligned reads, they're still > inefficient. Internally it does two aligned reads and rearranges the > bytes. On other architechtures the OS can emulate that but postgres > doesn't use that for obvious reasons. This gets back to the CPU vs. I/O bound issue, right? Might not some people (with heavily taxed disks but lightly taxed CPU) prefer that trade-off? >> For the example schema which started this thread, a contrib extension >> for ascii fields could be written, with types like ascii1, ascii2, >> ascii3, and ascii4, each with implicit upcasts to text. A contrib for >> int1 and uint1 could be written to store single byte integers in a >> single byte, performing math on them correctly, etc. > > The problem is that for each of those ascii types, to actually use them > they would have to be converted, which would amount to allocating some > memory, copying and adding a length header. At some point you have to > wonder whether you're actually saving anything. > > Have a nice day, I'm not sure what you mean by "actually use them". The types could have their own comparator operators. So you could use them for sorting and indexing, and use them in WHERE clauses with these comparisons without any conversion to/from text. I mentioned implicit upcasts to text merely to handle other cases, such as using them in a LIKE or ILIKE, or concatenation, etc., where the work of providing this functionality for each contrib datatype would not really be justified. I'm not personally as interested in the aforementioned ascii types as I am in the int1 and int3 types, but the argument in favor of each is about the same. If a person has a large table made of small data, it seems really nuts to have 150% - 400% bloat on that table, when such a small amount of work is needed to write the contrib datatypes necessary to store the data compactly. The argument made upthread that a quadratic number of conversion operators is necessitated doesn't seem right to me, given that each type could upcast to the canonical built in type. (int1 => smallint, int3 => integer, ascii1 => text, ascii2 => text, ascii3 => text, etc.) Operations on data of differing type can be done in the canonical type, but the common case for many users would be operations between data of the same type, for which no conversion is required. Am I missing something that would prevent this approach from working? I am seriously considering writing these contrib datatypes for use either on pgfoundary or the contrib/ subdirectory for the 8.3 release, but am looking for advice if I am really off-base. Thanks, mark
В списке pgsql-hackers по дате отправления: