Efficient slicing/substring of TOAST values (for comment)
От | John Gray |
---|---|
Тема | Efficient slicing/substring of TOAST values (for comment) |
Дата | |
Msg-id | 1002549159.23074.40.camel@adzuki обсуждение исходный текст |
Ответы |
Re: Efficient slicing/substring of TOAST values (for comment)
Re: Efficient slicing/substring of TOAST values (for comment) |
Список | pgsql-patches |
Hi all, I attach a patch which adds access routines for efficient extraction of parts of TOAST values. The principal additions are two routines in tuptoaster.c, heap_tuple_untoast_attr_slice and toast_fetch_datum_slice. The latter uses extra index scankeys to retrieve only the TOAST chunks which contain the requested substring. This will provide a performance benefit if you repeatedly extract small portions (e.g. file headers) from TOASTed values, as only one or two chunks will need to be fetched. This function is only invoked for external, uncompressed storage. The public access routine (heap_tuple_untoast_attr_slice) does take care of slicing values that are stored compressed or inline, but doesn't provide any performance benefit in those cases. The access macros are in the same vein to existing ones: PG_GETARG_TEXT_P_SLICE(n,start,length) for example. What I haven't done: 1. Documentation. If this patch is appropriate or acceptable, I'll add documentation. 2. Changed e.g. textsubstr and byteasubstr to use this method. textsubstr is complicated by the multibyte support -the fast method is only applicable in a non-multibyte environment. Also, the SQL negative offset rule is not embodied in what I've added, and the subscripts are zero-based. This was on the assumption that if the data was binary (e.g. JPEG/JFIF data) and the user's intent was to extract the header, it would be clearer to use zero-based offsets. 3. Added any facility to force a column to have attstorage 'e'. At present it appears to be defaulted from typstorage, but I couldn't see any problem with changing it after table creation. Would a keyword to CREATE TABLE to override the default attstorage be useful? -especially if the user knew that the data for a column would not be very compressible (there would be a performance gain in not trying to compress it, and just storing it externally uncompressed). Of course, this may just all be useless feature bloat or not up to scratch coding-wise (and please say so if it is) but please let me know if it's worth me documenting this or adding any more to it. (diffs against versions current in CVS as of twenty minutes ago or so) Regards John
Вложения
В списке pgsql-patches по дате отправления: