Re: copy vs. C function
От | Jon Nelson |
---|---|
Тема | Re: copy vs. C function |
Дата | |
Msg-id | CAKuK5J3VsY-1_4wzRZiYR_ExWVGhnMHYmkBZBxnvBxkMfqsL5w@mail.gmail.com обсуждение исходный текст |
Ответ на | Re: copy vs. C function (Tom Lane <tgl@sss.pgh.pa.us>) |
Ответы |
Re: copy vs. C function
|
Список | pgsql-performance |
On Wed, Dec 14, 2011 at 12:18 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote: > Jon Nelson <jnelson+pgsql@jamponi.net> writes: >> The only thing I have left are these statements: > >> get_call_result_type >> TupleDescGetAttInMetadata >> BuildTupleFromCStrings >> HeapTupleGetDatum >> and finally PG_RETURN_DATUM > >> It turns out that: >> get_call_result_type adds 43 seconds [total: 54], >> TupleDescGetAttInMetadata adds 19 seconds [total: 73], >> BuildTypleFromCStrings accounts for 43 seconds [total: 116]. > >> So those three functions account for 90% of the total time spent. >> What alternatives exist? Do I have to call get_call_result_type /every >> time/ through the function? > > Well, if you're concerned about performance then I think you're going > about this in entirely the wrong way, because as far as I can tell from > this you're converting all the field values to text and back again. > You should be trying to keep the values in Datum format and then > invoking heap_form_tuple. And yeah, you probably could cache the > type information across calls. The parsing/conversion (except BuildTupleFromCStrings) is only a small fraction of the overall time spent in the function and could probably be made slightly faster. It's the overhead that's killing me. Remember: I'm not converting multiple field values to text and back again, I'm turning a *single* TEXT into 8 columns of varying types (INET, INTEGER, and one INTEGER array, among others). I'll re-write the code to use Tuples but given that 53% of the time is spent in just two functions (the two I'd like to cache) I'm not sure how much of a gain it's likely to be. Regarding caching, I tried caching it across calls by making the TupleDesc static and only initializing it once. When I tried that, I got: ERROR: number of columns (6769856) exceeds limit (1664) I tried to find some documentation or examples that cache the information, but couldn't find any. -- Jon
В списке pgsql-performance по дате отправления: