Re: Help with bulk read performance

От: Jim Nasby
Тема: Re: Help with bulk read performance
Дата: ,
Msg-id: 73690052-2204-49D8-B5BA-4F52940502D0@nasby.net
(см: обсуждение, исходный текст)
Ответ на: Help with bulk read performance  (Dan Schaffer)
Список: pgsql-performance

Скрыть дерево обсуждения

Help with bulk read performance  (Dan Schaffer, )
 Re: Help with bulk read performance  (Jim Nasby, )
 Re: Help with bulk read performance  (Andy Colson, )
  Re: Help with bulk read performance  (Jim Nasby, )
   Re: Help with bulk read performance  (Andy Colson, )
    Re: Help with bulk read performance  (Nick Matheson, )
    Re: Help with bulk read performance  (Dan Schaffer, )
     Re: Help with bulk read performance  ("Pierre C", )
      Re: Help with bulk read performance  (Nick Matheson, )
 Re: Help with bulk read performance  (Jim Nasby, )
 Re: Help with bulk read performance  (Krzysztof Nienartowicz, )

BTW, have you tried prepared statements? bytea is most likely faster (in part) due to less parsing in the backend.
Preparedstatements would eliminate that parsing step. 

On Dec 14, 2010, at 10:07 AM, Nick Matheson wrote:

> Hey all-
>
> Glad to know you are still interested... ;)
>
> Didn't mean to leave you hanging, the holiday and all have put some bumps in the road.
>
> Dan my co-worker might be able to post some more detailed information here, but here is a brief summary of what I am
awareof: 
>
> 1. We have not tested any stored procedure/SPI based solutions to date.
> 2. The COPY API has been the best of the possible solutions explored to date.
> 3. We were able to get rates on the order of 35 MB/s with the original problem this way.
> 4. Another variant of the problem we were working on included some metadata fields and 300 float values (for this we
triedthree variants) 
>   a. 300 float values as columns
>   b. 300 float in a float array column
>   c. 300 floats packed into a bytea column
> Long story short on these three variants a and b largely performed the same. C was the winner and seems to have
improvedthe throughput on multiple counts. 1. it reduces the data transmitted over the wire by a factor of two (float
columnsand float arrays have a 2x overhead over the raw data requirement.) 2. this reduction seems to have reduced the
cpuburdens on the server side thus producing a better than the expected 2x speed. I think the final numbers left us
somewherein the 80-90 MB/s. 
>
> Thanks again for all the input. If you have any other questions let us know. Also if we get results for the stored
procedure/SPIroute we will try and post, but the improvements via standard JDBC are such that we aren't really pressed
atthis point in time to get more throughput so it may not happen. 
>
> Cheers,
>
> Nick
>> On 12/14/2010 9:41 AM, Jim Nasby wrote:
>>> On Dec 14, 2010, at 9:27 AM, Andy Colson wrote:
>>>> Is this the same thing Nick is working on?  How'd he get along?
>>>>
>>>> http://archives.postgresql.org/message-id/
>>>
>>> So it is. The one I replied to stood out because no one had replied to it; I didn't see the earlier email.
>>> --
>>> Jim C. Nasby, Database Architect                   
>>> 512.569.9461 (cell)                         http://jim.nasby.net
>>>
>>>
>>>
>>
>> Oh.. I didn't even notice the date... I thought it was a new post.
>>
>> But still... (and I'll cc Nick on this)  I'd love to hear an update on how this worked out.
>>
>> Did you get it to go fast?  What'd you use?  Did the project go over budget and did you all get fired?  COME ON MAN!
We need to know! :-) 
>>
>> -Andy
>

--
Jim C. Nasby, Database Architect                   
512.569.9461 (cell)                         http://jim.nasby.net




В списке pgsql-performance по дате сообщения:

От: Mark Kirkwood
Дата:
Сообщение: Re: Index Bloat - how to tell?
От: AI Rumman
Дата:
Сообщение: only one index is using, why?