On Fri, 4 Feb 2011, Vitalii Tymchyshyn wrote:
> 04.02.11 16:33, Kenneth Marshall ???????(??):
>>
>> In addition, the streaming ANALYZE can provide better statistics at
>> any time during the load and it will be complete immediately. As far
>> as passing the entire table through the ANALYZE process, a simple
>> counter can be used to only send the required samples based on the
>> statistics target. Where this would seem to help the most is in
>> temporary tables which currently do not work with autovacuum but it
>> would streamline their use for more complicated queries that need
>> an analyze to perform well.
>>
> Actually for me the main "con" with streaming analyze is that it adds
> significant CPU burden to already not too fast load process. Especially if
> it's automatically done for any load operation performed (and I can't see how
> it can be enabled on some threshold).
two thoughts
1. if it's a large enough load, itsn't it I/O bound?
2. this chould be done in a separate process/thread than the load itself,
that way the overhead of the load is just copying the data in memory to
the other process.
with a multi-threaded load, this would eat up some cpu that could be used
for the load, but cores/chip are still climbing rapidly so I expect that
it's still pretty easy to end up with enough CPU to handle the extra load.
David Lang
> And you can't start after some threshold of data passed by since you may
> loose significant information (like minimal values).
>
> Best regards, Vitalii Tymchyshyn
>