Re: INSERT times - same storage space but more fields -> much slower inserts

От: Stephen Frost
Тема: Re: INSERT times - same storage space but more fields -> much slower inserts
Дата: ,
Msg-id: 20090414155614.GL8123@tamriel.snowman.net
(см: обсуждение, исходный текст)
Ответ на: INSERT times - same storage space but more fields -> much slower inserts  (Craig Ringer)
Ответы: Re: INSERT times - same storage space but more fields -> much slower inserts  (Matthew Wakeling)
Список: pgsql-performance


Craig,

* Craig Ringer () wrote:
> I've been doing some testing for the Bacula project, which uses
> PostgreSQL as one of the databases in which it stores backup catalogs.

We also use Bacula with a PostgreSQL backend.

> I've been evaluating a schema change for Bacula that takes a field
> that's currently stored as a gruesome-to-work-with base64-encoded
> representation of a binary blob, and expands it into a set of integer
> fields that can be searched, indexed, etc.

This would be extremely nice.

> The table size of the expanded form is marginally smaller than for the
> base64-encoded string version. However, INSERT times are *CONSIDERABLY*
> greater for the version with more fields. It takes 1011 seconds to
> insert the base64 version, vs 1290 seconds for the expanded-fields
> version. That's a difference of 279 seconds, or 27%.
>
> Despite that, the final table sizes are the same.
>
> If I use tab-separated input and COPY, the original-format file is
> 1300MB and the expanded-structure format is 1618MB. The performance hit
> on COPY-based insert is not as bad, at 161s vs 182s (13%), but still
> quite significant.
>
> Any ideas about what I might be able to do to improve the efficiency of
> inserting records with many integer fields?

Bacula should be using COPY for the batch data loads, so hopefully won't
suffer too much from having the fields split out.  I think it would be
interesting to try doing PQexecPrepared with binary-format data instead
of using COPY though.  I'd be happy to help you implement a test setup
for doing that, if you'd like.

        Thanks,

            Stephen

Вложения

В списке pgsql-performance по дате сообщения:

От: Stephen Frost
Дата:
Сообщение: Re: INSERT times - same storage space but more fields -> much slower inserts
От: Craig Ringer
Дата:
Сообщение: Re: INSERT times - same storage space but more fields -> much slower inserts