Re: Simple (hopefully) throughput question?

Поиск
Список
Период
Сортировка
От Samuel Gendler
Тема Re: Simple (hopefully) throughput question?
Дата
Msg-id AANLkTikL2=SydpMrUbDD6zLQ6DDTdqeCOWZzff7bbWNv@mail.gmail.com
обсуждение исходный текст
Ответ на Re: Simple (hopefully) throughput question?  (Vitalii Tymchyshyn <tivv00@gmail.com>)
Список pgsql-performance
On Thu, Nov 4, 2010 at 8:07 AM, Vitalii Tymchyshyn <tivv00@gmail.com> wrote:
04.11.10 16:31, Nick Matheson написав(ла):

Heikki-

Try COPY, ie. "COPY bulk_performance.counts TO STDOUT BINARY".

Thanks for the suggestion. A preliminary test shows an improvement closer to our expected 35 MB/s.

Are you familiar with any Java libraries for decoding the COPY format? The spec is clear and we could clearly write our own, but figured I would ask. ;)
JDBC driver has some COPY support, but I don't remember details. You'd better ask in JDBC list.



The JDBC driver support works fine.  You can pass a Reader or InputStream (if I recall correctly, the InputStream path is more efficient.  Or maybe the Reader path was buggy.  Regardless, I wound up using an InputStream in the driver which I then wrap in a Reader in order to get it line-by-line.

You can write a COPY statement to send standard CSV format - take a look at the postgres docs for the COPY statement to see the full syntax.  I then have a subclass of BufferedReader which parses each line of CSV and does something interesting with it.  I've had it working very reliably for many months now, processing about 500 million rows per day (I'm actually COPYing out, rather than in, but the concept is the same, rgardless - my outputstream is wrapper in a writer, which reformats data on the fly).


В списке pgsql-performance по дате отправления:

Предыдущее
От: Samuel Gendler
Дата:
Сообщение: Re: Simple (hopefully) throughput question?
Следующее
От: Greg Smith
Дата:
Сообщение: Re: Defaulting wal_sync_method to fdatasync on Linux for 9.1?