Re: Re: logical changeset generation v4 - Heikki's thoughts about the patch state

Поиск
Список
Период
Сортировка
От Andres Freund
Тема Re: Re: logical changeset generation v4 - Heikki's thoughts about the patch state
Дата
Msg-id 20130128113117.GA22401@awork2.anarazel.de
обсуждение исходный текст
Ответ на Re: logical changeset generation v4 - Heikki's thoughts about the patch state  (Heikki Linnakangas <hlinnakangas@vmware.com>)
Список pgsql-hackers
On 2013-01-28 11:59:52 +0200, Heikki Linnakangas wrote:
> On 24.01.2013 00:30, Andres Freund wrote:
> >Also, while the apply side surely isn't benchmarkable without any being
> >submitted, the changeset generation can very well be benchmarked.
> >
> >A very, very adhoc benchmark:
> >  -c max_wal_senders=10
> >  -c max_logical_slots=10 --disabled for anything but logical
> >  -c wal_level=logical --hot_standby for anything but logical
> >  -c checkpoint_segments=100
> >  -c log_checkpoints=on
> >  -c shared_buffers=512MB
> >  -c autovacuum=on
> >  -c log_min_messages=notice
> >  -c log_line_prefix='[%p %t] '
> >  -c wal_keep_segments=100
> >  -c fsync=off
> >  -c synchronous_commit=off
> >
> >pgbench -p 5440 -h /tmp -n -M prepared -c 16 -j 16 -T 30
> >
> >pgbench upstream:
> >tps: 22275.941409
> >space overhead: 0%
> >pgbench logical-submitted
> >tps: 16274.603046
> >space overhead: 2.1%
> >pgbench logical-HEAD (will submit updated version tomorrow or so):
> >tps: 20853.341551
> >space overhead: 2.3%
> >pgbench single plpgsql trigger (INSERT INTO log(data) VALUES(NEW::text))
> >tps: 14101.349535
> >space overhead: 369%
> >
> >Note that in the single trigger case nobody consumed the queue while the
> >logical version streamed the changes out and stored them to disk.
> 
> That makes the space overhead comparison completely worthless, no? I would
> expect the trigger-based approach to generate roughly 100% more WAL, not
> close to 400%. As long as the queue is drained constantly, there should be
> no big difference in the disk space used, except for the WAL.

Imo its a valid comparison as all such queues can only be drained in a
rather imperfect manner. I think these days all solutions use multiple
(two) queue tables and switch between those and truncate the non-active
one as vacuuming them works far too unreliable.
And those tables have to be plain logged once, so they matter in
checkpoints et al.

> >Adding a default NOW() or similar to the tables immediately makes
> >logical decoding faster by a factor of about 3 in comparison to the
> >above trivial trigger.
> 
> Hmm, is that because of the conversion to text? I believe slony also
> converts all the values to text in the trigger, because that's simple and
> flexible, but if we're trying to compare the performance of logical
> changeset generation vs. trigger-based replication in general, we should
> choose the most efficient trigger-based scheme to compare with. That means,
> don't convert to text. And write the trigger in C.

Imo its basically impossible for the current queue-based solutions not
to convert to text because they otherwise would need to queue all the
conversion information as well. And the the test_decoding plugin also
converts everything to text, so thats a fair comparison from that
POV. In fact the test_decoding plugin does noticeably more as it also
outputs table, column and type name.

I aggree on the C argument. I really doubt its going to make that much
of a difference but we should try it.
In my experience a plpgsql trigger that just does a straight conversion
via cast is still noticeably faster than any of the "real" replication
triggers out there though, so I wouldn't expect much there.


Greetings,

Andres Freund

-- Andres Freund                       http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training &
Services



В списке pgsql-hackers по дате отправления:

Предыдущее
От: Marko Tiikkaja
Дата:
Сообщение: Re: pg_dump --pretty-print-views
Следующее
От: Michael Paquier
Дата:
Сообщение: Re: Support for REINDEX CONCURRENTLY