Re: pglz performance

Поиск
Список
Период
Сортировка
От Petr Jelinek
Тема Re: pglz performance
Дата
Msg-id a5cbd707-e4a4-70e7-e970-75f1cc68c20f@2ndquadrant.com
обсуждение исходный текст
Ответ на Re: pglz performance  (Tomas Vondra <tomas.vondra@2ndquadrant.com>)
Ответы Re: pglz performance  (Tomas Vondra <tomas.vondra@2ndquadrant.com>)
Re: pglz performance  (Andres Freund <andres@anarazel.de>)
Список pgsql-hackers
Hi,

On 04/08/2019 13:52, Tomas Vondra wrote:
> On Sun, Aug 04, 2019 at 02:41:24AM +0200, Petr Jelinek wrote:
>> Hi,
>>
>> On 02/08/2019 21:48, Tomas Vondra wrote:
>>> On Fri, Aug 02, 2019 at 11:20:03AM -0700, Andres Freund wrote:
>>>
>>>>
>>>>> Another question is whether we'd actually want to include the code in
>>>>> core directly, or use system libraries (and if some packagers might
>>>>> decide to disable that, for whatever reason).
>>>>
>>>> I'd personally say we should have an included version, and a
>>>> --with-system-... flag that uses the system one.
>>>>
>>>
>>> OK. I'd say to require a system library, but that's a minor detail.
>>>
>>
>> Same here.
>>
>> Just so that we don't idly talk, what do you think about the attached?
>> It:
>> - adds new GUC compression_algorithm with possible values of pglz 
>> (default) and lz4 (if lz4 is compiled in), requires SIGHUP
>> - adds --with-lz4 configure option (default yes, so the configure 
>> option is actually --without-lz4) that enables the lz4, it's using 
>> system library
>> - uses the compression_algorithm for both TOAST and WAL compression 
>> (if on)
>> - supports slicing for lz4 as well (pglz was already supported)
>> - supports reading old TOAST values
>> - adds 1 byte header to the compressed data where we currently store 
>> the algorithm kind, that leaves us with 254 more to add :) (that's an 
>> extra overhead compared to the current state)
>> - changes the rawsize in TOAST header to 31 bits via bit packing
>> - uses the extra bit to differentiate between old and new format
>> - supports reading from table which has different rows stored with 
>> different algorithm (so that the GUC itself can be freely changed)
>>
> 
> Cool.
> 
>> Simple docs and a TAP test included.
>>
>> I did some basic performance testing (it's not really my thing though, 
>> so I would appreciate if somebody did more).
>> I get about 7x perf improvement on data load with lz4 compared to pglz 
>> on my dataset but strangely only tiny decompression improvement. 
>> Perhaps more importantly I also did before patch and after patch tests 
>> with pglz and the performance difference with my data set was <1%.
>>
>> Note that this will just link against lz4, it does not add lz4 into 
>> PostgreSQL code-base.
>>
> 
> WFM, although I think Andres wanted to do both (link against system and
> add lz4 code as a fallback). I think the question is what happens when
> you run with lz4 for a while, and then switch to binaries without lz4
> support. Or when you try to replicate from lz4-enabled instance to an
> instance without it. Especially for physical replication, but I suppose
> it may affect logical replication with binary protocol?
> 

I generally prefer having system library, we don't include for example 
ICU either.

> 
> A very brief review:
> 
> 
> 1) I wonder what "runstatedir" is about.
> 

No idea, that stuff is generated by autoconf from configure.in.

> 
> 2) This seems rather suspicious, and obviously the comment is now
> entirely bogus:
> 
> /* Check that off_t can represent 2**63 - 1 correctly.
>     We can't simply define LARGE_OFF_T to be 9223372036854775807,
>     since some C++ compilers masquerading as C compilers
>     incorrectly reject 9223372036854775807.  */
> -#define LARGE_OFF_T (((off_t) 1 << 62) - 1 + ((off_t) 1 << 62))
> +#define LARGE_OFF_T ((((off_t) 1 << 31) << 31) - 1 + (((off_t) 1 << 31) 
> << 31))
> 

Same as above. TBH I am not sure why we even include configure in git 
repo given that different autoconf versions can build different outputs.

> 
> 3) I can't really build without lz4:
> 
> config.status: linking src/makefiles/Makefile.linux to src/Makefile.port
> pg_lzcompress.c: In function ‘pg_compress_bound’:
> pg_lzcompress.c:892:22: error: ‘SIZEOF_PG_COMPRESS_HEADER’ undeclared 
> (first use in this function)
>     return slen + 4 + SIZEOF_PG_COMPRESS_HEADER;
>                       ^~~~~~~~~~~~~~~~~~~~~~~~~
> pg_lzcompress.c:892:22: note: each undeclared identifier is reported 
> only once for each function it appears in
> 

Okay, that's just problem of SIZEOF_PG_COMPRESS_HEADER being defined 
inside the HAVE_LZ4 ifdef while it should be defined above ifdef.

> 
> 4) I did a simple test with physical replication, with lz4 enabled on
> both sides (well, can't build without lz4 anyway, per previous point).
> It immediately failed like this:
> 
> FATAL:  failed to restore block image
> CONTEXT:  WAL redo at 0/5000A40 for Btree/INSERT_LEAF: off 138
> LOG:  startup process (PID 15937) exited with exit code 1
> 
> This is a simple UPDATE on a trivial table:
> 
> create table t (a int primary key);
> insert into t select i from generate_series(1,1000) s(i);
> update t set a = a - 100000 where random () < 0.1;
> 
> with some checkpoints to force FPW (and wal_compression=on, of course).
> 
> I haven't tried `make check-world` but I suppose some of the TAP tests
> should fail because of this. And if not, we need to improve coverage.
> 

FWIW I did run check-world without problems, will have to look into this.

> 
> 5) I wonder why compression_algorithm is defined as PGC_SIGHUP. Why not
> to allow users to set it per session? I suppose we might have a separate
> option for WAL compression_algorithm.
>

Yeah I was thinking we might want to change wal_compression to enum as 
well. Although that complicates the code quite a bit (the caller has to 
decide algorithm instead compression system doing it).

> 
> 6) It seems a bit strange that pg_compress/pg_decompress are now defined
> in pglz_compress.{c,h}. Maybe we should invent src/common/compress.c?
> 

It does not seem worth it to invent new module for like 20 lines of 
wrapper code.

-- 
Petr Jelinek
2ndQuadrant - PostgreSQL Solutions for the Enterprise
https://www.2ndQuadrant.com/



В списке pgsql-hackers по дате отправления:

Предыдущее
От: Petr Jelinek
Дата:
Сообщение: Re: pglz performance
Следующее
От: Tom Lane
Дата:
Сообщение: Re: A couple of random BF failures in kerberosCheck