Re: libpq compression
От | Robbie Harwood |
---|---|
Тема | Re: libpq compression |
Дата | |
Msg-id | jlgo9g5qjst.fsf@redhat.com обсуждение исходный текст |
Ответ на | Re: libpq compression (Konstantin Knizhnik <k.knizhnik@postgrespro.ru>) |
Ответы |
Re: libpq compression
(Konstantin Knizhnik <k.knizhnik@postgrespro.ru>)
|
Список | pgsql-hackers |
Konstantin Knizhnik <k.knizhnik@postgrespro.ru> writes: > On 20.06.2018 00:04, Robbie Harwood wrote: >> Konstantin Knizhnik <k.knizhnik@postgrespro.ru> writes: >>> On 18.06.2018 23:34, Robbie Harwood wrote: >>> >>>> I also don't like that you've injected into the *startup* path - >>>> before authentication takes place. Fundamentally, authentication >>>> (if it happens) consists of exchanging some combination of short >>>> strings (e.g., usernames) and binary blobs (e.g., keys). None of >>>> this will compress well, so I don't see it as worth performing this >>>> negotiation there - it can wait. It's also another message in >>>> every startup. I'd leave it to connection parameters, personally, >>>> but up to you. >>> >>> From my point of view compression of libpq traffic is similar with >>> SSL and should be toggled at the same stage. >> >> But that's not what you're doing. This isn't where TLS gets toggled. >> >> TLS negotiation happens as the very first packet: after completing >> the TCP handshake, the client will send a TLS negotiation request. >> If it doesn't happen there, it doesn't happen at all. >> >> (You *could* configure it where TLS is toggled. This is, I think, >> not a good idea. TLS encryption is a probe: the server can reject >> it, at which point the client tears everything down and connects >> without TLS. So if you do the same with compression, that's another >> point of tearing down an starting over. The scaling on it isn't good >> either: if we add another encryption method into the mix, you've >> doubled the number of teardowns.) > > Yes, you are right. There is special message for enabling TLS > procotol. But I do not think that the same think is needed for > compression. This is why I prefer to specify compression in > connectoin options. So compression may be enabled straight after > processing of startup package. Frankly speaking I still do no see > reasons to postpone enabling compression till some later moment. I'm arguing for connection option only (with no additional negotiation round-trip). See below. >>> Definitely authentication parameter are not so large to be >>> efficiently compressed, by compression (may be in future password >>> protected) can somehow protect this data. In any case I do not >>> think that compression of authentication data may have any influence >>> on negotiation speed. So I am not 100% sure that toggling >>> compression just after receiving startup package is the only right >>> solution. But I am not also convinced in that there is some better >>> place where compressor should be configured. Do you have some >>> concrete suggestions for it? In InitPostgres just after >>> PerformAuthentication ? >> >> Hmm, let me try to explain this differently. >> >> pq_configure() (as you've called it) shouldn't send a packet. At its >> callsite, we *already know* whether we want to use compression - >> that's what the port->use_compression option says. So there's no >> point in having a negotiation there - it's already happened. > > My idea was the following: client want to use compression. But server > may reject this attempt (for any reasons: it doesn't support it, has > no proper compression library, do not want to spend CPU for > decompression,...) Right now compression algorithm is hardcoded. But > in future client and server may negotiate to choose proper compression > protocol. This is why I prefer to perform negotiation between client > and server to enable compression. Well, for negotiation you could put the name of the algorithm you want in the startup. It doesn't have to be a boolean for compression, and then you don't need an additional round-trip. >>>> I don't like where you've put the entry points to the compression >>>> logic: it's a layering violation. A couple people have expressed >>>> similar reservations I think, so let me see if I can explain using >>>> `pqsecure_read()` as an example. In pseudocode, `pqsecure_read()` >>>> looks like this: >>>> >>>> if conn->is_tls: >>>> n = tls_read(conn, ptr, len) >>>> else: >>>> n = pqsecure_raw_read(conn, ptr, len) >>>> return n >>>> >>>> I want to see this extended by your code to something like: >>>> >>>> if conn->is_tls: >>>> n = tls_read(conn, ptr, len) >>>> else: >>>> n = pqsecure_raw_read(conn, ptr, len) >>>> >>>> if conn->is_compressed: >>>> n = decompress(ptr, n) >>>> >>>> return n >>>> >>>> In conjunction with the above change, this should also >>>> significantly reduce the size of the patch (I think). >>> >>> Yes, it will simplify patch. But make libpq compression completely >>> useless (see my explanation above). We need to use streaming >>> compression, and to be able to use streaming compression I have to >>> pass function for fetching more data to compression library. >> >> I don't think you need that, even with the streaming API. >> >> To make this very concrete, let's talk about ZSTD_decompressStream (I'm >> pulling information from >> https://facebook.github.io/zstd/zstd_manual.html#Chapter7 ). >> >> Using the pseudocode I'm asking for above, the decompress() function >> would look vaguely like this: >> >> decompress(ptr, n) >> ZSTD_inBuffer in = {0} >> ZpSTD_outBuffer out = {0} >> >> in.src = ptr >> in.size = n >> >> while in.pos < in.size: >> ret = ZSTD_decompressStream(out, in) >> if ZSTD_isError(ret): >> give_up() >> >> memcpy(ptr, out.dst, out.pos) >> return out.pos >> >> (and compress() would follow a similar pattern, if we were to talk >> about it). > > It will not work in this way. We do not know how much input data we > need to be able to decompress message. Well, that's a design decision you've made. You could put lengths on chunks that are sent out - then you'd know exactly how much is needed. (For instance, 4 bytes of network-order length followed by a complete payload.) Then you'd absolutely know whether you have enough to decompress or not. > So loop should be something like this: > > decompress(ptr, n) > ZSTD_inBuffer in = {0} > ZSTD_outBuffer out = {0} > > in.src = ptr > in.size = n > > while true > ret = ZSTD_decompressStream(out, in) > if ZSTD_isError(ret): > give_up() > if out.pos != 0 > // if we deomcpress soemthing > return out.pos; > read_input(in); The last line is what I'm taking issue with. The interface we have already in postgres's network code has a notion of partial reads, or that reads might not return data. (Same with writing and partial writes.) So you'd buffer what compressed data you have and return - this is the preferred idiom so that we don't block or spin on a nonblocking socket. This is how the TLS code works already. Look at, for instance, pgtls_read(). If we get back SSL_ERROR_WANT_READ (i.e., not enough data to decrypt), we return no data and wait until the socket becomes readable again. Thanks, --Robbie
Вложения
В списке pgsql-hackers по дате отправления:
Предыдущее
От: Tom LaneДата:
Сообщение: Re: Avoiding Tablespace path collision for primary and standby
Следующее
От: Bruce MomjianДата:
Сообщение: Re: [Proposal] Table-level Transparent Data Encryption (TDE) and KeyManagement Service (KMS)