Re: [HACKERS] Custom compression methods
От | Tomas Vondra |
---|---|
Тема | Re: [HACKERS] Custom compression methods |
Дата | |
Msg-id | e02b7f01-ad9a-8b4c-609e-092a37d75926@2ndquadrant.com обсуждение исходный текст |
Ответ на | Re: [HACKERS] Custom compression methods (Chris Travers <chris.travers@adjust.com>) |
Список | pgsql-hackers |
On 3/19/19 4:44 PM, Chris Travers wrote: > > > On Tue, Mar 19, 2019 at 12:19 PM Tomas Vondra > <tomas.vondra@2ndquadrant.com <mailto:tomas.vondra@2ndquadrant.com>> wrote: > > > On 3/19/19 10:59 AM, Chris Travers wrote: > > > > > > Not discussing whether any particular committer should pick this > up but > > I want to discuss an important use case we have at Adjust for this > sort > > of patch. > > > > The PostgreSQL compression strategy is something we find > inadequate for > > at least one of our large deployments (a large debug log spanning > > 10PB+). Our current solution is to set storage so that it does not > > compress and then run on ZFS to get compression speedups on > spinning disks. > > > > But running PostgreSQL on ZFS has some annoying costs because we have > > copy-on-write on copy-on-write, and when you add file > fragmentation... I > > would really like to be able to get away from having to do ZFS as an > > underlying filesystem. While we have good write throughput, read > > throughput is not as good as I would like. > > > > An approach that would give us better row-level compression would > allow > > us to ditch the COW filesystem under PostgreSQL approach. > > > > So I think the benefits are actually quite high particularly for those > > dealing with volume/variety problems where things like JSONB might > be a > > go-to solution. Similarly I could totally see having systems which > > handle large amounts of specialized text having extensions for dealing > > with these. > > > > Sure, I don't disagree - the proposed compression approach may be a big > win for some deployments further down the road, no doubt about it. But > as I said, it's unclear when we get there (or if the interesting stuff > will be in some sort of extension, which I don't oppose in principle). > > > I would assume that if extensions are particularly stable and useful > they could be moved into core. > > But I would also assume that at first, this area would be sufficiently > experimental that folks (like us) would write our own extensions for it. > > > > > > > But hey, I think there are committers working for postgrespro, > who might > > have the motivation to get this over the line. Of course, > assuming that > > there are no serious objections to having this functionality > or how it's > > implemented ... But I don't think that was the case. > > > > > > While I am not currently able to speak for questions of how it is > > implemented, I can say with very little doubt that we would almost > > certainly use this functionality if it were there and I could see > plenty > > of other cases where this would be a very appropriate direction > for some > > other projects as well. > > > Well, I guess the best thing you can do to move this patch forward is to > actually try that on your real-world use case, and report your results > and possibly do a review of the patch. > > > Yeah, I expect to do this within the next month or two. > > > > IIRC there was an extension [1] leveraging this custom compression > interface for better jsonb compression, so perhaps that would work for > you (not sure if it's up to date with the current patch, though). > > [1] > https://www.postgresql.org/message-id/20171130182009.1b492eb2%40wp.localdomain > > Yeah I will be looking at a couple different approaches here and > reporting back. I don't expect it will be a full production workload but > I do expect to be able to report on benchmarks in both storage and > performance. > FWIW I was a bit curious how would that jsonb compression affect the data set I'm using for testing jsonpath patches, so I spent a bit of time getting it to work with master. It attached patch gets it to compile, but unfortunately then it fails like this: ERROR: jsonbd: worker has detached It seems there's some bug in how sh_mq is used, but I don't have time investigate that further. regards -- Tomas Vondra http://www.2ndQuadrant.com PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
Вложения
В списке pgsql-hackers по дате отправления:
Предыдущее
От: Robert HaasДата:
Сообщение: Re: Libpq support to connect to standby server as priority
Следующее
От: Robert HaasДата:
Сообщение: Re: Transaction commits VS Transaction commits (with parallel) VSquery mean time