On Fri, Mar 03, 2023 at 10:32:53AM -0800, Jacob Champion wrote:
> On Sat, Feb 25, 2023 at 5:22 PM Justin Pryzby <pryzby@telsasoft.com> wrote:
> > This resolves cfbot warnings: windows and cppcheck.
> > And refactors zstd routines.
> > And updates docs.
> > And includes some fixes for earlier patches that these patches conflicts
> > with/depends on.
>
> This'll need a rebase (cfbot took a while to catch up).
Soon.
> The patchset includes basebackup modifications, which are part of a
> different CF entry; was that intended?
Yes, it's intentional - if zstd:long mode were to be merged first, then
this patch should include long mode from the start.
Or, if pgdump+zstd were merged first, then long mode could be added to
both places.
> I tried this on a local, 3.5GB, mostly-text table (from the UK Price
Thanks for looking. If your zstd library is compiled with thread
support, could you also try with :workers=N ? I believe this is working
correctly, but I'm going to ask for help verifying that...
It'd be especially useful to test under windows, where pgdump/restore
use threads instead of forking... If you have a windows environment but
not set up for development, I think it's possible to get cirrusci to
compile a patch for you and then retrieve the binaries provided as an
"artifact" (credit/blame for this idea should be directed to Thomas
Munro).
> With this particular dataset, I don't see much improvement with
> zstd:long.
Yeah. I this could be because either 1) you already got very good
comprssion without looking at more data; and/or 2) the neighboring data
is already very similar, maybe equally or more similar, than the further
data, from which there's nothing to gain.
> (At nearly double the CPU time, I get a <1% improvement in
> compression size.) I assume it's heavily data dependent, but from the
> notes on --long [2] it seems like they expect you to play around with
> the window size to further tailor it to your data. Does it make sense
> to provide the long option without the windowLog parameter?
I don't want to start exposing lots of fine-granined parameters at this
point. In the immediate case, it looks like it may require more than
just adding another parameter:
Note: If windowLog is set to larger than 27,
--long=windowLog or --memory=windowSize needs to be passed to the
decompressor.
--
Justin