Обсуждение: initdb and share/postgresql.conf.sample

Поиск
Список
Период
Сортировка

initdb and share/postgresql.conf.sample

От
Jeff Janes
Дата:

In some of my git branches I have editorialized src/backend/utils/misc/postgresql.conf.sample to contain my configuration preferences for whatever it is that that branch is for testing.  Then this gets copied to share/postgresql.conf.sample during install and from there to data/postgresql.conf during initdb, and I don't need to remember to go make the necessary changes.

Am I insane to be doing this?  Is there a better way to handle this branch-specific configuration needs?

Anyway, I was recently astonished to discovery that the contents of share/postgresql.conf.sample during the initdb affected the performance of the server, even when the conf file was replaced with something else before the server was started up.  To make a very long story short, if share/postgresql.conf.sample is set up for archiving, then somewhere in the initdb process some bootstrap process pre-creates a bunch of extra xlog files.  

Is this alarming?  It looks like initdb takes some pains, at least on one place, to make an empty config file rather than using postgresql.conf.sample, but it seems like a sub-process is not doing that.

Those extra log files then give the newly started server a boost (whether it is started in archive mode or not) because it doesn't have to create them itself.  It isn't so much a boost, as the absence of a new-server penalty.  I want to remove that penalty to get better numbers from benchmarking.  What I am doing now is this, between the initdb and the pg_ctl start:

for g in `perl -e 'printf("0000000100000000000000%02X\n",$_) foreach 2..120'`; do cp /tmp/data/pg_xlog/000000010000000000000001 /tmp/data/pg_xlog/$g -i < /dev/null;

The "120" comes from 2 * checkpoint_segments.  That's mighty ugly, is there a better trick?

You could say that benchmarks should run long enough to average out such changes, but needing to run a benchmark that long can make some kinds of work (like git bisect) unrealistic rather than merely tedious.

Cheers,

Jeff

Re: initdb and share/postgresql.conf.sample

От
Greg Stark
Дата:
On Sun, Dec 23, 2012 at 11:11 PM, Jeff Janes <jeff.janes@gmail.com> wrote:
> You could say that benchmarks should run long enough to average out such
> changes

I would say any benchmark needs to be run long enough to reach a
steady state before the measurements are taken. The usual practice is
to run a series groups and observe the aggregate measurements for each
group. For instance run 10 runs with each run including of 1000
repetitions of the transaction. Then you can observe at which point
the averages for individual groups begin to behave consistently. If
the first three are outliers but the remaining 7 are stable then
discard the first three and take the average (or often median) of the
remaining 7.

If you include the early runs which are affected by non-steady-state
conditions such as cache effects or file fragmentation then it can
take a very long time for those effects to be erased by averaging with
later results. Worse, it's very difficult to tell whether you've
waited long enough.


-- 
greg