Обсуждение: poor performance when recreating constraints on large tables

Поиск
Список
Период
Сортировка

poor performance when recreating constraints on large tables

От
Mike Broers
Дата:
I originally posted this on admin, but it was suggested to post it to performance so here goes - 

I am in the process of implementing cascade on delete constraints retroactively on rather large tables so I can cleanly remove deprecated data.  The problem is recreating some foreign key constraints on tables of 55 million rows+ was taking much longer than the maintenance window I had, and now I am looking for tricks to speed up the process, hopefully there is something obvious i am overlooking.

here is the sql I am running, sorry im trying to obfuscate object names a little - 

BEGIN;
ALTER TABLE ONLY t1 DROP CONSTRAINT fk_t1_t2_id;
ALTER TABLE ONLY t1 ADD CONSTRAINT fk_t1_t2_id FOREIGN KEY(id) REFERENCES t2(id) 
ON DELETE CASCADE 
DEFERRABLE INITIALLY DEFERRED;
COMMIT;

t1 has 55 million rows
t2 has 72 million rows
the id columns are integer types
postgres version 8.3.8
there are nightly vacuum/analyze commands, and auto vacuum is enabled.

I have tried set constraints deferred, immediate, the id column on table 2 is indexed, its the primary key.  Nothing really seems to impact the time it takes to recreate the constraint.  There may be memory settings to tweak, I was able to get it to run on a faster test server with local storage in about 10 minutes, but it was running for over an hour in our production environment.. We took down the application and I verified it wasnt waiting for an exclusive lock on the table or anything, it was running the alter table command for that duration.  

Let me know if there is anything else I can supply that will help the review, thanks!

One additional question - is there any way to check how long postgres is estimating an operation will take to complete while it is running?

Thanks again,
Mike

Re: poor performance when recreating constraints on large tables

От
Tom Lane
Дата:
Mike Broers <mbroers@gmail.com> writes:
> I am in the process of implementing cascade on delete constraints
> retroactively on rather large tables so I can cleanly remove deprecated
> data.  The problem is recreating some foreign key constraints on tables of
> 55 million rows+ was taking much longer than the maintenance window I had,
> and now I am looking for tricks to speed up the process, hopefully there is
> something obvious i am overlooking.

maintenance_work_mem?

            regards, tom lane

Re: poor performance when recreating constraints on large tables

От
Mike Broers
Дата:
Thanks for the suggestion, maintenance_work_mem is set to the default of 16MB on the host that was taking over an hour as well as on the host that was taking less than 10 minutes.  I tried setting it to 1GB on the faster test server and it reduced the time from around 6-7 minutes to about 3:30.  this is a good start, if there are any other suggestions please let me know - is there any query to check estimated time remaining on long running transactions?



On Mon, Jun 6, 2011 at 3:37 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
Mike Broers <mbroers@gmail.com> writes:
> I am in the process of implementing cascade on delete constraints
> retroactively on rather large tables so I can cleanly remove deprecated
> data.  The problem is recreating some foreign key constraints on tables of
> 55 million rows+ was taking much longer than the maintenance window I had,
> and now I am looking for tricks to speed up the process, hopefully there is
> something obvious i am overlooking.

maintenance_work_mem?

                       regards, tom lane

Re: poor performance when recreating constraints on large tables

От
Robert Haas
Дата:
On Mon, Jun 6, 2011 at 6:10 PM, Mike Broers <mbroers@gmail.com> wrote:
> Thanks for the suggestion, maintenance_work_mem is set to the default of
> 16MB on the host that was taking over an hour as well as on the host that
> was taking less than 10 minutes.  I tried setting it to 1GB on the faster
> test server and it reduced the time from around 6-7 minutes to about 3:30.
>  this is a good start, if there are any other suggestions please let me know
> - is there any query to check estimated time remaining on long running
> transactions?

Sadly, no.  I suspect that coming up with a good algorithm for that is
a suitable topic for a PhD thesis.  :-(

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Re: poor performance when recreating constraints on large tables

От
Samuel Gendler
Дата:


On Wed, Jun 8, 2011 at 12:28 PM, Robert Haas <robertmhaas@gmail.com> wrote:
On Mon, Jun 6, 2011 at 6:10 PM, Mike Broers <mbroers@gmail.com> wrote:
> Thanks for the suggestion, maintenance_work_mem is set to the default of
> 16MB on the host that was taking over an hour as well as on the host that
> was taking less than 10 minutes.  I tried setting it to 1GB on the faster
> test server and it reduced the time from around 6-7 minutes to about 3:30.
>  this is a good start, if there are any other suggestions please let me know
> - is there any query to check estimated time remaining on long running
> transactions?

Sadly, no.  I suspect that coming up with a good algorithm for that is
a suitable topic for a PhD thesis.  :-(


The planner knows how many rows are expected for each step of the query plan, so it would be theoretically possible to compute how far along it is in processing a query based on those estimates, wouldn't it?  Combine percentage complete with time elapsed and you could get somewhat close if the stats are accurate, couldn't you?  Of course, I have no clue as to the internals of the planner and query executor which might or might not make such tracking of query execution possible.


Re: poor performance when recreating constraints on large tables

От
"Kevin Grittner"
Дата:
Samuel Gendler <sgendler@ideasculptor.com> wrote:

> The planner knows how many rows are expected for each step of the
> query plan, so it would be theoretically possible to compute how
> far along it is in processing a query based on those estimates,
> wouldn't it?

And it is sometimes off by orders of magnitude.  How much remaining
time do you report when the number of rows actually processed so far
is five times the estimated rows that the step would process?  How
about after it chugs on from there to 20 time she estimated row
count?  Of course, on your next query it might finish after
processing only 5% of the estimated rows....

-Kevin

Re: poor performance when recreating constraints on large tables

От
Samuel Gendler
Дата:


On Wed, Jun 8, 2011 at 12:53 PM, Kevin Grittner <Kevin.Grittner@wicourts.gov> wrote:
Samuel Gendler <sgendler@ideasculptor.com> wrote:

> The planner knows how many rows are expected for each step of the
> query plan, so it would be theoretically possible to compute how
> far along it is in processing a query based on those estimates,
> wouldn't it?

And it is sometimes off by orders of magnitude.  How much remaining
time do you report when the number of rows actually processed so far
is five times the estimated rows that the step would process?  How
about after it chugs on from there to 20 time she estimated row
count?  Of course, on your next query it might finish after
processing only 5% of the estimated rows....

Sure, but if it is a query that is slow enough for a time estimate to be useful, odds are good that stats that are that far out of whack would actually be interesting to whoever is looking at the time estimate, so showing some kind of 'N/A' response once things have gotten out of whack wouldn't be unwarranted.  Not that I'm suggesting that any of this is a particularly useful exercise.  I'm just playing with the original thought experiment suggestion.
 

-Kevin

Re: poor performance when recreating constraints on large tables

От
Claudio Freire
Дата:
---------- Forwarded message ----------
From: Claudio Freire <klaussfreire@gmail.com>
Date: Wed, Jun 8, 2011 at 11:57 PM
Subject: Re: [PERFORM] poor performance when recreating constraints on
large tables
To: Samuel Gendler <sgendler@ideasculptor.com>


On Wed, Jun 8, 2011 at 9:57 PM, Samuel Gendler
<sgendler@ideasculptor.com> wrote:
> Sure, but if it is a query that is slow enough for a time estimate to be
> useful, odds are good that stats that are that far out of whack would
> actually be interesting to whoever is looking at the time estimate, so
> showing some kind of 'N/A' response once things have gotten out of whack
> wouldn't be unwarranted.  Not that I'm suggesting that any of this is a
> particularly useful exercise.  I'm just playing with the original thought
> experiment suggestion.

There's a trick to get exactly that:

Do an explain, fetch the expected rowcount on the result set, add a
dummy sequence and a dummy field to the resultset "nextval(...) as
progress".

Now, you won't get to read the progress column probably, but that
doesn't matter. Open up another transaction, and query it there.
Sequences are nontransactional.

All the smarts about figuring out the expected resultset's size
remains on the application, which is fine by me.

Re: poor performance when recreating constraints on large tables

От
Greg Smith
Дата:
Samuel Gendler wrote:
> Sure, but if it is a query that is slow enough for a time estimate to
> be useful, odds are good that stats that are that far out of whack
> would actually be interesting to whoever is looking at the time
> estimate, so showing some kind of 'N/A' response once things have
> gotten out of whack wouldn't be unwarranted.

The next question is what are you then going to do with that information?

The ability to track some measure of "progress" relative to expectations
is mainly proposed as something helpful when a query has gone out of
control.  When that's happened, the progress meter normally turns out to
be fundamentally broken; the plan isn't happening at all as expected.
So, as you say, you will get an "N/A" response that says the query is
out of control, when in the cases where this sort of thing is expected
to be the most useful.

At that point, you have two choices.  You can let the query keep running
and see how long it really takes.  You have no idea how long that will
be, all you can do is wait and see because the estimation is trashed.
Or you can decide to kill it.  And the broken progress meter won't help
with that decision.  So why put it there at all?

What I try to do as a force of habit is run just about everything that
might take a while with "\timing" on, and try to keep statement_timeout
to a reasonable value at all times.  Do that enough, and you get a feel
for what reasonable and unreasonable time scales look like better than
the query executor can be expected to figure them out for you.  It would
be nice to provide a better UI here for tracking progress, but it would
really work only in the simplest of cases--which are of course the ones
you need it the least for.

--
Greg Smith   2ndQuadrant US    greg@2ndQuadrant.com   Baltimore, MD
PostgreSQL Training, Services, and 24x7 Support  www.2ndQuadrant.us
"PostgreSQL 9.0 High Performance": http://www.2ndQuadrant.com/books


Re: poor performance when recreating constraints on large tables

От
Samuel Gendler
Дата:


On Wed, Jun 8, 2011 at 10:57 PM, Greg Smith <greg@2ndquadrant.com> wrote:
Samuel Gendler wrote:
Sure, but if it is a query that is slow enough for a time estimate to be useful, odds are good that stats that are that far out of whack would actually be interesting to whoever is looking at the time estimate, so showing some kind of 'N/A' response once things have gotten out of whack wouldn't be unwarranted.

The next question is what are you then going to do with that information?

The ability to track some measure of "progress" relative to expectations is mainly proposed as something helpful when a query has gone out of control.  When that's happened, the progress meter normally turns out to be fundamentally broken; the plan isn't happening at all as expected.  So, as you say, you will get an "N/A" response that says the query is out of control, when in the cases where this sort of thing is expected to be the most useful.

Well, in my case, the use I'd put it to is a query that is necessarily long running (aggregations over large quantities of data that take a minute or two to complete), and the stats are accurate enough that it would potentially let me show a progress meter of some kind in the few places where such queries are run interactively rather than on a schedule.  Not that I'm really thinking seriously about doing so, but there are places in code I maintain where such a thing could prove useful if its accuracy is reasonable for the queries in question.  ENough to at least toy with the suggested sequence method and see what happens when I've got some spare time to play.