Обсуждение: Hung Query with No Locking Issues

Поиск
Список
Период
Сортировка

Hung Query with No Locking Issues

От
"Michael P. McDonnell"
Дата:
Hey team - 
I have 2 stored procedures that need to run back to back. It could convert to a single one - but it's easier from a maintenance perspective to keep them separated.

The first procedure effectively is 
INSERT INTO table_b () SELECT ____ FROM _table_a_; 
COMMIT;

Total execution time - about 180s. Nothing in the pg_locks table and nothing in the pg_stat_activity table suggests anything is hung over.

The second procedure mutates table_b data into table_b_collapsed
INSERT INTO table_c () SELECT _____ FROM _table_b_ JOIN _table_b as b1_ JOIN _table_b as b2_ JOIN _table_b as b3_, etc...;
COMMIT;

The first time I run my second stored procedure - it hangs for up to 8 hours.
If I immediately cancel and re-run the second stored procedure it runs in 2 seconds.
If I let the query hang and rerun THE EXACT SAME QUERY in another tab of dbeaver - it runs in 2 seconds.

During the hang - there are no ungranted locks in pg_locks, nothing unusual in the stats_activity table that I know how to look for. CPU is locked at 100% for the postgres process during the duration of the hung query as well.

Speaking of cpu, this is running on a VMware VM with 8 CPUs and 192 GB of RAM as a solo VM on a 256 GB / 16 CPU host with SSDs backing the DB locally on the server (SATA 3 speeds I'm sure) and storage is sitting comfortably at 20% used and 1% inode utilization.

Not really sure what I'm missing here; I was having these problems even when they weren't a stored procedure. I was hoping pushing the commit immediately after the query would solve my problem. Any advice, pointers, would be appreciated.


Re: Hung Query with No Locking Issues

От
Tom Lane
Дата:
"Michael P. McDonnell" <bzaks1424@gmail.com> writes:
> I have 2 stored procedures that need to run back to back. It could
> convert to a single one - but it's easier from a maintenance perspective to
> keep them separated.

> The first procedure effectively is
> INSERT INTO table_b () SELECT ____ FROM _table_a_;
> COMMIT;

> Total execution time - about 180s. Nothing in the pg_locks table and
> nothing in the pg_stat_activity table suggests anything is hung over.

> The second procedure mutates table_b data into table_b_collapsed
> INSERT INTO table_c () SELECT _____ FROM _table_b_ JOIN _table_b as b1_
> JOIN _table_b as b2_ JOIN _table_b as b3_, etc...;
> COMMIT;

> The first time I run my second stored procedure - it hangs for up to 8
> hours.
> If I immediately cancel and re-run the second stored procedure it runs in 2
> seconds.

Perhaps an "ANALYZE table_b" in between would help.

            regards, tom lane



Re: Hung Query with No Locking Issues

От
"Michael P. McDonnell"
Дата:
Okay - that worked.
How did you know that would work? That's incredible.

On Sun, May 7, 2023 at 4:25 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:
"Michael P. McDonnell" <bzaks1424@gmail.com> writes:
> I have 2 stored procedures that need to run back to back. It could
> convert to a single one - but it's easier from a maintenance perspective to
> keep them separated.

> The first procedure effectively is
> INSERT INTO table_b () SELECT ____ FROM _table_a_;
> COMMIT;

> Total execution time - about 180s. Nothing in the pg_locks table and
> nothing in the pg_stat_activity table suggests anything is hung over.

> The second procedure mutates table_b data into table_b_collapsed
> INSERT INTO table_c () SELECT _____ FROM _table_b_ JOIN _table_b as b1_
> JOIN _table_b as b2_ JOIN _table_b as b3_, etc...;
> COMMIT;

> The first time I run my second stored procedure - it hangs for up to 8
> hours.
> If I immediately cancel and re-run the second stored procedure it runs in 2
> seconds.

Perhaps an "ANALYZE table_b" in between would help.

                        regards, tom lane

Re: Hung Query with No Locking Issues

От
Tom Lane
Дата:
"Michael P. McDonnell" <bzaks1424@gmail.com> writes:
> On Sun, May 7, 2023 at 4:25 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:
>> Perhaps an "ANALYZE table_b" in between would help.

> Okay - that worked.
> How did you know that would work? That's incredible.

Well, I guessed ;-) ... but it was a well-educated guess.

The symptoms sounded like you were getting a bad plan choice for
the second query, which is unsurprising if you populate a table
from empty and don't give auto-analyze enough time to collect
some statistics behind your back.

As a solution going forward, I'd recommend adding the ANALYZE
as the last step in the table-populating procedure.

            regards, tom lane