Обсуждение: speed up pg_upgrade with large number of tables

Поиск
Список
Период
Сортировка

speed up pg_upgrade with large number of tables

От
"杨伯宇(长堂)"
Дата:
Hello postgres hackers:

I am recently working on speeding up pg_upgrade for database with over a
million tables and would like to share some (maybe) optimizeable or
interesting findings.

1: Skip Compatibility Check In "pg_upgrade"
=============================================
Concisely, we've got several databases, each with a million-plus tables.
Running the compatibility check before pg_dump can eat up like half an hour.
If I have performed an online check before the actual upgrade, repeating it
seems unnecessary and just adds to the downtime in many situations.

So, I'm thinking, why not add a "--skip-check" option in pg_upgrade to skip it?
See "1-Skip_Compatibility_Check_v1.patch".


2: Accelerate "FastPathTransferRelationLocks"
===============================================
In this scenario, pg_restore costs much more time than pg_dump. And through
monitoring the "postgres" backend via perf, I found that the much time are
taken by "LWLockAcquire" and "LWLockRelease". Diving deeper, I think I found
the reason:

When we try to create an index (pretty common in pg_restore), the "ShareLock"
to the relation must be held first. Such lock is a "strong" lock, so to acquire
the lock, before we change the global lock hash table, we must traverse each
proc to transfer their relation lock in fastpath. And the issue raise here
(in FastPathTransferRelationLocks ):
we acquire "fpInfoLock" before accessing "proc->databaseId". So we must perform
the lock acquiring and releasing "MaxBackends" times for each index. The reason
is recorded in the comment:
```
/*
* proc->databaseId is set at backend startup time and never changes
* thereafter, so it might be safe to perform this test before
* acquiring &proc->fpInfoLock. In particular, it's certainly safe to
* assume that if the target backend holds any fast-path locks, it
* must have performed a memory-fencing operation (in particular, an
* LWLock acquisition) since setting proc->databaseId. However, it's
* less clear that our backend is certain to have performed a memory
* fencing operation since the other backend set proc->databaseId. So
* for now, we test it after acquiring the LWLock just to be safe.
*/
```

I agree with the reason, but it seems OK to replace LWLockAcquire with a
memory barrier for "proc->databaseId". And this can save some time.
See "2-Accelerate_FastPathTransferRelationLocks_v1.patch".


3: Optimize Toast Index Creating
====================================
While tracing the reason mentioned in point "2", I notice an interesting
performance in creating toast index. In function "create_toast_table"

```
/* ShareLock is not really needed here, but take it anyway */
toast_rel = table_open(toast_relid, ShareLock);
/* some operation */
index_create(xxxx)
```

Yep, ShareLock is not really needed here, since we this is the only transaction
that the toast relation is visible to. But by design (in "relation_open"),
NoLock mode is only used when the caller confirms that it already holds the
lock. So I wonder is it still ok to let the NoLock mode used in such scenario
where the relation is created by current transaction.
See "3-Optimize_Toast_Index_Creating_v1.patch".


That's what I've got. Any response is appreciated.

Best regards,
Yang Boyu
Вложения

Re: speed up pg_upgrade with large number of tables

От
Daniel Gustafsson
Дата:
> On 5 Jul 2024, at 09:12, 杨伯宇(长堂) <yangboyu.yby@alibaba-inc.com> wrote:

> 1: Skip Compatibility Check In "pg_upgrade"
> =============================================
> Concisely, we've got several databases, each with a million-plus tables.
> Running the compatibility check before pg_dump can eat up like half an hour.
> If I have performed an online check before the actual upgrade, repeating it
> seems unnecessary and just adds to the downtime in many situations.
>
> So, I'm thinking, why not add a "--skip-check" option in pg_upgrade to skip it?
> See "1-Skip_Compatibility_Check_v1.patch".

How would a user know that nothing has changed in the cluster between running
the check and running the upgrade with a skipped check?  Considering how
complicated it is to understand exactly what pg_upgrade does it seems like
quite a large caliber footgun.

I would be much more interested in making the check phase go faster, and indeed
there is ongoing work in this area.  Since it sounds like you have a dev and
test environment with a big workload, testing those patches would be helpful.
https://commitfest.postgresql.org/48/4995/ is one that comes to mind.

--
Daniel Gustafsson




回复:Re: speed up pg_upgrade with large number of tables

От
"杨伯宇(长堂)"
Дата:
> > So, I'm thinking, why not add a "--skip-check" option in pg_upgrade to skip it?
> > See "1-Skip_Compatibility_Check_v1.patch".
>
> How would a user know that nothing has changed in the cluster between running
> the check and running the upgrade with a skipped check? Considering how
> complicated it is to understand exactly what pg_upgrade does it seems like
> quite a large caliber footgun.
Indeed, it's not feasible to execute an concise check ensuring that nothing
has changed. However, in many cases, a cluster consistently executes the
same SQL commands. Thus, if we've verified that the cluster was compatible
30 minutes prior, there's a strong likelihood that it remains compatible now.
Therefore, adding such an 'trust-me' option may still be beneficial.

> I would be much more interested in making the check phase go faster, and indeed
> there is ongoing work in this area. Since it sounds like you have a dev and
> test environment with a big workload, testing those patches would be helpful.
> https://commitfest.postgresql.org/48/4995/ is one that comes to mind.
Very meaningful work! I will try it.

--
Best regards,
Yang Boyu

Re: 回复:Re: speed up pg_upgrade with large number of tables

От
Nathan Bossart
Дата:
On Fri, Jul 05, 2024 at 05:24:42PM +0800, 杨伯宇(长堂) wrote:
>> > So, I'm thinking, why not add a "--skip-check" option in pg_upgrade to skip it?
>> > See "1-Skip_Compatibility_Check_v1.patch".
>> 
>> How would a user know that nothing has changed in the cluster between running
>> the check and running the upgrade with a skipped check? Considering how
>> complicated it is to understand exactly what pg_upgrade does it seems like
>> quite a large caliber footgun.

I am also -1 on this one for the same reasons as Daniel.

>> I would be much more interested in making the check phase go faster, and indeed
>> there is ongoing work in this area. Since it sounds like you have a dev and
>> test environment with a big workload, testing those patches would be helpful.
>> https://commitfest.postgresql.org/48/4995/ is one that comes to mind.
> Very meaningful work! I will try it.

Thanks!  Since you mentioned that you have multiple databases with 1M+
databases, you might also be interested in commit 2329cad.  That should
speed up the pg_dump step quite a bit.

-- 
nathan



回复:Re: 回复:Re: speed up pg_upgrade with large number of tables

От
"杨伯宇(长堂)"
Дата:
> Thanks! Since you mentioned that you have multiple databases with 1M+
> databases, you might also be interested in commit 2329cad. That should
> speed up the pg_dump step quite a bit.
Wow, I noticed this commit(2329cad) when it appeared in commitfest. It has
doubled the speed of pg_dump in this scenario. Thank you for your effort!

Besides, https://commitfest.postgresql.org/48/4995/ seems insufficient to
this situation. Some time-consuming functions like check_for_data_types_usage
are not yet able to run in parallel. But these patches could be a great
starting point for a more efficient parallelism implementation. Maybe we can
do it later.

Re: 回复:Re: 回复:Re: speed up pg_upgrade with large number of tables

От
Nathan Bossart
Дата:
On Mon, Jul 08, 2024 at 03:22:36PM +0800, 杨伯宇(长堂) wrote:
> Besides, https://commitfest.postgresql.org/48/4995/ seems insufficient to 
> this situation. Some time-consuming functions like check_for_data_types_usage
> are not yet able to run in parallel. But these patches could be a great
> starting point for a more efficient parallelism implementation. Maybe we can
> do it later.

I actually just wrote up the first version of the patch for parallelizing
the data type checks over the weekend.  I'll post it shortly.

-- 
nathan