Обсуждение: [HACKERS] SET NOT NULL [NOT VALID / CONCURRENTLY]?

Поиск
Список
Период
Сортировка

[HACKERS] SET NOT NULL [NOT VALID / CONCURRENTLY]?

От
Joel Jacobson
Дата:
Hi hackers,

I would be good if it would be possible to quickly set NOT NULL for an
existing column in a table
that have no rows where the column IS NULL and where there is a full
index on the column
allowing the logics to quickly understand there are no NULL values,
and just have to take a quick
lock on the table to prevent any modifications during the short time
when the NOT NULL
is set for the column.

Currently if you want to set NOT NULL for a column in a huge table
that's not doable without blocking all writes to the table for quite
some time.
Setting NOT NULL for a 100 million row table took 28 seconds locally
on my machine.

Is anyone working on fixing this for PostgreSQL 10?

Joel Jacobson
Trustly



Re: [HACKERS] SET NOT NULL [NOT VALID / CONCURRENTLY]?

От
Craig Ringer
Дата:
On 21 December 2016 at 16:48, Joel Jacobson <joel@trustly.com> wrote:
> Hi hackers,
>
> I would be good if it would be possible to quickly set NOT NULL for an
> existing column in a table
> that have no rows where the column IS NULL and where there is a full
> index on the column
> allowing the logics to quickly understand there are no NULL values,
> and just have to take a quick
> lock on the table to prevent any modifications during the short time
> when the NOT NULL
> is set for the column.
>
> Currently if you want to set NOT NULL for a column in a huge table
> that's not doable without blocking all writes to the table for quite
> some time.
> Setting NOT NULL for a 100 million row table took 28 seconds locally
> on my machine.
>
> Is anyone working on fixing this for PostgreSQL 10?

Not as far as I know.

IMO this and other similar cases should all be handled the same way:
create the constraint NOT VALID, then VALIDATE it while holding a weak
lock that only blocks concurrent schema changes.



-- Craig Ringer                   http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training & Services



Re: [HACKERS] SET NOT NULL [NOT VALID / CONCURRENTLY]?

От
Joel Jacobson
Дата:
On Wed, Dec 21, 2016 at 4:24 PM, Craig Ringer <craig@2ndquadrant.com> wrote:
>> Is anyone working on fixing this for PostgreSQL 10?
>
> Not as far as I know.
>
> IMO this and other similar cases should all be handled the same way:
> create the constraint NOT VALID, then VALIDATE it while holding a weak
> lock that only blocks concurrent schema changes.

Sounds like a good approach.

Similar to what we (Trustly) did when we sponsored the FOR KEY LOCK
feature to improve concurrency,
we would be very interested in also sponsoring this feature, as it
would mean a great lot to us.
I don't know if this is the right forum trying to find someone/some
company to sign up for the task,
please let me know if I should mail to some other list. Thanks.

Joel Jacobson
Trustly



Re: [HACKERS] SET NOT NULL [NOT VALID / CONCURRENTLY]?

От
Craig Ringer
Дата:
On 21 December 2016 at 19:01, Joel Jacobson <joel@trustly.com> wrote:

> Similar to what we (Trustly) did when we sponsored the FOR KEY LOCK
> feature to improve concurrency,
> we would be very interested in also sponsoring this feature, as it
> would mean a great lot to us.
> I don't know if this is the right forum trying to find someone/some
> company to sign up for the task,
> please let me know if I should mail to some other list. Thanks.

You'll probably get mail off list.

For what it's worth, there's a bit of a complexity here. PostgreSQL
doesn't model NOT NULL as a true CONSTRAINT. Instead it's a column
attribute. I suspect we would need to change that in order to allow a
NOT VALID NOT NULL constraint to be created.

That's at least partly why the docs say that "option NOT VALID [...]
is currently only allowed for foreign key and CHECK constraints".

Note that "[VALIDATE] acquires only a SHARE UPDATE EXCLUSIVE lock on
the table being altered" so it's already suitable for what you need.
The challenge is making it possible to create a NOT VALID constraint
for NOT NULL.

-- Craig Ringer                   http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training & Services



Re: [HACKERS] SET NOT NULL [NOT VALID / CONCURRENTLY]?

От
Joel Jacobson
Дата:
If you are fully confident you have no NULL values,
e.g. if you have all your logics in db functions and you validate all
INSERTs to a table won't pass any NULL values,
and you have checked all the rows in a table are NOT NULL for the column,
would it be completely crazy to just set pg_attribute.attnotnull to
TRUE for the column?

Is anything else happening "under the hood" than just locking all rows
and verifying there are no NULL rows, and then setting attnotnull to
TRUE?


On Wed, Dec 21, 2016 at 6:37 PM, Craig Ringer <craig@2ndquadrant.com> wrote:
> On 21 December 2016 at 19:01, Joel Jacobson <joel@trustly.com> wrote:
>
>> Similar to what we (Trustly) did when we sponsored the FOR KEY LOCK
>> feature to improve concurrency,
>> we would be very interested in also sponsoring this feature, as it
>> would mean a great lot to us.
>> I don't know if this is the right forum trying to find someone/some
>> company to sign up for the task,
>> please let me know if I should mail to some other list. Thanks.
>
> You'll probably get mail off list.
>
> For what it's worth, there's a bit of a complexity here. PostgreSQL
> doesn't model NOT NULL as a true CONSTRAINT. Instead it's a column
> attribute. I suspect we would need to change that in order to allow a
> NOT VALID NOT NULL constraint to be created.
>
> That's at least partly why the docs say that "option NOT VALID [...]
> is currently only allowed for foreign key and CHECK constraints".
>
> Note that "[VALIDATE] acquires only a SHARE UPDATE EXCLUSIVE lock on
> the table being altered" so it's already suitable for what you need.
> The challenge is making it possible to create a NOT VALID constraint
> for NOT NULL.
>
> --
>  Craig Ringer                   http://www.2ndQuadrant.com/
>  PostgreSQL Development, 24x7 Support, Training & Services



-- 
Joel Jacobson

Mobile: +46703603801
Trustly.com | Newsroom | LinkedIn | Twitter



Re: [HACKERS] SET NOT NULL [NOT VALID / CONCURRENTLY]?

От
Joel Jacobson
Дата:
Attached is the function SET_NOT_NULL(_Schema name, _Table name, _Column name) which does the following:

1. LOCK TABLE %I.%I IN ACCESS EXCLUSIVE MODE
just like the normal DDL commands would do

2. SELECT EXISTS (SELECT 1 FROM %I.%I WHERE %I IS NULL)
which is fast if there is an index on the column

3. UPDATE pg_catalog.pg_attribute SET attnotnull = TRUE
    WHERE attrelid = %L::oid
    AND   attname  = %L

Pragmatically, would this be a safe approach?
 

On Wed, Dec 21, 2016 at 6:53 PM, Joel Jacobson <joel@trustly.com> wrote:
If you are fully confident you have no NULL values,
e.g. if you have all your logics in db functions and you validate all
INSERTs to a table won't pass any NULL values,
and you have checked all the rows in a table are NOT NULL for the column,
would it be completely crazy to just set pg_attribute.attnotnull to
TRUE for the column?

Is anything else happening "under the hood" than just locking all rows
and verifying there are no NULL rows, and then setting attnotnull to
TRUE?


On Wed, Dec 21, 2016 at 6:37 PM, Craig Ringer <craig@2ndquadrant.com> wrote:
> On 21 December 2016 at 19:01, Joel Jacobson <joel@trustly.com> wrote:
>
>> Similar to what we (Trustly) did when we sponsored the FOR KEY LOCK
>> feature to improve concurrency,
>> we would be very interested in also sponsoring this feature, as it
>> would mean a great lot to us.
>> I don't know if this is the right forum trying to find someone/some
>> company to sign up for the task,
>> please let me know if I should mail to some other list. Thanks.
>
> You'll probably get mail off list.
>
> For what it's worth, there's a bit of a complexity here. PostgreSQL
> doesn't model NOT NULL as a true CONSTRAINT. Instead it's a column
> attribute. I suspect we would need to change that in order to allow a
> NOT VALID NOT NULL constraint to be created.
>
> That's at least partly why the docs say that "option NOT VALID [...]
> is currently only allowed for foreign key and CHECK constraints".
>
> Note that "[VALIDATE] acquires only a SHARE UPDATE EXCLUSIVE lock on
> the table being altered" so it's already suitable for what you need.
> The challenge is making it possible to create a NOT VALID constraint
> for NOT NULL.
>
> --
>  Craig Ringer                   http://www.2ndQuadrant.com/
>  PostgreSQL Development, 24x7 Support, Training & Services



--
Joel Jacobson

Mobile: +46703603801
Trustly.com | Newsroom | LinkedIn | Twitter



--
Joel Jacobson

Mobile: +46703603801
Вложения

Re: [HACKERS] SET NOT NULL [NOT VALID / CONCURRENTLY]?

От
Robert Haas
Дата:
On Wed, Dec 21, 2016 at 7:55 PM, Joel Jacobson <joel@trustly.com> wrote:
Attached is the function SET_NOT_NULL(_Schema name, _Table name, _Column name) which does the following:

1. LOCK TABLE %I.%I IN ACCESS EXCLUSIVE MODE
just like the normal DDL commands would do

2. SELECT EXISTS (SELECT 1 FROM %I.%I WHERE %I IS NULL)
which is fast if there is an index on the column

3. UPDATE pg_catalog.pg_attribute SET attnotnull = TRUE
    WHERE attrelid = %L::oid
    AND   attname  = %L

Pragmatically, would this be a safe approach?

Hmm, I don't see a problem with it.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company