Re: ALTER TABLE ADD COLUMN fast default

Поиск
Список
Период
Сортировка
От Tomas Vondra
Тема Re: ALTER TABLE ADD COLUMN fast default
Дата
Msg-id e198660e-ca2d-a2ea-6210-5af376a345fe@2ndquadrant.com
обсуждение исходный текст
Ответ на Re: ALTER TABLE ADD COLUMN fast default  (Andres Freund <andres@anarazel.de>)
Список pgsql-hackers
On 02/01/2018 02:54 PM, Andres Freund wrote:
> Hi,
> 
> On 2018-01-26 10:53:12 +1030, Andrew Dunstan wrote:
>> Yeah, thanks. revised patch attached
> 
> Given that this patch touches code that's a huge bottleneck in a lot of
> cases, I think this needs benchmarks that heavily exercises tuple
> deforming.
> 

That's a reasonable request, I guess, and I tried to collect such data
today. I measured two cases:

1) Table with 1000 columns and 64 rows, when there were no ALTER TABLE
adding columns with 'fast' defaults. This is meant to measure the best
case, with minimal impact from the patch. See the create.sql script
attached to this message, and the query.sql which was used for tests
using pgbench like this:

    pgbench -n -f q.sql -T 15 test

after 100 runs, the results (tps) look like this:

                 min      max      median
   --------------------------------------
    master      1827     1873        1860
   patched      2023     2066        2056

That is, the patch apparently improves the performance by about 10%
(according to perf profiles this is due to slot_deform_tuple getting
cheaper).

So this case seems fine.


2) Table with 64 rows and 1000 columns, all added by ALTER TABLE with
fast default without rewrite. See create-alter.sql.

Using the same query.sql as before, this shold significant drop to only
about 40 tps (from ~2000 tps for master). The profiles something like this:

    +   98.87%    98.87%  postgres            [.] slot_getmissingattrs
    +   98.77%     0.00%  postgres            [.] PortalRun
    +   98.77%     0.00%  postgres            [.] ExecAgg
    +   98.74%     0.01%  postgres            [.] ExecInterpExpr

which is kinda understandable, although the 2000 to 40 tps seems like a
pretty significant drop. But then again, this case is constructed like a
fairly extreme corner case.

However, there seems to be some sort of bug, because when I did VACUUM
FULL - ideally this would replace the "missing" default values with
actual values stored in the heap rows, eliminating the performance
impact. But the default values got lost and replaced by NULL values,
which seems like a clear data loss scenario.

I'm not quite sure what's wrong, but try this:

    \i create-alter.sql

    -- this returns 64, which is correct
    SELECT COUNT(*) FROM t;

    -- this actually retuns 64 rows with values "1"
    SELECT c1000 FROM t;

    -- this returns 63, which is incorrect (should be 64)
    SELECT count(c1000) FROM t;

    VACUUM FULL t;

    -- suddenly we only get NULL values for all 64 rows
    SELECT c1000 FROM t;

    -- and now we got 0 (instead of 64)
    SELECT count(c1000) FROM t;


regard

-- 
Tomas Vondra                  http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Вложения

В списке pgsql-hackers по дате отправления:

Предыдущее
От: Chapman Flack
Дата:
Сообщение: pie-in-sky idea: 'sensitive' function parameters
Следующее
От: "David G. Johnston"
Дата:
Сообщение: Re: [HACKERS] proposal: schema variables