Re: [Patch] Temporary tables that do not bloat pg_catalog (a.k.a fast temp tables)

Поиск
Список
Период
Сортировка
От Pavel Stehule
Тема Re: [Patch] Temporary tables that do not bloat pg_catalog (a.k.a fast temp tables)
Дата
Msg-id CAFj8pRAv+vPxzTQMviCsYqQrfpYei5bTvX80zRUp1Q3xAbTWyg@mail.gmail.com
обсуждение исходный текст
Ответ на Re: [Patch] Temporary tables that do not bloat pg_catalog (a.k.a fast temp tables)  (Tomas Vondra <tomas.vondra@2ndquadrant.com>)
Ответы Re: [Patch] Temporary tables that do not bloat pg_catalog (a.k.a fast temp tables)  (Tomas Vondra <tomas.vondra@2ndquadrant.com>)
Re: [Patch] Temporary tables that do not bloat pg_catalog (a.k.a fast temp tables)  (Tom Lane <tgl@sss.pgh.pa.us>)
Список pgsql-hackers


2016-07-30 1:46 GMT+02:00 Tomas Vondra <tomas.vondra@2ndquadrant.com>:
Hi,

On 07/29/2016 01:15 PM, Aleksander Alekseev wrote:
Hello

Some time ago we discussed an idea of "fast temporary tables":

https://www.postgresql.org/message-id/20160301182500.2c81c3dc%40fujitsu

In two words the idea is following.

<The Idea>

PostgreSQL stores information about all relations in pg_catalog. Some
applications create and delete a lot of temporary tables. It causes a
bloating of pg_catalog and running auto vacuum on it. It's quite an
expensive operation which affects entire database performance.

We could introduce a new type of temporary tables. Information about
these tables is stored not in a catalog but in backend's memory. This
way user can solve a pg_catalog bloating problem and improve overall
database performance.

</The Idea>

Great! Thanks for the patch, this is definitely an annoying issue worth fixing. I've spent a bit of time looking at the patch today, comments below ...

Yes, it some what we need long time


I took me a few months but eventually I made it work. Attached patch
has some flaws. I decided not to invest a lot of time in documenting
it or pgindent'ing all files yet. In my experience it will be rewritten
entirely 3 or 4 times before merging anyway :) But it _works_ and
passes all tests I could think of, including non-trivial cases like
index-only or bitmap scans of catalog tables.


Well, jokes aside, that's a pretty lousy excuse for not writing any docs, and you're pretty much asking the reviewers to reverse-engineer your reasoning. So I doubt you'll get many serious reviews without fixing this gap.

Usage example:

```
CREATE FAST TEMP TABLE fasttab_test1(x int, s text);

INSERT INTO fasttab_test1 VALUES (1, 'aaa'), (2, 'bbb'), (3, 'ccc');

UPDATE fasttab_test1 SET s = 'ddd' WHERE x = 2;

DELETE FROM fasttab_test1 WHERE x = 3;

SELECT * FROM fasttab_test1 ORDER BY x;

DROP TABLE fasttab_test1;
```

More sophisticated examples could be find in regression tests:

./src/test/regress/sql/fast_temp.sql

Any feedback on this patch will be much appreciated!



1) I wonder whether the FAST makes sense - does this really change the performance significantly? IMHO you only move the catalog rows to memory, so why should the tables be any faster? I also believe this conflicts with SQL standard specification of CREATE TABLE.

Probably has zero value to have slow and fast temp tables (from catalogue cost perspective). So the FAST implementation should be used everywhere. But there are some patterns used with work with temp tables,that should not working, and we would to decide if we prepare workaround or not.

-- problematic pattern (old code)
IF NOT EXISTS(SELECT * FROM pg_class WHERE ....) THEN
  CREATE TEMP TABLE xxx()
ELSE
  TRUNCATE TABLE xxx;
END IF;

-- modern patter (new code)
BEGIN
  TRUNCATE TABLE xxx;
EXCEPTION WHEN ..... THEN
  CREATE TEMP TABLE(...)
END;

In this case we can use GUC, because visible behave should be same.

The benefit of zero catalogue cost temp tables is significant - and for some larger applications the temp tables did hard performance issues.
 

2) Why do we need the new relpersistence value? ISTM we could easily got with just RELPERSISTENCE_TEMP, which would got right away of many chances as the steps are exactly the same.

IMHO if this patch gets in, we should use it as the only temp table implementation (Or can you think of cases where keeping rows in pg_class has advantages?). That'd also eliminate the need for FAST keyword in the CREATE TABLE command.

The one thin I'm not sure about is that our handling of temporary tables is not standard compliant - we require each session to create it's own private temporary table. Moving the rows from pg_class into backend memory seems to go in the opposite direction, but as no one was planning to fix this, I don't think it matters much.

3) I think the heapam/indexam/xact and various other places needs a major rework. You've mostly randomly sprinkled the code with function calls to make the patch work - that's fine for an initial version, but a more principled approach is needed.

4) I'm getting failures in the regression suite - apparently the patch somehow affects costing of index only scans, so that a several queries switch from index only scans to bitmap index scans etc. I haven't investigated this more closely, but it seems quite consistent (and I don't see it without the patch). It seems the patch delays building of visibility map, or something like that.

Some other random notes:

1. With this code should not be hard to implement global temp tables - shared persistent structure, temp local data - significant help for any who have to migrate from Oracle.

2. This should to work on slaves - it is one of ToDo

3. I didn't see support for memory store for column's statistics. Some separate questions is about production statistics - pg_stat_user_table, ..

Great and important work, thank you

Pavel


 

regards

--
Tomas Vondra                  http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

В списке pgsql-hackers по дате отправления:

Предыдущее
От: "David G. Johnston"
Дата:
Сообщение: Re: [BUGS] BUG #14244: wrong suffix for pg_size_pretty()
Следующее
От: Pavel Stehule
Дата:
Сообщение: Re: [BUGS] BUG #14244: wrong suffix for pg_size_pretty()