On Thu, Jun 04, 2020 at 10:30:47AM -0700, Andres Freund wrote:
> On 2020-05-08 02:25:45 -0500, Justin Pryzby wrote:
> > Seems to me it should, at least conditionally. At least if there's a function
> > scan or a relation or ..
>
> Well, the problem is that this can cause very very significant
> regressions. As in 10x slower or more. The ringbuffer can cause constant
> XLogFlush() calls (due to the lsn interlock), and the eviction from
> shared_buffers (regardless of actual available) will mean future vacuums
> etc will be much slower. I think this is likely to cause pretty
> widespread regressions on upgrades.
>
> Now, it sucks that we have this problem in the general facility that's
> supposed to be used for this kind of bulk operation. But I don't really
> see it realistic as expanding use of bulk insert strategies unless we
> have some more fundamental fixes.
I made this conditional on BEGIN BULK/SET bulk, so I'll solicit comments on that.
postgres=# \t on \\ \set QUIET \\ VACUUM FULL t; \dt+ t \\ begin ; \timing on \\ INSERT INTO t SELECT * FROM t;
rollback;SELECT COUNT(1), usagecount FROM pg_buffercache GROUP BY 2 ORDER BY 2;
| public | t | table | pryzbyj | 35 MB |
|Time: 9497.318 ms (00:09.497)
| 33 | 1
| 3 | 2
| 18 | 3
| 5 | 4
| 4655 | 5
| 11670 |
vs
postgres=# \t on \\ \set QUIET \\ VACUUM FULL t; \dt+ t \\ begin BULK ; \timing on \\ INSERT INTO t SELECT * FROM t;
rollback;SELECT COUNT(1), usagecount FROM pg_buffercache GROUP BY 2 ORDER BY 2;
| public | t | table | pryzbyj | 35 MB |
|Time: 8268.780 ms (00:08.269)
| 2080 | 1
| 3 | 2
| 19 | 4
| 234 | 5
| 14048 |
And:
postgres=# begin ; \x \\ \t \\ SELECT statement_timestamp(); \o /dev/null \\ SELECT 'INSERT INTO t VALUES(0)' FROM
generate_series(1,999999);\set ECHO errors \\ \set QUIET on \\ \o \\ \gexec \\ SELECT statement_timestamp(); abort; \x
\\SELECT COUNT(1), usagecount FROM pg_buffercache GROUP BY 2 ORDER BY 2; a
|statement_timestamp | 2020-07-12 20:31:43.717328-05
|statement_timestamp | 2020-07-12 20:36:16.692469-05
|
| 52 | 1
| 24 | 2
| 17 | 3
| 6 | 4
| 4531 | 5
| 11754 |
vs
postgres=# begin BULK ; \x \\ \t \\ SELECT statement_timestamp(); \o /dev/null \\ SELECT 'INSERT INTO t VALUES(0)' FROM
generate_series(1,999999);\set ECHO errors \\ \set QUIET on \\ \o \\ \gexec \\ SELECT statement_timestamp(); abort; \x
\\SELECT COUNT(1), usagecount FROM pg_buffercache GROUP BY 2 ORDER BY 2; a
|statement_timestamp | 2020-07-12 20:43:47.089538-05
|statement_timestamp | 2020-07-12 20:48:04.798138-05
|
| 4456 | 1
| 22 | 2
| 1 | 3
| 7 | 4
| 79 | 5
| 11819 |
--
Justin