executor: release large memory allocations sooner
От | Jeff Davis |
---|---|
Тема | executor: release large memory allocations sooner |
Дата | |
Msg-id | b9f761fef848eb9820d0a01304baa2fdbb1fa807.camel@j-davis.com обсуждение исходный текст |
Список | pgsql-hackers |
Currently, an executor node can hold onto large data structures until ExecEndNode(). For complex plans, that means we can use a lot more memory than we need to. For instance, see the SQL at the end of this email. (Note: this work is not intended to implement any kind of query-level work_mem enforcement. It reduces the maximum number of times that work_mem is used in certain kinds of complex plans, which may help with that problem, but enforcing that is not my goal right now.) The attached proof-of-concept patch (0001) makes the Hash node call ExecShutdownNode() on the subplan when it's exhausted, and extends ExecShutdownHash() to release memory. I added some code to measure the peak memory usage (0002), and it went from about 200MB to about 150MB. The savings roughly correspond to the memory used by the first Hash node. I'm not 100% sure that ExecShutdownNode() is the right place to do this, but it seems better than introducing yet another executor API. Thoughts? We'd need to figure out what to do about rescan. One option is to just say that if EXEC_FLAG_REWIND is used, then it would never free the resources eagerly. But if it's under a memoize, then it's unlikely to be called with the same parameters again, so whatever state it already has might be useless anyway. Also, are there any major challenges making this work with parallel query? Regards, Jeff Davis Example: CREATE TABLE t1(id1 int8 primary key); CREATE TABLE t2(id2 int8 primary key, id1 int8); CREATE TABLE t3(id3 int8, id2 int8, n NUMERIC); INSERT INTO t1 SELECT g FROM generate_series(1, 1000000) g; INSERT INTO t2 SELECT g+1000000000, g FROM generate_series(1, 1000000) g; INSERT INTO t3 SELECT g+2000000000, g+1000000000, 3.14159 FROM generate_series(1, 1000000) g; INSERT INTO t3 SELECT random(1,1000000) + 3000000000, random(1,1000000) + 3000000000, -1 FROM generate_series(1, 10000000) g; VACUUM ANALYZE; SET work_mem = '1GB'; SET from_collapse_limit = 1; SET enable_sort = off; SET enable_nestloop = off; SET max_parallel_workers = 0; SET max_parallel_workers_per_gather = 0; EXPLAIN (COSTS OFF) SELECT id1, COUNT(*) FROM t3, (SELECT t1.id1, t2.id2 FROM t1, t2 WHERE t2.id1 = t1.id1) s WHERE t3.id2 = s.id2 GROUP BY id1; QUERY PLAN -------------------------------------------------- HashAggregate Group Key: t1.id1 -> Hash Join Hash Cond: (t3.id2 = t2.id2) -> Seq Scan on t3 -> Hash -> Hash Join Hash Cond: (t2.id1 = t1.id1) -> Seq Scan on t2 -> Hash -> Seq Scan on t1
Вложения
В списке pgsql-hackers по дате отправления: