Обсуждение: Effect of the WindowAgg on the Nested Loop
Greetings.
I've been playing with a small query that I've been asked to optimize
and noticed a strange (for me) effect.
Query uses this table:
Table "clc06_tiles"
Column | Type |
Modifiers
------------+-----------------------+-----------------------------------------------------------
geometry | geometry |
code_06 | character varying(3) |
gid | bigint | not null default
nextval('clc06_tiles_gid_seq'::regclass)
Indexes:
"clc06_tiles_pkey" PRIMARY KEY, btree (gid)
"i_clc06_tiles_geometry" gist (geometry)
Check constraints:
"enforce_dims_geom" CHECK (st_ndims(geometry) = 2)
"enforce_geotype_geom" CHECK (geometrytype(geometry) =
'MULTIPOLYGON'::text OR geometrytype(geometry) = 'POLYGON'::text OR
geometry IS NULL)
"enforce_srid_geom" CHECK (st_srid(geometry) = 3035)
and this function:
CREATE OR REPLACE FUNCTION my_trans(x1 float8, y1 float8, x2 float8,
y2 float8) RETURNS geometry AS $my_trans$
SELECT st_Transform(
st_GeomFromText('LINESTRING('||x1::text||' '||y1::text||
', '||x2::text||' '||y2::text||')',4326),3035);
$my_trans$ LANGUAGE sql IMMUTABLE STRICT;
and these constants:
\set x1 4.56
\set y1 52.54
\set x2 5.08
\set y2 53.34
Original query looks like this ( http://explain.depesz.com/s/pzv ):
SELECT n, i.*, st_NumGeometries(i.geom)
FROM (
SELECT a.code_06 as code_06,
st_Multi(st_Intersection(a.geometry,
my_trans(:x1,:y1,:x2,:y2))) as geom
FROM clc06_tiles a
WHERE st_Intersects(a.geometry, my_trans(:x1,:y1,:x2,:y2))) i
JOIN generate_series(1,10) n ON n <= st_NumGeometries(i.geom);
After a while I added row_number() to the inner part (
http://explain.depesz.com/s/hfs ):
SELECT n, i.*, st_NumGeometries(i.geom)
FROM (
SELECT row_number() OVER () AS rn, a.code_06 as code_06,
st_Multi(st_Intersection(a.geometry,
my_trans(:x1,:y1,:x2,:y2))) as geom
FROM clc06_tiles a
WHERE st_Intersects(a.geometry, my_trans(:x1,:y1,:x2,:y2))) i
JOIN generate_series(1,10) n ON n <= st_NumGeometries(i.geom);
It was really surprising to see a "side" effect of 8x performance boost.
The only difference I can see is an extra WindowAgg step in the second variant.
Could you kindly explain how WindowAgg node affects the overall
performance, please?
PostgreSQL 9.2.1 on x86_64-unknown-linux-gnu, compiled by gcc
(Ubuntu/Linaro 4.6.1-9ubuntu3) 4.6.1, 64-bit
archive_command | (disabled) | configuration file
bgwriter_delay | 100ms | configuration file
bgwriter_lru_maxpages | 200 | configuration file
checkpoint_segments | 30 | configuration file
default_text_search_config | pg_catalog.english | configuration file
effective_cache_size | 3GB | configuration file
listen_addresses | * | configuration file
log_checkpoints | on | configuration file
log_connections | on | configuration file
log_destination | csvlog | configuration file
log_disconnections | on | configuration file
log_lock_waits | on | configuration file
log_min_duration_statement | 100ms | configuration file
log_rotation_age | 1d | configuration file
log_temp_files | 20MB | configuration file
log_timezone | UTC | configuration file
logging_collector | on | configuration file
maintenance_work_mem | 64MB | configuration file
max_connections | 100 | configuration file
max_stack_depth | 2MB | environment variable
max_wal_senders | 2 | configuration file
port | 5432 | configuration file
shared_buffers | 768MB | configuration file
temp_buffers | 32MB | configuration file
TimeZone | UTC | configuration file
wal_level | hot_standby | configuration file
work_mem | 8MB | configuration file
--
Victor Y. Yegorov
On Tue, Jan 22, 2013 at 3:57 PM, Виктор Егоров <vyegorov@gmail.com> wrote:
> Greetings.
>
> I've been playing with a small query that I've been asked to optimize
> and noticed a strange (for me) effect.
> Query uses this table:
>
> Table "clc06_tiles"
> Column | Type |
> Modifiers
> ------------+-----------------------+-----------------------------------------------------------
> geometry | geometry |
> code_06 | character varying(3) |
> gid | bigint | not null default
> nextval('clc06_tiles_gid_seq'::regclass)
> Indexes:
> "clc06_tiles_pkey" PRIMARY KEY, btree (gid)
> "i_clc06_tiles_geometry" gist (geometry)
> Check constraints:
> "enforce_dims_geom" CHECK (st_ndims(geometry) = 2)
> "enforce_geotype_geom" CHECK (geometrytype(geometry) =
> 'MULTIPOLYGON'::text OR geometrytype(geometry) = 'POLYGON'::text OR
> geometry IS NULL)
> "enforce_srid_geom" CHECK (st_srid(geometry) = 3035)
>
> and this function:
> CREATE OR REPLACE FUNCTION my_trans(x1 float8, y1 float8, x2 float8,
> y2 float8) RETURNS geometry AS $my_trans$
> SELECT st_Transform(
> st_GeomFromText('LINESTRING('||x1::text||' '||y1::text||
> ', '||x2::text||' '||y2::text||')',4326),3035);
> $my_trans$ LANGUAGE sql IMMUTABLE STRICT;
>
> and these constants:
> \set x1 4.56
> \set y1 52.54
> \set x2 5.08
> \set y2 53.34
>
>
> Original query looks like this ( http://explain.depesz.com/s/pzv ):
>
> SELECT n, i.*, st_NumGeometries(i.geom)
> FROM (
> SELECT a.code_06 as code_06,
> st_Multi(st_Intersection(a.geometry,
> my_trans(:x1,:y1,:x2,:y2))) as geom
> FROM clc06_tiles a
> WHERE st_Intersects(a.geometry, my_trans(:x1,:y1,:x2,:y2))) i
> JOIN generate_series(1,10) n ON n <= st_NumGeometries(i.geom);
>
>
> After a while I added row_number() to the inner part (
> http://explain.depesz.com/s/hfs ):
>
> SELECT n, i.*, st_NumGeometries(i.geom)
> FROM (
> SELECT row_number() OVER () AS rn, a.code_06 as code_06,
> st_Multi(st_Intersection(a.geometry,
> my_trans(:x1,:y1,:x2,:y2))) as geom
> FROM clc06_tiles a
> WHERE st_Intersects(a.geometry, my_trans(:x1,:y1,:x2,:y2))) i
> JOIN generate_series(1,10) n ON n <= st_NumGeometries(i.geom);
>
>
> It was really surprising to see a "side" effect of 8x performance boost.
> The only difference I can see is an extra WindowAgg step in the second variant.
>
> Could you kindly explain how WindowAgg node affects the overall
> performance, please?
Apologies for resurrecting an old thread, but I just came across this
post while doing some research and I don't see any responses.
This seems like a mighty interesting example. I'm not sure what's
going on here, but let me guess. I think that the WindowAgg is
forcing some operation - detoasting, maybe? - to happen under the
materialize node. As a result, it only gets done once. But in the
other plan, the detoast happens at the nested loop level, above the
materialize node, and therefore it happens 10x instead of 1x.
--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company
2013/5/15 Robert Haas <robertmhaas@gmail.com>
I was playing with the query a while ago and put it aside since then,
--
Victor Y. Yegorov
Apologies for resurrecting an old thread, but I just came across this> Original query looks like this ( http://explain.depesz.com/s/pzv ):
>
> After a while I added row_number() to the inner part (
> http://explain.depesz.com/s/hfs ):
>
> It was really surprising to see a "side" effect of 8x performance boost.
> The only difference I can see is an extra WindowAgg step in the second variant.
post while doing some research and I don't see any responses.
This seems like a mighty interesting example. I'm not sure what's
going on here, but let me guess. I think that the WindowAgg is
forcing some operation - detoasting, maybe? - to happen under the
materialize node. As a result, it only gets done once. But in the
other plan, the detoast happens at the nested loop level, above the
materialize node, and therefore it happens 10x instead of 1x.
need time to come back to this thing.
I will try to put together a testcase for this example, I'd like to achieve
the same behavior on a non-GIS data set.
Victor Y. Yegorov