Re: Potential "AIO / io workers" inter-worker locking issue in PG18?
От | Marco Boeringa |
---|---|
Тема | Re: Potential "AIO / io workers" inter-worker locking issue in PG18? |
Дата | |
Msg-id | 6cf37645-9f1e-4ed6-9397-23ba5de04931@boeringa.demon.nl обсуждение исходный текст |
Ответ на | Re: Potential "AIO / io workers" inter-worker locking issue in PG18? (Andres Freund <andres@anarazel.de>) |
Список | pgsql-bugs |
Hi Andres, I have been doing a bit more investigation. As I explained before, the problematic multi-threaded geoprocessing step is not some stand-alone query that can be easily reduced to small easily portable reproducible case with attached data. In fact, this geoprocessing step is part of a large custom build Python geoprocessing workflow, with total code probably in the 25k code lines range. However, based on the apparent poor query plan in PG18 / PostGIS 3.6.0, I now reviewed the exact code once more. I noticed that just before entering the multi-threaded code that emits the queries as seen below, I am actually adding the primary key field 'objectid' as "GENERATED BY DEFAULT AS IDENTITY" to the 'osm.landcover_scrubs_small_scale_2_ply' table. Now I also noticed I did not run ANALYZE after that against the same table. I have now added this to the code. Although it is still preliminary, first tests seem to indicate that this resolves the issue, and prevents the stalls or better said apparent hugely inefficient query plan (remember: a < 10 sec process was turned into multi-hour). I still need to do more thorough testing to be sure though. However, this raises a couple of question: - While ANALYZE is of course hugely important for proper statistics and query planning, I have wondered if PostgreSQL shouldn't automatically have updated the statistics for the addition of the primary key with IDENTITY? It seems to me that based on the definition of the primary key column and IDENTITY and table size, the actual distribution of values is essentially already known even before any sampling of ANALYZE to update statistics? - Am I right to assume that only the statistics on the objectid field play any role in this issue? As you can see, the WHERE clause does not involve any other fields than the two objectid fields of the main table and the chunk table specifying the job. All other values computed are just derived straight from the geometry column. - Were there any hints in the all the other data I supplied as to where PG18's query planning without the updated statistics of the new ANALYZE step added, is going wrong? And why this was never an issue in <= PG17? I also did some preliminary test with the old PG17.6 / PostgGIS 3.6.0 cluster with the same Italy extract data. I still need to do more thorough testing, both with and without the extra ANALYZE step, to fully exclude that there isn't something related to the upgrade to PostGIS 3.6.0, but first indications are as I already saw with the PG17.6 / PostgGIS 3.5.3, that there are no issue with <= PG17 / PostGIS combination as regards this apparent planner issue. Marco Op 8-10-2025 om 21:08 schreef Andres Freund: > In [1] you showed a query. Reformated that looks like this: > > UPDATE osm.landcover_scrubs_small_scale_2_ply AS t1 > SET area_geo = t2.area_geo, > perim_geo = t2.perim_geo, > compact_geo = CASE WHEN t2.area_geo > 0 THEN ((power(t2.perim_geo,2) / t2.area_geo) / (4 * pi())) ELSE 0 END, > npoints_geo = t2.npoints_geo, > comp_npoints_geo = CASE WHEN t2.npoints_geo > 0 THEN (CASE WHEN t2.area_geo > 0 THEN ((power(t2.perim_geo,2) / t2.area_geo)/ (4 * pi())) ELSE 0 END / t2.npoints_geo) ELSE 0 END, > convex_ratio_geo = CASE WHEN ST_Area(ST_ConvexHull(way)::geography,true) > 0 THEN (t2.area_geo / ST_Area(ST_ConvexHull(way)::geography,true))ELSE 1 END > FROM ( > SELECT > objectid, > ST_Area(way::geography,true) AS area_geo, > ST_Perimeter(way::geography,true) AS perim_geo, > ST_NPoints(way) AS npoints_geo > FROM osm.landcover_scrubs_small_scale_2_ply) AS t2 > WHERE (t2.objectid = t1.objectid) > AND t1.objectid IN (SELECT t3.objectid FROM mini_test.osm.osm_tmp_28128_ch5 AS t3) > > > Which certainly fits with two nested loops, although I don't think I can infer > which order it the joins are in.
В списке pgsql-bugs по дате отправления: