Re: Potential "AIO / io workers" inter-worker locking issue in PG18?

Поиск
Список
Период
Сортировка
От Marco Boeringa
Тема Re: Potential "AIO / io workers" inter-worker locking issue in PG18?
Дата
Msg-id 1342ae2a-6261-49ac-a106-cf0a4a2d173a@boeringa.demon.nl
обсуждение исходный текст
Ответ на Re: Potential "AIO / io workers" inter-worker locking issue in PG18?  (Thom Brown <thom@linux.com>)
Список pgsql-bugs

Hi Thom,

As an extension to what I already wrote: as the processing gets stuck during UPDATEs, I realized the pg_aios view is likely not involved, as the current AIO implementation of PG18 only affects the read operations like sequential and bitmap heap scans.

So not seeing anything listed in the pg_aios view might be normal? That said, I have attempted to view and refresh the view during other stages of the processing, with pgAdmin showing apparently read operations, but still no records displayed in pg_aios. Maybe I am hitting the "refresh" button on the wrong time though...

But maybe the whole new AIO thing isn't involved in these, and it is another issue in PG18. Just to summarize my observations once again:

- Multi-threaded processing implemented in Python using pyodbc and concurrent.futures apparently getting stuck waiting for PostgreSQL to return. The processing step involved should return in ***less than 10 seconds*** for the small Italy extract, but can take >1h (up to >6) when it gets randomly stuck (some runs successful without delay, others not).

- pgAdmin showing all sessions associated with the threads as 'Active' with no wait events nor blocking PIDs during the whole time the processing appears stuck in PostgreSQL.

- No other sessions like VACUUM visible in pgAdmin during the time the processing appears stuck except the main 'postgres' user session.

- All locks as shown in pg_locks are granted, and most if not all are fastpath, with only AccessShareLock and RowExclusiveLock on the table and its indexes involved. A couple of ExclusiveLock on virtualxid and transactionid.

- 'Top' in Ubuntu showing multiple backend 'postgres' processes continuously at high core usage, one for each thread (each Python thread of course uses its own connection).

- pg_aios view empty, but the processing is UPDATEs, so probably no surprise.

- The processing *DOES* eventually continue after this particular anomaly, with no further consequences and expected results at the end of the total processing flow, so it is not a true dead-lock.

- I have noticed it gets stuck when processing OpenStreetMap scrub or grassland of the Italy extract of Geofabrik. However, as written above, some processing runs are fine on the same data, while others get stuck and delayed. The issue may or may not involve PostGIS though considering this and the fact that the processing step getting stuck involves PostGIS functions.

- In pgAdmin, the SQL statements as generated by my geoprocessing workflow and as being processed by PostgreSQL when the processing is stuck, look like this:

UPDATE osm.landcover_scrubs_small_scale_2_ply AS t1 SET area_geo = t2.area_geo, perim_geo = t2.perim_geo, compact_geo = CASE WHEN t2.area_geo > 0 THEN ((power(t2.perim_geo,2) / t2.area_geo) / (4 * pi())) ELSE 0 END, npoints_geo = t2.npoints_geo, comp_npoints_geo = CASE WHEN t2.npoints_geo > 0 THEN (CASE WHEN t2.area_geo > 0 THEN ((power(t2.perim_geo,2) / t2.area_geo) / (4 * pi())) ELSE 0 END / t2.npoints_geo) ELSE 0 END, convex_ratio_geo = CASE WHEN ST_Area(ST_ConvexHull(way)::geography,true) > 0 THEN (t2.area_geo / ST_Area(ST_ConvexHull(way)::geography,true)) ELSE 1 END FROM (SELECT objectid,ST_Area(way::geography,true) AS area_geo,ST_Perimeter(way::geography,true) AS perim_geo,ST_NPoints(way) AS npoints_geo FROM osm.landcover_scrubs_small_scale_2_ply)  AS t2 WHERE (t2.objectid = t1.objectid) AND t1.objectid IN (SELECT t3.objectid FROM mini_test.osm.osm_tmp_28128_ch5 AS t3)

- All of this worked fine in PG <= 17.

Marco


So, to confirm, you get the issue with as little as 3 io_workers?

Also, what is pg_aios telling you during this time?

Thom

В списке pgsql-bugs по дате отправления: