Re: Potential "AIO / io workers" inter-worker locking issue in PG18?
От | Marco Boeringa |
---|---|
Тема | Re: Potential "AIO / io workers" inter-worker locking issue in PG18? |
Дата | |
Msg-id | bc4c57ee-eb69-4de8-b3ac-81f3e1ae9030@boeringa.demon.nl обсуждение исходный текст |
Ответ на | Re: Potential "AIO / io workers" inter-worker locking issue in PG18? (Andres Freund <andres@anarazel.de>) |
Ответы |
Re: Potential "AIO / io workers" inter-worker locking issue in PG18?
|
Список | pgsql-bugs |
Hi Andres, I should have phrased it better. The high processor and core activity is not unexpected. The code is designed to saturate the processor and maximize throughput by careful design of the Python threading stuff. It is just that all the jobs send to PostgreSQL via ODBC for the specific step in the processing and with the small Italy extract, should return in less than 10 seconds (which they do in those lucky runs I do not observe the issue), but some of the jobs for the specific step don't, e.g. 30 jobs return within 10 seconds, then the remaining 14 unexpectedly get stuck for 2 hours before returning, all the while staying at the same high core usage they were initiated with. So some of the PostgreSQL database sessions, as I already explained show up in pgAdmin as 'active' with no wait events or blocking pids, simply take an excessive amount of time, but will ultimately return. The CPU time, as witnessed with 'top' in Ubuntu, is really spend in PostgreSQL and the database sessions, not Python, which is run in Windows, and doesn't show high CPU usage in the Windows Task Manager. This doesn't always happen, it is kind of random. One run with the Italy data will be OK, the next not. Marco Op 6-10-2025 om 16:34 schreef Andres Freund: > Hi, > > On 2025-10-05 10:55:01 +0200, Marco Boeringa wrote: >> This has worked really well in previous versions of PostgreSQL (tested up to >> PG17). However, in PG18, during the multi-threaded processing, I see some of >> my submitted jobs that in this case were run against a small OpenStreetMap >> Italy extract of Geofabrik, all of a sudden take > 1 hour to finish (up to 6 >> hours for this small extract), even though similar jobs from the same >> processing step, finish in less than 10 seconds (and the other jobs should >> as well). This seems to happen kind of "random". Many multi-threading tasks >> before and after the affected processing steps, do finish normally. >> >> When this happens, I observe the following things: >> >> - High processor activity, even though the jobs that should finish in >> seconds, take hours, all the while showing the high core usage. >> >> - PgAdmin shows all sessions created by the Python threads as 'active', with >> *no* wait events attached. > I think we need CPU profiles of these tasks. If something is continually > taking a lot more CPU than expected, that seems like an issue worth > investigating. > > Greetings, > > Andres Freund
В списке pgsql-bugs по дате отправления: