Re: Bug report - pg_upgrade tool seems to have a race condition when trying to delete a pg_wal file
От | Waka Ranai |
---|---|
Тема | Re: Bug report - pg_upgrade tool seems to have a race condition when trying to delete a pg_wal file |
Дата | |
Msg-id | CAP8Vo=9ib4wxrYt3NdwwL8t8bPG4=LafoiZCSa+chZRzB=30TA@mail.gmail.com обсуждение исходный текст |
Ответ на | Re: Bug report - pg_upgrade tool seems to have a race condition when trying to delete a pg_wal file (Waka Ranai <wakadotranai@gmail.com>) |
Ответы |
Re: Bug report - pg_upgrade tool seems to have a race condition when trying to delete a pg_wal file
|
Список | pgsql-bugs |
Hi again, I eventually found out that Cortex XDR was also installed on the system, but even after uninstalling it, I'm still faced with the same issue. I try to monitor the resources that might have a handle on the file, but the only ones shown are from postgres (one from postgres.exe and one from pg_resetwal). I did the monitoring with the bundled Resource Monitor of Microsoft, do you have any recommendations for another monitoring tool with automatic scanning maybe ?
How could I make sure that the issue is not due to an internal postgres process ?
How could I make sure that the issue is not due to an internal postgres process ?
Did you consider not failing the upgrade if the file cannot be deleted ? What would be the problems, if any, in that use case ?
Thanks in advance
Le ven. 31 mai 2024 à 14:01, Waka Ranai <wakadotranai@gmail.com> a écrit :
Hello again, I tested after disabling the Microsoft antivirus entirely and it worked the first time. I then uninstalled completely the new Postrgres I'm upgrading to (Postgres 15, I made sure to delete the data folder) and reinstalled it again to try the upgrade a second and a third time, but both attempts failed, always on the same step, with the same error message. I also tested on one of the other machines where the upgrade never succeeded after disabling entirely the antivirus and still got the error.I agree that it must be some other process making readdir finding the file, but releasing before unlink could work, but I could not manage to find which one (apart from postgres processes) were using the wal file. I was wondering if it wouldn't be a suitable solution/workaround to not fail when trying to delete a file that is not there anymore ?I will continue looking for what process could be reading the newly modified/created file, but I'm a bit out of luck for nowLe mer. 29 mai 2024 à 09:51, Laurenz Albe <laurenz.albe@cybertec.at> a écrit :On Tue, 2024-05-28 at 16:14 +0200, Waka Ranai wrote:
> We tested on the aforementioned computer after adding an exception on the pg_wal
> folder for the Microsoft default antivirus with
> Add-MpPreference -ExclusionPath "C:\Program Files\PostgreSQL\15\data\pg_wal"
> but we still faced the same issue, I included the pg_upgrade logs
Thanks. I see
command: "C:/Program Files/PostgreSQL/15/bin/pg_resetwal" -f -u 536 "C:/Program Files/PostgreSQL/15/data" >> "C:/Program Files/PostgreSQL/15/data/pg_upgrade_output.d/202405>
Write-ahead log reset
command: "C:/Program Files/PostgreSQL/15/bin/pg_resetwal" -f -x 3466214 "C:/Program Files/PostgreSQL/15/data" >> "C:/Program Files/PostgreSQL/15/data/pg_upgrade_output.d/20>
pg_resetwal: error: could not delete file "pg_wal/000000010000000000000001": No such file or directory
So it is failing in KillExistingXLOG(): readdir() finds the file,
but by the time unlink() is executed, the file is already gone.
The file in question is the WAL segment written by WriteEmptyXLOG() in the
previous "pg_resetwal" execution.
But the previous "pg_resetwal" has exited by the time the next one is started,
so it should not be at fault.
I found this similar thread:
https://postgr.es/m/20090910094211.166C5753FB7%40cvs.postgresql.org
The symptoms are the same.
I wonder if something like commit 4e2d5efc6a45b1f9f96df42629f6d1c7740e657e
would be useful here too. But it cannot be a PostgreSQL process that is
holding the file open - the creating process has already exited, and no
other PostgreSQL process would read the file.
So the fact remains that there is something *outside of PostgreSQL* that
opens newly created files. You say you disabled the virus scanner, but can
you think of any other software on your system that would do that?
Perhaps you can try disabling the virus scanner completely and check if
that gets rid of the problem.
Yours,
Laurenz Albe
В списке pgsql-bugs по дате отправления:
Следующее
От: Richard GuoДата:
Сообщение: Re: BUG #18522: Wrong results with Merge Right Anti Join, inconsistent with Merge Anti Join