Re: Better HINT message for "unexpected data beyond EOF"
| От | Andres Freund | 
|---|---|
| Тема | Re: Better HINT message for "unexpected data beyond EOF" | 
| Дата | |
| Msg-id | gluttro6ro2lsn7mvs6i6ihdhi4futxpgljyhslcvguci2a5rd@xteikqd6ftos обсуждение исходный текст | 
| Ответ на | Re: Better HINT message for "unexpected data beyond EOF" (Jakub Wartak <jakub.wartak@enterprisedb.com>) | 
| Ответы | Re: Better HINT message for "unexpected data beyond EOF" | 
| Список | pgsql-hackers | 
Hi, On 2025-03-27 10:25:50 +0100, Jakub Wartak wrote: > On Wed, Mar 26, 2025 at 4:01 PM Robert Haas <robertmhaas@gmail.com> wrote: > [..] > > > so how about: > > > -HINT: This has been seen to occur with buggy kernels; consider > > > updating your system. > > > +HINT: This has been observed with files being overwritten, buggy > > > kernels and potentially other external file system influence. > > > > I agree that we should emphasize the possibility of files being > > overwritten. > > > I'm not sure we should even mention buggy kernels -- is > > there any evidence that's still a thing on still-running hardware? > > No, I do not have any, other than comments in source code from Tom. FWIW, I'm not sure how much that was ever true. We certainly had our own bugs that could lead to the error occurring. > E.g. I've tracked down that e.g. Pavan fixed something in 2ndQ > fast_redo/pg_xlog_prefetch extension in 2016, where some concurrency > bug in that extension was causing similiar problem back then on at > least one occasion: ```...issue was caused because the prefetch worker > process reading back blocks that are being concurrently dropped by the > startup process (as a result of truncate operation). When the startup > process later tries to extend the relation, it finds an existing valid > block in the shared buffers and panics. ``` (sounds like it is related > with data beyond EOF). FWIW that's more generally broken than just this error. You can't just read in data without holding a lock on a relation, that will cause breakage in all kinds of ways. > Proposals: > 1. HINT: This has been observed with files being overwritten. > 2. HINT: This has been observed with files being overwritten, old > (2.6.x) buggy Linux kernels . > 3. HINT: This has been observed with files being overwritten, old > (2.6.x) buggy Linux kernels, corruption or other non-core PostgreSQL > bugs. > 4. HINT: This has been observed with files being overwritten, buggy > kernels and potentially other external file system influence. FWIW, I think we should just drop the HINT. We really have no clue what caused it and a HINT should imo have at least some value other than "*Shrug*", which is imo pretty much what these HINTs amount to, if they were a bit more blunt. Greetings, Andres Freund
В списке pgsql-hackers по дате отправления: