Re: backend stuck in DataFileExtend

Поиск
Список
Период
Сортировка
От Thomas Munro
Тема Re: backend stuck in DataFileExtend
Дата
Msg-id CA+hUKG+QczhLLktfiXd9a-OMDRLnqAaz8g6JJGRBnLszrh5Fog@mail.gmail.com
обсуждение исходный текст
Ответ на Re: backend stuck in DataFileExtend  (Justin Pryzby <pryzby@telsasoft.com>)
Ответы Re: backend stuck in DataFileExtend
Список pgsql-hackers
On Tue, May 7, 2024 at 6:21 AM Justin Pryzby <pryzby@telsasoft.com> wrote:
> FWIW: both are running zfs-2.2.3 RPMs from zfsonlinux.org.
...
> Yes, they're running centos7 with the indicated kernels.

So far we've got:

* spurious EIO when opening a file (your previous report)
* hanging with CPU spinning (?) inside pwritev()
* old kernel, bleeding edge ZFS

From an (uninformed) peek at the ZFS code, if it really is spinning
there is seems like a pretty low level problem: it's finish the write,
and now is just trying to release (something like our unpin) and
unlock the buffers, which involves various code paths that might touch
various mutexes and spinlocks, and to get stuck like that I guess it's
either corrupted itself or it is deadlocking against something else,
but what?  Do you see any other processes (including kernel threads)
with any stuck stacks that might be a deadlock partner?

While looking around for reported issues I found your abandoned report
against an older ZFS version from a few years ago, same old Linux
version:

https://github.com/openzfs/zfs/issues/11641

I don't know enough to say anything useful about that but it certainly
smells similar...

I see you've been busy reporting lots of issues, which seems to
involve big data, big "recordsize" (= ZFS block sizes), compression
and PostgreSQL:

https://github.com/openzfs/zfs/issues?q=is%3Aissue+author%3Ajustinpryzby



В списке pgsql-hackers по дате отправления:

Предыдущее
От: SAIKIRAN AVULA
Дата:
Сообщение: Skip adding row-marks for non target tables when result relation is foreign table.
Следующее
От: David Rowley
Дата:
Сообщение: Re: Incorrect explain output for updates/delete operations with returning-list on partitioned tables