Re: "soft lockup" in kernel

Поиск

Список

Период

Сортировка

От	Dennis Jenkins
Тема	Re: "soft lockup" in kernel
Дата	15 июля 2013 г. 22:40:36
Msg-id	CAAEzAp9uqkzA1ztKsfDa1CND5fGTwLqU+hdeAVAYWfEg+ahSXg@mail.gmail.com обсуждение исходный текст
Ответ на	Re: "soft lockup" in kernel (Dennis Jenkins <dennis.jenkins.75@gmail.com>)
Список	pgsql-general

Дерево обсуждения

Stuart,

I'm simply curious - did you resolve your issue? What NAS (vendor/model/config) are you using?

On Fri, Jul 5, 2013 at 11:31 AM, Dennis Jenkins <dennis.jenkins.75@gmail.com> wrote:

On Fri, Jul 5, 2013 at 8:58 AM, Stuart Ford <stuart.ford@glide.uk.com> wrote:
On Fri, Jul 5, 2013 at 7:00 AM, Dennis Jenkins wrpte

No. iSCSI traffic between the VMWare hosts and the SAN uses completely

separate NICs and different switches to the "production" LAN.
I've had a look at the task activity in VCEnter and found these two events

at almost the same time as the kernel messages. In both cases the start
time (the first time below) is 5-6 seconds after the kernel message, and
I've seen that the clock on the Postgres VM and the VCenter server, at
least, are in sync (it may not, of course, be the VCenter server's clock
that these logs get the time from).

Remove snapshot
GLIBMPDB001_replica
Completed
GLIDE\Svc_vcenter
05/07/2013 11:58:41
05/07/2013 11:58:41
05/07/2013 11:59:03

Remove snapshot
GLIBMPDB001_replica
Completed
GLIDE\Svc_vcenter
05/07/2013 10:11:10
05/07/2013 10:11:10
05/07/2013 10:11:23

I would not blame Veeam.

I suspect that when a snapshot is deleted that all iSCSI activity either halts or slows SIGNIFICANTLY. This depends on your NAS.

I've seen an Oracle 7320 ZFS Storage appliance, misconfigured to use RAID-Z2 (raid6) to store terabytes of essentially random-access data pause for minutes when deleting a snapshot containing a few dozen gigabytes. (the snapshot deletion kernel threads get IO priority over "nfsd" file IO). This causes enough latency to VMWare (over NFS), that VMWare gave up on the IO and returned a generic SCSI error to the guests. Linux guests will semi-panic and remount their file-systems read-only. FreeBSD will just freak out, panic and reboot. The flaw here was using the wrong raid type (since replaced with triple-parity raid-10 and is working great).

What NAS are you using?

How busy are its disks when deleting a snapshot?

What is the RAID type under the hood?

В списке pgsql-general по дате отправления:

Предыдущее

От: David Kerr
Дата: 15 июля 2013 г., 22:18:34
Сообщение: Re: Build RPM from Postgres Source

Следующее

От: "shankar.kotamarthy@gmail.com"
Дата: 16 июля 2013 г., 08:35:48
Сообщение: Re: pg_upgrade could not create catalog dump while upgrading from 9.0 to 9.2

Вход в личный кабинет

Восстановление пароля

Подтверждение аккаунта

Изменение пароля

Re: "soft lockup" in kernel

Предыдущее

Следующее