Re: intermittent issue with windows 7 service manager not able to correctly determine or control postgresql 9.4

Поиск
Список
Период
Сортировка
От George Neuner
Тема Re: intermittent issue with windows 7 service manager not able to correctly determine or control postgresql 9.4
Дата
Msg-id b8abib17d6h45jdis6gchian866j9i1lsh@4ax.com
обсуждение исходный текст
Ответ на intermittent issue with windows 7 service manager not able to correctly determine or control postgresql 9.4  (Tom Hodder <tom@limepepper.co.uk>)
Список pgsql-general


Disclaimer: My comments here are generic to Windows services.
I don't run Postgresql on Windows and I have no idea how it is
implemented.

On Sun, 1 May 2016 03:35:44 +0100, Tom Hodder <tom@limepepper.co.uk>
wrote:

>I've got several machines running windows 7 which have postgresql 9.4
>installed as a service, and configured to start automatically on boot. I am
>monitoring these services with zabbix and several times a week I get a
>notification that the postgresql-x64-9.4 service has stopped.
>
>When I login to the machine, the service does appear to be stopped;
>?
>However when I check the database, I can query it ok;

Windows services have a time limit to respond to commands or status
inquries.  The service manager periodically queries status of all
running services - if they don't respond quickly enough, the manager
thinks they are hosed.  That may or may not be true.

But IME unresponsive services rarely appear "stopped" - usually they
show as "started" in the service manager, or, if you run SC from the
command line their state is shown as "running".


>If I try to start the service from the service manager, I see the following
>error in the logs;
>
>*2016-04-30 05:03:13 BST FATAL:  lock file "postmaster.pid" already
>exists2016-04-30 05:03:13 BST HINT:  Is another postmaster (PID 2556)
>running in data directory "C:/Program Files/PostgreSQL/9.4/data"?*
>
>The pg_ctl tool seems to correctly query the state of the service and
>return the correct PID;
>
>*C:\Program Files\PostgreSQL\9.4>bin\pg_ctl.exe -D "C:\Program
>Files\PostgreSQL\9.4\data" status
>pg_ctl: server is running (PID: 2556**)*

Which suggest the service either is not reponding to the manager's
status inquiries, or is responding too late.


>The other thing that seems to happen is the pgadmin3 tool seems to
>have lost the ability to control the service as all the options for
>start/stop are greyed out;
>[image: Inline images 2]

This is likely because the service manager believes the service is
unresponsive.  The programming API communicates with the manager.


>The only option to get the control back is to kill the processes in
>the task manager or reboot the machine.

You could try "sc stop <service>" from the command line.
The SC tool is separate from the shell "net" command and it sometimes
will work when "net stop <service>" does not.

You also could try using recovery options in the service manager to
automatically restart the service.  But if the service is showing as
"stopped" when it really is running, this is unlikely to work.


>Any suggestions on what might be causing this?

Services are tricky to get right: there are a number of rules the
control interface has to obey that are at odds with doing real work.

A single threaded service must periodically send "busy" status to the
manager during lengthy processing.  Failure to do that in a timely
manner will cause problems.

A multi-threaded service that separates processing from control must
be able to suspend or halt the processing when directed and send
"busy" status if it can't.

There is a way to launch arbirtrary programs as services so they can
run at startup and in the background, but programs that weren't
written explicitly to BE services don't obey the service manager and
their diplayed status usually is bogus (provided by the launcher).

George

В списке pgsql-general по дате отправления:

Предыдущее
От: Tom Smith
Дата:
Сообщение: Re: JSONB performance enhancement for 9.6
Следующее
От: hamann.w@t-online.de
Дата:
Сообщение: Re: Skip trigger?