Обсуждение: RHEL 7 (systemd) reboot

Поиск
Список
Период
Сортировка

RHEL 7 (systemd) reboot

От
Bryce Pepper
Дата:

I am running three instances (under different users) on a RHEL 7 server to support a vendor product.

 

In the defined services, the start & stop scripts work fine when invoked with systemctl {start|stop} whatever.service  but we have automated monthly patching which does a reboot.

 

Looking in /var/log/messages and the stop scripts do not get invoked on reboot, therefore I created a new shutdown service as described here.

 

It appears that PostGreSQL is receiving a signal from somewhere prior to my script running…

 

Oct 05 14:18:56 kccontrolmt01 NetworkManager[787]: <info>  [1538767136.0967] manager: NetworkManager state is now DISCONNECTED

Oct 05 14:18:56 kccontrolmt01 dbus[740]: [system] Activating via systemd: service name='org.freedesktop.nm_dispatcher' unit='dbus-org.freedesktop.nm-dispa

Oct 05 14:18:56 kccontrolmt01 dbus[740]: [system] Activation via systemd failed for unit 'dbus-org.freedesktop.nm-dispatcher.service': Refusing activation

Oct 05 14:18:56 kccontrolmt01 network[29310]: Shutting down interface eth0:  Device 'eth0' successfully disconnected.

Oct 05 14:18:56 kccontrolmt01 network[29310]: [  OK  ]

Oct 05 14:18:56 kccontrolmt01 stop_ctmlinux_server.sh[29185]: ------------------------

Oct 05 14:18:56 kccontrolmt01 stop_ctmlinux_server.sh[29185]: Shutting down CONTROL-M.

Oct 05 14:18:56 kccontrolmt01 stop_ctmlinux_server.sh[29185]: ------------------------

Oct 05 14:18:56 kccontrolmt01 stop_ctmlinux_server.sh[29185]: Waiting ...

Oct 05 14:18:56 kccontrolmt01 stop_ctmlinux_server.sh[29185]: psql action failed. cannot perform sql command in /data00/ctmlinux/ctm_server/tmp/upd_CMS_SY

Oct 05 14:18:56 kccontrolmt01 stop_ctmlinux_server.sh[29185]: db_execute_sql failed while processing /data00/ctmlinux/ctm_server/tmp/upd_CMS_SYSPRM_29448.

Oct 05 14:18:56 kccontrolmt01 stop_ctmlinux_server.sh[29185]: Failed to update CMS_SYSPRM table.

Oct 05 14:18:56 kccontrolmt01 stop_ctmlinux_server.sh[29185]: Be aware that the Configuration Agent might start the CONTROL-M/Server

 

The database must be available for the product to shut down in a consistent state.

 

I am open to suggestions.

 

Thanks,

Bryce

 

Bryce Pepper

Sr. Unix Applications Systems Engineer

The Kansas City Southern Railway Company

114 West 11th Street  |  Kansas City,  MO 64105

Office:  816.983.1512 

Email:  bpepper@kcsouthern.com  

 

Re: RHEL 7 (systemd) reboot

От
Adrian Klaver
Дата:
On 10/9/18 11:06 AM, Bryce Pepper wrote:
> I am running three instances (under different users) on a RHEL 7 server 
> to support a vendor product.
> 
> In the defined services, the start & stop scripts work fine when invoked 
> with systemctl {start|stop} whatever.service  but we have automated 
> monthly patching which does a reboot.
> 
> Looking in /var/log/messages and the stop scripts do not get invoked on 
> reboot, therefore I created a new shutdown service as described here 
> <https://unix.stackexchange.com/questions/211924/effect-of-reboot-signal-on-systemd-service-state>.
> 
> It appears that PostGreSQL is receiving a signal from somewhere prior to 
> my script running…
> 

> 
> The database must be available for the product to shut down in a 
> consistent state.
> 
> I am open to suggestions.

What is the below doing or coming from?:

db_execute_sql failed while processing 
/data00/ctmlinux/ctm_server/tmp/upd_CMS_SYSPRM_29448.

> 
> Thanks,
> 
> Bryce
> 
> *Bryce Pepper*
> 
> Sr. Unix Applications Systems Engineer
> 
> *The Kansas City Southern Railway Company *
> 
> 114 West 11^th Street  |  Kansas City,  MO 64105
> 
> Office:  816.983.1512
> 
> Email: bpepper@kcsouthern.com <mailto:bpepper@kcsouthern.com>
> 


-- 
Adrian Klaver
adrian.klaver@aklaver.com


RE: RHEL 7 (systemd) reboot

От
Bryce Pepper
Дата:
Adrian,
Thanks for the inquiry.  The function (db_execute_sql) is coming from a vendor (BMC) product called Control-M. It is a
schedulingproduct. 
The tmp file is deleted before I can see its contents but I believe it is trying to update some columns in the
CMS_SYSPRMtable.  
I also think the postgresql instance is already stopped and hence why the db_execute fails.  I will try to modify the
vendorfunction to save off the contents of the query. 

Bryce

p.s. Do you know of any verbose logging that could be turned on to catch when pgsql is being terminated?


-----Original Message-----
From: Adrian Klaver <adrian.klaver@aklaver.com>
Sent: Tuesday, October 09, 2018 7:39 PM
To: Bryce Pepper <BPepper@KCSouthern.com>; pgsql-general@lists.postgresql.org
Subject: Re: RHEL 7 (systemd) reboot

This email originated from outside the company. Please use caution when opening attachments or clicking on links. If
yoususpect this to be a phishing attempt, please report via PhishAlarm. 
________________________________

On 10/9/18 11:06 AM, Bryce Pepper wrote:
> I am running three instances (under different users) on a RHEL 7
> server to support a vendor product.
>
> In the defined services, the start & stop scripts work fine when
> invoked with systemctl {start|stop} whatever.service  but we have
> automated monthly patching which does a reboot.
>
> Looking in /var/log/messages and the stop scripts do not get invoked
> on reboot, therefore I created a new shutdown service as described
> here <https://unix.stackexchange.com/questions/211924/effect-of-reboot-signal-on-systemd-service-state>.
>
> It appears that PostGreSQL is receiving a signal from somewhere prior
> to my script running.
>

>
> The database must be available for the product to shut down in a
> consistent state.
>
> I am open to suggestions.

What is the below doing or coming from?:

db_execute_sql failed while processing
/data00/ctmlinux/ctm_server/tmp/upd_CMS_SYSPRM_29448.

>
> Thanks,
>
> Bryce
>
> *Bryce Pepper*
>
> Sr. Unix Applications Systems Engineer
>
> *The Kansas City Southern Railway Company *
>
> 114 West 11^th Street  |  Kansas City,  MO 64105
>
> Office:  816.983.1512
>
> Email: bpepper@kcsouthern.com <mailto:bpepper@kcsouthern.com>
>


--
Adrian Klaver
adrian.klaver@aklaver.com


RE: RHEL 7 (systemd) reboot

От
Bryce Pepper
Дата:
Here is the contents of the query and error:
[root@kccontrolmt01 tmp]# cat ctm.Xf9pQkg2
update CMS_SYSPRM set CURRENT_STATE='STOPPING',DESIRED_STATE='Down' where DESIRED_STATE <> 'Ignored'
;
psql: could not connect to server: Connection refused
        Is the server running on host "kccontrolmt01" (10.1.32.53) and accepting
        TCP/IP connections on port 5433?

-----Original Message-----
From: Adrian Klaver <adrian.klaver@aklaver.com>
Sent: Tuesday, October 09, 2018 7:39 PM
To: Bryce Pepper <BPepper@KCSouthern.com>; pgsql-general@lists.postgresql.org
Subject: Re: RHEL 7 (systemd) reboot

This email originated from outside the company. Please use caution when opening attachments or clicking on links. If
yoususpect this to be a phishing attempt, please report via PhishAlarm. 
________________________________

On 10/9/18 11:06 AM, Bryce Pepper wrote:
> I am running three instances (under different users) on a RHEL 7
> server to support a vendor product.
>
> In the defined services, the start & stop scripts work fine when
> invoked with systemctl {start|stop} whatever.service  but we have
> automated monthly patching which does a reboot.
>
> Looking in /var/log/messages and the stop scripts do not get invoked
> on reboot, therefore I created a new shutdown service as described
> here <https://unix.stackexchange.com/questions/211924/effect-of-reboot-signal-on-systemd-service-state>.
>
> It appears that PostGreSQL is receiving a signal from somewhere prior
> to my script running.
>

>
> The database must be available for the product to shut down in a
> consistent state.
>
> I am open to suggestions.

What is the below doing or coming from?:

db_execute_sql failed while processing
/data00/ctmlinux/ctm_server/tmp/upd_CMS_SYSPRM_29448.

>
> Thanks,
>
> Bryce
>
> *Bryce Pepper*
>
> Sr. Unix Applications Systems Engineer
>
> *The Kansas City Southern Railway Company *
>
> 114 West 11^th Street  |  Kansas City,  MO 64105
>
> Office:  816.983.1512
>
> Email: bpepper@kcsouthern.com <mailto:bpepper@kcsouthern.com>
>


--
Adrian Klaver
adrian.klaver@aklaver.com


Re: RHEL 7 (systemd) reboot

От
Adrian Klaver
Дата:
On 10/10/18 5:32 AM, Bryce Pepper wrote:
> Adrian,
> Thanks for the inquiry.  The function (db_execute_sql) is coming from a vendor (BMC) product called Control-M. It is
ascheduling product.
 
> The tmp file is deleted before I can see its contents but I believe it is trying to update some columns in the
CMS_SYSPRMtable.
 
> I also think the postgresql instance is already stopped and hence why the db_execute fails.  I will try to modify the
vendorfunction to save off the contents of the query.
 

Alright, I'm confused. In your earlier post you said the stop script is 
not running. Yet here it is, just not at the right time. I think a more 
detailed explanation is needed:

1) The stop script you are concerned about is a systemd  script, one 
that you created or system provided?

2) What is the shutdown service you refer to?

3) Is there a separate shutdown script for the Control-M product?

4) What do you expect to happen vs what is happening?

> 
> Bryce
> 
> p.s. Do you know of any verbose logging that could be turned on to catch when pgsql is being terminated?
> 
> 
> -----Original Message-----
> From: Adrian Klaver <adrian.klaver@aklaver.com>
> Sent: Tuesday, October 09, 2018 7:39 PM
> To: Bryce Pepper <BPepper@KCSouthern.com>; pgsql-general@lists.postgresql.org
> Subject: Re: RHEL 7 (systemd) reboot
> 
> This email originated from outside the company. Please use caution when opening attachments or clicking on links. If
yoususpect this to be a phishing attempt, please report via PhishAlarm.
 
> ________________________________
> 
> On 10/9/18 11:06 AM, Bryce Pepper wrote:
>> I am running three instances (under different users) on a RHEL 7
>> server to support a vendor product.
>>
>> In the defined services, the start & stop scripts work fine when
>> invoked with systemctl {start|stop} whatever.service  but we have
>> automated monthly patching which does a reboot.
>>
>> Looking in /var/log/messages and the stop scripts do not get invoked
>> on reboot, therefore I created a new shutdown service as described
>> here <https://unix.stackexchange.com/questions/211924/effect-of-reboot-signal-on-systemd-service-state>.
>>
>> It appears that PostGreSQL is receiving a signal from somewhere prior
>> to my script running.
>>
> 
>>
>> The database must be available for the product to shut down in a
>> consistent state.
>>
>> I am open to suggestions.
> 
> What is the below doing or coming from?:
> 
> db_execute_sql failed while processing
> /data00/ctmlinux/ctm_server/tmp/upd_CMS_SYSPRM_29448.
> 
>>
>> Thanks,
>>
>> Bryce
>>
>> *Bryce Pepper*
>>
>> Sr. Unix Applications Systems Engineer
>>
>> *The Kansas City Southern Railway Company *
>>
>> 114 West 11^th Street  |  Kansas City,  MO 64105
>>
>> Office:  816.983.1512
>>
>> Email: bpepper@kcsouthern.com <mailto:bpepper@kcsouthern.com>
>>
> 
> 
> --
> Adrian Klaver
> adrian.klaver@aklaver.com
> 


-- 
Adrian Klaver
adrian.klaver@aklaver.com


RE: RHEL 7 (systemd) reboot

От
Bryce Pepper
Дата:
Sorry, I wasn't clear in the prior posts.   

The stop script is running during reboot. The problem is the database is not reachable when the stop script runs.  The
ctmdistserver shut down is as follows:
 
   Stop control-m application
   Stop control-m configuration agent
   Stop database

As you can see the intent is for the database to be shut down after the product. 

But as you noticed from /var/log/message the stop_ctmlinux_server.sh  script is running but unable to execute the
updatequery.
 

I created the following Service definition and scripts that follow -- note there are 2 datacenters (ctmdist, ctmlinux)
thathave comparable scripts so I have only included one set:
 

[root@kccontrolmt01 ~]# cat ControlM_Shutdown.service
[Unit]
Description=Run mycommand at shutdown
Requires=network.target CTM_Postgre.service
DefaultDependencies=no
Before=shutdown.target reboot.target

[Service]
Type=oneshot
RemainAfterExit=true
ExecStart=/bin/true
ExecStop=/root/scripts/control-m_shutdown.sh

[Install]
WantedBy=multi-user.target


[root@kccontrolmt01 ~]# cat /root/scripts/control-m_shutdown.sh
#!/bin/sh
  # Shutdown any running Control-M services
    STATUS=$(/usr/bin/systemctl is-active CTMLinux_Server.service)
    if [ ${STATUS} == "active" ]; then
      /usr/bin/systemctl stop CTMLinux_Server.service 
    fi

    STATUS=$(/usr/bin/systemctl is-active CTMDist_Server.service)
    if [ ${STATUS} == "active" ]; then
      /usr/bin/systemctl stop CTMDist_Server.service 
    fi

    STATUS=$(/usr/bin/systemctl is-active EnterpriseManager.service)
    if [ ${STATUS} == "active" ]; then
      /usr/bin/systemctl stop EnterpriseManager.service
    fi
exit 0


#!/bin/bash

# stop CONTROL-M
if [ -f /data00/ctmlinux/ctm_server/scripts/shut_ctm ]; then
  echo "Stopping CONTROL-M application"
  /data00/ctmlinux/ctm_server/scripts/shut_ctm
fi

# stop CONTROL-M Configuration Agent
if [ -f /data00/ctmlinux/ctm_server/scripts/shut_ca ]; then
  echo "Stopping CONTROL-M Server Configuration Agent"
  /data00/ctmlinux/ctm_server/scripts/shut_ca
fi

# stop database
/data00/ctmlinux/ctm_server/scripts/dbversion
if [ $? -ne 0 ] ; then
  echo "SQL Server is already stopped "
else
  if [ -f /data00/ctmlinux/ctm_server/scripts/shutdb ]; then
    echo "Stopping SQL server for CONTROL-M"
    /data00/ctmlinux/ctm_server/scripts/shutdb
  fi
fi

exit 0

-----Original Message-----
From: Adrian Klaver <adrian.klaver@aklaver.com> 
Sent: Wednesday, October 10, 2018 8:25 AM
To: Bryce Pepper <BPepper@KCSouthern.com>; pgsql-general@lists.postgresql.org
Subject: Re: RHEL 7 (systemd) reboot

This email originated from outside the company. Please use caution when opening attachments or clicking on links. If
yoususpect this to be a phishing attempt, please report via PhishAlarm.
 
________________________________

On 10/10/18 5:32 AM, Bryce Pepper wrote:
> Adrian,
> Thanks for the inquiry.  The function (db_execute_sql) is coming from a vendor (BMC) product called Control-M. It is
ascheduling product.
 
> The tmp file is deleted before I can see its contents but I believe it is trying to update some columns in the
CMS_SYSPRMtable.
 
> I also think the postgresql instance is already stopped and hence why the db_execute fails.  I will try to modify the
vendorfunction to save off the contents of the query.
 

Alright, I'm confused. In your earlier post you said the stop script is not running. Yet here it is, just not at the
righttime. I think a more detailed explanation is needed:
 

1) The stop script you are concerned about is a systemd  script, one that you created or system provided?

2) What is the shutdown service you refer to?

3) Is there a separate shutdown script for the Control-M product?

4) What do you expect to happen vs what is happening?

>
> Bryce
>
> p.s. Do you know of any verbose logging that could be turned on to catch when pgsql is being terminated?
>
>
> -----Original Message-----
> From: Adrian Klaver <adrian.klaver@aklaver.com>
> Sent: Tuesday, October 09, 2018 7:39 PM
> To: Bryce Pepper <BPepper@KCSouthern.com>; 
> pgsql-general@lists.postgresql.org
> Subject: Re: RHEL 7 (systemd) reboot
>
> This email originated from outside the company. Please use caution when opening attachments or clicking on links. If
yoususpect this to be a phishing attempt, please report via PhishAlarm.
 
> ________________________________
>
> On 10/9/18 11:06 AM, Bryce Pepper wrote:
>> I am running three instances (under different users) on a RHEL 7 
>> server to support a vendor product.
>>
>> In the defined services, the start & stop scripts work fine when 
>> invoked with systemctl {start|stop} whatever.service  but we have 
>> automated monthly patching which does a reboot.
>>
>> Looking in /var/log/messages and the stop scripts do not get invoked 
>> on reboot, therefore I created a new shutdown service as described 
>> here <https://unix.stackexchange.com/questions/211924/effect-of-reboot-signal-on-systemd-service-state>.
>>
>> It appears that PostGreSQL is receiving a signal from somewhere prior 
>> to my script running.
>>
>
>>
>> The database must be available for the product to shut down in a 
>> consistent state.
>>
>> I am open to suggestions.
>
> What is the below doing or coming from?:
>
> db_execute_sql failed while processing 
> /data00/ctmlinux/ctm_server/tmp/upd_CMS_SYSPRM_29448.
>
>>
>> Thanks,
>>
>> Bryce
>>
>> *Bryce Pepper*
>>
>> Sr. Unix Applications Systems Engineer
>>
>> *The Kansas City Southern Railway Company *
>>
>> 114 West 11^th Street  |  Kansas City,  MO 64105
>>
>> Office:  816.983.1512
>>
>> Email: bpepper@kcsouthern.com <mailto:bpepper@kcsouthern.com>
>>
>
>
> --
> Adrian Klaver
> adrian.klaver@aklaver.com
>


--
Adrian Klaver
adrian.klaver@aklaver.com

Re: RHEL 7 (systemd) reboot

От
Adrian Klaver
Дата:
On 10/10/18 7:37 AM, Bryce Pepper wrote:
> Sorry, I wasn't clear in the prior posts.
> 
> The stop script is running during reboot. The problem is the database is not reachable when the stop script runs.
Thectmdist server shut down is as follows:
 
>     Stop control-m application
>     Stop control-m configuration agent
>     Stop database

Several things:

1) In your OP there was this:

Oct 05 14:18:56 kccontrolmt01 network[29310]: Shutting down interface 
eth0:  Device 'eth0' successfully disconnected.

Oct 05 14:18:56 kccontrolmt01 network[29310]: [  OK  ]

Oct 05 14:18:56 kccontrolmt01 stop_ctmlinux_server.sh[29185]: 
------------------------

Oct 05 14:18:56 kccontrolmt01 stop_ctmlinux_server.sh[29185]: Shutting 
down CONTROL-M.

So is your Postgres instance running on the same machine as the CTM 
instance or does the eth0 need to be up to reach the database?

2) In the above there is:
"Shutting down CONTROL-M."

Yet in script below there is:
"Stopping CONTROL-M application"

Is this because there are sub-scripts involved or the "Stopping ..." is 
embedded in the script?

3) I am by no means a shell script expert and I will admit to not fully 
understanding what control-m_shutdown.sh does. Still here it goes:

a) Are there actually two shebangs in one file or are there two files 
involved?

b) What is:

# stop database
/data00/ctmlinux/ctm_server/scripts/dbversion
if [ $? -ne 0 ] ; then
   echo "SQL Server is already stopped "
else
   if [ -f /data00/ctmlinux/ctm_server/scripts/shutdb ]; then
     echo "Stopping SQL server for CONTROL-M"
     /data00/ctmlinux/ctm_server/scripts/shutdb
   fi

actually doing?

I ask because from what I can see there are a set of parallel processes 
initiated and it is possible that the database server is winning. It 
comes down to what 'if [ $? -ne 0 ]' is testing.



> 
> As you can see the intent is for the database to be shut down after the product.
> 
> But as you noticed from /var/log/message the stop_ctmlinux_server.sh  script is running but unable to execute the
updatequery.
 
> 
> I created the following Service definition and scripts that follow -- note there are 2 datacenters (ctmdist,
ctmlinux)that have comparable scripts so I have only included one set:
 
> 
> [root@kccontrolmt01 ~]# cat ControlM_Shutdown.service
> [Unit]
> Description=Run mycommand at shutdown
> Requires=network.target CTM_Postgre.service
> DefaultDependencies=no
> Before=shutdown.target reboot.target
> 
> [Service]
> Type=oneshot
> RemainAfterExit=true
> ExecStart=/bin/true
> ExecStop=/root/scripts/control-m_shutdown.sh
> 
> [Install]
> WantedBy=multi-user.target
> 
> 
> [root@kccontrolmt01 ~]# cat /root/scripts/control-m_shutdown.sh
> #!/bin/sh
>    # Shutdown any running Control-M services
>      STATUS=$(/usr/bin/systemctl is-active CTMLinux_Server.service)
>      if [ ${STATUS} == "active" ]; then
>        /usr/bin/systemctl stop CTMLinux_Server.service
>      fi
> 
>      STATUS=$(/usr/bin/systemctl is-active CTMDist_Server.service)
>      if [ ${STATUS} == "active" ]; then
>        /usr/bin/systemctl stop CTMDist_Server.service
>      fi
> 
>      STATUS=$(/usr/bin/systemctl is-active EnterpriseManager.service)
>      if [ ${STATUS} == "active" ]; then
>        /usr/bin/systemctl stop EnterpriseManager.service
>      fi
> exit 0
> 
> 
> #!/bin/bash
> 
> # stop CONTROL-M
> if [ -f /data00/ctmlinux/ctm_server/scripts/shut_ctm ]; then
>    echo "Stopping CONTROL-M application"
>    /data00/ctmlinux/ctm_server/scripts/shut_ctm
> fi
> 
> # stop CONTROL-M Configuration Agent
> if [ -f /data00/ctmlinux/ctm_server/scripts/shut_ca ]; then
>    echo "Stopping CONTROL-M Server Configuration Agent"
>    /data00/ctmlinux/ctm_server/scripts/shut_ca
> fi
> 
> # stop database
> /data00/ctmlinux/ctm_server/scripts/dbversion
> if [ $? -ne 0 ] ; then
>    echo "SQL Server is already stopped "
> else
>    if [ -f /data00/ctmlinux/ctm_server/scripts/shutdb ]; then
>      echo "Stopping SQL server for CONTROL-M"
>      /data00/ctmlinux/ctm_server/scripts/shutdb
>    fi
> fi
> 
> exit 0
> 



-- 
Adrian Klaver
adrian.klaver@aklaver.com


RE: RHEL 7 (systemd) reboot

От
Bryce Pepper
Дата:
Adrian,

Thanks for being willing to dig into this.  

You are correct there are other scripts being called from mine (delivered by BMC with their software).   In order to
stayin support and work with their updates I use the vendor supplied scripts/programs.  
 

The Control-M product is installed on this single server and is broken down into the following parts:
Enterprise server with dedicated postgresql instance
Distributed datacenter with agent and dedicated postgresql instance
Linux datacenter with with agent and dedicated postgresql instance

To cut down on the noise, my post only focused on the "Distributed" side and shutdown process -- although the
ControlM_Shutdown.serviceunit stop script manages all of the above components.
 

In the ControlM_Shutdown.service there is a requires statement identifying that  network must be available while this
systemdunit runs.
 

You noticed that the eth0 disconnected in the /var/log/messages.   I showed that to highlight that the unit was not
executingin the order I had intended, again refer to the requires statement.
 

The second shebang is from one of the invoked subscripts (stop_ctmdist_server.sh) and is the "main" shutdown sequence
forthe Distributed datacenter (I think the "SQL server" echo from BMC is because it can be configured with other
databasesand they use it in a generic term --- not meaning sqlserver from Microsoft).
 

The dbversion check is being used to verify pgsql instance for this datacenter is running and returns a non-zero return
codeif the instance is unreachable (I could use pg_isready or pg_ctl but would diverge further from the BMC supported
technique).

You probably also noticed in the earlier posted shutdown service a requires of CTM_Postgre.service.  This was one of my
attemptsto ensure the instance was available by actually starting the instance outside of the BMC routines (if it is
alreadyrunning the BMC routines will not start -- the dbversion check is on the start side also).  I thought if I
managedthe postgresql instance outside of the product I could ensure it was running.  Unfortunately that didn't work as
theinstance shutdown on its own, presumably a resource (perhaps network) was terminated and postgresql shutdown.  
 

So to restate the original post...   It appears the postgresql instance is unavailable when the stop script runs.  

Thanks,
Bryce

[root@kccontrolmt01 ~]# systemctl --full cat ControlM_Shutdown.service
# /etc/systemd/system/ControlM_Shutdown.service
[Unit]
Description=Run ControlM shutdown process
Requires=graphical.target multi-user.target network.target network.service sockets.target
DefaultDependencies=no
Before=shutdown.target reboot.target halt.target poweroff.target kexec.target

[Service]
Type=oneshot
RemainAfterExit=true
ExecStart=/bin/true
ExecStop=/bin/bash /root/scripts/control-m_shutdown.sh
TimeoutStopSec=4min

[Install]
WantedBy=multi-user.target
[root@kccontrolmt01 ~]#


Re: RHEL 7 (systemd) reboot

От
Adrian Klaver
Дата:
On 10/11/18 6:33 AM, Bryce Pepper wrote:
> Adrian,
> 
> Thanks for being willing to dig into this.
> 
> You are correct there are other scripts being called from mine (delivered by BMC with their software).   In order to
stayin support and work with their updates I use the vendor supplied scripts/programs.
 
> 
> The Control-M product is installed on this single server and is broken down into the following parts:
> Enterprise server with dedicated postgresql instance
> Distributed datacenter with agent and dedicated postgresql instance
> Linux datacenter with with agent and dedicated postgresql instance
> 
> To cut down on the noise, my post only focused on the "Distributed" side and shutdown process -- although the
ControlM_Shutdown.serviceunit stop script manages all of the above components.
 
> 
> In the ControlM_Shutdown.service there is a requires statement identifying that  network must be available while this
systemdunit runs.
 
> 
> You noticed that the eth0 disconnected in the /var/log/messages.   I showed that to highlight that the unit was not
executingin the order I had intended, again refer to the requires statement.
 
> 
> The second shebang is from one of the invoked subscripts (stop_ctmdist_server.sh) and is the "main" shutdown sequence
forthe Distributed datacenter (I think the "SQL server" echo from BMC is because it can be configured with other
databasesand they use it in a generic term --- not meaning sqlserver from Microsoft).
 
> 
> The dbversion check is being used to verify pgsql instance for this datacenter is running and returns a non-zero
returncode if the instance is unreachable (I could use pg_isready or pg_ctl but would diverge further from the BMC
supportedtechnique).
 
> 
> You probably also noticed in the earlier posted shutdown service a requires of CTM_Postgre.service.  This was one of
myattempts to ensure the instance was available by actually starting the instance outside of the BMC routines (if it is
alreadyrunning the BMC routines will not start -- the dbversion check is on the start side also).  I thought if I
managedthe postgresql instance outside of the product I could ensure it was running.  Unfortunately that didn't work as
theinstance shutdown on its own, presumably a resource (perhaps network) was terminated and postgresql shutdown.
 
> 
> So to restate the original post...   It appears the postgresql instance is unavailable when the stop script runs.
> 
> Thanks,
> Bryce
> 
> [root@kccontrolmt01 ~]# systemctl --full cat ControlM_Shutdown.service
> # /etc/systemd/system/ControlM_Shutdown.service
> [Unit]
> Description=Run ControlM shutdown process
> Requires=graphical.target multi-user.target network.target network.service sockets.target
> DefaultDependencies=no
> Before=shutdown.target reboot.target halt.target poweroff.target kexec.target

Again I am not a systemd expert, but I believe the Before line above is 
the opposite of what you want:

https://serverfault.com/questions/812584/in-systemd-whats-the-difference-between-after-and-requires#812589

Above quotes man 
page(https://www.freedesktop.org/software/systemd/man/systemd.unit.html):

"... Note that when two units with an ordering dependency between them 
are shut down, the inverse of the start-up order is applied. i.e. if a 
unit is configured with After= on another unit, the former is stopped 
before the latter if both are shut down. ..."


> 
> [Service]
> Type=oneshot
> RemainAfterExit=true
> ExecStart=/bin/true
> ExecStop=/bin/bash /root/scripts/control-m_shutdown.sh
> TimeoutStopSec=4min
> 
> [Install]
> WantedBy=multi-user.target
> [root@kccontrolmt01 ~]#
> 


-- 
Adrian Klaver
adrian.klaver@aklaver.com


RE: RHEL 7 (systemd) reboot

От
Bryce Pepper
Дата:
Adrian,

I tried changing the Before to After but the postgresql instance was still shutdown too early. 

I appreciate all of the help but think I'm going to ask the patching group to ensure they stop the control-m services
priorto reboot. 
 

Bryce

Oct 11 09:19:57 kccontrolmt01 su[9816]: pam_unix(su-l:session): session opened for user sa_ctmlinux_uat by (uid=0)
Oct 11 09:19:57 kccontrolmt01 systemd[1]: Started Restore /run/initramfs.
Oct 11 09:19:57 kccontrolmt01 stop_ctmdist_agent.sh[9671]: setenv: Too many arguments.
Oct 11 09:19:57 kccontrolmt01 stop_ctmlinux_agent.sh[9672]: setenv: Too many arguments.
Oct 11 09:19:57 kccontrolmt01 stop_ctmdist_agent.sh[9671]: Killing Control-M/Agent Listener pid:5595
Oct 11 09:19:57 kccontrolmt01 stop_ctmlinux_agent.sh[9672]: Killing Control-M/Agent Listener pid:5977
Oct 11 09:19:58 kccontrolmt01 stop_ctmdist_agent.sh[9671]: 2018-10-11 09:19:58 Listener process stopped
Oct 11 09:19:58 kccontrolmt01 stop_ctmlinux_agent.sh[9672]: 2018-10-11 09:19:58 Listener process stopped
Oct 11 09:19:58 kccontrolmt01 stop_ctmlinux_agent.sh[9672]: Killing Control-M/Agent Tracker pid:6199
Oct 11 09:19:58 kccontrolmt01 stop_ctmdist_agent.sh[9671]: Killing Control-M/Agent Tracker pid:6172
Oct 11 09:19:58 kccontrolmt01 systemd[1]: Stopped Dynamic System Tuning Daemon.
Oct 11 09:19:59 kccontrolmt01 stop_ctmlinux_agent.sh[9672]: 2018-10-11 09:19:59 Tracker process stopped
Oct 11 09:19:59 kccontrolmt01 stop_ctmdist_agent.sh[9671]: 2018-10-11 09:19:59 Tracker process stopped
Oct 11 09:19:59 kccontrolmt01 systemd[1]: Stopped Eracent EUA Service.
Oct 11 09:19:59 kccontrolmt01 su[9815]: pam_unix(su-l:session): session closed for user sa_ctmdist_uat
Oct 11 09:19:59 kccontrolmt01 su[9816]: pam_unix(su-l:session): session closed for user sa_ctmlinux_uat
Oct 11 09:19:59 kccontrolmt01 systemd[1]: Stopped Control-M CTM Dist Agent.
Oct 11 09:19:59 kccontrolmt01 systemd[1]: Stopping Control-M CTM Dist Server...
Oct 11 09:19:59 kccontrolmt01 systemd[1]: Stopped Control-M CTM Linux Agent.
Oct 11 09:19:59 kccontrolmt01 systemd[1]: Stopping Control-M CTM Linux Server...
Oct 11 09:19:59 kccontrolmt01 su[10319]: (to sa_ctmdist_uat) root on none
Oct 11 09:19:59 kccontrolmt01 su[10320]: (to sa_ctmlinux_uat) root on none
Oct 11 09:19:59 kccontrolmt01 systemd[1]: Requested transaction contradicts existing jobs: Transaction is destructive.
Oct 11 09:19:59 kccontrolmt01 systemd-logind[777]: Failed to start session scope session-c12.scope: Transaction is
destructive.
Oct 11 09:19:59 kccontrolmt01 su[10319]: pam_systemd(su-l:session): Failed to create session: Resource deadlock
avoided
Oct 11 09:19:59 kccontrolmt01 su[10319]: pam_unix(su-l:session): session opened for user sa_ctmdist_uat by (uid=0)
Oct 11 09:19:59 kccontrolmt01 systemd[1]: Requested transaction contradicts existing jobs: Transaction is destructive.
Oct 11 09:19:59 kccontrolmt01 systemd-logind[777]: Failed to start session scope session-c13.scope: Transaction is
destructive.
Oct 11 09:19:59 kccontrolmt01 su[10320]: pam_systemd(su-l:session): Failed to create session: Resource deadlock
avoided
Oct 11 09:19:59 kccontrolmt01 su[10320]: pam_unix(su-l:session): session opened for user sa_ctmlinux_uat by (uid=0)
Oct 11 09:19:59 kccontrolmt01 systemd[1]: Stopped Eracent EPA Service.
Oct 11 09:19:59 kccontrolmt01 systemd[1]: Stopped target Network.
Oct 11 09:19:59 kccontrolmt01 systemd[1]: Stopping Network.
Oct 11 09:19:59 kccontrolmt01 systemd[1]: Stopping LSB: Bring up/down networking...
Oct 11 09:20:00 kccontrolmt01 stop_ctmdist_server.sh[10316]: setenv: Too many arguments.
Oct 11 09:20:00 kccontrolmt01 stop_ctmdist_server.sh[10316]: Stopping CONTROL-M application
Oct 11 09:20:00 kccontrolmt01 stop_ctmdist_server.sh[10316]: SQL Server is not running.
Oct 11 09:20:00 kccontrolmt01 stop_ctmlinux_server.sh[10318]: setenv: Too many arguments.
Oct 11 09:20:00 kccontrolmt01 stop_ctmlinux_server.sh[10318]: Stopping CONTROL-M application
Oct 11 09:20:00 kccontrolmt01 stop_ctmlinux_server.sh[10318]: SQL Server is not running.
Oct 11 09:20:00 kccontrolmt01 stop_ctmdist_server.sh[10316]: ------------------------
Oct 11 09:20:00 kccontrolmt01 stop_ctmdist_server.sh[10316]: Shutting down CONTROL-M.
Oct 11 09:20:00 kccontrolmt01 stop_ctmdist_server.sh[10316]: ------------------------
Oct 11 09:20:00 kccontrolmt01 stop_ctmdist_server.sh[10316]: Waiting ...
Oct 11 09:20:00 kccontrolmt01 stop_ctmdist_server.sh[10316]: psql action failed. cannot perform sql command in
/data00/ctmdist/ctm_server/tmp/upd_CMS_SYSP
Oct 11 09:20:00 kccontrolmt01 stop_ctmdist_server.sh[10316]: db_execute_sql failed while processing
/data00/ctmdist/ctm_server/tmp/upd_CMS_SYSPRM_10512.sq
Oct 11 09:20:00 kccontrolmt01 stop_ctmdist_server.sh[10316]: Failed to update CMS_SYSPRM table.
Oct 11 09:20:00 kccontrolmt01 stop_ctmdist_server.sh[10316]: Be aware that the Configuration Agent might start the
CONTROL-M/Server
Oct 11 09:20:00 kccontrolmt01 stop_ctmlinux_server.sh[10318]: ------------------------
Oct 11 09:20:00 kccontrolmt01 stop_ctmlinux_server.sh[10318]: Shutting down CONTROL-M.
Oct 11 09:20:00 kccontrolmt01 stop_ctmlinux_server.sh[10318]: ------------------------
Oct 11 09:20:00 kccontrolmt01 stop_ctmlinux_server.sh[10318]: Waiting ...
Oct 11 09:20:00 kccontrolmt01 stop_ctmlinux_server.sh[10318]: psql action failed. cannot perform sql command in
/data00/ctmlinux/ctm_server/tmp/upd_CMS_SY
Oct 11 09:20:00 kccontrolmt01 stop_ctmlinux_server.sh[10318]: db_execute_sql failed while processing
/data00/ctmlinux/ctm_server/tmp/upd_CMS_SYSPRM_10571.
Oct 11 09:20:00 kccontrolmt01 stop_ctmlinux_server.sh[10318]: Failed to update CMS_SYSPRM table.
Oct 11 09:20:00 kccontrolmt01 stop_ctmlinux_server.sh[10318]: Be aware that the Configuration Agent might start the
CONTROL-M/Server
Oct 11 09:20:00 kccontrolmt01 NetworkManager[789]: <info>  [1539267600.3979] device (eth0): state change: activated ->
deactivating(reason 'user-requeste
 
Oct 11 09:20:00 kccontrolmt01 NetworkManager[789]: <info>  [1539267600.4062] manager: NetworkManager state is now
DISCONNECTING
Oct 11 09:20:00 kccontrolmt01 dbus[748]: [system] Activating via systemd: service name='org.freedesktop.nm_dispatcher'
unit='dbus-org.freedesktop.nm-dispa
Oct 11 09:20:00 kccontrolmt01 dbus[748]: [system] Activation via systemd failed for unit
'dbus-org.freedesktop.nm-dispatcher.service':Refusing activation
 
Oct 11 09:20:00 kccontrolmt01 NetworkManager[789]: <info>  [1539267600.4228] audit: op="device-disconnect"
interface="eth0"ifindex=2 pid=10883 uid=0 resu
 
Oct 11 09:20:00 kccontrolmt01 NetworkManager[789]: <info>  [1539267600.4240] device (eth0): state change: deactivating
->disconnected (reason 'user-reque
 
Oct 11 09:20:00 kccontrolmt01 NetworkManager[789]: <warn>  [1539267600.4319] platform-linux: do-change-link[2]: failure
changinglink: failure 97 (Address
 
Oct 11 09:20:00 kccontrolmt01 NetworkManager[789]: <warn>  [1539267600.4325] device (eth0): failed to enable userspace
IPv6LLaddress handling (unspecifie
 
Oct 11 09:20:00 kccontrolmt01 NetworkManager[789]: <info>  [1539267600.4509] manager: NetworkManager state is now
DISCONNECTED
Oct 11 09:20:00 kccontrolmt01 dbus[748]: [system] Activating via systemd: service name='org.freedesktop.nm_dispatcher'
unit='dbus-org.freedesktop.nm-dispa
Oct 11 09:20:00 kccontrolmt01 dbus[748]: [system] Activation via systemd failed for unit
'dbus-org.freedesktop.nm-dispatcher.service':Refusing activation
 
Oct 11 09:20:00 kccontrolmt01 network[10323]: Shutting down interface eth0:  Device 'eth0' successfully disconnected.

Re: RHEL 7 (systemd) reboot

От
Adrian Klaver
Дата:
On 10/11/18 7:53 AM, Bryce Pepper wrote:
> Adrian,
> 
> I tried changing the Before to After but the postgresql instance was still shutdown too early.

In an earlier post you had:

cat ControlM_Shutdown.service
[Unit]
Description=Run mycommand at shutdown
Requires=network.target CTM_Postgre.service

Did you add CTM_Postgre.service to After= ?

My suspicion being that CTM_Postgre.service is running before you get to 
ControlM_Shutdown.service. Unless of course CTM_Postgre.service does not 
exist anymore.

Then there is this:

Oct 11 09:20:00 kccontrolmt01 stop_ctmdist_server.sh[10316]: setenv: Too 
many arguments.
Oct 11 09:20:00 kccontrolmt01 stop_ctmdist_server.sh[10316]: Stopping 
CONTROL-M application
Oct 11 09:20:00 kccontrolmt01 stop_ctmdist_server.sh[10316]: SQL Server 
is not running.
Oct 11 09:20:00 kccontrolmt01 stop_ctmlinux_server.sh[10318]: setenv: 
Too many arguments.
Oct 11 09:20:00 kccontrolmt01 stop_ctmlinux_server.sh[10318]: Stopping 
CONTROL-M application
Oct 11 09:20:00 kccontrolmt01 stop_ctmlinux_server.sh[10318]: SQL Server 
is not running.

which to me looks like the script is running twice

> 
> I appreciate all of the help but think I'm going to ask the patching group to ensure they stop the control-m services
priorto reboot.
 

Yeah, there seems to be hidden dependencies happening.
> 
> Bryce
> 



-- 
Adrian Klaver
adrian.klaver@aklaver.com


RE: RHEL 7 (systemd) reboot

От
Bryce Pepper
Дата:
I disabled and removed the CTM_Postgre.service as it didn't help (and I didn't want too many moving parts left out
there).

I did find a post
https://superuser.com/questions/1016827/how-do-i-run-a-script-before-everything-else-on-shutdown-with-systemdthat I
thinkis getting me closer.
 

I tried    RequiresMountsFor=/data00    which starts the script much sooner but unfortunately  the  postgresql instance
isunreachable by the time the script gets there.
 

These are two unique datacenter shutdowns: ctmdist  & ctmlinux 

Oct 11 09:20:00 kccontrolmt01 stop_ctmdist_server.sh[10316]: setenv: Too many arguments.
Oct 11 09:20:00 kccontrolmt01 stop_ctmdist_server.sh[10316]: Stopping CONTROL-M application Oct 11 09:20:00
kccontrolmt01stop_ctmdist_server.sh[10316]: SQL Server is not running.
 
Oct 11 09:20:00 kccontrolmt01 stop_ctmlinux_server.sh[10318]: setenv:
Too many arguments.
Oct 11 09:20:00 kccontrolmt01 stop_ctmlinux_server.sh[10318]: Stopping CONTROL-M application Oct 11 09:20:00
kccontrolmt01stop_ctmlinux_server.sh[10318]: SQL Server is not running.
 


Re: RHEL 7 (systemd) reboot

От
Adrian Klaver
Дата:
On 10/11/18 10:43 AM, Bryce Pepper wrote:
> I disabled and removed the CTM_Postgre.service as it didn't help (and I didn't want too many moving parts left out
there).
> 
> I did find a post
https://superuser.com/questions/1016827/how-do-i-run-a-script-before-everything-else-on-shutdown-with-systemdthat I
thinkis getting me closer.
 
> 
> I tried    RequiresMountsFor=/data00    which starts the script much sooner but unfortunately  the  postgresql
instanceis unreachable by the time the script gets there.
 

Seems to me the first priority is finding what is shutting down Postgres.

Does the system log show anything?

If not, find the shutdown time in the Postgres log and correlate that 
with the system log.

> 
> These are two unique datacenter shutdowns: ctmdist  & ctmlinux
> 
> Oct 11 09:20:00 kccontrolmt01 stop_ctmdist_server.sh[10316]: setenv: Too many arguments.
> Oct 11 09:20:00 kccontrolmt01 stop_ctmdist_server.sh[10316]: Stopping CONTROL-M application Oct 11 09:20:00
kccontrolmt01stop_ctmdist_server.sh[10316]: SQL Server is not running.
 
> Oct 11 09:20:00 kccontrolmt01 stop_ctmlinux_server.sh[10318]: setenv:
> Too many arguments.
> Oct 11 09:20:00 kccontrolmt01 stop_ctmlinux_server.sh[10318]: Stopping CONTROL-M application Oct 11 09:20:00
kccontrolmt01stop_ctmlinux_server.sh[10318]: SQL Server is not running.
 
> 


-- 
Adrian Klaver
adrian.klaver@aklaver.com