Re: [External] Re: pgadmin--pgagent---the process hang by unknow reasons

Поиск
Список
Период
Сортировка
От Neel Patel
Тема Re: [External] Re: pgadmin--pgagent---the process hang by unknow reasons
Дата
Msg-id CACCA4P3uvnQ4TSkjU3qL4+ZGy_oD8g6pC9KJ-EKBeHOhFe-1-w@mail.gmail.com
обсуждение исходный текст
Ответ на Re: [External] Re: pgadmin--pgagent---the process hang by unknow reasons  (Neel Patel <neel.patel@enterprisedb.com>)
Список pgadmin-support
Hi,

Is it possible for you to share the below details ? We want to understand what kind of jobs ( sql, batch script etc. ) it is and at what interval all the jobs are running ?

SELECT * FROM pgagent.pga_job;
SELECT * FROM pgagent.pga_joblog;
SELECT * FROM pgagent.pga_jobstep;
SELECT * FROM pgagent.pga_schedule;
SELECT * FROM pgagent.pga_jobsteplog;
SELECT * FROM pgagent.pga_exception;


Thanks,
Neel Patel

On Wed, Dec 9, 2020 at 5:13 PM Neel Patel <neel.patel@enterprisedb.com> wrote:
Sure Dave. I will check and update.

On Tue, Dec 8, 2020 at 4:39 PM Dave Page <dpage@pgadmin.org> wrote:
Neel, can you help with this please? I'm not overly familiar with the updated Boost code.

On Tue, Dec 8, 2020 at 10:23 AM Zhiyu ZY13 Xu <xuzy13@lenovo.com> wrote:

Hi Dave

 

Thanks for your help. I have upgrade to pgagent 4.0 on 12/7 with your recommendation.

CREATE EXTENSION pgagent FROM unpackaged;

 

Owing to my PostgreSQL server didn’t install by yum. So I copy those file to my extension directory.

Then create extension command work.

cp /usr/pgsql-10/share/extension/pgagent* /data/postgres/share/extension/

 

After I upgrade pgagent to 4.0. I test some application. It’s work normal. Pgagent 4.0 work as expected.

But when it running almost 1 hour. It experience the problem which same with pgagent 3.4.

I attach the logs and stack for your reference. The pgagent stop on this function “in __lll_lock_wait ()”

The log on the attachment for your reference.

Env: PG 10.5 and OS is Centos 6.9

 

postgres@sltfjfrauxq ~]$ pstack 21492

Thread 4 (Thread 0x7ff8dffff700 (LWP 24059)):

#0  0x00007ff8ea665334 in __lll_lock_wait () from /lib64/libpthread.so.0

#1  0x00007ff8ea6605d8 in _L_lock_854 () from /lib64/libpthread.so.0

#2  0x00007ff8ea6604a7 in pthread_mutex_lock () from /lib64/libpthread.so.0

#3  0x00007ff8ebfbbc01 in DBconn::Return() ()

#4  0x00007ff8ebfc7f32 in Job::Execute() ()

#5  0x00007ff8ebfcac58 in JobThread::operator()() ()

#6  0x00007ff8eb221d47 in thread_proxy () from /usr/lib64/libboost_thread-mt.so.5

#7  0x00007ff8ea65eaa1 in start_thread () from /lib64/libpthread.so.0

#8  0x00007ff8ea3abc4d in clone () from /lib64/libc.so.6

Thread 3 (Thread 0x7ff8debfd700 (LWP 28530)):

#0  0x00007ff8ea3a2403 in poll () from /lib64/libc.so.6

#1  0x00007ff8ebb5828f in ?? () from /usr/lib64/libpq.so.5

#2  0x00007ff8ebb58310 in ?? () from /usr/lib64/libpq.so.5

#3  0x00007ff8ebb538e2 in ?? () from /usr/lib64/libpq.so.5

#4  0x00007ff8ebb5465f in PQconnectdb () from /usr/lib64/libpq.so.5

#5  0x00007ff8ebfba6b5 in DBconn::Connect(std::basic_string<wchar_t, std::char_traits<wchar_t>, std::allocator<wchar_t> > const&) ()

#6  0x00007ff8ebfba7c4 in DBconn::DBconn(std::basic_string<wchar_t, std::char_traits<wchar_t>, std::allocator<wchar_t> > const&) ()

#7  0x00007ff8ebfbc9da in DBconn::Get(std::basic_string<wchar_t, std::char_traits<wchar_t>, std::allocator<wchar_t> > const&, std::basic_string<wchar_t, std::char_traits<wchar_t>, std::allocator<wchar_t> > const&) ()

#8  0x00007ff8ebfc7d54 in Job::Execute() ()

#9  0x00007ff8ebfcac58 in JobThread::operator()() ()

#10 0x00007ff8eb221d47 in thread_proxy () from /usr/lib64/libboost_thread-mt.so.5

#11 0x00007ff8ea65eaa1 in start_thread () from /lib64/libpthread.so.0

#12 0x00007ff8ea3abc4d in clone () from /lib64/libc.so.6

Thread 2 (Thread 0x7ff8de1fc700 (LWP 28531)):

#0  0x00007ff8ea665334 in __lll_lock_wait () from /lib64/libpthread.so.0

#1  0x00007ff8ea6605d8 in _L_lock_854 () from /lib64/libpthread.so.0

#2  0x00007ff8ea6604a7 in pthread_mutex_lock () from /lib64/libpthread.so.0

#3  0x00007ff8ebfbc9a7 in DBconn::Get(std::basic_string<wchar_t, std::char_traits<wchar_t>, std::allocator<wchar_t> > const&, std::basic_string<wchar_t, std::char_traits<wchar_t>, std::allocator<wchar_t> > const&) ()

#4  0x00007ff8ebfc7d54 in Job::Execute() ()

#5  0x00007ff8ebfcac58 in JobThread::operator()() ()

#6  0x00007ff8eb221d47 in thread_proxy () from /usr/lib64/libboost_thread-mt.so.5

#7  0x00007ff8ea65eaa1 in start_thread () from /lib64/libpthread.so.0

#8  0x00007ff8ea3abc4d in clone () from /lib64/libc.so.6

Thread 1 (Thread 0x7ff8ec1e07e0 (LWP 21492)):

#0  0x00007ff8ea665334 in __lll_lock_wait () from /lib64/libpthread.so.0

#1  0x00007ff8ea6605d8 in _L_lock_854 () from /lib64/libpthread.so.0

#2  0x00007ff8ea6604a7 in pthread_mutex_lock () from /lib64/libpthread.so.0

#3  0x00007ff8ebfbc197 in DBconn::ClearConnections(bool) ()

#4  0x00007ff8ebfd8909 in MainRestartLoop(DBconn*) ()

#5  0x00007ff8ebfd9993 in MainLoop() ()

#6  0x00007ff8ebfdd872 in main ()

[postgres@sltfjfrauxq ~]$ kill -9 21492

[postgres@sltfjfrauxq ~]$ pgagent_10  -s ./pg_agent_12_8_after_upgrade_version_upgrade.log -l 2 hostaddr=127.0.0.1 port=5432 dbname=postgres user=postgres password=abcd-1234

[postgres@sltfjfrauxq ~]$ psql

psql (10.5)

 

 

I have a questions about pgagent tables. There are many tables which used by pgagent.

I couldn’t find a document to describe those pgagent system tables. I want to know more detail about the tables and every column.

I want to deploy a monitor script to monitor pgagent state and every job execution result.

Some job failed with “couldn’t get connection with database” . Most of jobs could execute successfully.

 

Which sql statement to query result to display on the pgadmin page ? I want to use same sql to query latest pgagent jobs status.

 

 

 

 

 

 

 

徐志宇(Jack

Database Engineer

    

DB Team,ITS. Lenovo China

Phone: 86-18910860709

Email:xuzy13@lenovo.com

No.6 Shangdi West Road, Haidian District Beijing, China, 100085

 

发件人: Dave Page <dpage@pgadmin.org>
发送时间: 2020124 17:39
收件人: Zhiyu ZY13 Xu <xuzy13@lenovo.com>
抄送: pgadmin-support@postgresql.org
主题: Re: [External] Re: pgadmin--pgagent---the process hang by unknow reasons

 

Hi

 

On Fri, Dec 4, 2020 at 9:00 AM Zhiyu ZY13 Xu <xuzy13@lenovo.com> wrote:

Hi Dave 

 

I am appreciate you help on this problem.

The production system installed by my teammate. He told me that. He experience some problem when build and compile pgagent 3.4.0 file.

So he install it by yum resource.  Then create pgagent tables by execute a script.  Look like he didnt create pgagent extension.

On the production system. I could find those tables which used by pgagent_10 3.4.

 

Ah, OK. Try running the following (assuming *only* 3.4 is installed):

 

CREATE EXTENSION pgagent FROM unpackaged;

 

That should use the pgagent--unpackaged--3.4.sql script to convert the existing 3.4 installation into an extension. 

 

Then, you should be able to do the 4.0 upgrade by installing the package, updating the extension, and then restarting pgAgent.

 

 

 

   But I cant list pgagent extension.

 

Currently. There are over 30 jobs running by pgagent. Most of time pgagent could running well.

I dont want to re-create those jobs.  I want to  try to find a solution to upgrade pgagent on this env.

  If this env have unknown problem when deploy. Do you have any suggestion to upgrade 4.0 without re-create all pgagent jobs ? Thanks

   

 

徐志宇(Jack

Database Engineer

    

DB Team,ITS. Lenovo China

Phone: 86-18910860709

Email:xuzy13@lenovo.com

No.6 Shangdi West Road, Haidian District Beijing, China, 100085

 

发件人: Dave Page <dpage@pgadmin.org>
发送时间: 2020122 21:16
收件人: Zhiyu ZY13 Xu <xuzy13@lenovo.com>
抄送: pgadmin-support@postgresql.org
主题: Re: [External] Re: pgadmin--pgagent---the process hang by unknow reasons

 

Hi

 

On Wed, Dec 2, 2020 at 6:50 AM Zhiyu ZY13 Xu <xuzy13@lenovo.com> wrote:

Hi Dave 

 

As a reference point, I installed a fresh CentOS 6.10 VM (which is somewhat less easy than it used to be, since it went EOL 2 days ago and the repos and ISOs have all been moved). I then installed postgresql10-server, ran initdb and started the server. Once that was up and running, I installed pgagent_10-3.4.0. I was able to login to Postgres and run "CREATE EXTENSION pgagent;" without any errors. I then started pgAgent to ensure it ran correctly, which it did.

 

Then, I ran a "yum upgrade pgagent_10", which installed the 4.0.0 release. In PostgreSQL, I ran "ALTER EXTENSION pgagent UPDATE;", which successfully upgraded the schema. I restarted pgAgent and everything looked good.

 

So, this doesn't appear to be a general problem, but something specific to your system. Is there anything in the process above that doesn't match with what you've tried to do?

 

More comments below....

 

 

   Thanks for your update. I collect logs for your reference.  

   By the way. Is there any upgrade document which mentioned this scenario ?  upgrade pgagent_10 3.4 to pgagent_10 4.0.  

5: start pgagent process with new version pgagent.

With what error? Anything in the pgAgent log or /var/log/messages etc? 

There is no errors.The pgagent process disappear after execute the command.

On pgagent_10 3.4 version. There is a pgagent_10 process running.  

 

 

If you don't include the "-f" command line option, pgAgent will detach from the foreground and run as a background process, returning you to a shell prompt. I would suggest running ps to check you don't have a number of pgAgent processes running in the background (and if you do, kill them).

 

Also note that any command line options *must* come before the connection string, not after it. That would explain why the -s and -l options you've used above don't seem to do anything.

 

LINE 1: CREATE EXTENSION pgagent UPDATE;

                                 ^

postgres=# alter EXTENSION pgagent UPDATE;

ERROR:  extension "pgagent" does not exist

 

That implies that pgAgent isn't installed properly. Are you using the PostgreSQL packages from yum.postgresql.org? If not, how did you install PostgreSQL? The pgagent_10 package is designed to work with the PostgreSQL 10 RPMs from the same yum.postgresql.org repo.

 

 

 

 

 

徐志宇(Jack

Database Engineer

    

DB Team,ITS. Lenovo China

Phone: 86-18910860709

Email:xuzy13@lenovo.com

No.6 Shangdi West Road, Haidian District Beijing, China, 100085

 

发件人: Dave Page <dpage@pgadmin.org>
发送时间: 2020121 22:27
收件人: Zhiyu ZY13 Xu <xuzy13@lenovo.com>
抄送: pgadmin-support@postgresql.org
主题: Re: [External] Re: pgadmin--pgagent---the process hang by unknow reasons

 

Hi

 

On Tue, Dec 1, 2020 at 12:38 PM Zhiyu ZY13 Xu <xuzy13@lenovo.com> wrote:

Hi Dave

 

   The command output is :

 

My test system. All commands executed on this system.

Before upgrade pgagent. I execute follow in steps.

 

1: backup and restore production database.

2: install pgagent by yum command .

Yum install pgagent_10

Yum downgrade pgagent_10.  (Then pgagent backup to 3.4 )

 

3: start pgagent process. Make sure pgagent running on normal state.

 

4: execute yum upgrade command to pgagent 4.0

 

5: start pgagent process with new version pgagent.

Failed.

 

With what error? Anything in the pgAgent log or /var/log/messages etc?

 

 

6: try to upgrade by the upgrade script.

   Failed.

 

With what error?

 

 

7: Build and compile pgagent 4.2.

Try to upgrade pgagent. Failed.

  

[root@slbwcbnos2 ~]# ls -al /usr/pgsql-10/share/extension

total 40

drwxr-xr-x 2 root root   118 Dec  1 14:59 .

drwxr-xr-x 3 root root    23 Dec  1 13:55 ..

-rw-r--r-- 1 root root   538 Jul 12  2018 pgagent--3.4--4.0.sql

-rw-r--r-- 1 root root 27657 Apr  5  2020 pgagent--4.0.sql

-rw-r--r-- 1 root root   139 Apr  5  2020 pgagent.control

-rw-r--r-- 1 root root  2603 Jul 12  2018 pgagent--unpackaged--4.0.sql

 

 

That indicates that pgAgent 4.0 is installed, not 4.2. I would suggest not trying to build it yourself at this stage.

 

 

 

Our production system.

 

[root@sltfjfrauxq ~]# ls -al /usr/pgsql-10/share/extension

total 44

drwxr-xr-x 2 root root  4096 Nov 30 23:46 .

drwxr-xr-x 3 root root  4096 Nov 22  2018 ..

-rw-r--r-- 1 root root 27665 Oct 18  2017 pgagent--3.4.sql

-rw-r--r-- 1 root root   139 Oct 18  2017 pgagent.control

-rw-r--r-- 1 root root  2358 Feb 26  2014 pgagent--unpackaged--3.4.sql

[root@sltfjfrauxq ~]#

 

 

Right - that looks like 3.4 is installed, as expected.

 

 

徐志宇(Jack

Database Engineer

    

DB Team,ITS. Lenovo China

Phone: 86-18910860709

Email:xuzy13@lenovo.com

No.6 Shangdi West Road, Haidian District Beijing, China, 100085

 

发件人: Dave Page <dpage@pgadmin.org>
发送时间: 2020121 18:33
收件人: Zhiyu ZY13 Xu <xuzy13@lenovo.com>
抄送: pgadmin-support@postgresql.org
主题: Re: [External] Re: pgadmin--pgagent---the process hang by unknow reasons

 

Hi

 

It sounds like the extension hasn't been installed properly at the operating system level - what is the output from:

 

ls -al /usr/pgsql-10/share/extension

 

On Tue, Dec 1, 2020 at 10:00 AM Zhiyu ZY13 Xu <xuzy13@lenovo.com> wrote:

Hi Dave

 

  Thanks for your quick response.

 

I try this command on 4.0 and 4.2. I got a warning that the extension not exist.

 

 

 

 

 

徐志宇(Jack

Database Engineer

    

DB Team,ITS. Lenovo China

Phone: 86-18910860709

Email:xuzy13@lenovo.com

No.6 Shangdi West Road, Haidian District Beijing, China, 100085

 

发件人: Dave Page <dpage@pgadmin.org>
发送时间: 2020121 17:48
收件人: Zhiyu ZY13 Xu <xuzy13@lenovo.com>
抄送: pgadmin-support@postgresql.org
主题: Re: [External] Re: pgadmin--pgagent---the process hang by unknow reasons

 

Hi

 

On Tue, Dec 1, 2020 at 9:17 AM Zhiyu ZY13 Xu <xuzy13@lenovo.com> wrote:

Hi Dave

 

   Thanks for your update. I try to upgrade pgagent_10-3.4.0. But failed.

   Would you like to help me narrow down this upgrade issue Thanks

   

upgrade path

version upgrade

status

yum install pgagent_10

\i pgagent--3.4--4.0.sql

create extension pgagent;

3.4 4.0

failed

compile and build. (install cmake and Boost)

 

\i pgagent--3.4--4.2.sql

create extension pgagent;

3.4 4.2

failed.

 

 Currently pgagent_10 3.4.0 env  installed by my teammate. He told me that he install it by yum command and execute a script.

Then those pgagent tables created by this script. There is no extension exist. But those pgagent table exist.

 

 

 Then I try upgrade pgagent_10 3.40 to 4.0 by yum command.  Its successfully upgrade by yum command.

But the pgagent_10 process unable start.  

 

[root@sltfjfrauxq ~]# yum upgrade pgagent_10

 

Total download size: 138 k

Is this ok [y/N]: y

Downloading Packages:

pgagent_10-4.0.0-4.rhel6.x86_64.rpm                                                                       | 138 kB     00:08

Running rpm_check_debug

Running Transaction Test

Transaction Test Succeeded

Running Transaction

  Updating   : pgagent_10-4.0.0-4.rhel6.x86_64                                                                               1/2

  Cleanup    : pgagent_10-3.4.0-10.rhel6.x86_64                                                                              2/2

  Verifying  : pgagent_10-4.0.0-4.rhel6.x86_64                                                                               1/2

  Verifying  : pgagent_10-3.4.0-10.rhel6.x86_64                                                                              2/2

 

Updated:

  pgagent_10.x86_64 0:4.0.0-4.rhel6

 

Complete!

Version: 4.0.0

 

 

 

The 4.0 pgagent start logs.

Tue Dec 1 13:57:53 2020 DEBUG: Creating primary connection

Tue Dec 1 13:57:53 2020 DEBUG: Parsing connection information...

Tue Dec 1 13:57:53 2020 DEBUG: user: postgres

Tue Dec 1 13:57:53 2020 DEBUG: password: *****

Tue Dec 1 13:57:53 2020 DEBUG: dbname: postgres

Tue Dec 1 13:57:53 2020 DEBUG: hostaddr: 127.0.0.1

Tue Dec 1 13:57:53 2020 DEBUG: port: 5432

Tue Dec 1 13:57:53 2020 DEBUG: Creating DB connection: user=postgres password=abcd-1234  hostaddr=127.0.0.1 port=5432 dbname=postgres

Tue Dec 1 13:57:53 2020 DEBUG: Database sanity check

 

I try to upgrade it by sql. But still on failed. 

 

[root@slbwcbnos2 sql]# pwd

/data/postgres/new_package/pgAgent-4.0.0-Source/sql

[root@slbwcbnos2 sql]# ls -al

total 36

drwxr-xr-x 2 501 games    90 Dec  1 14:25 .

drwxr-xr-x 6 501 games   298 Jul 12  2018 ..

-rw-r--r-- 1 501 games   538 Jul 12  2018 pgagent--3.4--4.0.sql

-rw-r--r-- 1 501 games 27772 Jul 12  2018 pgagent.sql

-rw-r--r-- 1 501 games  2603 Jul 12  2018 pgagent--unpackaged--4.0.sql

 

postgres=# \i pgagent--3.4--4.0.sql

Use "CREATE EXTENSION pgagent UPDATE" to load this file.

postgres=# CREATE EXTENSION pgagent UPDATE;

ERROR:  syntax error at or near "UPDATE"

LINE 1: CREATE EXTENSION pgagent UPDATE;

                                 ^

postgres=# CREATE EXTENSION pgagent UPDATE;

ERROR:  syntax error at or near "UPDATE"

LINE 1: CREATE EXTENSION pgagent UPDATE;

                                 ^

postgres=# CREATE EXTENSION pgagent ;

ERROR:  extension "pgagent" has no installation script nor update path for version "3.4"

postgres=#

 

After failed. I try to build and compile the latest version 4.2.

 Build and install successfully. But I still on unable upgrade current 3.4 version.   

 I could create extension successfully on a new database.

 

 

postgres=# \i pgagent--3.4--4.2.sql

Use "CREATE EXTENSION pgagent UPDATE" to load this file.

postgres=# CREATE EXTENSION pgagent UPDATE;

ERROR:  syntax error at or near "UPDATE"

LINE 1: CREATE EXTENSION pgagent UPDATE;

                                 ^

postgres=# CREATE EXTENSION pgagent ;

ERROR:  relation "pga_jobagent" already exists

postgres=#

 

 

 

 

I think the message is wrong (which I'll fix if you can confirm) - it should be:

 

 ALTER EXTENSION pgagent UPDATE; 

 

--

Dave Page
Blog: http://pgsnake.blogspot.com
Twitter: @pgsnake

EDB: http://www.enterprisedb.com


 

--

Dave Page
Blog: http://pgsnake.blogspot.com
Twitter: @pgsnake

EDB: http://www.enterprisedb.com


 

--

Dave Page
Blog: http://pgsnake.blogspot.com
Twitter: @pgsnake

EDB: http://www.enterprisedb.com


 

--

Dave Page
Blog: http://pgsnake.blogspot.com
Twitter: @pgsnake

EDB: http://www.enterprisedb.com


 

--

Dave Page
Blog: http://pgsnake.blogspot.com
Twitter: @pgsnake

EDB: http://www.enterprisedb.com



--
Dave Page
Blog: http://pgsnake.blogspot.com
Twitter: @pgsnake

EDB: http://www.enterprisedb.com

Вложения

В списке pgadmin-support по дате отправления:

Предыдущее
От: Neel Patel
Дата:
Сообщение: Re: PgAgent Jobs randomly failing to start
Следующее
От: Akshay Joshi
Дата:
Сообщение: ERDTool alpha version for trial