RE: [Proposal] Add foreign-server health checks infrastructure

Поиск
Список
Период
Сортировка
От kuroda.hayato@fujitsu.com
Тема RE: [Proposal] Add foreign-server health checks infrastructure
Дата
Msg-id TYAPR01MB58662CD4FD98AA475B3D10F9F59B9@TYAPR01MB5866.jpnprd01.prod.outlook.com
обсуждение исходный текст
Ответ на Re: [Proposal] Add foreign-server health checks infrastructure  (Shinya Kato <Shinya11.Kato@oss.nttdata.com>)
Ответы Re: [Proposal] Add foreign-server health checks infrastructure  (Shinya Kato <Shinya11.Kato@oss.nttdata.com>)
Список pgsql-hackers
Dear Kato-san,

Thank you for your interest!

> > I also want you to review the postgres_fdw part,
> > but I think it should not be attached because cfbot cannot understand
> > such a dependency
> > and will throw build error. Do you know how to deal with them in this
> > case?
>
> I don't know how to deal with them, but I hope you will attach the PoC,
> as it may be easier to review.

OK, I attached the PoC along with the dependent patches. Please see the zip file.
add_helth_check_... patch is written by me, and other two patches are
just copied from [1].
In the new callback function ConnectionHash is searched sequentially and
WaitEventSetWait() is performed for WL_SOCKET_CLOSED socket event.
This event is added by the dependent ones.

===
How to use
===

I'll explain how to use it briefly.

1. boot two postmaster processes. One is coordinator, and another is worker
2. set remote_servers_connection_check_interval to non-zero value at the coordinator
3. create tables to worker DB-cluster.
4. create foreign server, user mapping, and foreign table to coordinator.
5. connect to coordinator via psql.
6. open a transaction and access to foreing tables.
7. do "pg_ctl stop" command to woker DB-cluser.
8. execute some commands that does not access an foreign table.
9. Finally the following output will be get:

ERROR:  Postgres foreign server XXX might be down.

===
Example in some steps
===

3. at worker

```
postgres=# \d
        List of relations
 Schema |  Name  | Type  | Owner
--------+--------+-------+--------
 public | remote | table | hayato
(1 row)
```

4. at coordinator

```
postgres=# select * from pg_foreign_server ;
  oid  | srvname | srvowner | srvfdw | srvtype | srvversion | srvacl |         srvoptions
-------+---------+----------+--------+---------+------------+--------+-----------------------------
 16406 | remote  |       10 |  16402 |         |            |        | {port=5433,dbname=postgres}
(1 row)

postgres=# select * from pg_user_mapping ;
  oid  | umuser | umserver |   umoptions
-------+--------+----------+---------------
 16407 |     10 |    16406 | {user=hayato}
(1 row)

postgres=# \d
            List of relations
 Schema |  Name  |     Type      | Owner
--------+--------+---------------+--------
 public | local  | table         | hayato
 public | remote | foreign table | hayato
(2 rows)
```

6-9. at coordinator

```
postgres=# begin;
BEGIN
postgres=*# select * from remote ;
 id
----
  1
(1 row)

postgres=*# select * from local ;
ERROR:  Postgres foreign server remote might be down.
postgres=!#
```

Note that some keepalive settings are needed
if you want to detect cable breakdown events.
In my understanding following parameters are needed as server options:

* keepalives_idle
* keepalives_count
* keepalives_interval

[1]: https://commitfest.postgresql.org/35/3098/

Best Regards,
Hayato Kuroda
FUJITSU LIMITED


Вложения

В списке pgsql-hackers по дате отправления:

Предыдущее
От: Amul Sul
Дата:
Сообщение: Re: Should rename "startup process" to something else?
Следующее
От: Etsuro Fujita
Дата:
Сообщение: Re: postgres_fdw: commit remote (sub)transactions in parallel during pre-commit