Dear Kato-san,
Thank you for your interest!
> > I also want you to review the postgres_fdw part,
> > but I think it should not be attached because cfbot cannot understand
> > such a dependency
> > and will throw build error. Do you know how to deal with them in this
> > case?
>
> I don't know how to deal with them, but I hope you will attach the PoC,
> as it may be easier to review.
OK, I attached the PoC along with the dependent patches. Please see the zip file.
add_helth_check_... patch is written by me, and other two patches are
just copied from [1].
In the new callback function ConnectionHash is searched sequentially and
WaitEventSetWait() is performed for WL_SOCKET_CLOSED socket event.
This event is added by the dependent ones.
===
How to use
===
I'll explain how to use it briefly.
1. boot two postmaster processes. One is coordinator, and another is worker
2. set remote_servers_connection_check_interval to non-zero value at the coordinator
3. create tables to worker DB-cluster.
4. create foreign server, user mapping, and foreign table to coordinator.
5. connect to coordinator via psql.
6. open a transaction and access to foreing tables.
7. do "pg_ctl stop" command to woker DB-cluser.
8. execute some commands that does not access an foreign table.
9. Finally the following output will be get:
ERROR: Postgres foreign server XXX might be down.
===
Example in some steps
===
3. at worker
```
postgres=# \d
List of relations
Schema | Name | Type | Owner
--------+--------+-------+--------
public | remote | table | hayato
(1 row)
```
4. at coordinator
```
postgres=# select * from pg_foreign_server ;
oid | srvname | srvowner | srvfdw | srvtype | srvversion | srvacl | srvoptions
-------+---------+----------+--------+---------+------------+--------+-----------------------------
16406 | remote | 10 | 16402 | | | | {port=5433,dbname=postgres}
(1 row)
postgres=# select * from pg_user_mapping ;
oid | umuser | umserver | umoptions
-------+--------+----------+---------------
16407 | 10 | 16406 | {user=hayato}
(1 row)
postgres=# \d
List of relations
Schema | Name | Type | Owner
--------+--------+---------------+--------
public | local | table | hayato
public | remote | foreign table | hayato
(2 rows)
```
6-9. at coordinator
```
postgres=# begin;
BEGIN
postgres=*# select * from remote ;
id
----
1
(1 row)
postgres=*# select * from local ;
ERROR: Postgres foreign server remote might be down.
postgres=!#
```
Note that some keepalive settings are needed
if you want to detect cable breakdown events.
In my understanding following parameters are needed as server options:
* keepalives_idle
* keepalives_count
* keepalives_interval
[1]: https://commitfest.postgresql.org/35/3098/
Best Regards,
Hayato Kuroda
FUJITSU LIMITED