Re: Use-after-free issue in postgres_fdw

Поиск
Список
Период
Сортировка
От Matheus Alcantara
Тема Re: Use-after-free issue in postgres_fdw
Дата
Msg-id 65f59349-2ef6-4ddc-82a0-691ca30743e6@gmail.com
обсуждение
Ответ на Re: Use-after-free issue in postgres_fdw  (Chao Li <li.evan.chao@gmail.com>)
Ответы Re: Use-after-free issue in postgres_fdw
Список pgsql-hackers
On 20/03/26 05:27, Chao Li wrote:
> I can reproduce the server crash following your procedure, and I traced the problem.
> 
> The issue is that, during select * from ft1, pgfdw_reject_incomplete_xact_state_change() calls
disconnect_pg_server(),which destroys conn and sets ConnCacheEntry->conn = NULL, but does not update
PgFdwScanState->conn.As a result, when "close c1" is executed later, PgFdwScanState->conn points to stale memory with
randomcontents.
 
> 
> I am not sure we should still allow further commands to run after select * from ft1, given that it has already
raised:"ERROR:  connection to server "loopback" was lost”. Maybe we should not keep going as if the connection were
stillthere.
 
> 
> I am not very familiar with the FDW code, so I am not ready to suggest a concrete fix. But it seems wrong to let
laterpaths keep using PgFdwScanState->conn after select * from ft1 has already failed with connection loss. My guess is
thatwe either need to invalidate all dependent state when disconnect_pg_server() runs, or otherwise prevent later
cleanuppaths from touching the cached PGconn *.
 
> 

Although I agree with this I think that it will be a quite invasive 
change to fix this issue, considering that it should be back-ported.

I see two ways to implement this:

1. ForeignScanState->conn points to a ConnCacheEntry and it access the 
PGConn via ConnCacheEntry->conn, so it can check if != NULL before using.

2. Make pgfdw_reject_incomplete_xact_state_change() or 
disconnect_pg_server() receive a PgFdwConnState and add a new field on 
this state to represent that the connection is closed and them check 
this field on the proper code paths.

With the patch proposed on the previous email the server don't crash 
and any use of PgFdwScanState->conn will make the command to fail with 
something like this:

    ERROR:  08006: no connection to the server
    CONTEXT:  remote SQL command: CLOSE c1
    LOCATION:  pgfdw_report_internal, connection.c:1059


So I don't think that this change would be worth, or I'm missing 
something? What do you think?

--
Matheus Alcantara
EDB: https://www.enterprisedb.com



В списке pgsql-hackers по дате отправления: