Обсуждение: libpq v17 PQsocketPoll timeout is not granular enough
Hi. I've noticed [that libpq API in v17 beta1][1], and wanted to use it to replace an existing Boost.ASIO-based async polling of the connection's socket, waiting for notifications. The use case being using PostgreSQL LISTEN/NOTIFY for a simple message queue. The code needs to be cross-platform Windows and Linux. My goal would be to eliminate that Boost.ASIO dependency for that, to use just libpq. PQsocketPoll() being based on time_t, it has only second resolution, AFAIK. Despite the [underlying implementation in fe-misc.c][2] supporting at least milliseconds. My use case is clients posting "requests" to the "queue" (i.e. a PostgreSQL table), which trigger NOTIFY messages, waited on by "servers"; and those servers informing back clients via further notifications (on per-request channels) about the processing status of their requests. In that use case, second-resolution on long-lived servers is OK. But OTOH, for the client side, waiting 1s or more to know whether a server picked up their request is "too long / slow". I'd need millisecond timeouts for that. The doc for PQsocketPoll() mentions a different use case for that API. But I think it would a pity if that unreleased API couldn't be made to accomodate sub-second timeouts and more use-cases, like above. Especially given that libpq v17 isn't out yet. I may come late to the game, but hopefully it is not too late. Thoughts? Thanks, --DD [1]: https://www.postgresql.org/docs/17/libpq-connect.html#LIBPQ-PQSOCKETPOLL [2]: https://github.com/postgres/postgres/blob/master/src/interfaces/libpq/fe-misc.c#L1086
Dominique Devienne <ddevienne@gmail.com> writes: > PQsocketPoll() being based on time_t, it has only second resolution, AFAIK. > Despite the [underlying implementation in fe-misc.c][2] supporting at > least milliseconds. > ... > But I think it would a pity if that unreleased API couldn't be made to > accomodate sub-second timeouts and more use-cases, like above. > Especially given that libpq v17 isn't out yet. I may come late to the > game, but hopefully it is not too late. This is an interesting suggestion, but I think yes it's too late. We're already past beta1 and this'd require some fairly fundamental rethinking, since there's no easy substitute for type time_t that has millisecond resolution. (The callers do want to specify an end time not a timeout interval, since some of them loop around PQsocketPoll and don't want the end time to slip.) I guess conceivably we could use gettimeofday() and struct timeval instead of time() and time_t, but it'd touch a lot of places in libpq and it'd make some of the calculations a lot more complex. Maybe a better idea would be to convert to using our src/include/portability/instr_time.h abstraction? But that would be problematic for outside callers. In any case this doesn't seem like a sane thing to be redesigning post-beta. A few months ago maybe we'd have done it, but ... regards, tom lane
Bummer… I didn’t presume to suggest an api before, but simply adding an extra int with the milliseconds offset from the time_t is simple, and trivial to plug into the implementation I saw. Callers who don’t care can simply pass zero. while I could pass a computed time_t and ms offset using <chrono>. No need for fancy types imho. Aren’t betas precisely for the purpose of exposing apis to those like myself to vet them? This is also beta1, I,e, the first one. My €0.02
On Mon, Jun 10, 2024 at 6:18 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:
Dominique Devienne <ddevienne@gmail.com> writes:
> PQsocketPoll() being based on time_t, it has only second resolution, AFAIK.
> Despite the [underlying implementation in fe-misc.c][2] supporting at
> least milliseconds.
> ...
> But I think it would a pity if that unreleased API couldn't be made to
> accomodate sub-second timeouts and more use-cases, like above.
> Especially given that libpq v17 isn't out yet. I may come late to the
> game, but hopefully it is not too late.
This is an interesting suggestion, but I think yes it's too late.
We're already past beta1 and this'd require some fairly fundamental
rethinking, since there's no easy substitute for type time_t that has
millisecond resolution. (The callers do want to specify an end time
not a timeout interval, since some of them loop around PQsocketPoll
and don't want the end time to slip.)
I guess conceivably we could use gettimeofday() and struct timeval
instead of time() and time_t, but it'd touch a lot of places in
libpq and it'd make some of the calculations a lot more complex.
Maybe a better idea would be to convert to using our
src/include/portability/instr_time.h abstraction? But that
would be problematic for outside callers.
In any case this doesn't seem like a sane thing to be redesigning
post-beta. A few months ago maybe we'd have done it, but ...
regards, tom lane
On 6/10/24 11:43, Dominique Devienne wrote: > Bummer… I didn’t presume to suggest an api before, but simply adding an > extra int with the milliseconds offset from the time_t is simple, and > trivial to plug into the implementation I saw. Callers who don’t care > can simply pass zero. while I could pass a computed time_t and ms offset > using <chrono>. No need for fancy types imho. Aren’t betas precisely for https://www.postgresql.org/developer/beta/ "PostgreSQL beta and release candidate releases are pre-release testing versions before the community makes a new release generally available. They are feature-frozen (i.e. no new features are added), and we release these to the public for testing before our final release. PostgreSQL beta and release candidate release are not meant for use in production systems." > the purpose of exposing apis to those like myself to vet them? This is > also beta1, I,e, the first one. My €0.02 > -- Adrian Klaver adrian.klaver@aklaver.com
On Tue, Jun 11, 2024 at 2:36 AM Dominique Devienne <ddevienne@gmail.com> wrote: > Hi. I've noticed [that libpq API in v17 beta1][1], and wanted to use > it to replace an existing Boost.ASIO-based async polling of the > connection's socket, waiting for notifications. The use case being > using PostgreSQL LISTEN/NOTIFY for a simple message queue. The code > needs to be cross-platform Windows and Linux. My goal would be to > eliminate that Boost.ASIO dependency for that, to use just libpq. One idea I have wondered about is why you wouldn't just use poll() directly, if you want a facility that works on all known operating systems without extra portability libraries. Windows has WSApoll(), which AFAIK was designed to be Unix-compatible and a drop-in replacement, requiring just a rename but otherwise having the same macros and struct etc. For some period of time, people who had to deal with socket connection events (that includes us) avoided it, with the Curl guys' blog being the most often cited public explanation for why: https://daniel.haxx.se/blog/2012/10/10/wsapoll-is-broken/ However, as far as I know, that was fixed ~4 years ago: https://learn.microsoft.com/en-us/windows/win32/api/winsock2/nf-winsock2-wsapoll "Note As of Windows 10 version 2004, when a TCP socket fails to connect, (POLLHUP \| POLLERR \| POLLWRNORM) is indicated." I wonder if that means that it's now completely usable on all reasonable versions of the OS. I think so? I don't use Windows myself, my interest in this topic is that I have a slow moving background project to figure out how and when to remove all remaining uses of select() from our tree, and this one is on my hit list. > PQsocketPoll() being based on time_t, it has only second resolution, AFAIK. > Despite the [underlying implementation in fe-misc.c][2] supporting at > least milliseconds. Yeah, that is not nice and your complaint is very reasonable, and we should probably do something like what Tom suggested. Hmm, but if what I speculated above is true, I wonder if the extern function is even worth its bits... but I don't know how to determine that completely.
Thomas Munro <thomas.munro@gmail.com> writes: > On Tue, Jun 11, 2024 at 2:36 AM Dominique Devienne <ddevienne@gmail.com> wrote: >> PQsocketPoll() being based on time_t, it has only second resolution, AFAIK. >> Despite the [underlying implementation in fe-misc.c][2] supporting at >> least milliseconds. > Yeah, that is not nice and your complaint is very reasonable, and we > should probably do something like what Tom suggested. > Hmm, but if what I speculated above is true, I wonder if the extern > function is even worth its bits... but I don't know how to determine > that completely. I think if we're going to change anything at all here, we should define the external API in microseconds not milliseconds. The lesson we need to be taking from this is that system calls come and go, but libpq API is forever ;-). Somebody could conceivably want sub-millisecond wait resolution within the lifespan of libpq. regards, tom lane
Tom: On Tue, 11 Jun 2024 at 01:49, Tom Lane <tgl@sss.pgh.pa.us> wrote: > I think if we're going to change anything at all here, we should > define the external API in microseconds not milliseconds. The lesson > we need to be taking from this is that system calls come and go, > but libpq API is forever ;-). Somebody could conceivably want > sub-millisecond wait resolution within the lifespan of libpq. I was thinking on that, but since you need at least 20 bits, so 32, for microseconds I would suggest nanoseconds, which fit in 32 too. Sure, nanos seems too much for the current time but it pushes the future problem further down, nearly forever for comms between different machines until someone develops FTL networks. Francisco Olarte.