On Sun, Oct 08, 2023 at 10:00:03PM -0700, David G. Johnston wrote:
> On Sun, Oct 8, 2023 at 9:10 PM Noah Misch <noah@leadboat.com> wrote:
> > I didn't think of any phrasing that clearly explained things without the
> > reader consulting the code. I considered these:
I'm leaning toward:
"socket file descriptor out of range for select(): %d" [style guide violation]
A true style guide purist might bury it in a detail message:
ERROR: socket file descriptor out of range: %d
DETAIL: select() accepts file descriptors from 0 to 1023, inclusive, in this build.
HINT: Try fewer concurrent database clients.
> > "socket file descriptor out of range: %d" [what range?]
> >
> Quick drive-by...but it seems that < 0 is a distinctly different problem
> than exceeding FD_SETSIZE. I'm unsure what would cause the former but the
> error for the later seems like:
>
> error: "Requested socket file descriptor %d exceeds connection limit of
> %d", fd, FD_SETSIZE-1
> hint: Reduce the requested number of concurrent connections
>
> In short, the concept of range doesn't seem applicable here. There is an
> error state at the max, and some invalid system state condition where the
> position within a set is somehow negative. These should be separated -
> with the < 0 check happening first.
I view it as: the range of select()-able file descriptors is [0,FD_SETSIZE).
Negative is out of range.
> And apparently this condition isn't applicable when dealing with jobs in
> connect_slot?
For both pgbench.c and parallel_slot.c, there are sufficient negative-FD
checks elsewhere in the file. Ideally, either both files would have redundant
checks, or neither file would. I didn't feel the need to mess with that part
of the status quo.
> Also, I see that for connections we immediately issue FD_SET
> so this check on the boundary of the file descriptor makes sense. But the
> remaining code in connect_slot doesn't involve FD_SET so the test for the
> file descriptor falling within its maximum seems to be coming out of
> nowhere. Likely this is all good, and the lack of symmetry is just due to
> the natural progressive development of the code, but it stands out to the
> uninitiated looking at this patch.
True. The approach in parallel_slot.c is to check the FD number each time it
opens a socket, after which its loop with FD_SET() doesn't recheck. That's a
bit more efficient than the pgbench.c way, because each file may call FD_SET()
many times per socket. Again, I didn't mess with that part of the status quo.