Andres Freund wrote:
> On 2014-12-31 18:35:38 +0530, Amit Kapila wrote:
> > + PQsetnonblocking(connSlot[0].connection, 1);
> > +
> > + for (i = 1; i < concurrentCons; i++)
> > + {
> > + connSlot[i].connection = connectDatabase(dbname, host, port, username,
> > + prompt_password, progname, false);
> > +
> > + PQsetnonblocking(connSlot[i].connection, 1);
> > + connSlot[i].isFree = true;
> > + connSlot[i].sock = PQsocket(connSlot[i].connection);
> > + }
>
> Are you sure about this global PQsetnonblocking()? This means that you
> might not be able to send queries... And you don't seem to be waiting
> for sockets waiting for writes in the select loop - which means you
> might end up being stuck waiting for reads when you haven't submitted
> the query.
>
> I think you might need a more complex select() loop. On nonfree
> connections also wait for writes if PQflush() returns != 0.
I removed the PQsetnonblocking() calls. They were a misunderstanding, I
think.
> > +/*
> > + * GetIdleSlot
> > + * Process the slot list, if any free slot is available then return
> > + * the slotid else perform the select on all the socket's and wait
> > + * until atleast one slot becomes available.
> > + */
> > +static int
> > +GetIdleSlot(ParallelSlot *pSlot, int max_slot, const char *dbname,
> > + const char *progname, bool completedb)
> > +{
> > + int i;
> > + fd_set slotset;
>
>
> Hm, you probably need to limit -j to FD_SETSIZE - 1 or so.
I tried without the check to use 1500 connections, and select() didn't
even blink -- everything worked fine vacuuming 1500 tables in parallel
on a set of 2000 tables. Not sure what's the actual limit but my
FD_SETSIZE says 1024, so I'm clearly over the limit. (I tried to run it
with -j2000 but the server didn't start with that many connections. I
didn't try any intermediate numbers.) Anyway I added the check.
I fixed some more minor issues and pushed.
--
Álvaro Herrera http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services