Re: Parallel Seq Scan
От | Robert Haas |
---|---|
Тема | Re: Parallel Seq Scan |
Дата | |
Msg-id | CA+TgmobBZ=0n=JcS28hBxVBaSXeZHBQCnxVzCTUSPMe1zsuGdw@mail.gmail.com обсуждение исходный текст |
Ответ на | Re: Parallel Seq Scan (Amit Kapila <amit.kapila16@gmail.com>) |
Ответы |
Re: Parallel Seq Scan
(Amit Kapila <amit.kapila16@gmail.com>)
|
Список | pgsql-hackers |
On Thu, Jan 8, 2015 at 6:42 AM, Amit Kapila <amit.kapila16@gmail.com> wrote: > Are we sure that in such cases we will consume work_mem during > execution? In cases of parallel_workers we are sure to an extent > that if we reserve the workers then we will use it during execution. > Nonetheless, I have proceded and integrated the parallel_seq scan > patch with v0.3 of parallel_mode patch posted by you at below link: > http://www.postgresql.org/message-id/CA+TgmoYmp_=XcJEhvJZt9P8drBgW-pDpjHxBhZA79+M4o-CZQA@mail.gmail.com That depends on the costing model. It makes no sense to do a parallel sequential scan on a small relation, because the user backend can scan the whole thing itself faster than the workers can start up. I suspect it may also be true that the useful amount of parallelism increases the larger the relation gets (but maybe not). > 2. To enable two types of shared memory queue's (error queue and > tuple queue), we need to ensure that we switch to appropriate queue > during communication of various messages from parallel worker > to master backend. There are two ways to do it > a. Save the information about error queue during startup of parallel > worker (ParallelMain()) and then during error, set the same (switch > to error queue in errstart() and switch back to tuple queue in > errfinish() and errstart() in case errstart() doesn't need to > propagate > error). > b. Do something similar as (a) for tuple queue in printtup or other > place > if any for non-error messages. > I think approach (a) is slightly better as compare to approach (b) as > we need to switch many times for tuple queue (for each tuple) and > there could be multiple places where we need to do the same. For now, > I have used approach (a) in Patch which needs some more work if we > agree on the same. I don't think you should be "switching" queues. The tuples should be sent to the tuple queue, and errors and notices to the error queue. > 3. As per current implementation of Parallel_seqscan, it needs to use > some information from parallel.c which was not exposed, so I have > exposed the same by moving it to parallel.h. Information that is required > is as follows: > ParallelWorkerNumber, FixedParallelState and shm keys - > This is used to decide the blocks that needs to be scanned. > We might change it in future the way parallel scan/work distribution > is done, but I don't see any harm in exposing this information. Hmm. I can see why ParallelWorkerNumber might need to be exposed, but the other stuff seems like it shouldn't be. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
В списке pgsql-hackers по дате отправления: