Обсуждение: Segmentation fault when max_parallel degree is very High

Поиск
Список
Период
Сортировка

Segmentation fault when max_parallel degree is very High

От
Dilip Kumar
Дата:
When parallel degree is set to very high say 70000, there is a segmentation fault in parallel code,
and that is because type casting is missing in the code..

Take a look at below test code:

create table abd(n int) with (parallel_degree=70000);
insert into abd values (generate_series(1,1000000)); analyze abd; vacuum abd;
set max_parallel_degree=70000;
explain analyze verbose select * from abd where n<=1;

server closed the connection unexpectedly
This probably means the server terminated abnormally
before or while processing the request.
The connection to the server was lost. Attempting reset: LOG:  server process (PID 41906) was terminated by signal 11: Segmentation fault
DETAIL:  Failed process was running: explain analyze verbose select * from abd where n<=1;


This is crashing because in ExecParallelSetupTupleQueues function, 

for (i = 0; i < pcxt->nworkers; ++i)

{

... 

(Here i is Int but arg to shm_mq_create is Size so when worker is beyond 32767 then 32767*65536 will overflow

the integer boundary, and it will access the illegal memory and will crash or corrupt some memory. Need to typecast

i * PARALLEL_TUPLE_QUEUE_SIZE  --> (Size)i * PARALLEL_TUPLE_QUEUE_SIZE and this will fix

 mq = shm_mq_create(tqueuespace + i * PARALLEL_TUPLE_QUEUE_SIZE, (Size)PARALLEL_TUPLE_QUEUE_SIZE);

...
}

Below attached patch will fix this issue, Apart from here I have done typecasting at other places also wherever its needed.
typecasting at other places will fix other issue (ERROR:  requested shared memory size overflows size_t) also 
described in below mail thread

http://www.postgresql.org/message-id/570BACFC.6020305@enterprisedb.com


--
Regards,
Dilip Kumar
EnterpriseDB: http://www.enterprisedb.com
Вложения

Re: Segmentation fault when max_parallel degree is very High

От
Tom Lane
Дата:
Dilip Kumar <dilipbalaut@gmail.com> writes:
> When parallel degree is set to very high say 70000, there is a segmentation
> fault in parallel code,
> and that is because type casting is missing in the code..

I'd say the cause is not having a sane range limit on the GUC.

> or corrupt some memory. Need to typecast
> *i * PARALLEL_TUPLE_QUEUE_SIZE  --> (Size)i * **PARALLEL_TUPLE_QUEUE_SIZE *and
> this will fix

That might "fix" it on 64-bit machines, but not 32-bit.
        regards, tom lane



Re: Segmentation fault when max_parallel degree is very High

От
Amit Kapila
Дата:
On Wed, May 4, 2016 at 8:31 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>
> Dilip Kumar <dilipbalaut@gmail.com> writes:
> > When parallel degree is set to very high say 70000, there is a segmentation
> > fault in parallel code,
> > and that is because type casting is missing in the code..
>
> I'd say the cause is not having a sane range limit on the GUC.
>

I think it might not be advisable to have this value more than the number of CPU cores, so how about limiting it to 512 or 1024?
 

With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com

Re: Segmentation fault when max_parallel degree is very High

От
Robert Haas
Дата:
On Wed, May 4, 2016 at 11:01 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> Dilip Kumar <dilipbalaut@gmail.com> writes:
>> When parallel degree is set to very high say 70000, there is a segmentation
>> fault in parallel code,
>> and that is because type casting is missing in the code..
>
> I'd say the cause is not having a sane range limit on the GUC.
>
>> or corrupt some memory. Need to typecast
>> *i * PARALLEL_TUPLE_QUEUE_SIZE  --> (Size)i * **PARALLEL_TUPLE_QUEUE_SIZE *and
>> this will fix
>
> That might "fix" it on 64-bit machines, but not 32-bit.

Yeah, I think what we should do here is use mul_size(), which will
error out instead of crashing.

Putting a range limit on the GUC is a good idea, too, but I like
having overflow checks built into these code paths as a backstop, in
case a value that we think is a safe upper limit turns out to be less
safe than we think ... especially on 32-bit platforms.

I'll go do that, and also limit the maximum parallel degree to 1024,
which ought to be enough for anyone (see what I did there?).

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company