Обсуждение: [HACKERS] Parallel Bitmap Heap Scans segfaults due to (tbm->dsa==NULL) onPostgreSQL 10
[HACKERS] Parallel Bitmap Heap Scans segfaults due to (tbm->dsa==NULL) onPostgreSQL 10
От
Tomas Vondra
Дата:
Hi, It seems that Q19 from TPC-H is consistently failing with segfaults due to calling tbm_prepare_shared_iterate() with (tbm->dsa==NULL). I'm not very familiar with how the dsa is initialized and passed around, but I only see the failures when the bitmap is constructed by a mix of BitmapAnd and BitmapOr operations. Another interesting observation is that setting force_parallel_mode=on may not be enough - there really need to be multiple parallel workers, which is why the simple query does cpu_tuple_cost=1. Attached is a bunch of files: 1) details for "full" query: * query.sql * plan.txt * backtrace.txt 2) details for the "minimal" query triggering the issue: * query-minimal.sql * plan-minimal.txt * backtrace-minimal.txt regards -- Tomas Vondra http://www.2ndQuadrant.com PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Вложения
On Thu, Oct 12, 2017 at 4:31 PM, Tomas Vondra <tomas.vondra@2ndquadrant.com> wrote: > Hi, > > It seems that Q19 from TPC-H is consistently failing with segfaults due > to calling tbm_prepare_shared_iterate() with (tbm->dsa==NULL). > > I'm not very familiar with how the dsa is initialized and passed around, > but I only see the failures when the bitmap is constructed by a mix of > BitmapAnd and BitmapOr operations. > I think I have got the issue, bitmap_subplan_mark_shared is not properly pushing the isshared flag to lower level bitmap index node, and because of that tbm_create is passing NULL dsa while creating the tidbitmap. So this problem will come in very specific combination of BitmapOr and BitmapAnd when BitmapAnd is the first subplan for the BitmapOr. If BitmapIndex is the first subplan under BitmapOr then there is no problem because BitmapOr node will create the tbm by itself and isshared is set for BitmapOr. Attached patch fixing the issue for me. I will thoroughly test this patch with other scenario as well. Thanks for reporting. -- Regards, Dilip Kumar EnterpriseDB: http://www.enterprisedb.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Вложения
On 10/12/2017 02:40 PM, Dilip Kumar wrote: > On Thu, Oct 12, 2017 at 4:31 PM, Tomas Vondra > <tomas.vondra@2ndquadrant.com> wrote: >> Hi, >> >> It seems that Q19 from TPC-H is consistently failing with segfaults due >> to calling tbm_prepare_shared_iterate() with (tbm->dsa==NULL). >> >> I'm not very familiar with how the dsa is initialized and passed around, >> but I only see the failures when the bitmap is constructed by a mix of >> BitmapAnd and BitmapOr operations. >> > I think I have got the issue, bitmap_subplan_mark_shared is not > properly pushing the isshared flag to lower level bitmap index node, > and because of that tbm_create is passing NULL dsa while creating the > tidbitmap. So this problem will come in very specific combination of > BitmapOr and BitmapAnd when BitmapAnd is the first subplan for the > BitmapOr. If BitmapIndex is the first subplan under BitmapOr then > there is no problem because BitmapOr node will create the tbm by > itself and isshared is set for BitmapOr. > > Attached patch fixing the issue for me. I will thoroughly test this > patch with other scenario as well. Thanks for reporting. > Yep, this fixes the failures for me. regards -- Tomas Vondra http://www.2ndQuadrant.com PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
On Thu, Oct 12, 2017 at 6:37 PM, Tomas Vondra <tomas.vondra@2ndquadrant.com> wrote: > > > On 10/12/2017 02:40 PM, Dilip Kumar wrote: >> On Thu, Oct 12, 2017 at 4:31 PM, Tomas Vondra >> <tomas.vondra@2ndquadrant.com> wrote: >>> Hi, >>> >>> It seems that Q19 from TPC-H is consistently failing with segfaults due >>> to calling tbm_prepare_shared_iterate() with (tbm->dsa==NULL). >>> >>> I'm not very familiar with how the dsa is initialized and passed around, >>> but I only see the failures when the bitmap is constructed by a mix of >>> BitmapAnd and BitmapOr operations. >>> >> I think I have got the issue, bitmap_subplan_mark_shared is not >> properly pushing the isshared flag to lower level bitmap index node, >> and because of that tbm_create is passing NULL dsa while creating the >> tidbitmap. So this problem will come in very specific combination of >> BitmapOr and BitmapAnd when BitmapAnd is the first subplan for the >> BitmapOr. If BitmapIndex is the first subplan under BitmapOr then >> there is no problem because BitmapOr node will create the tbm by >> itself and isshared is set for BitmapOr. >> >> Attached patch fixing the issue for me. I will thoroughly test this >> patch with other scenario as well. Thanks for reporting. >> > > Yep, this fixes the failures for me. > Thanks for confirming. -- Regards, Dilip Kumar EnterpriseDB: http://www.enterprisedb.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
On Thu, Oct 12, 2017 at 9:14 AM, Dilip Kumar <dilipbalaut@gmail.com> wrote: >> Yep, this fixes the failures for me. >> > Thanks for confirming. Committed and back-patched to v10. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers