BUG #15102: Performance problem when doing join, index are not used

Поиск
Список
Период
Сортировка
От PG Bug reporting form
Тема BUG #15102: Performance problem when doing join, index are not used
Дата
Msg-id 152052183493.4972.11853150375334373737@wrigleys.postgresql.org
обсуждение исходный текст
Ответы Re: BUG #15102: Performance problem when doing join, index are not used
Список pgsql-bugs
The following bug has been logged on the website:

Bug reference:      15102
Logged by:          Mehdi Rahman
Email address:      mehdirahbvd@gmail.com
PostgreSQL version: 10.3
Operating system:   Windows 2012 r2 std 64 bit
Description:

I did write a query finding the evolution of company global ultimate owner
during years 2007 to 2016. For that, I join a table with ids of all
companies we have to tables with the id of the guo for each year.


The computer is a dedicated server windows 2012 r2 std 64 bit with 64 gb
RAM, 64 bit, 8 cpu x64 and SCSI disks (6 To). As it is dedicated? I did
authorize postgress to use 16 Go RAM and 1 Go per request.

Tables are huge (million of rows) and have indexes on id (queries to create
table and index at end). When doing an explain (see below), the query do not
use my indexes, and runs for half an hour using 1 CPU and barely doing IO.

Here is a sample of data:
Data in tables public.links_xxx_guo is like this. links_2016_guo contains
63075671 rows without duplicates.
"US411568618"    "US174295875L"
"US411652566"    "US542185193"
"US411655168"    "US411849203"
"US410952918"    "US411504721"
"US4-5891774"    "USLEI77366"
"US4-5895366"    "US4-5927464"
"US4-5895406"    "US4-5927464"
"US4-5895441"    "US4-5927464"
"US4-5895468"    "US4-5927464"
"US4-5895661"    "US4-5927464"
"US4-5906622"    "USIA80173288"
"US4-5907020"    "US133257110L"
"US4-5907665"    "US4-0383146"

Data in allsubjects is like this. it contains 124730186 rows without
duplicates.
"AU083774798"
"AU083774798-B0002"
"AU083775357"
"AU083775357-B0001"
"AU083775357-B0002"
"AU083775473"
"AU083775473-B0001"
"AU083775795"
"AU083775795-B0001"
"AU083775795-B0002"
"AU083775795-B0003"
"AU083775795-B0004"
"AU083775795-B0006"
"AU083775928"
"AU083775928-B0003"
"AU083775928-B0004"
"AU083775982"

Here is the query:
select allsubjects."Subsidiary BvD ID" as
SubjectID,links_2007_guo."Shareholder BvD ID" as
GUO_2007,links_2008_guo."Shareholder BvD ID" as
GUO_2008,links_2009_guo."Shareholder BvD ID" as
GUO_2009,links_2010_guo."Shareholder BvD ID" as
GUO_2010,links_2011_guo."Shareholder BvD ID" as
GUO_2011,links_2012_guo."Shareholder BvD ID" as
GUO_2012,links_2013_guo."Shareholder BvD ID" as
GUO_2013,links_2014_guo."Shareholder BvD ID" as
GUO_2014,links_2015_guo."Shareholder BvD ID" as
GUO_2015,links_2016_guo."Shareholder BvD ID" as GUO_2016 into guo50 from
allsubjects left outer join links_2007_guo on allsubjects."Subsidiary BvD
ID" = links_2007_guo."Subsidiary BvD ID" left outer join links_2008_guo on
allsubjects."Subsidiary BvD ID" = links_2008_guo."Subsidiary BvD ID" left
outer join links_2009_guo on allsubjects."Subsidiary BvD ID" =
links_2009_guo."Subsidiary BvD ID" left outer join links_2010_guo on
allsubjects."Subsidiary BvD ID" = links_2010_guo."Subsidiary BvD ID" left
outer join links_2011_guo on allsubjects."Subsidiary BvD ID" =
links_2011_guo."Subsidiary BvD ID" left outer join links_2012_guo on
allsubjects."Subsidiary BvD ID" = links_2012_guo."Subsidiary BvD ID" left
outer join links_2013_guo on allsubjects."Subsidiary BvD ID" =
links_2013_guo."Subsidiary BvD ID" left outer join links_2014_guo on
allsubjects."Subsidiary BvD ID" = links_2014_guo."Subsidiary BvD ID" left
outer join links_2015_guo on allsubjects."Subsidiary BvD ID" =
links_2015_guo."Subsidiary BvD ID" left outer join links_2016_guo on
allsubjects."Subsidiary BvD ID" = links_2016_guo."Subsidiary BvD ID" 

Here is the explain results with real data:
"Merge Left Join  (cost=64186781.61..72528875.02 rows=124730184
width=148)"
"  Merge Cond: ((allsubjects."Subsidiary BvD ID")::text =
(links_2016_guo."Subsidiary BvD ID")::text)"
"  ->  Merge Left Join  (cost=54054997.55..60981441.25 rows=124730184
width=134)"
"        Merge Cond: ((allsubjects."Subsidiary BvD ID")::text =
(links_2015_guo."Subsidiary BvD ID")::text)"
"        ->  Merge Left Join  (cost=45929338.09..51650216.35 rows=124730184
width=120)"
"              Merge Cond: ((allsubjects."Subsidiary BvD ID")::text =
(links_2014_guo."Subsidiary BvD ID")::text)"
"              ->  Merge Left Join  (cost=39700486.68..44416349.85
rows=124730184 width=106)"
"                    Merge Cond: ((allsubjects."Subsidiary BvD ID")::text =
(links_2013_guo."Subsidiary BvD ID")::text)"
"                    ->  Merge Left Join  (cost=34095001.72..37872283.29
rows=124730184 width=92)"
"                          Merge Cond: ((allsubjects."Subsidiary BvD
ID")::text = (links_2012_guo."Subsidiary BvD ID")::text)"
"                          ->  Merge Left Join
(cost=29447210.87..32388567.11 rows=124730184 width=78)"
"                                Merge Cond: ((allsubjects."Subsidiary BvD
ID")::text = (links_2011_guo."Subsidiary BvD ID")::text)"
"                                ->  Merge Left Join
(cost=26186284.29..28442070.80 rows=124730184 width=65)"
"                                      Merge Cond: ((allsubjects."Subsidiary
BvD ID")::text = (links_2010_guo."Subsidiary BvD ID")::text)"
"                                      ->  Merge Left Join
(cost=24091276.15..25805109.28 rows=124730184 width=52)"
"                                            Merge Cond:
((allsubjects."Subsidiary BvD ID")::text = (links_2009_guo."Subsidiary BvD
ID")::text)"
"                                            ->  Merge Left Join
(cost=22434390.29..23652173.77 rows=124730184 width=39)"
"                                                  Merge Cond:
((allsubjects."Subsidiary BvD ID")::text = (links_2008_guo."Subsidiary BvD
ID")::text)"
"                                                  ->  Merge Left Join
(cost=21061837.72..21813642.52 rows=124730184 width=26)"
"                                                        Merge Cond:
((allsubjects."Subsidiary BvD ID")::text = (links_2007_guo."Subsidiary BvD
ID")::text)"
"                                                        ->  Sort
(cost=19932318.48..20244143.94 rows=124730184 width=13)"
"                                                              Sort Key:
allsubjects."Subsidiary BvD ID""
"                                                              ->  Seq Scan
on allsubjects  (cost=0.00..1941633.84 rows=124730184 width=13)"
"                                                        ->  Sort
(cost=1129519.24..1150878.22 rows=8543592 width=25)"
"                                                              Sort Key:
links_2007_guo."Subsidiary BvD ID""
"                                                              ->  Seq Scan
on links_2007_guo  (cost=0.00..145877.92 rows=8543592 width=25)"
"                                                  ->  Sort
(cost=1372552.57..1398244.78 rows=10276881 width=25)"
"                                                        Sort Key:
links_2008_guo."Subsidiary BvD ID""
"                                                        ->  Seq Scan on
links_2008_guo  (cost=0.00..175660.81 rows=10276881 width=25)"
"                                            ->  Sort
(cost=1656885.86..1687589.89 rows=12281613 width=26)"
"                                                  Sort Key:
links_2009_guo."Subsidiary BvD ID""
"                                                  ->  Seq Scan on
links_2009_guo  (cost=0.00..210726.13 rows=12281613 width=26)"
"                                      ->  Sort
(cost=2095008.14..2133362.79 rows=15341861 width=26)"
"                                            Sort Key:
links_2010_guo."Subsidiary BvD ID""
"                                            ->  Seq Scan on links_2010_guo
(cost=0.00..263882.61 rows=15341861 width=26)"
"                                ->  Materialize
(cost=3260926.57..3367710.65 rows=21356816 width=26)"
"                                      ->  Sort
(cost=3260926.57..3314318.61 rows=21356816 width=26)"
"                                            Sort Key:
links_2011_guo."Subsidiary BvD ID""
"                                            ->  Seq Scan on links_2011_guo
(cost=0.00..368939.16 rows=21356816 width=26)"
"                          ->  Materialize  (cost=4647790.85..4797533.67
rows=29948564 width=27)"
"                                ->  Sort  (cost=4647790.85..4722662.26
rows=29948564 width=27)"
"                                      Sort Key: links_2012_guo."Subsidiary
BvD ID""
"                                      ->  Seq Scan on links_2012_guo
(cost=0.00..519326.64 rows=29948564 width=27)"
"                    ->  Materialize  (cost=5605484.97..5784558.15
rows=35814636 width=27)"
"                          ->  Sort  (cost=5605484.97..5695021.56
rows=35814636 width=27)"
"                                Sort Key: links_2013_guo."Subsidiary BvD
ID""
"                                ->  Seq Scan on links_2013_guo
(cost=0.00..622160.36 rows=35814636 width=27)"
"              ->  Materialize  (cost=6228851.41..6426905.59 rows=39610836
width=27)"
"                    ->  Sort  (cost=6228851.41..6327878.50 rows=39610836
width=27)"
"                          Sort Key: links_2014_guo."Subsidiary BvD ID""
"                          ->  Seq Scan on links_2014_guo
(cost=0.00..688528.36 rows=39610836 width=27)"
"        ->  Materialize  (cost=8125659.46..8381013.74 rows=51070856
width=28)"
"              ->  Sort  (cost=8125659.46..8253336.60 rows=51070856
width=28)"
"                    Sort Key: links_2015_guo."Subsidiary BvD ID""
"                    ->  Seq Scan on links_2015_guo  (cost=0.00..888822.56
rows=51070856 width=28)"
"  ->  Materialize  (cost=10131784.05..10447162.41 rows=63075672
width=28)"
"        ->  Sort  (cost=10131784.05..10289473.23 rows=63075672 width=28)"
"              Sort Key: links_2016_guo."Subsidiary BvD ID""
"              ->  Seq Scan on links_2016_guo  (cost=0.00..1097783.72
rows=63075672 width=28)"

Here are the queries to create used table and indexes to reproduce the
problem:
CREATE TABLE public.allsubjects
(
    "Subsidiary BvD ID" character varying(100) COLLATE
pg_catalog."default"
)

CREATE TABLE public.links_2007_guo
(
    "Subsidiary BvD ID" character varying(1500) COLLATE
pg_catalog."default",
    "Shareholder BvD ID" character varying(1500) COLLATE
pg_catalog."default"
)

CREATE TABLE public.links_2008_guo
(
    "Subsidiary BvD ID" character varying(1500) COLLATE
pg_catalog."default",
    "Shareholder BvD ID" character varying(1500) COLLATE
pg_catalog."default"
)

CREATE TABLE public.links_2009_guo
(
    "Subsidiary BvD ID" character varying(1500) COLLATE
pg_catalog."default",
    "Shareholder BvD ID" character varying(1500) COLLATE
pg_catalog."default"
)

CREATE TABLE public.links_2010_guo
(
    "Subsidiary BvD ID" character varying(1500) COLLATE
pg_catalog."default",
    "Shareholder BvD ID" character varying(1500) COLLATE
pg_catalog."default"
)

CREATE TABLE public.links_2011_guo
(
    "Subsidiary BvD ID" character varying(1500) COLLATE
pg_catalog."default",
    "Shareholder BvD ID" character varying(1500) COLLATE
pg_catalog."default"
)

CREATE TABLE public.links_2012_guo
(
    "Subsidiary BvD ID" character varying(1500) COLLATE
pg_catalog."default",
    "Shareholder BvD ID" character varying(1500) COLLATE
pg_catalog."default"
)

CREATE TABLE public.links_2013_guo
(
    "Subsidiary BvD ID" character varying(1500) COLLATE
pg_catalog."default",
    "Shareholder BvD ID" character varying(1500) COLLATE
pg_catalog."default"
)

CREATE TABLE public.links_2014_guo
(
    "Subsidiary BvD ID" character varying(1500) COLLATE
pg_catalog."default",
    "Shareholder BvD ID" character varying(1500) COLLATE
pg_catalog."default"
)

CREATE TABLE public.links_2015_guo
(
    "Subsidiary BvD ID" character varying(1500) COLLATE
pg_catalog."default",
    "Shareholder BvD ID" character varying(1500) COLLATE
pg_catalog."default"
)

CREATE TABLE public.links_2016_guo
(
    "Subsidiary BvD ID" character varying(1500) COLLATE
pg_catalog."default",
    "Shareholder BvD ID" character varying(1500) COLLATE
pg_catalog."default"
)


CREATE INDEX "idx_AllSubjects_SubBvDID"
    ON public.allsubjects USING hash
    ("Subsidiary BvD ID" COLLATE pg_catalog."default")
    TABLESPACE pg_default;

CREATE INDEX idx_links_2007_guo_subbvdid
    ON public.links_2007_guo USING btree
    ("Subsidiary BvD ID" COLLATE pg_catalog."default", "Shareholder BvD ID"
COLLATE pg_catalog."default")
    TABLESPACE pg_default;

CREATE INDEX idx_links_2008_guo_subbvdid
    ON public.links_2008_guo USING btree
    ("Subsidiary BvD ID" COLLATE pg_catalog."default", "Shareholder BvD ID"
COLLATE pg_catalog."default")
    TABLESPACE pg_default;

CREATE INDEX idx_links_2009_guo_subbvdid
    ON public.links_2009_guo USING btree
    ("Subsidiary BvD ID" COLLATE pg_catalog."default", "Shareholder BvD ID"
COLLATE pg_catalog."default")
    TABLESPACE pg_default;

CREATE INDEX idx_links_2010_guo_subbvdid
    ON public.links_2010_guo USING btree
    ("Subsidiary BvD ID" COLLATE pg_catalog."default", "Shareholder BvD ID"
COLLATE pg_catalog."default")
    TABLESPACE pg_default;

CREATE INDEX idx_links_2011_guo_subbvdid
    ON public.links_2011_guo USING btree
    ("Subsidiary BvD ID" COLLATE pg_catalog."default", "Shareholder BvD ID"
COLLATE pg_catalog."default")
    TABLESPACE pg_default;

CREATE INDEX idx_links_2012_guo_subbvdid
    ON public.links_2012_guo USING btree
    ("Subsidiary BvD ID" COLLATE pg_catalog."default", "Shareholder BvD ID"
COLLATE pg_catalog."default")
    TABLESPACE pg_default;

CREATE INDEX idx_links_2013_guo_subbvdid
    ON public.links_2013_guo USING btree
    ("Subsidiary BvD ID" COLLATE pg_catalog."default", "Shareholder BvD ID"
COLLATE pg_catalog."default")
    TABLESPACE pg_default;

CREATE INDEX idx_links_2014_guo_subbvdid
    ON public.links_2014_guo USING btree
    ("Subsidiary BvD ID" COLLATE pg_catalog."default", "Shareholder BvD ID"
COLLATE pg_catalog."default")
    TABLESPACE pg_default;

CREATE INDEX idx_links_2015_guo_subbvdid
    ON public.links_2015_guo USING btree
    ("Subsidiary BvD ID" COLLATE pg_catalog."default", "Shareholder BvD ID"
COLLATE pg_catalog."default")
    TABLESPACE pg_default;

CREATE INDEX idx_links_2016_guo_subbvdid
    ON public.links_2016_guo USING btree
    ("Subsidiary BvD ID" COLLATE pg_catalog."default", "Shareholder BvD ID"
COLLATE pg_catalog."default")
    TABLESPACE pg_default;



В списке pgsql-bugs по дате отправления:

Предыдущее
От: David Rowley
Дата:
Сообщение: Re: 9.6.7 -> 9.6.8 analyze worker behaviour changed?
Следующее
От: Tom Lane
Дата:
Сообщение: Re: BUG #15102: Performance problem when doing join, index are not used