Re: [PERFORM] Slow query with 3 table joins

Поиск
Список
Период
Сортировка
От Alessandro Ferrucci
Тема Re: [PERFORM] Slow query with 3 table joins
Дата
Msg-id CAMgh6pLFO+Vk6X1wkiK_uRfpNkMgFnCjuwapMcvyc7C5=F2FiA@mail.gmail.com
обсуждение исходный текст
Ответ на Re: [PERFORM] Slow query with 3 table joins  (David Rowley <david.rowley@2ndquadrant.com>)
Список pgsql-performance
Dave -

I had re-ran ANALYZE and VACUUM on all the tables and that did not seem to have fixed the issue (the query still took a long time, however I did not let it finish to produce a full EXPLAIN plan.

However - after creating an index on FIELD(FIELD_NAME,UNIT_ID) and now the query runs very fast ( I changed the FIELD_NAME clause so I would not run into
any caching ). 

The new query is (notice the new FIELD_NAME value to avoid caching).

EXPLAIN (ANALYZE,BUFFERS) SELECT
   UNIT.ID AS UNIT_ID,
   UNIT.UNIT_ID AS UNIT_UNIT_ID,
   UNIT.BATCH_ID AS UNIT_BATCH_ID,
   UNIT.CREATE_DATE AS UNIT_CREATE_DATE,
   UNIT.UPDATE_DATE AS UNIT_UPDATE_DATE
FROM
   UNIT, FIELD, ANSWER
WHERE
   UNIT.ID=FIELD.UNIT_ID AND
   FIELD.ID=ANSWER.FIELD_ID AND
   FIELD.FIELD_NAME='RESP_PH_PREFIX_ID' AND
   ANSWER.ANS='2';

You can find the EXPLAIN plan here:


I believe this fixes the issue as far as I can see.  I'm going to play around with it more and see how it goes.

I wanted to also answer your question as to how many different values there are for the FIELD_NAME (and how many rows of each value there are)

here is it:

SELECT FIELD_NAME,COUNT(*) FROM FIELD GROUP BY FIELD_NAME ORDER BY COUNT;

"PAGE_SERIAL";10315
"SHEETS_PRESENT";10315
"RESP_PH_AREA_ID";10556
"RESP_PH_PREFIX_ID";10559
"RESP_PH_SUFFIX_ID";10560
"H_LOC_ADD_NO_IND";10587
"H_TENURE_RENTED_IND";11162
"H_TENURE_OWNED_MORT_IND";11199
"H_TENURE_OWNED_FREE_IND";11208
"PAPER_JIC_2_TEXT";11413
"PAPER_JIC_1_TEXT";11413
"H_TENURE_OCC_NOPAY_IND";11478
"H_LOC_CHILDREN_IND";11496
"H_LOC_RELATIVES_IND";11496
"H_LOC_TEMPORARY_IND";11500
"H_LOC_NONRELATIVES_IND";11510
"PSEUDO_FIELD_MARGINALIA";87744
"H_SIZE_STATED_INT";207918
"P_REL_NO_IND";825240
"P_REL_YES_IND";825240
"P_REL_OTHER_NONREL_IND";1239894
"P_REL_CHILD_ADOPTED_IND";1239894
"P_REL_CHILD_BIO_IND";1239894
"P_REL_CHILD_FOSTER_IND";1239894
"P_REL_CHILD_STEP_IND";1239894
"P_REL_GRANDCHILD_IND";1239894
"P_REL_HOUSEROOMMATE_IND";1239894
"P_REL_INLAW_CHILD_IND";1239894
"P_REL_INLAW_PARENT_IND";1239894
"P_REL_OTHER_REL_IND";1239894
"P_REL_PARENT_IND";1239894
"P_REL_PARTNER_OPP_IND";1239894
"P_REL_PARTNER_SAME_IND";1239894
"P_REL_SIBLING_IND";1239894
"P_REL_SPOUSE_OPP_IND";1239894
"P_REL_SPOUSE_SAME_IND";1239894
"P_TRBSHR_CORP_NAME";1446204
"P_TRBSHR_YES_IND";1446204
"P_TRBSHR_NO_IND";1446204
"P_LOC_ELSE_COLLEGE_IND";1446204
"P_LOC_ELSE_JAIL_IND";1446204
"P_LOC_ELSE_JOB_IND";1446204
"P_LOC_ELSE_MILITARY_IND";1446204
"P_LOC_ELSE_NO_IND";1446204
"P_TRBENR_YES_IND";1446204
"P_LOC_ELSE_SEASONAL_IND";1446204
"P_LOC_ELSE_NURSINGHOME_IND";1446204
"P_TRBENR_TRIBE_NAME";1446204
"P_TRBENR_NO_IND";1446204
"P_LOC_ELSE_RELATIVES_IND";1446204
"P_LOC_ELSE_OTHER_IND";1446204
"P_RACE_WHITE_IND";1447812
"P_RACE2_TONGAN_IND";1447812
"P_RACE2_AFAM_IND";1447812
"P_RACE2_AIAN_TEXT";1447812
"P_RACE2_ASIANINDIAN_IND";1447812
"P_RACE2_ASIAN_TEXT";1447812
"P_RACE2_BLACK_TEXT";1447812
"P_RACE2_CHAMORRO_IND";1447812
"P_RACE2_CHINESE_IND";1447812
"P_RACE2_COLOMBIAN_IND";1447812
"P_RACE2_CUBAN_IND";1447812
"P_RACE2_DOMINICAN_IND";1447812
"P_RACE2_EGYPTIAN_IND";1447812
"P_RACE2_ENGLISH_IND";1447812
"P_RACE2_ETHIOPIAN_IND";1447812
"P_RACE2_FIJIAN_IND";1447812
"P_RACE2_FILIPINO_IND";1447812
"P_RACE2_FRENCH_IND";1447812
"P_RACE2_GERMAN_IND";1447812
"P_RACE2_HAITIAN_IND";1447812
"P_RACE2_HISP_TEXT";1447812
"P_RACE2_IRANIAN_IND";1447812
"P_RACE2_IRISH_IND";1447812
"P_RACE2_ISRAELI_IND";1447812
"P_RACE2_ITALIAN_IND";1447812
"P_RACE2_JAMAICAN_IND";1447812
"P_RACE2_JAPANESE_IND";1447812
"P_RACE2_KOREAN_IND";1447812
"P_RACE2_LEBANESE_IND";1447812
"P_RACE2_MARSHALLESE_IND";1447812
"P_RACE2_MENA_TEXT";1447812
"P_RACE2_MEXICAN_IND";1447812
"P_RACE2_MOROCCAN_IND";1447812
"P_RACE2_NATHAWAIIAN_IND";1447812
"P_RACE2_NHPI_TEXT";1447812
"P_RACE2_NIGERIAN_IND";1447812
"P_RACE2_POLISH_IND";1447812
"P_RACE2_PUERTORICAN_IND";1447812
"P_RACE2_SALVADORAN_IND";1447812
"P_RACE2_SAMOAN_IND";1447812
"P_RACE2_SOMALI_IND";1447812
"P_RACE2_SOR_TEXT";1447812
"P_RACE2_SYRIAN_IND";1447812
"P_RACE2_VIETNAMESE_IND";1447812
"P_RACE2_WHITE_TEXT";1447812
"P_RACE_AIAN_IND";1447812
"P_RACE_ASIAN_IND";1447812
"P_RACE_BLACK_IND";1447812
"P_RACE_HISP_IND";1447812
"P_RACE_MENA_IND";1447812
"P_RACE_NHPI_IND";1447812
"P_RACE_SOR_IND";1447812
"P_SEX_MALE_IND";2273052
"P_SEX_FEMALE_IND";2273052
"P_MIDDLE_NAME";2273052
"P_LAST_NAME";2273052
"P_FIRST_NAME";2273052
"P_BIRTH_YEAR_INT";2273052
"P_BIRTH_MONTH_INT";2273052
"P_BIRTH_DAY_INT";2273052
"P_AGE_INT";2273052


I want to give a HUGE thanks to everyone who took the time to look at my issue and provide insight and assistance, you folks are truly awesome!



On Wed, Apr 26, 2017 at 12:12 AM, David Rowley <david.rowley@2ndquadrant.com> wrote:
On 26 April 2017 at 15:19, Alessandro Ferrucci
<alessandroferrucci@gmail.com> wrote:
> After about 40 inutes the slow query finally finished and the result of the
> EXPLAIN plan can be found here:
>
> https://explain.depesz.com/s/BX22

> Index Scan using field_unit_id_idx on field  (cost=0.00..8746678.52 rows=850149 width=8) (actual time=0.030..2414345.998 rows=10315 loops=1)"

This estimate seems a long way off. Are the stats up-to-date on the
table? Try again after running: ANALYZE field;

It might also be a good idea to ANALYZE all the tables. Is auto-vacuum
switched on?

The plan in question would work better if you create an index on field
(field_name, unit_id);

but I think if you update the stats the plan will switch.

A HashJoin, hashing "unit"  and index scanning on field_field_name_idx
would have been a much smarter plan choice for the planner to make.

Also how many distinct field_names are there? SELECT COUNT(DISTINCT
field_name) FROM field;

You may want to increase the histogram buckets on that columns if
there are more than 100 field names, and the number of rows with each
field name is highly variable. ALTER TABLE field ALTER COLUMN
field_name SET STATISTICS <n buckets>; 100 is the default, and 10000
is the maximum.


--
 David Rowley                   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services



--
Signed,
Alessandro Ferrucci

В списке pgsql-performance по дате отправления:

Предыдущее
От: Gerardo Herzig
Дата:
Сообщение: Re: [PERFORM] Slow query with 3 table joins
Следующее
От: Matthew Bellew
Дата:
Сообщение: Re: [PERFORM] Slow query with 3 table joins