Обсуждение: Slow query and wrong row estimates for CTE

Поиск
Список
Период
Сортировка

Slow query and wrong row estimates for CTE

От
Dane Foster
Дата:
Hello,

I'm seeking guidance in how to improve the performance of a slow query and to have some other sets of eyes confirm that what I wrote does what I intend.

According to the PostgreSQL wiki there is a set of metadata that I should provide to help you help me. So let's begin there.

PostgreSQL version: PostgreSQL 12.5 on x86_64-pc-linux-gnu, compiled by gcc (GCC) 4.8.5 20150623 (Red Hat 4.8.5-11), 64-bit

Full table and index schema:
CREATE TABLE attempt_scores (
id bigint GENERATED ALWAYS AS IDENTITY PRIMARY KEY,
attempt_report_id bigint NOT NULL,
score_value double precision NOT NULL,
created_at timestamptz NOT NULL DEFAULT now(),
attempt_report_updated_at timestamptz NOT NULL,
student_id int NOT NULL,
course_id int NOT NULL,
assignment_id int NOT NULL,
score_name citext NOT NULL CHECK (length(trim(score_name)) > 0),
attempted_by citext NOT NULL CHECK (length(trim(attempted_by)) > 0),
CONSTRAINT for_upsert UNIQUE (attempt_report_id, score_name)
);
CREATE INDEX ON attempt_scores (attempt_report_updated_at);
COMMENT ON TABLE attempt_scores IS
$$The collection of assignment scores extracted from the LMS database.$$;
COMMENT ON COLUMN attempt_scores.attempt_report_id IS
$$Each assignment attempt has an associated attempt report (attempt_reports)
where the scores of the attempt is recorded. This column is the pk value from
that table.$$;
COMMENT ON COLUMN attempt_scores.score_value IS $$The score's value.$$;
COMMENT ON COLUMN attempt_scores.created_at IS $$The timestamp the record was created.$$;
COMMENT ON COLUMN attempt_scores.student_id IS $$The student's ID.$$;
COMMENT ON COLUMN attempt_scores.course_id IS $$The course's primary key in the LMS database.$$;
COMMENT ON COLUMN attempt_scores.assignment_id IS $$The assignment's primary key in the LMS database.$$;
COMMENT ON COLUMN attempt_scores.score_name IS $$The source/name of the score.$$;
COMMENT ON COLUMN attempt_scores.attempted_by IS 'The users.role column in LMS';
COMMENT ON COLUMN attempt_scores.attempt_report_updated_at IS
$$The timestamp value of attempt_reports.updated_at on the LMS side. We use it
to find new rows added since the last time we exported to Salesforce.$$;

Table metadata:
relname     | relpages | reltuples | relallvisible | relkind | relnatts | relhassubclass | reloptions | pg_table_size
----------------+----------+-----------+---------------+---------+----------+----------------+------------+---------------
 attempt_scores |   130235 |   9352640 |             0 | r       |       10 | f              | [NULL]     |    1067180032

Other context: The PostgreSQL database is an Amazon RDS instance.

Next up is the query and a description of what it's supposed to do.

-- What this query is supposed to do is to compute averages for a set of scoring/learning metrics but it's not so
-- straight forward. There is an umbrella metric that summarises the others called the student performance index (SPI)
-- and the folks who want this info want the averages to be driven by the SPI. So the basic algorithm is that for each
-- student/assignment pair, find the assignment that has the highest SPI then use that to collect and average the
-- component metrics.
EXPLAIN (ANALYZE, BUFFERS)
WITH max_spi AS (
  SELECT student_id, assignment_id, max(score_value) spi
  FROM attempt_scores
  WHERE score_name = 'student_performance_index'
  GROUP BY student_id, assignment_id
  HAVING max(score_value) > 0
), reports AS (
  SELECT max(attempt_report_id) attempt_report_id, max(score_value) spi
  FROM max_spi m NATURAL JOIN attempt_scores
  WHERE score_value = m.spi
  GROUP BY student_id, assignment_id
)
SELECT
  avg(spi) spi,
  avg(CASE score_name WHEN 'digital_clinical_experience' THEN score_value END) dce,
  avg(CASE score_name WHEN 'tier1_subjective_data_collection' THEN score_value END) sdc
FROM reports NATURAL JOIN attempt_scores;

Finally, the EXPLAIN output and some links.
QUERY PLAN                                                                                                                        
---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
 Aggregate  (cost=672426.02..672426.03 rows=1 width=24) (actual time=903359.923..903368.957 rows=1 loops=1)
   Buffers: shared hit=6167172 read=4199539, temp read=99551 written=99678
   I/O Timings: read=839121.853
   ->  Nested Loop  (cost=672389.80..672425.91 rows=8 width=37) (actual time=36633.920..885232.956 rows=7034196 loops=1)
         Buffers: shared hit=6167172 read=4199539, temp read=99551 written=99678
         I/O Timings: read=839121.853
         ->  GroupAggregate  (cost=672389.37..672389.39 rows=1 width=24) (actual time=36628.945..41432.502 rows=938244 loops=1)
               Group Key: attempt_scores_2.student_id, attempt_scores_2.assignment_id
               Buffers: shared hit=191072 read=329960, temp read=99551 written=99678
               I/O Timings: read=18210.866
               ->  Sort  (cost=672389.37..672389.37 rows=1 width=24) (actual time=36628.111..39562.421 rows=2500309 loops=1)
                     Sort Key: attempt_scores_2.student_id, attempt_scores_2.assignment_id
                     Sort Method: external merge  Disk: 83232kB
                     Buffers: shared hit=191072 read=329960, temp read=99551 written=99678
                     I/O Timings: read=18210.866
                     ->  Gather  (cost=425676.58..672389.36 rows=1 width=24) (actual time=25405.650..34716.694 rows=2500309 loops=1)
                           Workers Planned: 2
                           Workers Launched: 2
                           Buffers: shared hit=191072 read=329960, temp read=78197 written=78260
                           I/O Timings: read=18210.866
                           ->  Hash Join  (cost=424676.58..671389.26 rows=1 width=24) (actual time=25169.930..34121.825 rows=833436 loops=3)
                                 Hash Cond: ((attempt_scores_1.student_id = attempt_scores_2.student_id) AND (attempt_scores_1.assignment_id = attempt_scores_2.assignment_id) AND (attempt_scores_1.score_value = (max(attempt_scores_2.score_value))))
                                 Buffers: shared hit=191072 read=329960, temp read=78197 written=78260
                                 I/O Timings: read=18210.866
                                 ->  Parallel Seq Scan on attempt_scores attempt_scores_1  (cost=0.00..169204.33 rows=3896933 width=24) (actual time=0.013..5775.887 rows=3118127 loops=3)
                                       Buffers: shared hit=41594 read=88641
                                       I/O Timings: read=14598.128
                                 ->  Hash  (cost=419397.94..419397.94 rows=235808 width=16) (actual time=25160.038..25160.555 rows=938244 loops=3)
                                       Buckets: 131072 (originally 131072)  Batches: 16 (originally 4)  Memory Usage: 3786kB
                                       Buffers: shared hit=149408 read=241311, temp read=15801 written=27438
                                       I/O Timings: read=3610.261
                                       ->  GroupAggregate  (cost=392800.28..417039.86 rows=235808 width=16) (actual time=23175.268..24589.121 rows=938244 loops=3)
                                             Group Key: attempt_scores_2.student_id, attempt_scores_2.assignment_id
                                             Filter: (max(attempt_scores_2.score_value) > '0'::double precision)
                                             Rows Removed by Filter: 2908
                                             Buffers: shared hit=149408 read=241311, temp read=15801 written=15864
                                             I/O Timings: read=3610.261
                                             ->  Sort  (cost=392800.28..395879.64 rows=1231743 width=16) (actual time=23174.917..23654.979 rows=1206355 loops=3)
                                                   Sort Key: attempt_scores_2.student_id, attempt_scores_2.assignment_id
                                                   Sort Method: external merge  Disk: 30760kB
                                                   Worker 0:  Sort Method: external merge  Disk: 30760kB
                                                   Worker 1:  Sort Method: external merge  Disk: 30760kB
                                                   Buffers: shared hit=149408 read=241311, temp read=15801 written=15864
                                                   I/O Timings: read=3610.261
                                                   ->  Seq Scan on attempt_scores attempt_scores_2  (cost=0.00..247143.00 rows=1231743 width=16) (actual time=16980.832..21585.313 rows=1206355 loops=3)
                                                         Filter: (score_name = 'student_performance_index'::citext)
                                                         Rows Removed by Filter: 8148027
                                                         Buffers: shared hit=149394 read=241311
                                                         I/O Timings: read=3610.261
         ->  Index Scan using for_upsert on attempt_scores  (cost=0.43..36.42 rows=8 width=37) (actual time=0.394..0.896 rows=7 loops=938244)
               Index Cond: (attempt_report_id = (max(attempt_scores_1.attempt_report_id)))
               Buffers: shared hit=5976100 read=3869579
               I/O Timings: read=820910.987
 Planning Time: 14.357 ms
 Execution Time: 903426.284 ms
(55 rows)

Time: 903571.689 ms (15:03.572)
The explain output visualized on explain.depesz.com: https://explain.depesz.com/s/onPZ

--

Thank you for your time and consideration,


Dane

Re: Slow query and wrong row estimates for CTE

От
Justin Pryzby
Дата:
On Mon, Feb 15, 2021 at 12:49:29PM -0500, Dane Foster wrote:
> PostgreSQL version: PostgreSQL 12.5 on x86_64-pc-linux-gnu, compiled by gcc
> (GCC) 4.8.5 20150623 (Red Hat 4.8.5-11), 64-bit

> EXPLAIN (ANALYZE, BUFFERS)
> WITH max_spi AS (

Since v12, CTEs are usually inlined by default.
I suspect it doesn't help, but as an experiment you could try
WITH .. AS MATERIALIZED.

You could try instead: CREATE TEMPORARY TABLE + ANALYZE, which will use
statistics that "WITH" CTE's don't have (like the rowcount after GROUPing).

>  Aggregate  (cost=672426.02..672426.03 rows=1 width=24) (actual time=903359.923..903368.957 rows=1 loops=1)
>    Buffers: shared hit=6167172 read=4199539, temp read=99551 written=99678
>    I/O Timings: read=839121.853

This shows that most of time is spent in I/O (839s/903s)

>   ->  Nested Loop  (cost=672389.80..672425.91 rows=8 width=37) (actual time=36633.920..885232.956 rows=7034196
loops=1)
>         Buffers: shared hit=6167172 read=4199539, temp read=99551 written=99678
...
>                            ->  Hash Join  (cost=424676.58..671389.26 rows=1 width=24) (actual
time=25169.930..34121.825rows=833436 loops=3)
 
>                                  Hash Cond: ((attempt_scores_1.student_id = attempt_scores_2.student_id) AND
(attempt_scores_1.assignment_id= attempt_scores_2.assignment_id) AND (attempt_scores_1.score_value =
(max(attempt_scores_2.score_value))))

This shows that it estimated 1 row but got 833k, so the plan may be no good.
As another quick experiment, you could try SET enable_nestloop=off.

>         ->  Index Scan using for_upsert on attempt_scores (cost=0.43..36.42 rows=8 width=37) (actual
time=0.394..0.896rows=7 loops=938244)
 
>               Index Cond: (attempt_report_id = (max(attempt_scores_1.attempt_report_id)))
>               Buffers: shared hit=5976100 read=3869579
>               I/O Timings: read=820910.987

This shows where most of your I/O time is from.
I think you could maybe improve this by clustering the table on for_upsert and
analyzing.  Very possibly your "id" and "time" columns are all correlated.
They might already/automatically be correlated - you can check the correlation
stat:
https://wiki.postgresql.org/wiki/Slow_Query_Questions#Statistics:_n_distinct.2C_MCV.2C_histogram

Without looking closely, an index might help: student_id,assignment_id
That'd avoid the sort, and maybe change the shape of the whole plan.

-- 
Justin



Re: Slow query and wrong row estimates for CTE

От
Michael Lewis
Дата:
                                                   Sort Method: external merge  Disk: 30760kB
                                                   Worker 0:  Sort Method: external merge  Disk: 30760kB
                                                   Worker 1:  Sort Method: external merge  Disk: 30760kB

If you can increase work_mem, even setting it temporarily higher for the session or transaction, that may dramatically change the plan. The advice given by Justin particularly about row estimates would be wise to pursue. I'd wonder how selective that condition of score_name = 'student_performance_index' is in filtering out many of the 9.3 million tuples in that table and if an index with that as the leading column, or just an index on that column would be helpful. You'd need to look at pg_stats for the table and see how many distinct values, and if student_performance_index is relatively high or low (or not present) in the MCVs list.

I am not sure if your query does what you want it to do as I admit I didn't follow your explanation of the desired behavior. My hunch is that you want to make use of a window function and get rid of one of the CTEs.

Re: Slow query and wrong row estimates for CTE

От
Dane Foster
Дата:
On Mon, Feb 15, 2021 at 5:32 PM Justin Pryzby <pryzby@telsasoft.com> wrote:
...

Without looking closely, an index might help: student_id,assignment_id
That'd avoid the sort, and maybe change the shape of the whole plan.
I tried that prior to posting on the forum and it didn't make a difference. 🙁

I'll try your other suggestions later today or tomorrow. I will keep you posted.

--
Justin

Thanks,

Dane

Re: Slow query and wrong row estimates for CTE

От
Dane Foster
Дата:

On Tue, Feb 16, 2021 at 10:13 AM Michael Lewis <mlewis@entrata.com> wrote:
                                                   Sort Method: external merge  Disk: 30760kB
                                                   Worker 0:  Sort Method: external merge  Disk: 30760kB
                                                   Worker 1:  Sort Method: external merge  Disk: 30760kB

If you can increase work_mem, even setting it temporarily higher for the session or transaction, that may dramatically change the plan.
I will try increasing work_mem for the session later today.
The advice given by Justin particularly about row estimates would be wise to pursue.
 
I'd wonder how selective that condition of score_name = 'student_performance_index' is in filtering out many of the 9.3 million tuples in that table and if an index with that as the leading column, or just an index on that column would be helpful.
There are 1,206,355 rows where score_name='student_performance_idex'.
You'd need to look at pg_stats for the table and see how many distinct values, and if student_performance_index is relatively high or low (or not present) in the MCVs list.
I will look into that.
 
I am not sure if your query does what you want it to do as I admit I didn't follow your explanation of the desired behavior. My hunch is that you want to make use of a window function and get rid of one of the CTEs.
If you could tell me what part(s) are unclear I would appreciate it so that I can write a better comment.

Thank you sooo much for all the feedback. It is greatly appreciated!
Sincerely,

Dane

Re: Slow query and wrong row estimates for CTE

От
Dane Foster
Дата:
Short conclusion:
Switching from CTEs to temporary tables and analyzing reduced the runtime from 15 minutes to about 1.5 minutes.


Longer conclusion:
  • I experimented w/ materializing the CTEs and it helped at the margins but did not significantly contribute to a reduction in runtime.
  • No clustering was required because once I switched to temporary tables the new plan no longer used the for_upsert index.
  • Increasing work_mem to 100MB (up from 4MB) helped at the margins (i.e., some 100's of millisecond improvement) but did not represent a significant reduction in the runtime.
  • It wasn't obvious to me which window function would be appropriate for the problem I was trying to solve therefore I didn't experiment w/ that approach.
  • The selectivity of score_name='student_performance_index' was not enough for the planner to choose an index over doing a FTS.
Finally, thank you both for helping me bring this poor performing query to heel. Your insights were helpful and greatly appreciated.

Sincerely,

Dane


On Tue, Feb 16, 2021 at 10:25 AM Dane Foster <studdugie@gmail.com> wrote:

On Tue, Feb 16, 2021 at 10:13 AM Michael Lewis <mlewis@entrata.com> wrote:
                                                   Sort Method: external merge  Disk: 30760kB
                                                   Worker 0:  Sort Method: external merge  Disk: 30760kB
                                                   Worker 1:  Sort Method: external merge  Disk: 30760kB

If you can increase work_mem, even setting it temporarily higher for the session or transaction, that may dramatically change the plan.
I will try increasing work_mem for the session later today.
The advice given by Justin particularly about row estimates would be wise to pursue.
 
I'd wonder how selective that condition of score_name = 'student_performance_index' is in filtering out many of the 9.3 million tuples in that table and if an index with that as the leading column, or just an index on that column would be helpful.
There are 1,206,355 rows where score_name='student_performance_idex'.
You'd need to look at pg_stats for the table and see how many distinct values, and if student_performance_index is relatively high or low (or not present) in the MCVs list.
I will look into that.
 
I am not sure if your query does what you want it to do as I admit I didn't follow your explanation of the desired behavior. My hunch is that you want to make use of a window function and get rid of one of the CTEs.
If you could tell me what part(s) are unclear I would appreciate it so that I can write a better comment.

Thank you sooo much for all the feedback. It is greatly appreciated!
Sincerely,

Dane

Re: Slow query and wrong row estimates for CTE

От
Dane Foster
Дата:

A small update (see below/inline).


On Tue, Feb 16, 2021 at 2:11 PM Dane Foster <studdugie@gmail.com> wrote:
Short conclusion:
Switching from CTEs to temporary tables and analyzing reduced the runtime from 15 minutes to about 1.5 minutes.


Longer conclusion:
  • I experimented w/ materializing the CTEs and it helped at the margins but did not significantly contribute to a reduction in runtime.
  • No clustering was required because once I switched to temporary tables the new plan no longer used the for_upsert index.
  • Increasing work_mem to 100MB (up from 4MB) helped at the margins (i.e., some 100's of millisecond improvement) but did not represent a significant reduction in the runtime.
  • It wasn't obvious to me which window function would be appropriate for the problem I was trying to solve therefore I didn't experiment w/ that approach.
I want to update/correct this statement: 
  • The selectivity of score_name='student_performance_index' was not enough for the planner to choose an index over doing a FTS.
I added a partial index (WHERE score_name='student_performance_index'::citext) and that had a dramatic impact. That part of the query went from ~12 seconds to ~1 second.
Finally, thank you both for helping me bring this poor performing query to heel. Your insights were helpful and greatly appreciated.

Sincerely,

Dane


On Tue, Feb 16, 2021 at 10:25 AM Dane Foster <studdugie@gmail.com> wrote:

On Tue, Feb 16, 2021 at 10:13 AM Michael Lewis <mlewis@entrata.com> wrote:
                                                   Sort Method: external merge  Disk: 30760kB
                                                   Worker 0:  Sort Method: external merge  Disk: 30760kB
                                                   Worker 1:  Sort Method: external merge  Disk: 30760kB

If you can increase work_mem, even setting it temporarily higher for the session or transaction, that may dramatically change the plan.
I will try increasing work_mem for the session later today.
The advice given by Justin particularly about row estimates would be wise to pursue.
 
I'd wonder how selective that condition of score_name = 'student_performance_index' is in filtering out many of the 9.3 million tuples in that table and if an index with that as the leading column, or just an index on that column would be helpful.
There are 1,206,355 rows where score_name='student_performance_idex'.
You'd need to look at pg_stats for the table and see how many distinct values, and if student_performance_index is relatively high or low (or not present) in the MCVs list.
I will look into that.
 
I am not sure if your query does what you want it to do as I admit I didn't follow your explanation of the desired behavior. My hunch is that you want to make use of a window function and get rid of one of the CTEs.
If you could tell me what part(s) are unclear I would appreciate it so that I can write a better comment.

Thank you sooo much for all the feedback. It is greatly appreciated!
Sincerely,

Dane

Re: Slow query and wrong row estimates for CTE

От
Yoan SULTAN
Дата:
Hi all, 

This is my first post on this mailing list, I really enjoy it.
I wanted to add some details and answers to this disccusion.

 17 févr. 2021 à 17:52, Dane Foster <studdugie@gmail.com> a écrit :

A small update (see below/inline).


On Tue, Feb 16, 2021 at 2:11 PM Dane Foster <studdugie@gmail.com> wrote:
Short conclusion:
Switching from CTEs to temporary tables and analyzing reduced the runtime from 15 minutes to about 1.5 minutes.

The attempt_scores table is pretty big and it is called 3 times; try to rewrite the query in order to reduce call to this table. for example :
EXPLAIN (ANALYZE, BUFFERS)
WITH reports AS (
  SELECT student_id, assignment_id, max(score_value) FILTER (WHERE score_name = 'student_performance_index'),
                                    max(attempt_report_id) maxid,
                                    max(score_value) spi
  FROM attempt_scores
  GROUP BY student_id, assignment_id
  HAVING max(score_value) > 0 AND max(score_value) FILTER (WHERE score_name = 'student_performance_index') = max(score_value)
)
SELECT
  avg(spi) spi,
  avg(CASE score_name WHEN 'digital_clinical_experience' THEN score_value END) dce,
  avg(CASE score_name WHEN 'tier1_subjective_data_collection' THEN score_value END) sdc
FROM attempt_scores JOIN reports ON reports.maxid=attempt_scores.attempt_report_id;

Also, I would continue to increase work_mem to 200MB until the external merge is not required.
SET WORK_MEM='200MB'; -- to change only at session level

Longer conclusion:
  • I experimented w/ materializing the CTEs and it helped at the margins but did not significantly contribute to a reduction in runtime.
  • No clustering was required because once I switched to temporary tables the new plan no longer used the for_upsert index.
  • Increasing work_mem to 100MB (up from 4MB) helped at the margins (i.e., some 100's of millisecond improvement) but did not represent a significant reduction in the runtime.
  • It wasn't obvious to me which window function would be appropriate for the problem I was trying to solve therefore I didn't experiment w/ that approach.
I want to update/correct this statement: 
  • The selectivity of score_name='student_performance_index' was not enough for the planner to choose an index over doing a FTS.
I added a partial index (WHERE score_name='student_performance_index'::citext) and that had a dramatic impact. That part of the query went from ~12 seconds to ~1 second.

Another way to generate perf. gains on this query,  CREATE HASH INDEX ON attempt_scores(score_name); --since score_name doesn't seem to have a big cardinality

Finally, thank you both for helping me bring this poor performing query to heel. Your insights were helpful and greatly appreciated.

Sincerely,

Dane


On Tue, Feb 16, 2021 at 10:25 AM Dane Foster <studdugie@gmail.com> wrote:

On Tue, Feb 16, 2021 at 10:13 AM Michael Lewis <mlewis@entrata.com> wrote:
                                                   Sort Method: external merge  Disk: 30760kB
                                                   Worker 0:  Sort Method: external merge  Disk: 30760kB
                                                   Worker 1:  Sort Method: external merge  Disk: 30760kB

If you can increase work_mem, even setting it temporarily higher for the session or transaction, that may dramatically change the plan.
I will try increasing work_mem for the session later today.
The advice given by Justin particularly about row estimates would be wise to pursue.
 
I'd wonder how selective that condition of score_name = 'student_performance_index' is in filtering out many of the 9.3 million tuples in that table and if an index with that as the leading column, or just an index on that column would be helpful.
There are 1,206,355 rows where score_name='student_performance_idex'.
You'd need to look at pg_stats for the table and see how many distinct values, and if student_performance_index is relatively high or low (or not present) in the MCVs list.
I will look into that.
 
I am not sure if your query does what you want it to do as I admit I didn't follow your explanation of the desired behavior. My hunch is that you want to make use of a window function and get rid of one of the CTEs.
If you could tell me what part(s) are unclear I would appreciate it so that I can write a better comment.

Thank you sooo much for all the feedback. It is greatly appreciated!
Sincerely,

Dane



--
Regards,
Yo.

Re: Slow query and wrong row estimates for CTE

От
Dane Foster
Дата:
On Wed, Feb 17, 2021 at 1:37 PM Yoan SULTAN <yeyeyo@gmail.com> wrote:
Hi all, 

This is my first post on this mailing list, I really enjoy it.
I wanted to add some details and answers to this disccusion.
I'm happy you've decided to join the conversation and about the fact that you've opened up an entirely new avenue for me to investigate and learn from. I feel like I'm about to level up my SQL-fu! 😊
 
 17 févr. 2021 à 17:52, Dane Foster <studdugie@gmail.com> a écrit :

A small update (see below/inline).


On Tue, Feb 16, 2021 at 2:11 PM Dane Foster <studdugie@gmail.com> wrote:
Short conclusion:
Switching from CTEs to temporary tables and analyzing reduced the runtime from 15 minutes to about 1.5 minutes.

The attempt_scores table is pretty big and it is called 3 times; try to rewrite the query in order to reduce call to this table. for example :
EXPLAIN (ANALYZE, BUFFERS)
WITH reports AS (
  SELECT student_id, assignment_id, max(score_value) FILTER (WHERE score_name = 'student_performance_index'),
                                    max(attempt_report_id) maxid,
                                    max(score_value) spi
  FROM attempt_scores
  GROUP BY student_id, assignment_id
  HAVING max(score_value) > 0 AND max(score_value) FILTER (WHERE score_name = 'student_performance_index') = max(score_value)
)
SELECT
  avg(spi) spi,
  avg(CASE score_name WHEN 'digital_clinical_experience' THEN score_value END) dce,
  avg(CASE score_name WHEN 'tier1_subjective_data_collection' THEN score_value END) sdc
FROM attempt_scores JOIN reports ON reports.maxid=attempt_scores.attempt_report_id;
Given: HAVING max(score_value) > 0 AND max(score_value) FILTER (WHERE score_name = 'student_performance_index') = max(score_value)

Why: max(score_value) FILTER (WHERE score_name = 'student_performance_index') but no FILTER clause on: max(attempt_report_id)?

Some context for my question. I'm new to aggregate expressions therefore I don't have a strong mental model for what's happening. So let me tell you what I think is happening and you can correct me.

The new HAVING clause that you've added ensures that for each student/assignment pair/group that we are selecting the max spi value (i.e., score_name = 'student_performance_index'). Therefore, isn't the FILTER clause in the SELECT section redundant? And if it's not redundant then why isn't it necessary for: max(attempt_report_id)?


Also, I would continue to increase work_mem to 200MB until the external merge is not required.
SET WORK_MEM='200MB'; -- to change only at session level

Longer conclusion:
  • I experimented w/ materializing the CTEs and it helped at the margins but did not significantly contribute to a reduction in runtime.
  • No clustering was required because once I switched to temporary tables the new plan no longer used the for_upsert index.
  • Increasing work_mem to 100MB (up from 4MB) helped at the margins (i.e., some 100's of millisecond improvement) but did not represent a significant reduction in the runtime.
  • It wasn't obvious to me which window function would be appropriate for the problem I was trying to solve therefore I didn't experiment w/ that approach.
I want to update/correct this statement: 
  • The selectivity of score_name='student_performance_index' was not enough for the planner to choose an index over doing a FTS.
I added a partial index (WHERE score_name='student_performance_index'::citext) and that had a dramatic impact. That part of the query went from ~12 seconds to ~1 second.

Another way to generate perf. gains on this query,  CREATE HASH INDEX ON attempt_scores(score_name); --since score_name doesn't seem to have a big cardinality

Finally, thank you both for helping me bring this poor performing query to heel. Your insights were helpful and greatly appreciated.

Sincerely,

Dane


On Tue, Feb 16, 2021 at 10:25 AM Dane Foster <studdugie@gmail.com> wrote:

On Tue, Feb 16, 2021 at 10:13 AM Michael Lewis <mlewis@entrata.com> wrote:
                                                   Sort Method: external merge  Disk: 30760kB
                                                   Worker 0:  Sort Method: external merge  Disk: 30760kB
                                                   Worker 1:  Sort Method: external merge  Disk: 30760kB

If you can increase work_mem, even setting it temporarily higher for the session or transaction, that may dramatically change the plan.
I will try increasing work_mem for the session later today.
The advice given by Justin particularly about row estimates would be wise to pursue.
 
I'd wonder how selective that condition of score_name = 'student_performance_index' is in filtering out many of the 9.3 million tuples in that table and if an index with that as the leading column, or just an index on that column would be helpful.
There are 1,206,355 rows where score_name='student_performance_idex'.
You'd need to look at pg_stats for the table and see how many distinct values, and if student_performance_index is relatively high or low (or not present) in the MCVs list.
I will look into that.
 
I am not sure if your query does what you want it to do as I admit I didn't follow your explanation of the desired behavior. My hunch is that you want to make use of a window function and get rid of one of the CTEs.
If you could tell me what part(s) are unclear I would appreciate it so that I can write a better comment.

Thank you sooo much for all the feedback. It is greatly appreciated!
Sincerely,

Dane



--
Regards,
Yo.
Again, thanks for joining the conversation. I look forward to hearing from you.

Sincerely,

Dane 

Re: Slow query and wrong row estimates for CTE

От
Yoan SULTAN
Дата:

You are totally right, the  max(score_value) FILTER (WHERE score_name = 'student_performance_index')  in the SELECT clause is redundant.

Le mer. 17 févr. 2021 à 21:33, Dane Foster <studdugie@gmail.com> a écrit :
On Wed, Feb 17, 2021 at 1:37 PM Yoan SULTAN <yeyeyo@gmail.com> wrote:
Hi all, 

This is my first post on this mailing list, I really enjoy it.
I wanted to add some details and answers to this disccusion.
I'm happy you've decided to join the conversation and about the fact that you've opened up an entirely new avenue for me to investigate and learn from. I feel like I'm about to level up my SQL-fu! 😊
 
 17 févr. 2021 à 17:52, Dane Foster <studdugie@gmail.com> a écrit :

A small update (see below/inline).


On Tue, Feb 16, 2021 at 2:11 PM Dane Foster <studdugie@gmail.com> wrote:
Short conclusion:
Switching from CTEs to temporary tables and analyzing reduced the runtime from 15 minutes to about 1.5 minutes.

The attempt_scores table is pretty big and it is called 3 times; try to rewrite the query in order to reduce call to this table. for example :
EXPLAIN (ANALYZE, BUFFERS)
WITH reports AS (
  SELECT student_id, assignment_id, max(score_value) FILTER (WHERE score_name = 'student_performance_index'),
                                    max(attempt_report_id) maxid,
                                    max(score_value) spi
  FROM attempt_scores
  GROUP BY student_id, assignment_id
  HAVING max(score_value) > 0 AND max(score_value) FILTER (WHERE score_name = 'student_performance_index') = max(score_value)
)
SELECT
  avg(spi) spi,
  avg(CASE score_name WHEN 'digital_clinical_experience' THEN score_value END) dce,
  avg(CASE score_name WHEN 'tier1_subjective_data_collection' THEN score_value END) sdc
FROM attempt_scores JOIN reports ON reports.maxid=attempt_scores.attempt_report_id;
Given: HAVING max(score_value) > 0 AND max(score_value) FILTER (WHERE score_name = 'student_performance_index') = max(score_value)

Why: max(score_value) FILTER (WHERE score_name = 'student_performance_index') but no FILTER clause on: max(attempt_report_id)? 

Some context for my question. I'm new to aggregate expressions therefore I don't have a strong mental model for what's happening. So let me tell you what I think is happening and you can correct me.

The new HAVING clause that you've added ensures that for each student/assignment pair/group that we are selecting the max spi value (i.e., score_name = 'student_performance_index'). Therefore, isn't the FILTER clause in the SELECT section redundant? And if it's not redundant then why isn't it necessary for: max(attempt_report_id)?



Also, I would continue to increase work_mem to 200MB until the external merge is not required.
SET WORK_MEM='200MB'; -- to change only at session level

Longer conclusion:
  • I experimented w/ materializing the CTEs and it helped at the margins but did not significantly contribute to a reduction in runtime.
  • No clustering was required because once I switched to temporary tables the new plan no longer used the for_upsert index.
  • Increasing work_mem to 100MB (up from 4MB) helped at the margins (i.e., some 100's of millisecond improvement) but did not represent a significant reduction in the runtime.
  • It wasn't obvious to me which window function would be appropriate for the problem I was trying to solve therefore I didn't experiment w/ that approach.
I want to update/correct this statement: 
  • The selectivity of score_name='student_performance_index' was not enough for the planner to choose an index over doing a FTS.
I added a partial index (WHERE score_name='student_performance_index'::citext) and that had a dramatic impact. That part of the query went from ~12 seconds to ~1 second.

Another way to generate perf. gains on this query,  CREATE HASH INDEX ON attempt_scores(score_name); --since score_name doesn't seem to have a big cardinality

Finally, thank you both for helping me bring this poor performing query to heel. Your insights were helpful and greatly appreciated.

Sincerely,

Dane


On Tue, Feb 16, 2021 at 10:25 AM Dane Foster <studdugie@gmail.com> wrote:

On Tue, Feb 16, 2021 at 10:13 AM Michael Lewis <mlewis@entrata.com> wrote:
                                                   Sort Method: external merge  Disk: 30760kB
                                                   Worker 0:  Sort Method: external merge  Disk: 30760kB
                                                   Worker 1:  Sort Method: external merge  Disk: 30760kB

If you can increase work_mem, even setting it temporarily higher for the session or transaction, that may dramatically change the plan.
I will try increasing work_mem for the session later today.
The advice given by Justin particularly about row estimates would be wise to pursue.
 
I'd wonder how selective that condition of score_name = 'student_performance_index' is in filtering out many of the 9.3 million tuples in that table and if an index with that as the leading column, or just an index on that column would be helpful.
There are 1,206,355 rows where score_name='student_performance_idex'.
You'd need to look at pg_stats for the table and see how many distinct values, and if student_performance_index is relatively high or low (or not present) in the MCVs list.
I will look into that.
 
I am not sure if your query does what you want it to do as I admit I didn't follow your explanation of the desired behavior. My hunch is that you want to make use of a window function and get rid of one of the CTEs.
If you could tell me what part(s) are unclear I would appreciate it so that I can write a better comment.

Thank you sooo much for all the feedback. It is greatly appreciated!
Sincerely,

Dane



--
Regards,
Yo.
Again, thanks for joining the conversation. I look forward to hearing from you.

Sincerely,

Dane 


--
Regards,
Yo.