Re: Random Page Cost and Planner

Поиск
Список
Период
Сортировка
От David Jarvis
Тема Re: Random Page Cost and Planner
Дата
Msg-id AANLkTinexztvMHO-IKfEtM2ni371vgX51sV5sRcHMWG0@mail.gmail.com
обсуждение исходный текст
Ответ на Re: Random Page Cost and Planner  (Rob Wultsch <wultsch@gmail.com>)
Ответы Re: Random Page Cost and Planner
Список pgsql-performance
Hi, Rob.

I tried bumping the effective_cache_size. It made no difference.

My latest attempt at forcing PostgreSQL to use the indexes involved two loops: one to loop over the stations, the other to extract the station data from the measurement table. The outer loop executes in 1.5 seconds. The inner loop does a full table scan for each record in the outer loop:

  FOR station IN
    SELECT
      sc.station_id,
      sc.taken_start,
      sc.taken_end
    FROM
      climate.city c,
      climate.station s,
      climate.station_category sc
    WHERE
      c.id = city_id AND
      earth_distance(
        ll_to_earth(c.latitude_decimal,c.longitude_decimal),
        ll_to_earth(s.latitude_decimal,s.longitude_decimal)) / 1000 <= radius AND
      s.elevation BETWEEN elevation1 AND elevation2 AND
      s.applicable AND
      sc.station_id = s.id AND
      sc.category_id = category_id AND
      extract(YEAR FROM sc.taken_start) >= year1 AND
      extract(YEAR FROM sc.taken_end) <= year2
    ORDER BY
      sc.station_id
  LOOP
    RAISE NOTICE 'B.1. % % %', station.station_id, station.taken_start, station.taken_end;
   
    FOR measure IN
      SELECT
        extract(YEAR FROM m.taken) AS year,
        avg(m.amount) AS amount
      FROM
        climate.measurement m
      WHERE
        m.station_id = station.station_id AND
        m.taken BETWEEN station.taken_start AND station.taken_end AND
        m.category_id = category_id
      GROUP BY
        extract(YEAR FROM m.taken)
    LOOP
      RAISE NOTICE '  B.2. % %', measure.year, measure.amount;
    END LOOP;
  END LOOP;

I thought that the bold lines would have evoked index use. The values used for the inner query:

NOTICE:  B.1. 754 1980-08-01 2001-11-30

When I run the query manually, using constants, it executes in ~25 milliseconds:

SELECT
  extract(YEAR FROM m.taken) AS year,
  avg(m.amount) AS amount
FROM
  climate.measurement m
WHERE
  m.station_id = 754 AND
  m.taken BETWEEN '1980-08-01'::date AND '2001-11-30'::date AND
  m.category_id = 7
GROUP BY
  extract(YEAR FROM m.taken)

With 106 rows it should execute in ~2.65 seconds, which is better than the 5 seconds I get when everything is cached and a tremendous improvement over the ~85 seconds from cold.

I do not understand why the below query uses a full table scan (executes in ~13 seconds):

SELECT
  extract(YEAR FROM m.taken) AS year,
  avg(m.amount) AS amount
FROM
  climate.measurement m
WHERE
  m.station_id = station.station_id AND
  m.taken BETWEEN station.taken_start AND station.taken_end AND
  m.category_id = category_id
GROUP BY
  extract(YEAR FROM m.taken)

Moreover, what can I do to solve the problem?

Thanks again!

Dave

В списке pgsql-performance по дате отправления:

Предыдущее
От: Robert Haas
Дата:
Сообщение: Re: Performance issues when the number of records are around 10 Million
Следующее
От: Alexey Klyukin
Дата:
Сообщение: Re: Random Page Cost and Planner