Re: Indexes on expressions with multiple columns and operators

Поиск
Список
Период
Сортировка
От Jehan-Guillaume de Rorthais
Тема Re: Indexes on expressions with multiple columns and operators
Дата
Msg-id 20250918172628.44301672@karst
обсуждение исходный текст
Ответ на Re: Indexes on expressions with multiple columns and operators  (Tom Lane <tgl@sss.pgh.pa.us>)
Ответы Re: Indexes on expressions with multiple columns and operators
Список pgsql-performance
Hi there,

I think this discussion has a nice solution, thank you!

However, while poking around this issue yesterday, we also found something
surprising between estimated rows and costs when using a function. Bellow the
scenario to apply on top of Frederic's one to quickly expose the weirdness:

  CREATE OR REPLACE FUNCTION s(crit text, ackid int) 
  RETURNS bool LANGUAGE plpgsql IMMUTABLE AS $$
  BEGIN RETURN crit = 'WARNING' AND ackid IS NULL;
  END $$;

  CREATE INDEX foo_s_idx ON foo (s(crit, ackid));

  ANALYZE foo ;

  EXPLAIN SELECT * FROM foo WHERE s(crit, ackid);

  EXPLAIN SELECT * FROM foo
  WHERE (ackid IS NULL AND crit = 'WARNING') is true;

  SELECT most_common_vals, most_common_freqs
  FROM pg_stats WHERE tablename = 'foo_s_idx';

On a fresh instance from HEAD with its default configuration, it shows:

  Index Scan using foo_s_idx on foo  (cost=0.29..8.39 rows=33333 width=13)
    Index Cond: (s(crit, ackid) = true)

  Index Scan using foo_expr_idx on foo  (cost=0.29..8.39 rows=5 width=13)
    Index Cond: (((ackid IS NULL) AND (crit = 'WARNING'::text)) = true)

   most_common_vals | most_common_freqs 
  ------------------+-------------------
   {f,t}            | {0.99995,5e-05}

It seems statistics shown in "pg_stats" view for function "s()" are good. The
query itself even have the same costs than the query using the syntax tips you
provide before.

However, the estimated row number seems wrong in regard with the costs shown
and statistics. It looks like a default hardcoded 33% has been applied. As
Frederic said earlier, this bad row estimation drives the upper join to the
wrong method.

Is this a known planer behavior?

(again, the ju-jitsu syntax provided by Laurenz and you is definitely on
point here, I just hijack the thread to discuss this weirdness.)

Regards,



В списке pgsql-performance по дате отправления: