Re: [GENERAL] query planner placement of sort/limit w.r.t. joins

Поиск
Список
Период
Сортировка
От David Rowley
Тема Re: [GENERAL] query planner placement of sort/limit w.r.t. joins
Дата
Msg-id CAKJS1f_Y8OqN2G_nhjqDXFNMPsC=Qc6KZMANJq5ct_TkPz5udg@mail.gmail.com
обсуждение исходный текст
Ответ на Re: [GENERAL] query planner placement of sort/limit w.r.t. joins  ("David G. Johnston" <david.g.johnston@gmail.com>)
Список pgsql-general
On 29 April 2017 at 11:37, David G. Johnston <david.g.johnston@gmail.com> wrote:
>> > Perhaps there are reasons why this optimization is not safe that I
>> > haven't
>> > thought about?
>>
>> Yeah, I think so. What happens if an A row cannot find a match in B or
>> C? This version of the query will end up returning fewer rows due to
>> that, but the original version would consider other rows with a higher
>> rank.
>>
>> We've danced around a bit with using foreign keys as proofs that rows
>> will exist for other optimisations in the past, but it's tricky ground
>> since foreign keys are not updated immediately, so there are windows
>> where they may not actually hold true to their word.
>
>
> I read this query as having a relation cardinality of one-to-one mandatory -
> which precludes the scenario described.

What makes you say so?

It's pretty easy to show how the queries are not the same.

create table a (
  id int primary key,
  b_id int not null,
  val int not null,
  rank int not null
);

create table b (
  id int primary key,
  c_id int not null,
  val int not null
);

create table c (
  id int primary key,
  val int not null
);
insert into a select x,x,x,x from generate_series(1,150) x;
insert into b select x,x,x from generate_series(51,150) x;
insert into c select x,x from generate_series(51,150) x;

SELECT A.val, B.val, C.val FROM A
   JOIN B ON A.b_id = B.id
   JOIN C ON B.c_id = C.id
   ORDER BY A.rank
   LIMIT 100; -- returns 100 rows

 SELECT D.val, B.val, C.val FROM
   (SELECT * FROM A ORDER BY A.rank LIMIT 100) AS D
   JOIN B ON D.b_id = B.id
   JOIN C ON B.c_id = C.id
   LIMIT 100; -- returns 50 rows


> Is the above saying that, today, there is no planning benefit to setting up
> two deferrable references constraints to enforce the non-optional
> requirement?

There is no place in the planner where a foreign key is used as a
proof that a joined row must exist, with the exception of row
estimations for statistics.

--
 David Rowley                   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services


В списке pgsql-general по дате отправления:

Предыдущее
От: Tomas Vondra
Дата:
Сообщение: Re: [GENERAL] [PERFORM] Questionaire: Common WAL write rates on busy servers.
Следующее
От: Thomas Güttler
Дата:
Сообщение: [GENERAL] Can PG replace redis, amqp, s3 in the future?