Re: I'd like to discuss scaleout at PGCon

Поиск
Список
Период
Сортировка
От Simon Riggs
Тема Re: I'd like to discuss scaleout at PGCon
Дата
Msg-id CANP8+jJ_e6xxhvx1i0jWZJmmMBbx2SzW5nG-eC2W2A77ywZkwA@mail.gmail.com
обсуждение исходный текст
Ответ на Re: I'd like to discuss scaleout at PGCon  (Ashutosh Bapat <ashutosh.bapat@enterprisedb.com>)
Ответы Re: I'd like to discuss scaleout at PGCon  ("MauMau" <maumau307@gmail.com>)
Список pgsql-hackers
On 1 June 2018 at 15:44, Ashutosh Bapat <ashutosh.bapat@enterprisedb.com> wrote:
> On Thu, May 31, 2018 at 11:00 PM, MauMau <maumau307@gmail.com> wrote:
>> 2018-05-31 22:44 GMT+09:00, Robert Haas <robertmhaas@gmail.com>:
>>> On Thu, May 31, 2018 at 8:12 AM, MauMau <maumau307@gmail.com> wrote:
>>>> Oh, I didn't know you support FDW approach mainly for analytics.  I
>>>> guessed the first target was OLTP read-write scalability.
>>>
>>> That seems like a harder target to me, because you will have an extra
>>> hop involved -- SQL from the client to the first server, then via SQL
>>> to a second server.  The work of parsing and planning also has to be
>>> done twice, once for the foreign table and again for the table.  For
>>> longer-running queries this overhead doesn't matter as much, but for
>>> short-running queries it is significant.
>>
>> Yes, that extra hop and double parsing/planning were the killer for
>> our performance goal when we tried to meet our customer's scaleout
>> needs with XL.  The application executes 82 DML statements in one
>> transaction.  Those DMLs consist of INSERT, UPDATE and SELECT that
>> only accesses one row with a primary key.  The target tables are only
>> a few, so the application PREPAREs a few statements and EXECUTEs them
>> repeatedly.  We placed the coordinator node of XL on the same host as
>> the application, and data nodes and GTM on other individual nodes.
>>
>
> I agree that there's double parsing happening, but I am hesitant to
> agree with the double planning claim. We do plan, let's say a join
> between two foreign tables, on the local server, but that's only to
> decide whether it's efficient to join locally or on the foreign
> server. That means we create foreign paths for scan on the foreign
> tables, may be as many parameterized plans as the number of join
> conditions, and one path for the join pushdown that's it. We then
> create local join paths but we need those to decide whether it's
> efficient to join locally and if yes, which way. But don't create
> paths as to how the foreign server would plan that join. That's not
> double planning since we do not create same paths locally and on the
> foreign server.
>
> In order to avoid double parsing, we might want to find a way to pass
> a "normalized" parse tree down to the foreign server. We need to
> normalize the OIDs in the parse tree since those may be different
> across the nodes.

Passing detailed info between servers is exactly what XL does.

It requires us to define a cluster, exactly as XL does.

And yes, its a good idea to replicate some tables to all nodes, as XL does.

So it seems we have at last some agreement that some of the things XL
does are the correct approaches.

-- 
Simon Riggs                http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


В списке pgsql-hackers по дате отправления:

Предыдущее
От: Laurenz Albe
Дата:
Сообщение: Re: Loaded footgun open_datasync on Windows
Следующее
От: Simon Riggs
Дата:
Сообщение: Re: I'd like to discuss scaleout at PGCon