On 11/25/20 11:18 AM, Tom Lane wrote:
> Mats Julian Olsen <mats@duneanalytics.com> writes:
>> I've got some more numbers here:
>> ...
>> To me this does indicate some sort of networking issue, but I'm
>> wondering if INSERTs are treated differently than SELECTs in
>> postgres_fdw? The only feasibly explanation I have is that postgres_fdw
>> does many more network calls for INSERT than for SELECT, e.g. something
>> like 1 for SELECT and `n` for INSERT?
>
> I don't have the code in front of me, but from memory, postgres_fdw
> will issue an INSERT statement to the remote for each row it has to
> insert. Maybe you are indeed just dealing with spectacularly bad
> network round trip times.
>
> You could try turning on log_statement and/or log_duration on the
> remote to see if that sheds any more light about how much time is
> spent executing each insertion vs. the network delay.
>
> There's been some recent discussion about teaching postgres_fdw to
> batch insertions, which would likely be helpful in your situation.
> I don't know how close that is to committable, but in any case
> it couldn't see the light of day earlier than v14. In the meantime,
> if you're sufficiently desperate maybe you could switch to using
> dblink with manually-issued multi-row INSERTs. (This needn't
> preclude continuing to access the table with postgres_fdw when
> that's suitable.)
So even if Mats where to break this query:
INSERT INTO foreign.labels (address, labels)
SELECT address_id, ARRAY_AGG(name) AS labels
FROM labels
GROUP BY 1
LIMIT 100;
down into something like this:
INSERT INTO foreign.labels (address, labels)
VALUES (), (), (), ();
postgres_fdw would send it as individual INSERTs?
>
> regards, tom lane
>
--
Adrian Klaver
adrian.klaver@aklaver.com