On 4/30/22 11:28, Alvaro Herrera wrote:
> On 2022-Apr-28, Tomas Vondra wrote:
>
>> Attached is a patch doing the same thing in tablesync. The overall idea
>> is to generate copy statement with CASE expressions, applying filters to
>> individual columns. For Alvaro's example, this generates something like
>>
>> SELECT
>> (CASE WHEN (a < 0) OR (a > 0) THEN a ELSE NULL END) AS a,
>> (CASE WHEN (a > 0) THEN b ELSE NULL END) AS b,
>> (CASE WHEN (a < 0) THEN c ELSE NULL END) AS c
>> FROM uno WHERE (a < 0) OR (a > 0)
>
> I've been reading the tablesync.c code you propose and the idea seems
> correct. (I was distracted by wondering if a different data structure
> would be more appropriate, because what's there looks slightly
> uncomfortable to work with. But after playing around I can't find
> anything that feels better in an obvious way.)
>
> (I confess I'm a bit bothered by the fact that there are now three
> different data structures in our code called PublicationInfo).
>
True. I haven't really thought about naming of the data structures, so
maybe we should name them differently.
> I propose some comment changes in the attached patch, and my
> interpretation (untested) of the idea of optimizing for a single
> publication. (In there I also rename logicalrep_relmap_free_entry
> because it's confusing. That should be a separate patch but I didn't
> split it before posting, apologies.)
>
>> There's a couple options how we might optimize this for common cases.
>> For example if there's just a single publication, there's no need to
>> generate the CASE expressions - the WHERE filter will do the trick.
>
> Right.
>
OK, now that we agree on the approach in general, I'll look into these
optimizations (and the comments from your patch).
regards
--
Tomas Vondra
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company