RE: CDC/ETL system on top of logical replication with pgoutput, custom client

Поиск
Список
Период
Сортировка
От José Neves
Тема RE: CDC/ETL system on top of logical replication with pgoutput, custom client
Дата
Msg-id PR3P193MB049193D131031CB4BD03288F890CA@PR3P193MB0491.EURP193.PROD.OUTLOOK.COM
обсуждение исходный текст
Ответ на Re: CDC/ETL system on top of logical replication with pgoutput, custom client  (Amit Kapila <amit.kapila16@gmail.com>)
Ответы Re: CDC/ETL system on top of logical replication with pgoutput, custom client
Список pgsql-hackers
Hi Amit.

Humm, that's... challenging. I faced some issues after "the fix" because I had a couple of transactions with 25k updates, and I had to split it to be able to push to our event messaging system, as our max message size is 10MB. Relying on commit time would mean that all transaction operations will have the same timestamp. If something goes wrong while my worker is pushing that transaction data chunks, I will duplicate some data in the next run, so... this wouldn't allow me to deal with data duplication.
Is there any other way that you see to deal with it?

Right now I only see an option, which is to store all processed LSNs on the other side of the ETL. I'm trying to avoid that overhead.

Thanks.
Regards,
José Neves

De: Amit Kapila <amit.kapila16@gmail.com>
Enviado: 7 de agosto de 2023 05:59
Para: José Neves <rafaneves3@msn.com>
Cc: Andres Freund <andres@anarazel.de>; pgsql-hackers@postgresql.org <pgsql-hackers@postgresql.org>
Assunto: Re: CDC/ETL system on top of logical replication with pgoutput, custom client
 
On Sun, Aug 6, 2023 at 7:54 PM José Neves <rafaneves3@msn.com> wrote:
>
> A follow-up on this. Indeed, a new commit-based approach solved my missing data issues.
> But, getting back to the previous examples, how are server times expected to be logged for the xlogs containing these records?
>

I think it should be based on commit_time because as far as I see we
can only get that on the client.

--
With Regards,
Amit Kapila.

В списке pgsql-hackers по дате отправления:

Предыдущее
От: Peter Eisentraut
Дата:
Сообщение: Re: Improve const use in zlib-using code
Следующее
От: Yuya Watari
Дата:
Сообщение: Re: [PoC] Reducing planning time when tables have many partitions