Re: Referential Integrity Checks with Statement-level Triggers

Поиск
Список
Период
Сортировка
От Pavel Stehule
Тема Re: Referential Integrity Checks with Statement-level Triggers
Дата
Msg-id CAFj8pRCQe9=tX5GmC-S_Dmms9tdU4OWGD2xaHZFm-yaMS1zmoQ@mail.gmail.com
обсуждение исходный текст
Ответ на Referential Integrity Checks with Statement-level Triggers  (Corey Huinker <corey.huinker@gmail.com>)
Ответы Re: Referential Integrity Checks with Statement-level Triggers  (Alvaro Herrera <alvherre@2ndquadrant.com>)
Список pgsql-hackers


po 17. 12. 2018 v 15:32 odesílatel Corey Huinker <corey.huinker@gmail.com> napsal:
Back when Pg added statement-level triggers, I was interested in the potential promise of moving referential integrity checks to statement-level triggers.

The initial conversation, along with Kevin Grittner's POC script (in SQL) that showed a potential for a 98% reduction in time spent doing RI checks. The original thread is here:


I dug around in the code, and was rather surprised at how close we already are to implementing this. The function RI_Initial_Check() already does a left-join query via SPI to look for any invalid data, so if we could just replace the near table with the transition table for inserted rows, we'd be home free. The function SPI_register_trigger_data() makes the transition tables visible to SPI, so I started to wonder why this hadn't be done already.

I approached Kevin and Thomas Munro seeking feedback on my approach. I also made it into a session at the PgConf.ASIA un-conference, and then later with Michael Paquier at that same conference, and the coalesced feedback was this:

- the overhead of registering the transition tables probably makes it unprofitable for single row inserts
- the single row overhead is itself significant, so maybe the transition tables aren't worse
- there has been talk of replacing transition tables with an in-memory data structure that would be closer to "free" from a startup perspective and might even coalesce the transition tables of multiple statements in the same transaction
- because no declarative code changes, it's trivial to switch from row level to statement level triggering via pg_upgrade
- assuming that transition tables are an overhead that only pays off when > N rows have been updated, does it make sense to enforce RI with something that isn't actually a trigger?
- there was also some mention that parallel query uses a queue mechanism that might be leveraged to do row-level triggers for updates of <= N rows and statement level for > N

That's what I have so far. I'm going to be working on a POC patch so that I can benchmark a pure-statement-level solution, which if nothing else will let us know the approximate value of N.

All suggestions are appreciated.

It is great. I though little bit about it - just theoretically. 

ROW trigger call RI check too often, and statement trigger too less. I think so ideal design can be call RI check every 10K rows. I think so can be unfriendly if somebody does very long import and it fails on the end. I don't think so there should not be any performance difference, if RI check is called per 1000 or more rows.

Regards

Pavel

В списке pgsql-hackers по дате отправления:

Предыдущее
От: Antonin Houska
Дата:
Сообщение: Re: [HACKERS] WIP: Aggregation push-down
Следующее
От: Alexey Kondratov
Дата:
Сообщение: Re: PATCH: logical_work_mem and logical streaming of largein-progress transactions