Re: pg_walinspect - a new extension to get raw WAL data and WAL stats

Поиск
Список
Период
Сортировка
От Jeremy Schneider
Тема Re: pg_walinspect - a new extension to get raw WAL data and WAL stats
Дата
Msg-id 9186c1cd-04b6-3dc9-40c5-5977dd854e36@ardentperf.com
обсуждение исходный текст
Ответ на Re: pg_walinspect - a new extension to get raw WAL data and WAL stats  (Bruce Momjian <bruce@momjian.us>)
Ответы Re: pg_walinspect - a new extension to get raw WAL data and WAL stats  (Bruce Momjian <bruce@momjian.us>)
Re: pg_walinspect - a new extension to get raw WAL data and WAL stats  (Alvaro Herrera <alvherre@alvh.no-ip.org>)
Re: pg_walinspect - a new extension to get raw WAL data and WAL stats  (Bharath Rupireddy <bharath.rupireddyforpostgres@gmail.com>)
Список pgsql-hackers
On 10/5/21 17:43, Bruce Momjian wrote:
> On Tue, Oct  5, 2021 at 03:30:07PM -0700, Jeremy Schneider wrote:
>> Specifically exposing pg_waldump functionality in SQL could speed up
>> finding bugs in the PG logical replication code. We found and fixed a
>> few over this past year, but there are more lurking out there.
> 
> Uh, why is pg_waldump more important than other command line tool
> information?

Going down the list of all other utilities in src/bin:

* pg_amcheck, pg_config, pg_controldata: already available in SQL
* psql, pgbench, pg_dump: already available as client-side apps
* initdb, pg_archivecleanup, pg_basebackup, pg_checksums, pg_ctl,
pg_resetwal, pg_rewind, pg_upgrade, pg_verifybackup: can't think of any
possible use case outside server OS access, most of these are too low
level and don't even make sense in SQL
* pg_test_fsync, pg_test_timing: marginally interesting ideas in SQL,
don't feel any deep interest myself

Speaking selfishly, there are a few reasons I would be specifically
interested in pg_waldump (the only remaining one on the list).

.

First, to better support existing features around logical replication
and decoding.

In particular, it seems inconsistent to me that all the replication
management SQL functions take LSNs as arguments - and yet there's no
SQL-based way to find the LSNs that you are supposed to pass into these
functions.

https://www.postgresql.org/docs/current/functions-admin.html#FUNCTIONS-REPLICATION

Over the past few years I've been pulled in to help several large PG
users who ran into these bugs, and it's very painful - because the only
real remediation is to drop and recreate the replication slot, which
means either re-copying all the data to the downstream system or
figuring out a way to resync it. Some PG users have 3rd party tools like
HVR which can do row-by-row resync (IIUC), but no matter how you slice
it, we're talking about a lot of pain for people replicating large data
sets between multiple systems. In most cases, the only/best option even
with very large tables is to just make a fresh copy of the data - which
can translate to a business outage of hours or even days.

My favorite example is the SQL function pg_replication_slot_advance() -
this could really help PG users find less painful solutions to broken
decoding, however it's not really possible to /know/ an LSN to advance
to without inspecting WAL. ISTM there's a strong use case here for a SQL
interface on WAL inspection.

.

Second: debugging and troubleshooting logical replication and decoding bugs.

I helped track down a few logical replication bugs and get fixed into
code at postgresql.org this past year. (But I give credit to others who
are much better at C than I am, and who did a lot more work than I did
on these bugs!)

Logical decoding bugs are some of the hardest to fix - because all you
have is a WAL stream, but you don't know the SQL or workload patterns or
PG code paths which created that WAL stream.

Dumping the WAL - knowing which objects and which types of operations
are involved and stats like number of changes or number of
subtransactions - this helps identify which transaction and what SQL in
the application triggered the failure, which can help find short-term
workarounds. Businesses need those short-term workarounds so they can
keep going while we work on finding and fixing bugs, which can take some
time. This is another place where I think a SQL interface to WAL would
be helpful to PG users. Especially the ability to filter and trace a
single transaction through a large number of WAL files in the directory.

.

Third: statistics on WAL - especially full page writes. Giving users the
full power of SQL allows much more sophisticated analysis of the WAL
records. Personally, I'd probably find myself importing all the WAL
stats into the DB anyway because SQL is my preferred data manipulation
language.


>> Having the extension in core is actually the only way to avoid
>> duplicated effort, by having shared code which the pg_dump binary and
>> the extension both wrap or call.
> 
> Uh, aren't you duplicating code by having pg_waldump as a command-line
> tool and an extension?

Well this whole conversation is just theoretical anyway until the code
is shared.  :)  But if Bharath is writing functions to decode WAL, then
wouldn't we just have pg_waldump use these same functions in order to
avoid duplicating code?

Bharath - was some code already posted and I just missed it?

Looking at the proposed API from the initial email, I like that there's
both stats functionality and WAL record inspection functionality
(similar to pg_waldump). I like that there's the ability to pull the
records as raw bytea data, however I think we're also going to want an
ability in v1 of the patch to decode it (similar to pageinspect
heap_page_item_attrs, etc).

Another feature that might be interesting down the road would be the
ability to provide filtering of WAL records for security purposes. For
example, allowing a user to only dump raw WAL records for one particular
database, or maybe excluding WAL records that change system catalogs or
the like. But I probably wouldn't start here, personally.

Now then.... as Blaise Pascal said in 1657 (and as was also said by
Winston Churchill, Mark Twain, etc).... "I'm sorry I wrote you such a
long letter; I didn't have time to write a short one."

-Jeremy


-- 
http://about.me/jeremy_schneider



В списке pgsql-hackers по дате отправления:

Предыдущее
От: Bruce Momjian
Дата:
Сообщение: Re: storing an explicit nonce
Следующее
От: Suraj Khamkar
Дата:
Сообщение: Re: [PATCH] Added TRANSFORM FOR for COMMENT tab completion