Обсуждение: Detecting File Damage & Inconsistencies
I would like to propose a few points that will help us detect file damage, inconsistencies in files and track actions of users. Optionally, we could add these fields onto the WAL commit record: * 8-byte session start time (from MyStartTime) * 2-byte pid (from MyProcPid) * 4-byte user oid Identified by a name flag bit, XACT_XINFO_HAS_USERINFO. That allows us to match up transactions with the server log %c option. Another option might allow text username to be added to each commit as well. XLogLongPageHeaderData has 4 bytes of unused data because of alignment. We could use that space for 1) the Xid Epoch value or 2) a CRC value - since only WAL records are covered by a CRC, not page or file headers. Perhaps we should add both? There are also 2 bytes unused in the XLogRecord header (with 8 byte alignment). We could optionally use that space for the pid that wrote the record, but that's not compelling. What can we use those 2 bytes for? REINDEX VERIFY After the new index is created, but before we drop the old index: Check whether the two indexes match: * checks whether the previous index had pointers to row versions that don't exist * checks whether the heap has rows that were not in the old index This approach piggybacks on existing operations. AccessShareLock is held on both indexes before the old one is dropped. Other ideas are welcome. Thoughts? -- Simon Riggs http://www.EnterpriseDB.com/
On 11/11/20 21:56, Simon Riggs wrote: > [ŝnip] > > REINDEX VERIFY > After the new index is created, but before we drop the old index: > Check whether the two indexes match: > * checks whether the previous index had pointers to row versions that > don't exist > * checks whether the heap has rows that were not in the old index > This approach piggybacks on existing operations. AccessShareLock is > held on both indexes before the old one is dropped. FWIW, as long as it's optional (due to the added runtime), it'd be a welcome feature. Maybe something along the lines of: REINDEX (verify yes) .... > Other ideas are welcome. I still have nightmares from an specific customer case w/ shared storage (using VxFS) among two postmaster instances ---supposedly could never be active concurrently, not completely sure that it didn't actually happen--- and the corruption that we found there. I seem to remember that they even had scripts to remove the locking when switching over and back :S I don't think Postgres can do much about this, but maybe someone can come up with a suitable countermeasure. Just my .02€ Thanks, / J.L.
From: Simon Riggs <simon@2ndquadrant.com> > I would like to propose a few points that will help us detect file > damage, inconsistencies in files and track actions of users. Hello, Simon san. Long time no see. I'm happy to see you be back here recently. What kind of improvement do you expect? What problems would this make detectable? > * 2-byte pid (from MyProcPid) pid is 4 bytes on Windows. Isn't it also 4 byte on Linux when some kernel parameter is set to a certain value? Regards Takayuki Tsunakawa
On Thu, 12 Nov 2020 at 06:42, tsunakawa.takay@fujitsu.com <tsunakawa.takay@fujitsu.com> wrote: > > From: Simon Riggs <simon@2ndquadrant.com> > > I would like to propose a few points that will help us detect file > > damage, inconsistencies in files and track actions of users. > > Hello, Simon san. Long time no see. I'm happy to see you be back here recently. Thank you, happy to be back. It's good to have the time to contribute again. > What kind of improvement do you expect? What problems would this make detectable? If a rogue user/process is suspected, this would allow you to identify more easily the changes made by specific sessions/users. > > * 2-byte pid (from MyProcPid) > > pid is 4 bytes on Windows. Isn't it also 4 byte on Linux when some kernel parameter is set to a certain value? 4 bytes is no problem, thanks for pointing that out. -- Simon Riggs http://www.EnterpriseDB.com/
From: Simon Riggs <simon@2ndquadrant.com> > If a rogue user/process is suspected, this would allow you to identify > more easily the changes made by specific sessions/users. Isn't that kind of auditing a job of pgAudit or log_statement = mod? Or, does "more easily" mean that you find pgAudit complexto use and/or log_statement's overhead is big? Regards Takayuki Tsunakawa
On Fri, 13 Nov 2020 at 00:50, tsunakawa.takay@fujitsu.com <tsunakawa.takay@fujitsu.com> wrote: > > From: Simon Riggs <simon@2ndquadrant.com> > > If a rogue user/process is suspected, this would allow you to identify > > more easily the changes made by specific sessions/users. > > Isn't that kind of auditing a job of pgAudit or log_statement = mod? Or, does "more easily" mean that you find pgAuditcomplex to use and/or log_statement's overhead is big? Well, I designed pgaudit, so yes, I think pgaudit is useful. However, pgaudit works at the statement level, not the data level. So using pgaudit to locate data rows that have changed is fairly hard. What I'm proposing is an option to add 16 bytes onto each COMMIT record, which is considerably less than turning on full auditing in pgaudit. This option would allow identifying data at the row level, so you could for example find all rows changed by specific sessions. Also, because it is stored in WAL it will show updates that might no longer exist in the database because the changed row versions might have been vacuumed away. So pgaudit will tell you that happened, but having extra info in WAL is important also. So thank you for the question because it has allowed me to explain why it is useful and important. -- Simon Riggs http://www.EnterpriseDB.com/
On Fri, 13 Nov 2020 at 11:24, Simon Riggs <simon@2ndquadrant.com> wrote: > > On Fri, 13 Nov 2020 at 00:50, tsunakawa.takay@fujitsu.com > <tsunakawa.takay@fujitsu.com> wrote: > > > > From: Simon Riggs <simon@2ndquadrant.com> > > > If a rogue user/process is suspected, this would allow you to identify > > > more easily the changes made by specific sessions/users. > > > > Isn't that kind of auditing a job of pgAudit or log_statement = mod? Or, does "more easily" mean that you find pgAuditcomplex to use and/or log_statement's overhead is big? > > Well, I designed pgaudit, so yes, I think pgaudit is useful. > > However, pgaudit works at the statement level, not the data level. So > using pgaudit to locate data rows that have changed is fairly hard. > > What I'm proposing is an option to add 16 bytes onto each COMMIT > record, which is considerably less than turning on full auditing in > pgaudit. This option would allow identifying data at the row level, so > you could for example find all rows changed by specific sessions. > Also, because it is stored in WAL it will show updates that might no > longer exist in the database because the changed row versions might > have been vacuumed away. So pgaudit will tell you that happened, but > having extra info in WAL is important also. > > So thank you for the question because it has allowed me to explain why > it is useful and important. Patch attached to implement "wal_sessioninfo" option. -- Simon Riggs http://www.EnterpriseDB.com/
Вложения
On Fri, Nov 13, 2020 at 7:24 PM Simon Riggs <simon@2ndquadrant.com> wrote:
What I'm proposing is an option to add 16 bytes onto each COMMIT
record
Otherwise it won't be accessible to streaming-mode logical decoding.
On Wed, 18 Nov 2020 at 06:42, Craig Ringer <craig.ringer@enterprisedb.com> wrote: > > On Fri, Nov 13, 2020 at 7:24 PM Simon Riggs <simon@2ndquadrant.com> wrote: >> >> >> What I'm proposing is an option to add 16 bytes onto each COMMIT >> record > > > Would it make sense to write this at the time we write a topxid assignment to WAL instead? > > Otherwise it won't be accessible to streaming-mode logical decoding. Do you mean extend the xl_xact_assignment record? My understanding is that is not sent in all cases, so not sure what you mean by "instead". -- Simon Riggs http://www.EnterpriseDB.com/
Hi Simon, On Wed, Nov 18, 2020 at 2:14 AM Simon Riggs <simon@2ndquadrant.com> wrote: > > On Fri, 13 Nov 2020 at 11:24, Simon Riggs <simon@2ndquadrant.com> wrote: > > > > On Fri, 13 Nov 2020 at 00:50, tsunakawa.takay@fujitsu.com > > <tsunakawa.takay@fujitsu.com> wrote: > > > > > > From: Simon Riggs <simon@2ndquadrant.com> > > > > If a rogue user/process is suspected, this would allow you to identify > > > > more easily the changes made by specific sessions/users. > > > > > > Isn't that kind of auditing a job of pgAudit or log_statement = mod? Or, does "more easily" mean that you find pgAuditcomplex to use and/or log_statement's overhead is big? > > > > Well, I designed pgaudit, so yes, I think pgaudit is useful. > > > > However, pgaudit works at the statement level, not the data level. So > > using pgaudit to locate data rows that have changed is fairly hard. > > > > What I'm proposing is an option to add 16 bytes onto each COMMIT > > record, which is considerably less than turning on full auditing in > > pgaudit. This option would allow identifying data at the row level, so > > you could for example find all rows changed by specific sessions. > > Also, because it is stored in WAL it will show updates that might no > > longer exist in the database because the changed row versions might > > have been vacuumed away. So pgaudit will tell you that happened, but > > having extra info in WAL is important also. > > > > So thank you for the question because it has allowed me to explain why > > it is useful and important. > > Patch attached to implement "wal_sessioninfo" option. You sent in your patch, wal_sessioninfo.v2.patch to pgsql-hackers on Nov 18, but you did not post it to the next CommitFest[1]. If this was intentional, then you need to take no action. However, if you want your patch to be reviewed as part of the upcoming CommitFest, then you need to add it yourself before 2021-01-01 AoE[2]. Thanks for your contributions. Regards, [1] https://commitfest.postgresql.org/31/ [2] https://en.wikipedia.org/wiki/Anywhere_on_Earth -- Masahiko Sawada EnterpriseDB: https://www.enterprisedb.com/
On Mon, 15 Mar 2021 at 21:01, David Steele <david@pgmasters.net> wrote:
On 11/18/20 5:23 AM, Simon Riggs wrote:
> On Wed, 18 Nov 2020 at 06:42, Craig Ringer
> <craig.ringer@enterprisedb.com> wrote:
>>
>> On Fri, Nov 13, 2020 at 7:24 PM Simon Riggs <simon@2ndquadrant.com> wrote:
>>>
>>>
>>> What I'm proposing is an option to add 16 bytes onto each COMMIT
>>> record
>>
>>
>> Would it make sense to write this at the time we write a topxid assignment to WAL instead?
>>
>> Otherwise it won't be accessible to streaming-mode logical decoding.
>
> Do you mean extend the xl_xact_assignment record? My understanding is
> that is not sent in all cases, so not sure what you mean by "instead".
Craig, can you clarify?
Right. Or write a separate WAL record when the feature is enabled. But it's probably sufficient to write it as an optional chunk on xl_xact_assignment records. We often defer writing them so we can optimise away xacts that never actually wrote anything, but IIRC we still write one before we write any WAL that references the xid. That'd be fine, since we don't need the info any sooner than that during decoding. I'd have to double check that we write it in all cases and won't get to that too soon, but I'm pretty sure we do...
On Thu, Mar 18, 2021 at 6:20 AM Craig Ringer <craig.ringer@enterprisedb.com> wrote: > > On Mon, 15 Mar 2021 at 21:01, David Steele <david@pgmasters.net> wrote: >> >> On 11/18/20 5:23 AM, Simon Riggs wrote: >> > On Wed, 18 Nov 2020 at 06:42, Craig Ringer >> > <craig.ringer@enterprisedb.com> wrote: >> >> >> >> On Fri, Nov 13, 2020 at 7:24 PM Simon Riggs <simon@2ndquadrant.com> wrote: >> >>> >> >>> >> >>> What I'm proposing is an option to add 16 bytes onto each COMMIT >> >>> record >> >> >> >> >> >> Would it make sense to write this at the time we write a topxid assignment to WAL instead? >> >> >> >> Otherwise it won't be accessible to streaming-mode logical decoding. >> > >> > Do you mean extend the xl_xact_assignment record? My understanding is >> > that is not sent in all cases, so not sure what you mean by "instead". >> >> Craig, can you clarify? > > > Right. Or write a separate WAL record when the feature is enabled. But it's probably sufficient to write it as an optionalchunk on xl_xact_assignment records. We often defer writing them so we can optimise away xacts that never actuallywrote anything, but IIRC we still write one before we write any WAL that references the xid. That'd be fine, sincewe don't need the info any sooner than that during decoding. I'd have to double check that we write it in all casesand won't get to that too soon, but I'm pretty sure we do... The commit record is optimized away if no xid is assigned, though is still present if we didn't write any WAL records. But if a commit record exists in the WAL stream, we want to know where it came from. A later patch will add PITR capability based on this information so attaching it directly to the commit record is fairly important, IMHO. -- Simon Riggs http://www.EnterpriseDB.com/
On Tue, 22 Jun 2021 at 00:24, Simon Riggs <simon.riggs@enterprisedb.com> wrote:
On Thu, Mar 18, 2021 at 6:20 AM Craig Ringer
<craig.ringer@enterprisedb.com> wrote:
>
> On Mon, 15 Mar 2021 at 21:01, David Steele <david@pgmasters.net> wrote:
>>
>> On 11/18/20 5:23 AM, Simon Riggs wrote:
>> > On Wed, 18 Nov 2020 at 06:42, Craig Ringer
>> > <craig.ringer@enterprisedb.com> wrote:
>> >>
>> >> On Fri, Nov 13, 2020 at 7:24 PM Simon Riggs <simon@2ndquadrant.com> wrote:
>> >>>
>> >>>
>> >>> What I'm proposing is an option to add 16 bytes onto each COMMIT
>> >>> record
>> >>
>> >>
>> >> Would it make sense to write this at the time we write a topxid assignment to WAL instead?
>> >>
>> >> Otherwise it won't be accessible to streaming-mode logical decoding.
>> >
>> > Do you mean extend the xl_xact_assignment record? My understanding is
>> > that is not sent in all cases, so not sure what you mean by "instead".
>>
>> Craig, can you clarify?
>
>
> Right. Or write a separate WAL record when the feature is enabled. But it's probably sufficient to write it as an optional chunk on xl_xact_assignment records. We often defer writing them so we can optimise away xacts that never actually wrote anything, but IIRC we still write one before we write any WAL that references the xid. That'd be fine, since we don't need the info any sooner than that during decoding. I'd have to double check that we write it in all cases and won't get to that too soon, but I'm pretty sure we do...
The commit record is optimized away if no xid is assigned, though is
still present if we didn't write any WAL records.
But if a commit record exists in the WAL stream, we want to know where
it came from.
A later patch will add PITR capability based on this information so
attaching it directly to the commit record is fairly important, IMHO.
Why?
All the proposed info:
* 8-byte session start time (from MyStartTime)
* 2-byte pid (from MyProcPid)
* 4-byte user oid
are available at topxid assignment time. If we defer writing them until commit, we lose the ability to use this information during streaming logical decoding. That's something I believe you've wanted for other functionality in the past, such as logical decoding based audit functionality.
IIRC the restart_lsn horizon already ensures that we can't miss the xl_xact_assignment at the start of a txn. We would ensure that the desired info is available throughout decoding of the txn, including at commit record processing time, by adding it to the toplevel ReorderBufferTxn.
The only advantage I can see to annotating the commit record instead is that we don't have to spend a few bytes per reorder-buffered topxid to track this info between start of decoding for the tx and processing of the commit record. I don't think that's worth caring about.The advantages that having it earlier would give us are much more significant.
A few examples:
* Skip reorder buffering of non-target transactions early, so we can decode the WAL stream to find the target transactions much faster using less memory and I/O;
* Read the database change stream and use the session info to stream info into an intrusion detection system and/or audit engine in real time, using txn streaming to avoid the need to create huge reorder buffers;
* Re-decode the WAL stream to identify a target txn you know was aborted, and commit it instead, so you can recover data from aborted txns from the WAL stream using logical decoding. (Only possible if the catalog_xmin hasn't advanced past that point already though)
So yeah. I think it'd be better to log the info you want at start-of-txn unless there's a compelling reason not so, and I don't see one yet.
On Tue, Jun 22, 2021 at 6:32 AM Craig Ringer <craig.ringer@enterprisedb.com> wrote: > IIRC the restart_lsn horizon already ensures that we can't miss the xl_xact_assignment at the start of a txn. We wouldensure that the desired info is available throughout decoding of the txn, including at commit record processing time,by adding it to the toplevel ReorderBufferTxn. > > The only advantage I can see to annotating the commit record instead is that we don't have to spend a few bytes per reorder-bufferedtopxid to track this info between start of decoding for the tx and processing of the commit record. I don'tthink that's worth caring about.The advantages that having it earlier would give us are much more significant. > > A few examples: > > * Skip reorder buffering of non-target transactions early, so we can decode the WAL stream to find the target transactionsmuch faster using less memory and I/O; > > * Read the database change stream and use the session info to stream info into an intrusion detection system and/or auditengine in real time, using txn streaming to avoid the need to create huge reorder buffers; > > * Re-decode the WAL stream to identify a target txn you know was aborted, and commit it instead, so you can recover datafrom aborted txns from the WAL stream using logical decoding. (Only possible if the catalog_xmin hasn't advanced pastthat point already though) > > So yeah. I think it'd be better to log the info you want at start-of-txn unless there's a compelling reason not so, andI don't see one yet. AFAIK, XLOG_XACT_ASSIGNMENT does not occur for normal top-level transactions, only for subxids. I don't really want to add an extra record just for this because it will slow down applications and it won't get turned on as often. Thoughts? -- Simon Riggs http://www.EnterpriseDB.com/
On Fri, 2 Jul 2021 at 00:19, Simon Riggs <simon.riggs@enterprisedb.com> wrote:
> So yeah. I think it'd be better to log the info you want at start-of-txn unless there's a compelling reason not so, and I don't see one yet.
AFAIK, XLOG_XACT_ASSIGNMENT does not occur for normal top-level
transactions, only for subxids.
I don't really want to add an extra record just for this because it
will slow down applications and it won't get turned on as often.
OK, that makes sense - I was indeed operating on an incorrect assumption.
I wouldn't want to add a new record either. I thought we could piggyback on XLOG_XACT_ASSIGNMENT with a new chunk that's only added when the feature is required, much like we do for replication origin info on commit records.
Is it worth considering forcing XLOG_XACT_ASSIGNMENT to be logged if this feature is enabled?
If you don't think the sorts of use cases I presented are worth the trouble that's fair enough. I'm not against adding it on the commit record. It's just that with logical decoding moving toward a streaming model I suspect only having it at commit time may cause us some pain later.
On Fri, Jul 2, 2021 at 5:34 AM Craig Ringer <craig.ringer@enterprisedb.com> wrote: > > On Fri, 2 Jul 2021 at 00:19, Simon Riggs <simon.riggs@enterprisedb.com> wrote: > >> >> > So yeah. I think it'd be better to log the info you want at start-of-txn unless there's a compelling reason not so,and I don't see one yet. >> >> AFAIK, XLOG_XACT_ASSIGNMENT does not occur for normal top-level >> transactions, only for subxids. >> >> I don't really want to add an extra record just for this because it >> will slow down applications and it won't get turned on as often. > > > OK, that makes sense - I was indeed operating on an incorrect assumption. > > I wouldn't want to add a new record either. I thought we could piggyback on XLOG_XACT_ASSIGNMENT with a new chunk that'sonly added when the feature is required, much like we do for replication origin info on commit records. > > Is it worth considering forcing XLOG_XACT_ASSIGNMENT to be logged if this feature is enabled? My feeling is that the drop in performance would lead to it being turned off most of the time, reducing the value of the feature. Does anyone else disagree? > If you don't think the sorts of use cases I presented are worth the trouble that's fair enough. I'm not against addingit on the commit record. It's just that with logical decoding moving toward a streaming model I suspect only havingit at commit time may cause us some pain later. I think you have some good ideas about how to handle larger transactions with streaming. As a separate patch it might be worth keeping track of transaction size and logging something when a transaction gets too large. -- Simon Riggs http://www.EnterpriseDB.com/
On Fri, Jul 2, 2021 at 8:29 PM Simon Riggs <simon.riggs@enterprisedb.com> wrote: > > On Fri, Jul 2, 2021 at 5:34 AM Craig Ringer > <craig.ringer@enterprisedb.com> wrote: > > > > > If you don't think the sorts of use cases I presented are worth the trouble that's fair enough. I'm not against addingit on the commit record. It's just that with logical decoding moving toward a streaming model I suspect only havingit at commit time may cause us some pain later. > > I think you have some good ideas about how to handle larger > transactions with streaming. As a separate patch it might be worth > keeping track of transaction size and logging something when a > transaction gets too large. > If we want this additional information for streaming mode in logical replication then can't we piggyback it on the very first record written for a transaction when this info is required? Currently, we do something similar for logging top_level_xid for subtransaction in XLogRecordAssemble(). -- With Regards, Amit Kapila.
On Tue, Jul 6, 2021 at 4:21 AM Amit Kapila <amit.kapila16@gmail.com> wrote: > > On Fri, Jul 2, 2021 at 8:29 PM Simon Riggs <simon.riggs@enterprisedb.com> wrote: > > > > On Fri, Jul 2, 2021 at 5:34 AM Craig Ringer > > <craig.ringer@enterprisedb.com> wrote: > > > > > > > > If you don't think the sorts of use cases I presented are worth the trouble that's fair enough. I'm not against addingit on the commit record. It's just that with logical decoding moving toward a streaming model I suspect only havingit at commit time may cause us some pain later. > > > > I think you have some good ideas about how to handle larger > > transactions with streaming. As a separate patch it might be worth > > keeping track of transaction size and logging something when a > > transaction gets too large. > > > > If we want this additional information for streaming mode in logical > replication then can't we piggyback it on the very first record > written for a transaction when this info is required? Currently, we do > something similar for logging top_level_xid for subtransaction in > XLogRecordAssemble(). It's possible, but I'm struggling to believe anybody would accept that as an approach because it breaks simplicity, modularity and makes it harder to search for this info in the WAL. I was imagining that we'd keep track of amount of WAL written by a transaction and when it reaches a certain size generate a "streaming info" record as an early warning that we have a big transaction coming down the pipe. I'm feeling that a simple patch is expanding well beyond its original scope and timeline. How can we do this simply? -- Simon Riggs http://www.EnterpriseDB.com/
On Tue, Jul 13, 2021 at 8:29 PM Simon Riggs <simon.riggs@enterprisedb.com> wrote: > > On Tue, Jul 6, 2021 at 4:21 AM Amit Kapila <amit.kapila16@gmail.com> wrote: > > > > > > > > > If you don't think the sorts of use cases I presented are worth the trouble that's fair enough. I'm not against addingit on the commit record. It's just that with logical decoding moving toward a streaming model I suspect only havingit at commit time may cause us some pain later. > > > > > > I think you have some good ideas about how to handle larger > > > transactions with streaming. As a separate patch it might be worth > > > keeping track of transaction size and logging something when a > > > transaction gets too large. > > > > > > > If we want this additional information for streaming mode in logical > > replication then can't we piggyback it on the very first record > > written for a transaction when this info is required? Currently, we do > > something similar for logging top_level_xid for subtransaction in > > XLogRecordAssemble(). > > It's possible, but I'm struggling to believe anybody would accept that > as an approach because it breaks simplicity, modularity and makes it > harder to search for this info in the WAL. > > I was imagining that we'd keep track of amount of WAL written by a > transaction and when it reaches a certain size generate a "streaming > info" record as an early warning that we have a big transaction coming > down the pipe. > I am not sure if that satisfies Craig's requirement of making it available during the streaming of in-progress xacts during logical replication. It is quite possible that by the time we decide to start streaming a transaction this information won't be logged yet. > I'm feeling that a simple patch is expanding well beyond its original > scope and timeline. How can we do this simply? > The patch is simple but its use doesn't seem to be very clear. You have mentioned its use for future PITR patches and Craig mentioned some use cases in logical decoding and it appears to me that to support the use cases mentioned by Craig, it is important to LOG this earlier than at commit time. As there are no details about how it will be used for PITR patches and whether such patch ideas are accepted, it makes it harder to judge the value of this patch. I think if we would have patches (even at WIP/POC stage) for the ideas you and Craig have in mind, it would have been much easier to see the value of this patch. -- With Regards, Amit Kapila.
On Wed, 14 Jul 2021 at 05:01, Amit Kapila <amit.kapila16@gmail.com> wrote: > The patch is simple but its use doesn't seem to be very clear. You > have mentioned its use for future PITR patches and Craig mentioned > some use cases in logical decoding and it appears to me that to > support the use cases mentioned by Craig, it is important to LOG this > earlier than at commit time. As there are no details about how it will > be used for PITR patches and whether such patch ideas are accepted, it > makes it harder to judge the value of this patch. > I think if we would have patches (even at WIP/POC stage) for the ideas > you and Craig have in mind, it would have been much easier to see the > value of this patch. Fair question. This is one of a series of 4 independent patches I have planned to provide enhanced information and/or recovery tools. (The second one is already in this CF). This is an area I know lots about and nobody else is working on, so I thought I would contribute. I've not discussed this off-list with anyone else. So this is PITR as recovery in a broad sense, not just that specific feature. For this patch, the idea is to be able to go in either direction: Data damage <--> User info So if you know a user was an intruder, you can detect the damage they caused. Or, if you detect damage, you can work out who caused it, work out if they were an intruder and if so, detect what else they did. The most important thing is to have the info available in WAL, nothing is possible until that is available. We already added an option to add this same info to log_line_prefix, yet nobody said it wasn't useful there, or needed other uses to allow the feature. The two sources of info are designed to be able to be used in combination. My experience of recovery scenarios is that you often have to build custom search tools to make it work. It's hard to say whether you'll want to track the user, the specific session, or even specific transactions. But I do understand the overall request, so I propose adding * pg_waldump output for wal_sessioninfo data, if it exists * pg_waldump --user=USERNAME as a filter on username to demonstrate the use of this -- Simon Riggs http://www.EnterpriseDB.com/
On Thu, Jul 22, 2021 at 7:10 PM Simon Riggs <simon.riggs@enterprisedb.com> wrote: > > On Wed, 14 Jul 2021 at 05:01, Amit Kapila <amit.kapila16@gmail.com> wrote: > > But I do understand the overall request, so I propose adding > * pg_waldump output for wal_sessioninfo data, if it exists > * pg_waldump --user=USERNAME as a filter on username > to demonstrate the use of this > This makes sense but I am still thinking of some more concrete way. Can we think of providing a way to filter WAL from user/process (like filter_by_origin) for logical decoding? If so, then we can have an example to show via test_decoding. -- With Regards, Amit Kapila.