Обсуждение: Dubugging an intermittent foreign key insert error with csvlog
Hi all,
I've been debugging an intermittent foreign key insert error on our
single database / multi-tenant server. To help isolate traffic by
tenant, I've switched to using csvlog and for the duration and have set
log_min_duration_statement to 0 to get *everything*. Fortunately, daily
80G csvlogs compress nicely.
For lack of a readable way to paste in the 12 lines of relevant csvlog
into an e-mail, I've uploaded a very small 3K csv file to my web server at
https://openvistas.net/hansens_error.csv
The bare bones of the issue involve inserting a row into a table named
load_det, then getting the serial sequence of that table, getting the
last_value of that sequence and then inserting into another table named
cargo_det using that retrieved last_value as the foreign key that ties
this row to the load_det table. The vast majority of these succeed
without issue but not all.
The csvlog snip shows what I believe are 2 simultaneous but separate
sessions and the session that attempts to insert into the cargo_det
table is not the same session that inserted into the load_det table.
That's what my hunch is but what is unclear to me is if those separate
sessions are also in separate transactions.
csvlog has a couple of columns that I'm unclear about:
session_start_time, virtual_transaction_id. Is session_start_time the
time inside a transaction block, as in beginning with a begin but before
a commit or rollback? Or is it maybe just how long this pgbouncer
session has been connected? virtual_transaction_id is defined in the
docs as backendID/localXID--do separate backendIDs also represent
separate transactions? Is there a better way to determine separate
transactions within csvlog?
Also, the app code that does this is legacy perl using DBD::Pg but the
original code was written for Informix. We've been in the process of
moving off informix for a while now and should be done within the month.
I intend to re-work this to use returning id (available in postgres
since 8.2!) instead of the serial sequence / last_value hack but not
quite there yet.
Thanks,
Jeff Ross
On 9/16/19 1:46 PM, Jeff Ross wrote: > Hi all, > > I've been debugging an intermittent foreign key insert error on our > single database / multi-tenant server. To help isolate traffic by > tenant, I've switched to using csvlog and for the duration and have set > log_min_duration_statement to 0 to get *everything*. Fortunately, daily > 80G csvlogs compress nicely. > > For lack of a readable way to paste in the 12 lines of relevant csvlog > into an e-mail, I've uploaded a very small 3K csv file to my web server at > > https://openvistas.net/hansens_error.csv > > The bare bones of the issue involve inserting a row into a table named > load_det, then getting the serial sequence of that table, getting the > last_value of that sequence and then inserting into another table named > cargo_det using that retrieved last_value as the foreign key that ties > this row to the load_det table. The vast majority of these succeed > without issue but not all. > > The csvlog snip shows what I believe are 2 simultaneous but separate > sessions and the session that attempts to insert into the cargo_det > table is not the same session that inserted into the load_det table. > That's what my hunch is but what is unclear to me is if those separate > sessions are also in separate transactions. To me it looks like the INSERT into load_det and into cargo_det are occurring in the same transaction(934281062). The part that would concern me is that: select last_value from load_det_id_seq occurs in different transactions and sessions. From here: https://www.postgresql.org/docs/11/sql-createsequence.html "Also, last_value will reflect the latest value reserved by any session, whether or not it has yet been returned by nextval." Especially as the error is coming from a different transaction(934281063) and session then the INSERTs. I'm guessing that there is cross talk on the sequence number fetch and application to cargo_det. > > csvlog has a couple of columns that I'm unclear about: > session_start_time, virtual_transaction_id. Is session_start_time the > time inside a transaction block, as in beginning with a begin but before > a commit or rollback? Or is it maybe just how long this pgbouncer > session has been connected? virtual_transaction_id is defined in the > docs as backendID/localXID--do separate backendIDs also represent > separate transactions? Is there a better way to determine separate > transactions within csvlog? > > Also, the app code that does this is legacy perl using DBD::Pg but the > original code was written for Informix. We've been in the process of > moving off informix for a while now and should be done within the month. > I intend to re-work this to use returning id (available in postgres > since 8.2!) instead of the serial sequence / last_value hack but not > quite there yet. > > Thanks, > > Jeff Ross > > -- Adrian Klaver adrian.klaver@aklaver.com
On 9/16/19 4:07 PM, Adrian Klaver wrote: > On 9/16/19 1:46 PM, Jeff Ross wrote: >> >> The csvlog snip shows what I believe are 2 simultaneous but separate >> sessions and the session that attempts to insert into the cargo_det >> table is not the same session that inserted into the load_det table. >> That's what my hunch is but what is unclear to me is if those separate >> sessions are also in separate transactions. > > To me it looks like the INSERT into load_det and into cargo_det are > occurring in the same transaction(934281062). The part that would > concern me is that: > > select last_value from load_det_id_seq > > occurs in different transactions and sessions. From here: > > https://www.postgresql.org/docs/11/sql-createsequence.html > > "Also, last_value will reflect the latest value reserved by any session, > whether or not it has yet been returned by nextval." > > Especially as the error is coming from a different > transaction(934281063) and session then the INSERTs. I'm guessing that > there is cross talk on the sequence number fetch and application to > cargo_det. > Thank you Adrian--I think my hunch was basically correct then. Now all I need to do is figure out why we have multiple sessions and transactions. Or it might be time to skip ahead and get rid of the last_value query. Jeff
On 9/16/19 3:16 PM, Jeff Ross wrote: > On 9/16/19 4:07 PM, Adrian Klaver wrote: >> On 9/16/19 1:46 PM, Jeff Ross wrote: > >>> >>> The csvlog snip shows what I believe are 2 simultaneous but separate >>> sessions and the session that attempts to insert into the cargo_det >>> table is not the same session that inserted into the load_det table. >>> That's what my hunch is but what is unclear to me is if those >>> separate sessions are also in separate transactions. >> >> To me it looks like the INSERT into load_det and into cargo_det are >> occurring in the same transaction(934281062). The part that would >> concern me is that: >> >> select last_value from load_det_id_seq >> >> occurs in different transactions and sessions. From here: >> >> https://www.postgresql.org/docs/11/sql-createsequence.html >> >> "Also, last_value will reflect the latest value reserved by any >> session, whether or not it has yet been returned by nextval." >> >> Especially as the error is coming from a different >> transaction(934281063) and session then the INSERTs. I'm guessing that >> there is cross talk on the sequence number fetch and application to >> cargo_det. >> > > Thank you Adrian--I think my hunch was basically correct then. Now all > I need to do is figure out why we have multiple sessions and > transactions. Or it might be time to skip ahead and get rid of the > last_value query. Yes, RETURNING makes this sort of thing so much easier it pays to invest the time in making it part of the query. > > Jeff > > -- Adrian Klaver adrian.klaver@aklaver.com