Обсуждение: Dubugging an intermittent foreign key insert error with csvlog
Hi all, I've been debugging an intermittent foreign key insert error on our single database / multi-tenant server. To help isolate traffic by tenant, I've switched to using csvlog and for the duration and have set log_min_duration_statement to 0 to get *everything*. Fortunately, daily 80G csvlogs compress nicely. For lack of a readable way to paste in the 12 lines of relevant csvlog into an e-mail, I've uploaded a very small 3K csv file to my web server at https://openvistas.net/hansens_error.csv The bare bones of the issue involve inserting a row into a table named load_det, then getting the serial sequence of that table, getting the last_value of that sequence and then inserting into another table named cargo_det using that retrieved last_value as the foreign key that ties this row to the load_det table. The vast majority of these succeed without issue but not all. The csvlog snip shows what I believe are 2 simultaneous but separate sessions and the session that attempts to insert into the cargo_det table is not the same session that inserted into the load_det table. That's what my hunch is but what is unclear to me is if those separate sessions are also in separate transactions. csvlog has a couple of columns that I'm unclear about: session_start_time, virtual_transaction_id. Is session_start_time the time inside a transaction block, as in beginning with a begin but before a commit or rollback? Or is it maybe just how long this pgbouncer session has been connected? virtual_transaction_id is defined in the docs as backendID/localXID--do separate backendIDs also represent separate transactions? Is there a better way to determine separate transactions within csvlog? Also, the app code that does this is legacy perl using DBD::Pg but the original code was written for Informix. We've been in the process of moving off informix for a while now and should be done within the month. I intend to re-work this to use returning id (available in postgres since 8.2!) instead of the serial sequence / last_value hack but not quite there yet. Thanks, Jeff Ross
On 9/16/19 1:46 PM, Jeff Ross wrote: > Hi all, > > I've been debugging an intermittent foreign key insert error on our > single database / multi-tenant server. To help isolate traffic by > tenant, I've switched to using csvlog and for the duration and have set > log_min_duration_statement to 0 to get *everything*. Fortunately, daily > 80G csvlogs compress nicely. > > For lack of a readable way to paste in the 12 lines of relevant csvlog > into an e-mail, I've uploaded a very small 3K csv file to my web server at > > https://openvistas.net/hansens_error.csv > > The bare bones of the issue involve inserting a row into a table named > load_det, then getting the serial sequence of that table, getting the > last_value of that sequence and then inserting into another table named > cargo_det using that retrieved last_value as the foreign key that ties > this row to the load_det table. The vast majority of these succeed > without issue but not all. > > The csvlog snip shows what I believe are 2 simultaneous but separate > sessions and the session that attempts to insert into the cargo_det > table is not the same session that inserted into the load_det table. > That's what my hunch is but what is unclear to me is if those separate > sessions are also in separate transactions. To me it looks like the INSERT into load_det and into cargo_det are occurring in the same transaction(934281062). The part that would concern me is that: select last_value from load_det_id_seq occurs in different transactions and sessions. From here: https://www.postgresql.org/docs/11/sql-createsequence.html "Also, last_value will reflect the latest value reserved by any session, whether or not it has yet been returned by nextval." Especially as the error is coming from a different transaction(934281063) and session then the INSERTs. I'm guessing that there is cross talk on the sequence number fetch and application to cargo_det. > > csvlog has a couple of columns that I'm unclear about: > session_start_time, virtual_transaction_id. Is session_start_time the > time inside a transaction block, as in beginning with a begin but before > a commit or rollback? Or is it maybe just how long this pgbouncer > session has been connected? virtual_transaction_id is defined in the > docs as backendID/localXID--do separate backendIDs also represent > separate transactions? Is there a better way to determine separate > transactions within csvlog? > > Also, the app code that does this is legacy perl using DBD::Pg but the > original code was written for Informix. We've been in the process of > moving off informix for a while now and should be done within the month. > I intend to re-work this to use returning id (available in postgres > since 8.2!) instead of the serial sequence / last_value hack but not > quite there yet. > > Thanks, > > Jeff Ross > > -- Adrian Klaver adrian.klaver@aklaver.com
On 9/16/19 4:07 PM, Adrian Klaver wrote: > On 9/16/19 1:46 PM, Jeff Ross wrote: >> >> The csvlog snip shows what I believe are 2 simultaneous but separate >> sessions and the session that attempts to insert into the cargo_det >> table is not the same session that inserted into the load_det table. >> That's what my hunch is but what is unclear to me is if those separate >> sessions are also in separate transactions. > > To me it looks like the INSERT into load_det and into cargo_det are > occurring in the same transaction(934281062). The part that would > concern me is that: > > select last_value from load_det_id_seq > > occurs in different transactions and sessions. From here: > > https://www.postgresql.org/docs/11/sql-createsequence.html > > "Also, last_value will reflect the latest value reserved by any session, > whether or not it has yet been returned by nextval." > > Especially as the error is coming from a different > transaction(934281063) and session then the INSERTs. I'm guessing that > there is cross talk on the sequence number fetch and application to > cargo_det. > Thank you Adrian--I think my hunch was basically correct then. Now all I need to do is figure out why we have multiple sessions and transactions. Or it might be time to skip ahead and get rid of the last_value query. Jeff
On 9/16/19 3:16 PM, Jeff Ross wrote: > On 9/16/19 4:07 PM, Adrian Klaver wrote: >> On 9/16/19 1:46 PM, Jeff Ross wrote: > >>> >>> The csvlog snip shows what I believe are 2 simultaneous but separate >>> sessions and the session that attempts to insert into the cargo_det >>> table is not the same session that inserted into the load_det table. >>> That's what my hunch is but what is unclear to me is if those >>> separate sessions are also in separate transactions. >> >> To me it looks like the INSERT into load_det and into cargo_det are >> occurring in the same transaction(934281062). The part that would >> concern me is that: >> >> select last_value from load_det_id_seq >> >> occurs in different transactions and sessions. From here: >> >> https://www.postgresql.org/docs/11/sql-createsequence.html >> >> "Also, last_value will reflect the latest value reserved by any >> session, whether or not it has yet been returned by nextval." >> >> Especially as the error is coming from a different >> transaction(934281063) and session then the INSERTs. I'm guessing that >> there is cross talk on the sequence number fetch and application to >> cargo_det. >> > > Thank you Adrian--I think my hunch was basically correct then. Now all > I need to do is figure out why we have multiple sessions and > transactions. Or it might be time to skip ahead and get rid of the > last_value query. Yes, RETURNING makes this sort of thing so much easier it pays to invest the time in making it part of the query. > > Jeff > > -- Adrian Klaver adrian.klaver@aklaver.com