Обсуждение: Followup: vacuum'ing toast
Thanks folks for the quick replies. 1. There is one transaction, connected from the JVM, that is showing "IDLE in transaction" .... this appears to be a leftover from Hibernate looking at the schema metadata. It's Apache Jackrabbit, not our own code: hyper9test_1_6=# select c.relname, l.* from pg_class c, pg_locks l where c.relfilenode=l.relation and l.pid in (select procpid from pg_stat_activity where current_query='<IDLE> in transaction'); relname | locktype | database | relation | page | tuple | virtualxid | transactionid | classid | objid | objsubid | virtualtransaction | pid | mode | granted ----------------------------+----------+----------+----------+------+-------+------------+---------------+---------+-------+----------+--------------------+------+-----------------+--------- pg_class_oid_index | relation | 280066 | 2662 | | | | | | | | 3/18 | 8069 | AccessShareLock | t pg_class_relname_nsp_index | relation | 280066 | 2663 | | | | | | | | 3/18 | 8069 | AccessShareLock | t pg_description_o_c_o_index | relation | 280066 | 2675 | | | | | | | | 3/18 | 8069 | AccessShareLock | t pg_namespace_nspname_index | relation | 280066 | 2684 | | | | | | | | 3/18 | 8069 | AccessShareLock | t pg_namespace_oid_index | relation | 280066 | 2685 | | | | | | | | 3/18 | 8069 | AccessShareLock | t pg_class | relation | 280066 | 1259 | | | | | | | | 3/18 | 8069 | AccessShareLock | t pg_description | relation | 280066 | 2609 | | | | | | | | 3/18 | 8069 | AccessShareLock | t pg_namespace | relation | 280066 | 2615 | | | | | | | | 3/18 | 8069 | AccessShareLock | t version_node | relation | 280066 | 493309 | | | | | | | | 3/18 | 8069 | AccessShareLock | t version_node_idx | relation | 280066 | 493315 | | | | | | | | 3/18 | 8069 | AccessShareLock | t (10 rows) Since the Jackrabbit tables are in the same namespace / user / schema as ours, am I right in thinking that this is effectively blocking the entire auto-vaccum system from doing anything at all? Cheers Dave
Dave Crooke wrote: > Since the Jackrabbit tables are in the same namespace / user / schema > as ours, am I right in thinking that this is effectively blocking the > entire auto-vaccum system from doing anything at all? > Yes, but the problem is actually broader than that: it wouldn't matter if it was a different user or namespace, the impact would still be the same. PostgreSQL gets rid of needing to hold a bunch of table/row locks by using an approach called MVCC: http://www.postgresql.org/docs/8.4/static/mvcc-intro.html The biggest downside of that approach is that if you have an old client lingering around, things that happened in the database after it started can't be cleaned up. That client might still be referring to the old copy of that data, so that anything it looks at will be a consistent snapshot that includes the earlier version of the rows, the database is paranoid about letting VACUUM clean the things you've deleted up. In 8.4 this situation is improved for some common use cases. In the 8.3 you're using, an old transaction will block any VACUUM attempt from moving past that point in time forever. You have to figure out how to get Hibernate to close the transaction it's leaving open for VACUUM to work. -- Greg Smith greg@2ndquadrant.com Baltimore, MD
Greg Smith wrote: > The biggest downside of [MVCC] is that if you have an old client > lingering around, things that happened in the database after it started > can't be cleaned up. Just to clarify for readers: Idle clients aren't generally an issue. It's only clients that are idle with an open transaction that tend to cause issues. > In 8.4 this situation is improved for some common use cases. In the 8.3 > you're using, an old transaction will block any VACUUM attempt from > moving past that point in time forever. You have to figure out how to > get Hibernate to close the transaction it's leaving open for VACUUM to > work. Hibernate is pretty well behaved with transaction management. In fact, it's downright nuts about keeping transactions open for as short a period of time as possible. It even implements its own row-versioning based optimistic locking scheme (oplock) rather than relying on holding a transaction open with row locks in the database. If you have connections left idle in transaction by a Hibernate-based Java app, the problem is probably: 1) Unclosed sessions / EntityManagers or explicit transactions in your own app code. Check particularly for places where the app may open a transaction without a finally clause on a try block to ensure the transaction (and the Session / EntityManager) are closed when the block is exited. 2) Connections being returned to the connection pool with open transactions ( probably due to #1 ). The connection pool should take care of that, but reports suggest that some don't. 3) Autocommit being disabled. At least when using Hibernate via JPA, that'll cause a major mess and would easily explain the issues you're seeing. Hibernate manages transactions explicitly when required, and expects autocommit to be off. 3) Your connection pool software doing something crazy like intentionally keeping idle connections with transactions open. The connection pool (c3p0 or whatever) that you use is separate from Hibernate. I'd be surprised to see this except if autocommit was disabled and the pooling software expected/assumed it'd be enabled. -- Craig Ringe
> >3) Autocommit being disabled. At least when using Hibernate via JPA, >that'll cause a major mess and would easily explain the issues you're >seeing. Hibernate manages transactions explicitly when required, and >expects autocommit to be off. Excuse me but i don't understand this point. You say that the problem happens if Autocommit is disabled but that hibernatesexpects that Autocommit is disabled for a correct work. What's better then? Autocommit off for Hibernate or AutocommitOn for the original poster problem. Can you explain it more? Thanks >-- >Craig Ringe -------------------------------- Eduardo Morrás González Dept. I+D+i e-Crime Vigilancia Digital S21sec Labs Tlf: +34 902 222 521 Móvil: +34 555 555 555 www.s21sec.com, blog.s21sec.com Salvo que se indique lo contrario, esta información es CONFIDENCIAL y contiene datos de carácter personal que han de ser tratados conforme a la legislación vigente en materia de protección de datos. Si usted no es destinatario original de este mensaje, le comunicamos que no está autorizado a revisar, reenviar, distribuir, copiar o imprimir la información en él contenida y le rogamos que proceda a borrarlo de sus sistemas. Kontrakoa adierazi ezean, posta elektroniko honen barruan doana ISILPEKO informazioa da eta izaera pertsonaleko datuak dituenez, indarrean dagoen datu pertsonalak babesteko legediaren arabera tratatu beharrekoa. Posta honen hartzaile ez zaren kasuan, jakinarazten dizugu baimenik ez duzula bertan dagoen informazioa aztertu, igorri, banatu, kopiatu edo inprimatzeko. Hortaz, erregutzen dizugu posta hau zure sistemetatik berehala ezabatzea. Antes de imprimir este mensaje valora si verdaderamente es necesario. De esta forma contribuimos a la preservación del Medio Ambiente.
Eduardo Morras wrote: >> 3) Autocommit being disabled. At least when using Hibernate via JPA, >> that'll cause a major mess and would easily explain the issues you're >> seeing. Hibernate manages transactions explicitly when required, and >> expects autocommit to be off. > > Excuse me but i don't understand this point. You say that the problem happens if Autocommit is disabled but that hibernatesexpects that Autocommit is disabled for a correct work. What's better then? Autocommit off for Hibernate or AutocommitOn for the original poster problem. Can you explain it more? Argh! I'm really sorry. I meant that Hibernate expects autocommit to be _enabled_. However, when I want back to the documentation to double-check my understanding: http://docs.jboss.org/hibernate/core/3.3/reference/en/html/session-configuration.html it reads: "hibernate.connection.autocommit: Enables autocommit for JDBC pooled connections (it is not recommended). e.g. true | false" ... so now I'm confused too. I *know* I had issues initially when I disabled autocommit explicitly (particularly with explicit transaction management), and that everything works well in my app with autocommit off via the hibernate.connection.autocommit param, but that appears to conflict with the documentation. Maybe it's different when hibernate is used via the JPA APIs as I'm using it? Thanks for checking that. I guess now I'm confused too, but that's better than "knowing" something wrong. -- Craig Ringer
Craig Ringer wrote: > Eduardo Morras wrote: >>> 3) Autocommit being disabled. At least when using Hibernate via JPA, >>> that'll cause a major mess and would easily explain the issues you're >>> seeing. Hibernate manages transactions explicitly when required, and >>> expects autocommit to be off. >> Excuse me but i don't understand this point. You say that the problem happens if Autocommit is disabled but that hibernatesexpects that Autocommit is disabled for a correct work. What's better then? Autocommit off for Hibernate or AutocommitOn for the original poster problem. Can you explain it more? > > Argh! > > I'm really sorry. I meant that Hibernate expects autocommit to be > _enabled_. Some searching suggests that, indeed, the issue is that when using Hibernate via JPA (Hibernate EntityManager) autocommit needs to be left enabled. For some reason there doesn't seem to be any explicit reference to autocommit in the Hibernate EntityManager docs or the EJB3 spec. I can't find anything but (numerous) forum posts and the like on this, so don't take it as definitive. -- Craig Ringer
Craig Ringer wrote: > Hibernate is pretty well behaved with transaction management. In fact, > it's downright nuts about keeping transactions open for as short a > period of time as possible. It even implements its own row-versioning > based optimistic locking scheme (oplock) rather than relying on holding > a transaction open with row locks in the database. > It's probably more nuts than it needs to be with PostgreSQL as the backing store, since MVCC prevents some of the common sources of row locks from being needed. But since Hibernate is database-agnostic and it worried about locally cached copies of things too, it ends up needing to do this extra work regardless. > 3) Autocommit being disabled. At least when using Hibernate via JPA, > that'll cause a major mess and would easily explain the issues you're > seeing. Hibernate manages transactions explicitly when required, and > expects autocommit to be off. > Downthread it suggests there's still some confusion here, but everyone should be clear about one thing: turning autocommit on is the first step down a road that usually leads to bad batch performance. If your problems go away by enabling it, which they sometimes do, that is a sign there's a problem to be investigated, not a true solution. One day you're going to find yourself wanting transactions to be explicitly committed only when required, both for atomicity and performance reasons, and you won't be able to rely on autocommit as a crutch at that point. Better to never get used to be there in the first place. -- Greg Smith greg@2ndquadrant.com Baltimore, MD
On 5/11/2009 10:28 PM, Greg Smith wrote: > Craig Ringer wrote: >> Hibernate is pretty well behaved with transaction management. In fact, >> it's downright nuts about keeping transactions open for as short a >> period of time as possible. It even implements its own row-versioning >> based optimistic locking scheme (oplock) rather than relying on holding >> a transaction open with row locks in the database. >> > It's probably more nuts than it needs to be with PostgreSQL as the > backing store, since MVCC prevents some of the common sources of row > locks from being needed. I'm not sure about that personally. Much of the work it does is to avoid holding an update lock on a row during "user think time". Instead of stopping another transaction from jumping in between reading a record and writing an updated copy, it detects when another transaction has got in the way and aborts the loser of the race, which will usually retry in some way. This issue applies just as much to PostgreSQL as any other database, and is very hard to avoid if your problem forces you to write code that reads a record, updates it in memory, then writes it back to the DB instead of doing an in-place read-and-update. That means that, as in SERIALIZABLE transactions, UPDATEs with hibernate can fail and may need to be retried. On the other hand, it means that transactions aren't blocked by a lock held by another transaction during long periods of user inactivity. It's the difference between: BEGIN; SELECT val1, val2 FROM blah WHERE id = 1 FOR UPDATE; -- User ponders for half an hour before applying a change -- Meanwhile, another transaction that has to update the same record -- is blocked, and can't continue on to do other work. As it also holds -- update locks on other records, if you're unlucky or the app's data -- is highly interdependent then half the app lands up waiting for the -- user to get back from lunch. UPDATE blah SET val1 = something, val2 = somethingelse WHERE id = 1; COMMIT; and: BEGIN; SELECT val1, val2, version FROM blah WHERE id = 1; COMMIT; -- User ponders for half an hour before applying a change. Meanwhile, -- someone else who hasn't gone for lunch updates the record, -- incrementing the `version' field as well as tweaking the data fields. BEGIN; UPDATE blah SET val1 = something, val2 = somethingelse WHERE id = 1, version = oldversion; -- As rows matched, Hibernate knows the record has been deleted -- or someone else updated it in the mean time. It aborts the -- change by the until recently out-to-lunch user and the app informs -- the user that -- someone else has altered the record, so they'll have to check -- if they still need to make their changes and possibly re-apply them. -- (Or, if appropriate, the app it merges the two change sets and -- auto-retries). ROLLBACK; Getting these two strategies to play well together in a DB used by "optimistic locking" row-versioned users like Hibernate as well as apps using conventional SQL DB locking isn't hard, by the way. I wrote something up on it recently: http://wiki.postgresql.org/wiki/Hibernate_oplocks > Downthread it suggests there's still some confusion here, but everyone > should be clear about one thing: turning autocommit on is the first > step down a road that usually leads to bad batch performance. Normally I'd be in complete agreement with you. Batching things into transactions not only improves performance, but it's necessary for correctness unless much of what you're doing is pretty trivial. The distinction here is that the ORM framework expects to manage autocommit settings on the JDBC connection its self. In the case of use of Hibernate via JPA, Hibernate will almost always have autocommit disabled when doing work. It's just that the JPA implementation appears to expect to receive connections with autocommit initially enabled, and gets somewhat confused if that's not the case. -- Craig Ringer