On Mon, May 27, 2013 at 5:59 AM, Christopher Browne <cbbrowne@gmail.com> wrote:
> This situation falls from a problem that we noticed a mighty long time ago
> in Slony, where the set of XIDs outstanding gets very large, and, attendant
> to that, the set of "action id" values by which tuples are being filtered,
> gets correspondingly large.
>
> It happens when there is a long pause in application of replication data,
> and is commonly the consequence of setting up replication on a very large
> data table that takes a long time for the initial data copy.
>
> At the time, Neil Conway observed this query breakage with a query that was
> roughly 640K in size, from whence fell jokes to the effect, "who would ever
> need a query larger than 640K?"
>
> The resolution that I introduced at the time was to write a little parser
> that would recognize sequences of adjacent values and merge them into
> "BETWEEN A and B" clauses, which would bring the query size back to a
> reasonable size.
PgQ uses simpler optimization to keep IN list size down -
it aggressively enlarges the main xid range and later processes
rows with txid_is_visible_in_snapshot():
https://github.com/markokr/skytools/blob/master/sql/pgq/functions/pgq.batch_event_sql.sql
IOW - it assumes the open-xid distribution is not uniformly random.
This additional optimization was ignored when pgq long-tx
approach was first imported to slony:
http://lists.slony.info/pipermail/slony1-general/2007-July/006238.html
I guess the reason was to have minimal patch.
You might want to play with that now.
--
marko