> The partial index is highly leveraged. If every tuple in the
> table is updated once, that amounts to every tuple in the index
> being updated 25,000 times.
>
> How so? That sounds like O(n_2) behaviour.
>
> If the table has 5 million rows while the index has 200 (active) rows at
> any given time, then to update every row in the table to null and back
> again would be 100% turn over of the table. But each such change would
> lead to an addition and then a deletion from the index. So 100%
> turnover of the table would be a 5 million / 200 = 25,000 fold turn of
> the index.
Sorry, I was being dense. I misread that as:
"every time a single tuple in the table is updated, the entire index
(every row) is updated".
Yes, of course your explanation makes sense.
>
> There is some code that allows a btree index entry to get killed (and so
> the slot to be reused) without any vacuum, if a scan follows that entry
> and finds the corresponding tuple in the table no longer visible to
> anyone. I have not examined this code, and don't know whether it is
> doing its job but just isn't enough to prevent the bloat, or if for some
> reason it is not applicable to your situation.
>
It looks like my solution is going to be a REINDEX invoked from cron, or
maybe just every 100k inserts.
In terms of trying to improve this behaviour for other PG users in the
future, are there any more diagnostics I can do for you? Having found a
special case, I'd like to help permanently resolve it if I can.
Thanks very much again.
Best wishes,
Richard