Re: Process local hint bit cache
От | Merlin Moncure |
---|---|
Тема | Re: Process local hint bit cache |
Дата | |
Msg-id | AANLkTimpUbs=bCHGN9oCV9oug55b5DUw4DaCuAXwXivM@mail.gmail.com обсуждение исходный текст |
Ответ на | Re: Process local hint bit cache (Merlin Moncure <mmoncure@gmail.com>) |
Ответы |
Re: Process local hint bit cache
(Merlin Moncure <mmoncure@gmail.com>)
|
Список | pgsql-hackers |
On Thu, Mar 31, 2011 at 5:33 PM, Merlin Moncure <mmoncure@gmail.com> wrote: > On Wed, Mar 30, 2011 at 2:35 PM, Merlin Moncure <mmoncure@gmail.com> wrote: >> On Wed, Mar 30, 2011 at 11:23 AM, Heikki Linnakangas >> <heikki.linnakangas@enterprisedb.com> wrote: >>> On 30.03.2011 18:02, Robert Haas wrote: >>>> >>>> On Wed, Mar 30, 2011 at 10:40 AM, Greg Stark<gsstark@mit.edu> wrote: >>>>> >>>>> But one way or another the hint bits have to get set sometime. The >>>>> sooner that happens the less clog i/o has to happen in the meantime. >>>> >>>> I talked about this with Merlin a bit yesterday. I think that his >>>> thought is that most transactions will access a small enough number of >>>> distinct CLOG pages, and the cache accesses might be fast enough, that >>>> we really wouldn't need to get the hint bits set, or at least that >>>> vacuum/freeze time would be soon enough. I'm not sure if that's >>>> actually true, though - I think the overhead of the cache might be >>>> higher than he's imagining. However, there's a sure-fire way to find >>>> out... code it up and see how it plays. >>> >>> I did a little experiment: I hacked SetHintBits() to do nothing, so that >>> hint bits are never set. I then created a table with 100000 rows in it: >>> >>> postgres=# \d foo >>> Table "public.foo" >>> Column | Type | Modifiers >>> --------+---------+----------- >>> a | integer | >>> >>> postgres=# INSERT INTO foo SELECT generate_series(1, 100000); >>> INSERT 0 100000 >>> >>> And ran queries on the table: >>> >>> postgres=# do $$ >>> declare >>> i int4; >>> begin >>> loop >>> perform COUNT(*) FROM foo; >>> end loop; >>> end; >>> $$; >>> >>> This test case is designed to exercise the visibility tests as much as >>> possible. However, all the tuples are loaded in one transaction, so the >>> one-element cache in TransactionLogFetch is 100% effective. >>> >>> I ran oprofile on that. It shows that about 15% of the time is spent in >>> HeapTupleSatisfiesMVCC and its subroutines. 6.6% is spent in >>> HeapTupleSatisfiesMVCC itself. Here's the breakdown of that: >>> >>> $ opreport -c --global-percent >>> >>> CPU: Intel Architectural Perfmon, speed 2266 MHz (estimated) >>> Counted CPU_CLK_UNHALTED events (Clock cycles when not halted) with a unit >>> mask of 0x00 (No unit mask) count 100000 >>> samples % app name symbol name >>> ... >>> ------------------------------------------------------------------------------- >>> 2143 0.4419 postgres postgres heapgettup_pagemode >>> 73277 15.1091 postgres postgres heapgetpage >>> 31858 6.5688 postgres postgres HeapTupleSatisfiesMVCC >>> 31858 6.5688 postgres postgres HeapTupleSatisfiesMVCC >>> [self] >>> 12809 2.6411 postgres postgres >>> TransactionIdIsInProgress >>> 12091 2.4931 postgres postgres XidInMVCCSnapshot >>> 7150 1.4743 postgres postgres >>> TransactionIdIsCurrentTransactionId >>> 7056 1.4549 postgres postgres TransactionIdDidCommit >>> 1839 0.3792 postgres postgres TransactionIdPrecedes >>> 1467 0.3025 postgres postgres SetHintBits >>> 1155 0.2382 postgres postgres TransactionLogFetch >>> ------------------------------------------------------------------------------- >>> ... >>> >>> I then ran the same test again with an unpatched version, to set the hint >>> bits. After the hint bits were set, I ran oprofile again: >>> >>> ------------------------------------------------------------------------------- >>> 275 0.4986 postgres heapgettup_pagemode >>> 4459 8.0851 postgres heapgetpage >>> 3005 5.4487 postgres HeapTupleSatisfiesMVCC >>> 3005 5.4487 postgres HeapTupleSatisfiesMVCC [self] >>> 1620 2.9374 postgres XidInMVCCSnapshot >>> 110 0.1995 postgres TransactionIdPrecedes >>> ------------------------------------------------------------------------------- >>> >>> So with hint bits set, HeapTupleSatisfiesMVCC accounts for 8% of the total >>> CPU time, and without hint bits, 15%. >>> >>> Even if clog access was free, hint bits still give a significant speedup >>> thanks to skipping all the other overhead like TransactionIdIsInProgress() >>> and TransactionIdIsCurrentTransactionId(). Speeding up clog access is >>> important; when the one-element cache isn't saving you the clog access >>> becomes a dominant factor. But you have to address that other overhead too >>> before you can get rid of hint bits. >> >> Here is a patch demonstrating the caching action, but without the >> cache table, which isn't done yet -- It only works using the very last >> transaction id fetched. I used macros so I could keep the changes >> quite localized. > > While playing with the data structure to implement an xid cache inside > TransactionLogFetch, I learned a nasty lesson. > HeapTupleSatisifiesMVCC is super sensitive to any changes and even > jumping through a couple of calls to TransactionLogFetch was > measurable in a couple of performance test case I came up with. I > hauntingly recalled Tom's warnings to beware unexpected performance > downsides. > > In short, irregardless of implementation, I unfortunately don't think > any caching as or under the clog abstraction is going to work without > impacting at least some cases. > > OTOH, maybe some micro optimization of HeapTupleSatisifiesMVCC could > provide the same benefit I'm going after here. If you save off the > the last xid that came back committed in a static variable, you can > check that at the same level as if it were a hint bit. From my > testing, this is well and truly 'free' in the sense that it can > defer/eliminate clog checks and hint bit i/o. > > The patch attached gives all the advantages I was after in the hm, this isn't quite right -- if you get a 'last xid cache hit', the hint bit tuples should be set if they are not already (but the page should not be dirtied). working on exanding the cache to # xid > 1. merlin
В списке pgsql-hackers по дате отправления: