Обсуждение: Reduce the memcpy call from SearchCatCache
Hi, Here is the oprofile results of pgbench. CPU: P4 / Xeon with 2 hyper-threads, speed 2793.55 MHz (estimated) Counted GLOBAL_POWER_EVENTS events with a unit mask of 0x01 (mandatory) count 100000 samples % app name symbol name 134521 6.8312 ipmi_si (no symbols) 94515 4.7996 vmlinux schedule 52609 2.6716 postgres AllocSetAlloc 39659 2.0140 postgres base_yyparse 34605 1.7573 vmlinux mwait_idle 33234 1.6877 vmlinux _spin_lock 31353 1.5922 libc-2.3.4.so memcpy I think that the performance improves if the call frequency of memcpy is reduced. I measured the place where postgres used memcpy. (Test program is pgbench -t 4000) total-size avg-size caller ------------------------------------------------------------------------ 636185 111968560 176 catcache.c:1129 68236 18436197 270 xlog.c:947 3909 13822874 3536 xlog.c:940 20003 3520528 176 catcache.c:1376 56010 2071477 36 pgstat.c:2288 125524 1902864 15 dynahash.c:948 20001 1760088 88 setrefs.c:205 catcache.c:1129 is memcpy at SearchCatCache, and catcache.c:1376 is memcpy at SearchCatCacheList. memcpy(cur_skey, cache->cc_skey, sizeof(cur_skey)); Attached patch is reduce the memcpy calls from SearchCatCache and SearchCatCacheList. This patch directly uses cache->cc_skey in looking for hash table. Here is an effect of the patch. original: Counted GLOBAL_POWER_EVENTS events samples % app name symbol name 31353 1.5922 libc-2.3.4.so memcpy patched: Counted GLOBAL_POWER_EVENTS events samples % app name symbol name 20629 1.0684 libc-2.3.4.so memcpy --- Atsushi Ogawa *** ./src/backend/utils/cache/catcache.c.orig 2009-07-06 22:06:52.000000000 +0900 --- ./src/backend/utils/cache/catcache.c 2009-07-06 13:51:48.000000000 +0900 *************** *** 1124,1140 **** /* * initialize the search key information */ ! memcpy(cur_skey, cache->cc_skey, sizeof(cur_skey)); ! cur_skey[0].sk_argument = v1; ! cur_skey[1].sk_argument = v2; ! cur_skey[2].sk_argument = v3; ! cur_skey[3].sk_argument = v4; /* * find the hash bucket in which to look for the tuple */ ! hashValue = CatalogCacheComputeHashValue(cache, cache->cc_nkeys, cur_skey); hashIndex = HASH_INDEX(hashValue, cache->cc_nbuckets); /* --- 1124,1141 ---- /* * initialize the search key information + * use cache->cc_skey directly in looking for hash table */ ! cache->cc_skey[0].sk_argument = v1; ! cache->cc_skey[1].sk_argument = v2; ! cache->cc_skey[2].sk_argument = v3; ! cache->cc_skey[3].sk_argument = v4; /* * find the hash bucket in which to look for the tuple */ ! hashValue = CatalogCacheComputeHashValue(cache, cache->cc_nkeys, ! cache->cc_skey); hashIndex = HASH_INDEX(hashValue, cache->cc_nbuckets); /* *************** *** 1160,1166 **** HeapKeyTest(&ct->tuple, cache->cc_tupdesc, cache->cc_nkeys, ! cur_skey, res); if (!res) continue; --- 1161,1167 ---- HeapKeyTest(&ct->tuple, cache->cc_tupdesc, cache->cc_nkeys, ! cache->cc_skey, res); if (!res) continue; *************** *** 1222,1227 **** --- 1223,1234 ---- */ relation = heap_open(cache->cc_reloid, AccessShareLock); + /* + * We need copy ScanKey data, because systable_beginscan changes + * the ScanKey data. + */ + memcpy(cur_skey, cache->cc_skey, sizeof(cur_skey)); + scandesc = systable_beginscan(relation, cache->cc_indexoid, IndexScanOK(cache, cur_skey), *************** *** 1371,1389 **** /* * initialize the search key information */ ! memcpy(cur_skey, cache->cc_skey, sizeof(cur_skey)); ! cur_skey[0].sk_argument = v1; ! cur_skey[1].sk_argument = v2; ! cur_skey[2].sk_argument = v3; ! cur_skey[3].sk_argument = v4; /* * compute a hash value of the given keys for faster search. We don't * presently divide the CatCList items into buckets, but this still lets * us skip non-matching items quickly most of the time. */ ! lHashValue = CatalogCacheComputeHashValue(cache, nkeys, cur_skey); /* * scan the items until we find a match or exhaust our list --- 1378,1396 ---- /* * initialize the search key information + * use cache->cc_skey directly in looking for hash table */ ! cache->cc_skey[0].sk_argument = v1; ! cache->cc_skey[1].sk_argument = v2; ! cache->cc_skey[2].sk_argument = v3; ! cache->cc_skey[3].sk_argument = v4; /* * compute a hash value of the given keys for faster search. We don't * presently divide the CatCList items into buckets, but this still lets * us skip non-matching items quickly most of the time. */ ! lHashValue = CatalogCacheComputeHashValue(cache, nkeys, cache->cc_skey); /* * scan the items until we find a match or exhaust our list *************** *** 1410,1416 **** HeapKeyTest(&cl->tuple, cache->cc_tupdesc, nkeys, ! cur_skey, res); if (!res) continue; --- 1417,1423 ---- HeapKeyTest(&cl->tuple, cache->cc_tupdesc, nkeys, ! cache->cc_skey, res); if (!res) continue; *************** *** 1460,1465 **** --- 1467,1478 ---- relation = heap_open(cache->cc_reloid, AccessShareLock); + /* + * We need copy ScanKey data, because systable_beginscan changes + * the ScanKey data. + */ + memcpy(cur_skey, cache->cc_skey, sizeof(cur_skey)); + scandesc = systable_beginscan(relation, cache->cc_indexoid, true,
Atsushi Ogawa <a_ogawa@hi-ho.ne.jp> writes: > Attached patch is reduce the memcpy calls from SearchCatCache > and SearchCatCacheList. This patch directly uses cache->cc_skey > in looking for hash table. How much did you test this patch? I'm fairly sure it will break things. There are cases where cache lookups happen recursively. regards, tom lane
Tom Lane writes: > Atsushi Ogawa <a_ogawa@hi-ho.ne.jp> writes: > > Attached patch is reduce the memcpy calls from SearchCatCache > > and SearchCatCacheList. This patch directly uses cache->cc_skey > > in looking for hash table. > > How much did you test this patch? I'm fairly sure it will break > things. > There are cases where cache lookups happen recursively. I tested regression test and pgbench. However, I did not consider recursive case. I revised a patch for safe recursive call. But I cannot find test case in which recursive call happens. In my understanding, recursive call at SearchCatCache does not happen while looking for hash table. The recursive call happens while reading the relation. If the cache->cc_skey is copied before read the relation, I think it is safe. best regards, --- Atsushi Ogawa *** ./src/backend/utils/cache/catcache.c.orig 2009-07-07 15:19:56.000000000 +0900 --- ./src/backend/utils/cache/catcache.c 2009-07-07 15:19:46.000000000 +0900 *************** *** 1124,1140 **** /* * initialize the search key information */ ! memcpy(cur_skey, cache->cc_skey, sizeof(cur_skey)); ! cur_skey[0].sk_argument = v1; ! cur_skey[1].sk_argument = v2; ! cur_skey[2].sk_argument = v3; ! cur_skey[3].sk_argument = v4; /* * find the hash bucket in which to look for the tuple */ ! hashValue = CatalogCacheComputeHashValue(cache, cache->cc_nkeys, cur_skey); hashIndex = HASH_INDEX(hashValue, cache->cc_nbuckets); /* --- 1124,1141 ---- /* * initialize the search key information + * directly use cache->cc_skey while looking for hash table */ ! cache->cc_skey[0].sk_argument = v1; ! cache->cc_skey[1].sk_argument = v2; ! cache->cc_skey[2].sk_argument = v3; ! cache->cc_skey[3].sk_argument = v4; /* * find the hash bucket in which to look for the tuple */ ! hashValue = CatalogCacheComputeHashValue(cache, cache->cc_nkeys, ! cache->cc_skey); hashIndex = HASH_INDEX(hashValue, cache->cc_nbuckets); /* *************** *** 1160,1166 **** HeapKeyTest(&ct->tuple, cache->cc_tupdesc, cache->cc_nkeys, ! cur_skey, res); if (!res) continue; --- 1161,1167 ---- HeapKeyTest(&ct->tuple, cache->cc_tupdesc, cache->cc_nkeys, ! cache->cc_skey, res); if (!res) continue; *************** *** 1206,1211 **** --- 1207,1218 ---- } /* + * We need copy ScanKey data, because it is possible for recursive + * cache lookup. + */ + memcpy(cur_skey, cache->cc_skey, sizeof(cur_skey)); + + /* * Tuple was not found in cache, so we have to try to retrieve it directly * from the relation. If found, we will add it to the cache; if not * found, we will add a negative cache entry instead. *************** *** 1371,1389 **** /* * initialize the search key information */ ! memcpy(cur_skey, cache->cc_skey, sizeof(cur_skey)); ! cur_skey[0].sk_argument = v1; ! cur_skey[1].sk_argument = v2; ! cur_skey[2].sk_argument = v3; ! cur_skey[3].sk_argument = v4; /* * compute a hash value of the given keys for faster search. We don't * presently divide the CatCList items into buckets, but this still lets * us skip non-matching items quickly most of the time. */ ! lHashValue = CatalogCacheComputeHashValue(cache, nkeys, cur_skey); /* * scan the items until we find a match or exhaust our list --- 1378,1396 ---- /* * initialize the search key information + * directly use cache->cc_skey while looking for hash table */ ! cache->cc_skey[0].sk_argument = v1; ! cache->cc_skey[1].sk_argument = v2; ! cache->cc_skey[2].sk_argument = v3; ! cache->cc_skey[3].sk_argument = v4; /* * compute a hash value of the given keys for faster search. We don't * presently divide the CatCList items into buckets, but this still lets * us skip non-matching items quickly most of the time. */ ! lHashValue = CatalogCacheComputeHashValue(cache, nkeys, cache->cc_skey); /* * scan the items until we find a match or exhaust our list *************** *** 1410,1416 **** HeapKeyTest(&cl->tuple, cache->cc_tupdesc, nkeys, ! cur_skey, res); if (!res) continue; --- 1417,1423 ---- HeapKeyTest(&cl->tuple, cache->cc_tupdesc, nkeys, ! cache->cc_skey, res); if (!res) continue; *************** *** 1440,1445 **** --- 1447,1458 ---- } /* + * We need copy ScanKey data, because it is possible for recursive + * cache lookup. + */ + memcpy(cur_skey, cache->cc_skey, sizeof(cur_skey)); + + /* * List was not found in cache, so we have to build it by reading the * relation. For each matching tuple found in the relation, use an * existing cache entry if possible, else build a new one.
Atsushi Ogawa <a_ogawa@hi-ho.ne.jp> writes: > Tom Lane writes: >> There are cases where cache lookups happen recursively. > I tested regression test and pgbench. However, I did not consider > recursive case. I revised a patch for safe recursive call. > But I cannot find test case in which recursive call happens. Try turning on CLOBBER_CACHE_ALWAYS or CLOBBER_CACHE_RECURSIVELY to get a demonstration of what can happen under the right conditions. I think the only really safe way to do what you propose would be to refactor the ScanKey API to separate the datum values and is-null flags from the more static parts of the data structure. That would be a pretty large/invasive patch, and the numbers cited here don't seem to me to justify the work. It's even possible that it could end up being a net performance loss due to having to pass around more pointers :-( regards, tom lane