Обсуждение: Performance impact of NULLs and variable length fields
Sometime ago somebody asked if it made a difference adn it was suggested that the effect was probably marginal. I ran a profiler over postgres doing a large query and these are the top 10 functions: % cumulative self self total time seconds seconds calls ms/call ms/call name 16.04 0.51 0.51 1676772 0.00 0.00 nocachegetattr 11.95 0.89 0.38 1427403 0.00 0.00 heapgettup 10.06 1.21 0.32 2955372 0.00 0.00 LockBuffer 6.92 1.43 0.22 3406475 0.00 0.00 ExecEvalExpr 5.03 1.59 0.16 1617018 0.00 0.00 AllocSetReset 4.40 1.73 0.14 1427403 0.00 0.00 heap_getnext 4.09 1.86 0.13 1844339 0.00 0.00 ExecEvalVar 4.09 1.99 0.13 1441330 0.00 0.00 ReleaseBuffer 4.09 2.12 0.13 1427064 0.00 0.00 SeqNext 3.46 2.23 0.11 1552338 0.00 0.00 ExecQual Note: the wall clock time of the query was about 30 seconds. The total cpu time was 3.18 seconds. Now, nocachegetattr is called mostly whenever there are NULLs or variable length strings in the tuple. Since our biggest table begins with a variable length field, every single lookup is going to be uncached. Looks like the effect is significant. Is this ever going to change? Just for interests sake i've also got the results for a plain select sum(x) from y; % cumulative self self total time seconds seconds calls ms/call ms/call name 9.12 0.60 0.60 1423541 0.00 0.00 heapgettup 8.51 1.16 0.56 1423329 0.00 0.00 nocachegetattr 6.08 1.56 0.40 2850372 0.00 0.00 AllocSetAlloc 5.47 1.92 0.36 1423308 0.00 0.00 ExecTargetList 5.02 2.25 0.33 2846619 0.00 0.00 AllocSetReset 4.10 2.52 0.27 1423308 0.00 0.00 heap_formtuple 3.80 2.77 0.25 2846614 0.00 0.00 ExecEvalVar 3.80 3.02 0.25 2 125.00 922.33 ExecAgg 3.65 3.26 0.24 1423310 0.00 0.00 ExecProcNode 3.04 3.46 0.20 1423307 0.00 0.00 advance_transition_function Seems to me that the generality of the aggregate implementation doesn't seem to really be an issue. -- Martijn van Oosterhout <kleptog@svana.org> http://svana.org/kleptog/ > It would be nice if someone came up with a certification system that > actually separated those who can barely regurgitate what they crammed over > the last few weeks from those who command secret ninja networking powers.
Martijn van Oosterhout <kleptog@svana.org> writes: > Sometime ago somebody asked if it made a difference adn it was suggested > that the effect was probably marginal. I ran a profiler over postgres doing > a large query and these are the top 10 functions: This is pretty meaningless when you didn't give one single detail about the test query or conditions. regards, tom lane
Well, Tom pointed out that without the query itself this was fairly meaningless. So here it is: select c.id, sum(d.cost) from c, l, d where c.id = l.cid and l.clid = d.clid and c.accman = '2500' and billid is null group by c.id; EXPLAIN output: Aggregate (cost=51479.55..51507.07 rows=550 width=48) (actual time=26192.07..26222.07 rows=156 loops=1) -> Group (cost=51479.55..51493.31 rows=5503 width=48) (actual time=26165.81..26186.14 rows=2029 loops=156) -> Sort (cost=51479.55..51479.55 rows=5503 width=48) (actual time=26164.90..26167.77 rows=2029 loops=1) -> Hash Join (cost=676.79..51137.62 rows=5503 width=48) (actual time=3703.82..26122.23 rows=2029 loops=1) -> Seq Scan on d (cost=0.00..50150.07 rows=32262 width=12) (actual time=67.35..24921.42 rows=41045loops=1) -> Hash (cost=671.31..671.31 rows=2193 width=36) (actual time=594.55..594.55 rows=0 loops=1) -> Merge Join (cost=260.50..671.31 rows=2193 width=36) (actual time=251.95..551.40 rows=1330loops=1) -> Index Scan using l_cid on l (cost=0.00..364.36 rows=6308 width=20) (actual time=33.95..340.52rows=6307 loops=1) -> Sort (cost=260.50..260.50 rows=1305 width=16) (actual time=142.85..145.34 rows=1337loops=1) -> Seq Scan on c (cost=0.00..192.94 rows=1305 width=16) (actual time=14.99..103.40rows=1314 loops=1) d has over 1.4 million rows, l about 10,000 and c about 2,000. NULL columns and variable length fields used extensivly. On Sat, Jul 21, 2001 at 05:25:38PM +1000, Martijn van Oosterhout wrote: > Sometime ago somebody asked if it made a difference adn it was suggested > that the effect was probably marginal. I ran a profiler over postgres doing > a large query and these are the top 10 functions: > > % cumulative self self total > time seconds seconds calls ms/call ms/call name > 16.04 0.51 0.51 1676772 0.00 0.00 nocachegetattr > 11.95 0.89 0.38 1427403 0.00 0.00 heapgettup > 10.06 1.21 0.32 2955372 0.00 0.00 LockBuffer > 6.92 1.43 0.22 3406475 0.00 0.00 ExecEvalExpr > 5.03 1.59 0.16 1617018 0.00 0.00 AllocSetReset > 4.40 1.73 0.14 1427403 0.00 0.00 heap_getnext > 4.09 1.86 0.13 1844339 0.00 0.00 ExecEvalVar > 4.09 1.99 0.13 1441330 0.00 0.00 ReleaseBuffer > 4.09 2.12 0.13 1427064 0.00 0.00 SeqNext > 3.46 2.23 0.11 1552338 0.00 0.00 ExecQual -- Martijn van Oosterhout <kleptog@svana.org> http://svana.org/kleptog/ > It would be nice if someone came up with a certification system that > actually separated those who can barely regurgitate what they crammed over > the last few weeks from those who command secret ninja networking powers.
Martijn van Oosterhout wrote: > Sometime ago somebody asked if it made a difference adn it was suggested > that the effect was probably marginal. I ran a profiler over postgres doing > a large query and these are the top 10 functions: > > % cumulative self self total > time seconds seconds calls ms/call ms/call name > 16.04 0.51 0.51 1676772 0.00 0.00 nocachegetattr > 11.95 0.89 0.38 1427403 0.00 0.00 heapgettup > 10.06 1.21 0.32 2955372 0.00 0.00 LockBuffer > 6.92 1.43 0.22 3406475 0.00 0.00 ExecEvalExpr > 5.03 1.59 0.16 1617018 0.00 0.00 AllocSetReset > 4.40 1.73 0.14 1427403 0.00 0.00 heap_getnext > 4.09 1.86 0.13 1844339 0.00 0.00 ExecEvalVar > 4.09 1.99 0.13 1441330 0.00 0.00 ReleaseBuffer > 4.09 2.12 0.13 1427064 0.00 0.00 SeqNext > 3.46 2.23 0.11 1552338 0.00 0.00 ExecQual I have added this to the TODO list: * Improve caching of attribute offsets when NULLs exist in the row I have some ideas about how offsets with NULL's could be handled in this case. -- Bruce Momjian | http://candle.pha.pa.us pgman@candle.pha.pa.us | (610) 853-3000 + If your life is a hard drive, | 830 Blythe Avenue + Christ can be your backup. | Drexel Hill, Pennsylvania 19026