B-Tree contention?

Поиск
Список
Период
Сортировка
От Alan Li
Тема B-Tree contention?
Дата
Msg-id bc34668d0906071830s2961662am68d59cb521fec209@mail.gmail.com
обсуждение исходный текст
Список pgsql-hackers
Hello,<br /><br />I have multiple files with that have very similar distributions and I'm seeing contention when
concurrentCOPY's are happening against a table with a b-tree index on the timestamp column.  Each file look something
likethe following:<br /><br /><div style="margin-left: 40px;">~4M rows with timestamp1<br />~4M rows with timestamp2<br
/>...<br/></div><br />To simulate the problem, I have a single file that has:<br /><br /><div style="margin-left:
40px;">aint, t timestamp<br /> [1-4000000],2007-01-01 00:00:01<br />[1-4000000], 2007-01-01 00:00:02<br /></div><br
/>Themachine is an eight-core system w/ 16gb of memory.  postgresql.conf is configured with:<br /><br /><div
style="margin-left:40px;">shared_buffers = 256MB     #256MB w/ one COPY, 512MB w/ two COPY's<br /> wal_buffers =
16MB<br/>checkpoint_segments = 100<br /></div><br />When I COPY a single file into the table without an index on the
timecolumn, it takes:<br /><br /><div style="margin-left: 40px;">real    0m19.628s<br />user    0m0.001s<br /> sys    
0m0.002s<br/></div><br />When I have two COPY's with the same file concurrently into the same table without an index on
thetime column, it takes:<br /><br /><div style="margin-left: 40px;">real    0m39.933s<br />user    0m0.001s<br />
sys    0m0.000s<br /><br />real    0m41.294s<br />user    0m0.000s<br />sys     0m0.003s<br /></div><br />So it takes
roughlytwice the time and we're not getting any increase in throughput for the parallel COPY's.  The top LWLock
contentionsshows that it's the WALInsertLock.<br /><br /><div style="margin-left: 40px;">postgres=# select * from
lwlocksorder by blk desc limit 20;<br />  pid  | lockid | share |  write  |  blk   <br
/>-------+--------+-------+---------+--------<br/> 14065 |      7 |     0 | 8000001 | 311884<br />  14062 |      7
|    0 | 8000001 | 289561<br /> 14062 |     45 |     0 |   89783 |   1287<br /> 14065 |     45 |     0 |   90104 |  
1207<br/> 14062 |      8 |     0 |     652 |     10<br /> 14062 |     17 |     0 |   86469 |      7<br />  14065 |   
370|     0 |       7 |      6<br /> 14062 |    370 |     0 |  251466 |      6<br /> 14065 |      8 |     0 |     660
|     4<br /> 14065 |     29 | 46010 |    5337 |      3<br /> 14065 |     17 |     0 |   86467 |      3<br />  14062
|    35 |  8830 |    5279 |      1<br /> 14062 |     40 |  6735 |    5139 |      1<br /> 14062 |     37 | 12860 |   
5304|      1<br /> 14062 |     29 | 46037 |    5387 |      1<br /> 14065 |    521 |     0 |       1 |      1<br />
 14065|     28 |  6775 |    5236 |      1<br /> 14065 |     37 | 12831 |    5057 |      1<br /> 14062 |    506 |     0
|   2043 |      1<br /> 14065 |     40 |  6763 |    5251 |      1<br />(20 rows)<br /></div><br />When I COPY a single
fileinto the table with an index on the time column, it takes:<br /><br /><div style="margin-left: 40px;">real   
0m51.486s<br/>user    0m0.000s<br />sys     0m0.002s<br /></div><br />When I have two COPY's with the same file
concurrentlyinto the same table with an index on the time column, it takes:<br /><br /><div style="margin-left:
40px;">real   2m4.414s<br />user    0m0.001s<br />sys     0m0.001s<br /><br />real    2m4.428s<br />user    0m0.000s<br
/>sys    0m0.002s<br /></div><br />So it's taking more than twice the time when the timestamp index was added.  The top
LWLockcontentions shows that there's a bit more contention on the WALInsertLock, but there's additional contention on
twobuffer pages.<br /><br /><div style="margin-left: 40px;">postgres=# select * from lwlocks order by blk desc limit
20;<br/>  pid  | lockid |  share   |  write   |  blk   <br />-------+--------+----------+----------+--------<br
/> 13888|      7 |        0 | 16027716 | 342522<br />  13889 |      7 |        0 | 16027881 | 339595<br /> 13889 |   
400|  4000000 |  4000002 |  72875<br /> 13889 |    388 |  4000001 |  4000000 |  72545<br /> 13888 |    388 |  4000001
| 4000001 |  70554<br /> 13888 |    400 |  4000000 |  4000002 |  68748<br />  13889 |     45 |        0 |    89402 |  
1340<br/> 13888 |     45 |        0 |    89259 |   1290<br /> 13889 |     33 | 12117131 |     6772 |   1102<br /> 13888
|    33 | 12118267 |     7178 |   1054<br /> 13888 |     29 |  4232749 |     7159 |    415<br />  13889 |     29 | 
4233831|     7057 |    353<br /> 13888 |     34 |  4190329 |     6907 |    318<br /> 13889 |     34 |  4191618 |    
6870|    292<br /> 13888 |     38 |  4185303 |     7004 |    283<br /> 13889 |     38 |  4186310 |     7084 |    259<br
/> 13889 |    922 |        0 |      278 |     86<br /> 13889 |    940 |        0 |      207 |     80<br /> 13889 |
102316|        0 |      346 |     72<br /> 13889 |    854 |        0 |      194 |     70<br />(20 rows)<br /></div><br
/>The two hotspots seem to represent the first BTree pages that contain the '2007-01-01 00:00:01' and '2007-01-01
00:00:02'keys.  I suspect it's due to concurrent _bt_doinsert(), which tries to acquire both a read lock (_bt_search)
anda write lock (_bt_findinsertloc) on those first BTree pages.  Does this sound like a reasonable explanation for the
contention?<br/><br />Btw, this is against a 8.4beta2 build as of yesterday, and I'm pretty sure that this happens in
REL8_3_STABLEas well.<br /><br />Thanks, Alan<br /> 

В списке pgsql-hackers по дате отправления:

Предыдущее
От: Robert Haas
Дата:
Сообщение: Re: Partial vacuum versus pg_class.reltuples
Следующее
От: Bruce Momjian
Дата:
Сообщение: Re: pg_migrator issue with contrib