Re: Parallel copy
От | vignesh C |
---|---|
Тема | Re: Parallel copy |
Дата | |
Msg-id | CALDaNm2dYgE0g9n3rGyw_v=-0zucUdkR7c_9rr9=Dj=SfPx9PA@mail.gmail.com обсуждение исходный текст |
Ответ на | Re: Parallel copy (Amit Kapila <amit.kapila16@gmail.com>) |
Список | pgsql-hackers |
On Thu, Oct 15, 2020 at 2:39 PM Amit Kapila <amit.kapila16@gmail.com> wrote: > > On Wed, Oct 14, 2020 at 6:51 PM vignesh C <vignesh21@gmail.com> wrote: > > > > On Fri, Oct 9, 2020 at 11:01 AM Amit Kapila <amit.kapila16@gmail.com> wrote: > > > > > > I am not able to properly parse the data but If understand the wal > > > data for non-parallel (1116 | 0 | 3587203) and parallel (1119 > > > | 6 | 3624405) case doesn't seem to be the same. Is that > > > right? If so, why? Please ensure that no checkpoint happens for both > > > cases. > > > > > > > I have disabled checkpoint, the results with the checkpoint disabled > > are given below: > > | wal_records | wal_fpi | wal_bytes > > Sequential Copy | 1116 | 0 | 3587669 > > Parallel Copy(1 worker) | 1116 | 0 | 3587669 > > Parallel Copy(4 worker) | 1121 | 0 | 3587668 > > I noticed that for 1 worker wal_records & wal_bytes are same as > > sequential copy, but with different worker count I had noticed that > > there is difference in wal_records & wal_bytes, I think the difference > > should be ok because with more than 1 worker the order of records > > processed will be different based on which worker picks which records > > to process from input file. In the case of sequential copy/1 worker > > the order in which the records will be processed is always in the same > > order hence wal_bytes are the same. > > > > Are all records of the same size in your test? If so, then why the > order should matter? Also, even the number of wal_records has > increased but wal_bytes are not increased, rather it is one-byte less. > Can we identify what is going on here? I don't intend to say that it > is a problem but we should know the reason clearly. The earlier run that I executed was with varying record size. The below results are by modifying the records to keep it of same size: | wal_records | wal_fpi | wal_bytes Sequential Copy | 1307 | 0 | 4198526 Parallel Copy(1 worker) | 1307 | 0 | 4198526 Parallel Copy(2 worker) | 1308 | 0 | 4198836 Parallel Copy(4 worker) | 1307 | 0 | 4199147 Parallel Copy(8 worker) | 1312 | 0 | 4199735 Parallel Copy(16 worker) | 1313 | 0 | 4200311 Still I noticed that there is some difference in wal_records & wal_bytes. I feel the difference in wal_records & wal_bytes is because of the following reasons: Each worker prepares 1000 tuples and then tries to do heap_multi_insert for 1000 tuples, In our case approximately 185 tuples is stored in 1 page, 925 tuples are stored in 5 WAL records and the remaining 75 tuples are stored in next WAL record. The wal dump is like below: rmgr: Heap2 len (rec/tot): 3750/ 3750, tx: 510, lsn: 0/0160EC80, prev 0/0160DDB0, desc: MULTI_INSERT+INIT 185 tuples flags 0x00, blkref #0: rel 1663/13751/16384 blk 0 rmgr: Heap2 len (rec/tot): 3750/ 3750, tx: 510, lsn: 0/0160FB28, prev 0/0160EC80, desc: MULTI_INSERT+INIT 185 tuples flags 0x00, blkref #0: rel 1663/13751/16384 blk 1 rmgr: Heap2 len (rec/tot): 3750/ 3750, tx: 510, lsn: 0/016109E8, prev 0/0160FB28, desc: MULTI_INSERT+INIT 185 tuples flags 0x00, blkref #0: rel 1663/13751/16384 blk 2 rmgr: Heap2 len (rec/tot): 3750/ 3750, tx: 510, lsn: 0/01611890, prev 0/016109E8, desc: MULTI_INSERT+INIT 185 tuples flags 0x00, blkref #0: rel 1663/13751/16384 blk 3 rmgr: Heap2 len (rec/tot): 3750/ 3750, tx: 510, lsn: 0/01612750, prev 0/01611890, desc: MULTI_INSERT+INIT 185 tuples flags 0x00, blkref #0: rel 1663/13751/16384 blk 4 rmgr: Heap2 len (rec/tot): 1550/ 1550, tx: 510, lsn: 0/016135F8, prev 0/01612750, desc: MULTI_INSERT+INIT 75 tuples flags 0x02, blkref #0: rel 1663/13751/16384 blk 5 After the 1st 1000 tuples are inserted and when the worker tries to insert another 1000 tuples, it will use the last page which had free space to insert where we can insert 110 more tuples: rmgr: Heap2 len (rec/tot): 2470/ 2470, tx: 510, lsn: 0/01613C08, prev 0/016135F8, desc: MULTI_INSERT 110 tuples flags 0x00, blkref #0: rel 1663/13751/16384 blk 5 rmgr: Heap2 len (rec/tot): 3750/ 3750, tx: 510, lsn: 0/016145C8, prev 0/01613C08, desc: MULTI_INSERT+INIT 185 tuples flags 0x00, blkref #0: rel 1663/13751/16384 blk 6 rmgr: Heap2 len (rec/tot): 3750/ 3750, tx: 510, lsn: 0/01615470, prev 0/016145C8, desc: MULTI_INSERT+INIT 185 tuples flags 0x00, blkref #0: rel 1663/13751/16384 blk 7 rmgr: Heap2 len (rec/tot): 3750/ 3750, tx: 510, lsn: 0/01616330, prev 0/01615470, desc: MULTI_INSERT+INIT 185 tuples flags 0x00, blkref #0: rel 1663/13751/16384 blk 8 rmgr: Heap2 len (rec/tot): 3750/ 3750, tx: 510, lsn: 0/016171D8, prev 0/01616330, desc: MULTI_INSERT+INIT 185 tuples flags 0x00, blkref #0: rel 1663/13751/16384 blk 9 rmgr: Heap2 len (rec/tot): 3050/ 3050, tx: 510, lsn: 0/01618098, prev 0/016171D8, desc: MULTI_INSERT+INIT 150 tuples flags 0x02, blkref #0: rel 1663/13751/16384 blk 10 This behavior will be the same for sequential copy and copy with 1 worker as the sequence of insert & the pages used to insert is in same order. There 2 reasons together result in the varying wal_size & wal_records with multiple worker: 1) When more than 1 worker is involved the sequence in which the pages that will be selected is not guaranteed, the MULTI_INSERT tuple count varies & MULTI_INSERT/MULTI_INSERT+INIT description varies. 2) wal_records will increase with more number of workers because when the tuples are split across the workers, one of the worker will have few more WAL record because the last heap_multi_insert gets split across the workers and generates new wal records like: rmgr: Heap2 len (rec/tot): 600/ 600, tx: 510, lsn: 0/019F8B08, prev 0/019F7C48, desc: MULTI_INSERT 25 tuples flags 0x00, blkref #0: rel 1663/13751/16384 blk 1065 Attached the tar of wal file dump which was used for analysis. Regards, Vignesh EnterpriseDB: http://www.enterprisedb.com
Вложения
В списке pgsql-hackers по дате отправления: