Proposal: speeding up GIN build with parallel workers
От | Constantin S. Pan |
---|---|
Тема | Proposal: speeding up GIN build with parallel workers |
Дата | |
Msg-id | 20160116013839.57cfcb37@thought обсуждение исходный текст |
Ответы |
Re: Proposal: speeding up GIN build with parallel workers
Re: Proposal: speeding up GIN build with parallel workers Re: [WIP] speeding up GIN build with parallel workers |
Список | pgsql-hackers |
Hi, Hackers. The task of building GIN can require lots of time and eats 100 % CPU, but we could easily make it use more than a 100 %, especially since we now have parallel workers in postgres. The process of building GIN looks like this: 1. Accumulate a batch of index records into an rbtree in maintenance work memory. 2. Dump the batch to disk. 3. Repeat. I have a draft implementation which divides the whole process between N parallel workers, see the patch attached. Instead of a full scan of the relation, I give each worker a range of blocks to read. This speeds up the first step N times, but slows down the second one, because when multiple workers dump item pointers for the same key, each of them has to read and decode the results of the previous one. That is a huge waste, but there is an idea on how to eliminate it. When it comes to dumping the next batch, a worker does not do it independently. Instead, it (and every other worker) sends the accumulated index records to the parent (backend) in ascending key order. The backend, which receives the records from the workers through shared memory, can merge them and dump each of them once, without the need to reread the records N-1 times. In current state the implementation is just a proof of concept and it has all the configuration hardcoded, but it already works as is, though it does not speed up the build process more than 4 times on my configuration (12 CPUs). There is also a problem with temporary tables, for which the parallel mode does not work. Please leave your feedback. Regards, Constantin S. Pan Postgres Professional: http://www.postgrespro.com The Russian Postgres Company
Вложения
В списке pgsql-hackers по дате отправления: