[PATCH] LLVM tuple deforming improvements
| От | Pierre Ducroquet |
|---|---|
| Тема | [PATCH] LLVM tuple deforming improvements |
| Дата | |
| Msg-id | 2033438.sjni0gc1MV@peanuts2 обсуждение исходный текст |
| Ответы |
Re: [PATCH] LLVM tuple deforming improvements
|
| Список | pgsql-hackers |
Hi As reported in the «effect of JIT tuple deform?» thread, there are for some cases slowdowns when using JIT tuple deforming. I've played with the generated code and with the LLVM optimizer trying to fix that issue, here are the results of my experiments, with the corresponding patches. All performance measurements are done following the test from https://www.postgresql.org/message-id/CAFj8pRAOcSXNnykfH=M6mNaHo +g=FaUs=DLDZsOHdJbKujRFSg@mail.gmail.com Base measurements : No JIT : 850ms JIT without tuple deforming : 820 ms (0.2ms optimizing) JIT with tuple deforming, no opt : 1650 ms (1.5ms) JIT with tuple deforming, -O3 : 770 ms (105ms) 1) force a -O1 when deforming This is by far the best I managed to get. With -O1, the queries are even faster than with -O3 since the optimizer is faster, while generating an already efficient code. I have tried adding the right passes to the passmanager, but it looks like the interesting ones are not available unless you enable -O1. JIT with tuple deforming, -O1 : 725 ms (54ms) 2) improve the LLVM IR code The code generator in llvmjit-deform.c currently rely on the LLVM optimizer to do the right thing. For instance, it can generate a lot of empty blocks with only a jump. If we don't want to enable the LLVM optimizer for every code, we have to get rid of this kind of pattern. The attached patch does that. When the optimizer is not used, this gets a few cycles boost, nothing impressive. I have tried to go closer to the optimized bitcode, but it requires building phi nodes manually instead of using alloca, and this isn't enough to bring us to the performance level of -O1. JIT with tuple deforming, no opt : 1560 ms (1.5ms) 3) *experimental* : faster non-NULL handling Currently, the generated code always look at the tuple header bitfield to check each field null-ness, using afterwards an and against the hasnulls bit. Checking only for hasnulls improves performance when there are mostly null- less tuples, but taxes the performance when nulls are found. I have not yet suceeded in implementing it, but I think that using the statistics collected for a given table, we could use that when we know that we may benefit from it. JIT with tuple deforming, no opt : 1520 ms (1.5ms) JIT with tuple deforming, -O1 : 690 ms (54ms)
Вложения
В списке pgsql-hackers по дате отправления: