Re: Optimizing pglz compressor

Поиск
Список
Период
Сортировка
От Amit Kapila
Тема Re: Optimizing pglz compressor
Дата
Msg-id 01fa01ce7272$5b359700$11a0c500$@kapila@huawei.com
обсуждение исходный текст
Ответ на Re: Optimizing pglz compressor  (Heikki Linnakangas <hlinnakangas@vmware.com>)
Ответы Re: Optimizing pglz compressor  (Heikki Linnakangas <hlinnakangas@vmware.com>)
Список pgsql-hackers
On Wednesday, June 26, 2013 2:15 AM Heikki Linnakangas wrote:
> On 19.06.2013 14:01, Amit Kapila wrote:
> > Observations
> > --------------
> > 1. For small data perforamce is always good with patch.
> > 2. For random small/large data performace is good.
> > 3. For medium and large text and same byte data(3K,5K text,
> > 10K,100K,500K same byte), performance is degraded.
> 
> Wow, that's strange. What platform and CPU did you test on? 

CPU - 4 core 
RAM - 24GB 
OS  - SUSE 11 SP2
Kernel version - 3.0.13

> Are you
> sure you used the same compiler flags with and without the patch?

Yes.
> Can you also try the attached patch, please? It's the same as before,
> but in this version, I didn't replace the prev and next pointers in
> PGLZ_HistEntry struct with int16s. That avoids some table lookups, at
> the expense of using more memory. It's closer to what we have without
> the patch, so maybe that helps on your system.

Yes it helped a lot on my system.

Head:     testname      |   auto 
-------------------+----------- 5k text           |  3499.888 512b text         |  1425.106 256b text         |
1769.1261K text           |  1378.151 3K text           |  4081.254 2k random         |  -410.928 100k random       |
-10.235500k random       |    -2.094 512b random       |  -770.665 256b random       | -1120.173 1K random         |
-570.35110k of same byte  |  3602.610 100k of same byte | 36187.863 500k of same byte | 26055.472
 

After your Patch (pglz-variable-size-hash-table-2.patch)
    testname      |   auto 
-------------------+----------- 5k text           |  3524.306 512b text         |   954.962 256b text         |
832.5271K text           |  1273.970 3K text           |  3963.329 2k random         |  -300.516 100k random       |
-7.538500k random       |    -1.525 512b random       |  -439.726 256b random       |  -440.154 1K random         |
-391.07010k of same byte  |  3570.921 100k of same byte | 37498.502 500k of same byte | 26904.426
 

There was minor problem in you patch, in one of experiments it crashed.
Fix is not to access 0th history entry in function pglz_find_match(),
modified patch is attached.

After fix, readings are almost similar:

testname           |   auto 
-------------------+----------- 5k text           |  3347.961 512b text         |   938.442 256b text         |
834.4961K text           |  1243.618 3K text           |  3790.835 2k random         |  -306.470 100k random       |
-7.589500k random       |    -1.517 512b random       |  -442.649 256b random       |  -438.781 1K random         |
-392.10610k of same byte  |  3565.449 100k of same byte | 37355.595 500k of same byte | 26776.076
 



I guess some difference might be due to different way of accessing in
pglz_hist_add().


With Regards,
Amit Kapila.

В списке pgsql-hackers по дате отправления:

Предыдущее
От: Noah Misch
Дата:
Сообщение: Re: updated emacs configuration
Следующее
От: "Yuri Levinsky"
Дата:
Сообщение: Re: Hash partitioning.