Обсуждение: In-Memory Columnar Store
Hello! I want to annouce my implementation of In-Memory Columnar Store extension for PostgreSQL: Documentation: http://www.garret.ru/imcs/user_guide.html Sources: http://www.garret.ru/imcs-1.01.tar.gz Any feedbacks, bug reports and suggestions are welcome. Vertical representation of data is stored in PostgreSQL shared memory. This is why it is important to be able to utilize all available physical memory. Now servers with Tb or more RAM are not something exotic, especially in financial world. But there is limitation in Linux with standard 4kb pages for maximal size of mapped memory segment: 256Gb. It is possible to overcome this limitation either by creating multiple segments - but it requires too much changes in PostgreSQL memory manager. Or just set MAP_HUGETLB flag (assuming that huge pages were allocated in the system). I found several messages related with MAP_HUGETLB flag, the most recent one was from 21 of November: http://www.postgresql.org/message-id/20131125032920.GA23793@toroid.org I wonder what is the current status of this patch?
On Mon, Dec 9, 2013 at 1:40 PM, knizhnik <knizhnik@garret.ru> wrote: > Hello! > > I want to annouce my implementation of In-Memory Columnar Store extension > for PostgreSQL: > > Documentation: http://www.garret.ru/imcs/user_guide.html > Sources: http://www.garret.ru/imcs-1.01.tar.gz > > Any feedbacks, bug reports and suggestions are welcome. > > Vertical representation of data is stored in PostgreSQL shared memory. > This is why it is important to be able to utilize all available physical > memory. > Now servers with Tb or more RAM are not something exotic, especially in > financial world. > But there is limitation in Linux with standard 4kb pages for maximal size > of mapped memory segment: 256Gb. > It is possible to overcome this limitation either by creating multiple > segments - but it requires too much changes in PostgreSQL memory manager. > Or just set MAP_HUGETLB flag (assuming that huge pages were allocated in the > system). > > I found several messages related with MAP_HUGETLB flag, the most recent one > was from 21 of November: > http://www.postgresql.org/message-id/20131125032920.GA23793@toroid.org > > I wonder what is the current status of this patch? I looked over your extension. I think it's a pretty amazing example of the postgres extension and type systems -- up there with postgis. Very well done. How long did this take you to write? MAP_HUGETLB patch was marked 'returned with feedback'. https://commitfest.postgresql.org/action/patch_view?id=1308. It seems likely to be revived, perhaps in time for 9.4. Honestly, I think your efforts here provide more argument for adding huge tbl support. merlin
On Mon, Dec 09, 2013 at 11:40:41PM +0400, knizhnik wrote: > Hello! > > I want to annouce my implementation of In-Memory Columnar Store > extension for PostgreSQL: > > Documentation: http://www.garret.ru/imcs/user_guide.html > Sources: http://www.garret.ru/imcs-1.01.tar.gz > > Any feedbacks, bug reports and suggestions are welcome. > > Vertical representation of data is stored in PostgreSQL shared memory. > This is why it is important to be able to utilize all available > physical memory. Hi, This is very neat! The question I have, which applies to the matview support as well, is "How can we transparently substitute usage of the in-memory columnar store/matview in a SQL query?". Regards, Ken
Hello! Implementation of IMCS itself took me about two months (with testing and writing documentation). But huge part of the code was previously written by me for other projects, so I have reused them. Most of the time I have spent in integration of this code with PostgreSQL (I was not so familiar with it before). Certainly implementations of columnar store for Oracle (Oracle Database In-Memory Option), DB2 (BLU Acceleration), ... are more convenient for users: them can execute normal SQL queries and do not require users to learn new functions and approach. But it requires complete redesign of query engine (or providing alternative implementation). I was not able to do it. This is why I try to provide advantages of vertical data representation (vector operation, parallel execution, data skipping) as well as advantages of fast access to in-memory data as standard PostgreSQL extension. There are obviously some limitations and queries look more complicated than in case of standard SQL... But from the other side it is possible to write queries which are hardly to be expressed using standard SQL. For example calculating split-adjusted prices can not be done in SQL without using stored procedures. To make usage of IMCS functions as simple as possible I defined a larger number of various operators for most popular operations. For example Volume-Weighted-Average-Price can be calculated just as: select Volume//Close as VWAP from Quote_get(); It is even shore than analog SQL statement: select sum(Close*Volume)/sum(Volume) as VWAP from Quote; Concerning integration with PostgreSQL, there were several problems. Some of them seems to have no easy solution, but other are IMHO imperfections in PostgreSQL which I hope will be fixed sometime: 1. Calls in PL/pgSQL are very slow - about 1-2 micsroseconds at my computer. Just defining insertion per-row trigger with empty procedure increase time of insertion of 6 million records twice - from 7 till 15 seconds. If trigger procedure is not empty, then time is increased proportionally number of performed calls. In my case inserting data with propagation it in columnar store using trigger takes about 80 seconds. But if I first load data without triggers in PostgreSQL table and then insert it in columnar store using load function (implemented in C), then time will be 7+9=16 seconds. Certainly I realize that plpgsql is interpreted language. But for example also interpreted Python is able to do 100 times more calls per second. Unfortunately profiler doesn;t show some bottleneck - looks like long calltime is caused by large overhead of initializing and resetting memory context and copying arguments data. 2. Inefficient implementation of expanding composite type columns using (foo()).* clause. In this case function foo() will be invoked as much times as there are fields in the returned composite type. Even in case of placing call in FROM list (thanks to lateral joins in 9.3), PostgreSQL still sometimes performs redundant calls which can be avoided using hack with adding "OFFSET 1" clause. 3. 256Gb limit for used shared memory segment size at Linux. Concerning last problem - I have included in IMCS distributive much simpler patch which just set MAP_HUGETLB flags when a) is it defined in system headers b) requested memory size is larger than 256Gb In this case right now PostgreSQL will just fail to start. But certainly it is more correct to trigger this flag through configuration parameter, because large pages can minimize MMU overhead and so increase speed even if size of used memory is less than 256Gb (this is why Oracle is widely using it). . Вызов функции занимает прядка 2 микросекунд. Т.е. если я напишу триггер с пустой процедурой, то вставка 6 миллионов объектов займёт 15 секунд. Это при том, что без триггера вставка занимает всего 7 секунд... On 12/11/2013 06:33 PM, Merlin Moncure wrote: > On Mon, Dec 9, 2013 at 1:40 PM, knizhnik <knizhnik@garret.ru> wrote: >> Hello! >> >> I want to annouce my implementation of In-Memory Columnar Store extension >> for PostgreSQL: >> >> Documentation: http://www.garret.ru/imcs/user_guide.html >> Sources: http://www.garret.ru/imcs-1.01.tar.gz >> >> Any feedbacks, bug reports and suggestions are welcome. >> >> Vertical representation of data is stored in PostgreSQL shared memory. >> This is why it is important to be able to utilize all available physical >> memory. >> Now servers with Tb or more RAM are not something exotic, especially in >> financial world. >> But there is limitation in Linux with standard 4kb pages for maximal size >> of mapped memory segment: 256Gb. >> It is possible to overcome this limitation either by creating multiple >> segments - but it requires too much changes in PostgreSQL memory manager. >> Or just set MAP_HUGETLB flag (assuming that huge pages were allocated in the >> system). >> >> I found several messages related with MAP_HUGETLB flag, the most recent one >> was from 21 of November: >> http://www.postgresql.org/message-id/20131125032920.GA23793@toroid.org >> >> I wonder what is the current status of this patch? > I looked over your extension. I think it's a pretty amazing example > of the postgres extension and type systems -- up there with postgis. > Very well done. How long did this take you to write? > > MAP_HUGETLB patch was marked 'returned with feedback'. > https://commitfest.postgresql.org/action/patch_view?id=1308. It seems > likely to be revived, perhaps in time for 9.4. > > Honestly, I think your efforts here provide more argument for adding > huge tbl support. > > merlin
Hi, I depends on what you mean by "transparently substitute". I f you want to be able to execute standard SQL queries using columnar store, then it seems to be impossible without rewriting of executor. I provided another approach based on calling standard functions which perform manipulations not with scalar types but with timeseries. For example instead of standard SQL select sum(ClosePrice) from Quote; I will have to write: select cs_sum(ClosePrice) from Quote_get(); It looks similar but not quite the same. And for more complex queries difference is larger. For example the query select sum(score*volenquired)/sum(volenquired) from DbItem group by (trader,desk,office); can be written as select agg_val,cs_cut(group_by,'c22c30c10') from (select (cs_project_agg(ss1.*)).* from (select (s1).sum/(s2).sum,(s1).groupsfrom DbItem_get() q, cs_hash_sum(q.score*q.volenquired, q.trader||q.desk||q.office) s1, cs_hash_sum(q.volenquired, q.trader||q.desk||q.office) s2) ss1) ss2; Looks too complex, doesn't it? But first two lines are responsible to perform reverse mapping: from vertical data representation to normal horisontal tuples. The good thing is that this query is executed more than 1000 times faster (with default PostgreSQL configuration parameters except shared shared_buffers which was set large enough to fit all data in memory). On 12/11/2013 07:14 PM, ktm@rice.edu wrote: > On Mon, Dec 09, 2013 at 11:40:41PM +0400, knizhnik wrote: >> Hello! >> >> I want to annouce my implementation of In-Memory Columnar Store >> extension for PostgreSQL: >> >> Documentation: http://www.garret.ru/imcs/user_guide.html >> Sources: http://www.garret.ru/imcs-1.01.tar.gz >> >> Any feedbacks, bug reports and suggestions are welcome. >> >> Vertical representation of data is stored in PostgreSQL shared memory. >> This is why it is important to be able to utilize all available >> physical memory. > Hi, > > This is very neat! The question I have, which applies to the matview > support as well, is "How can we transparently substitute usage of the > in-memory columnar store/matview in a SQL query?". > > Regards, > Ken
On Wed, Dec 11, 2013 at 10:08 AM, knizhnik <knizhnik@garret.ru> wrote: > 1. Calls in PL/pgSQL are very slow - about 1-2 micsroseconds at my computer. > Just defining insertion per-row trigger with empty procedure increase time > of insertion of 6 million records twice - from 7 till 15 seconds. If trigger > procedure is not empty, then time is increased proportionally number of > performed calls. > In my case inserting data with propagation it in columnar store using > trigger takes about 80 seconds. But if I first load data without triggers in > PostgreSQL table and then > insert it in columnar store using load function (implemented in C), then > time will be 7+9=16 seconds. Yeah. For this problem, we either unfortunately have to try to try to use standard sql functions in such away that supports inlining (this is a black art mostly, and fragile), or move logic out of the function and into the query via things like window functions, or just deal with the performance hit. postgres flavored SQL is pretty much the most productive language on the planet AFAIC, but the challenge is always performance, performance. Down the line, I am optimistic per call function overhead can be optimized, probably by expanding what can be inlined somehow. The problem is that this requires cooperation from the language executors this is not currently possible through the SPI interface, so I really don't know. > Certainly I realize that plpgsql is interpreted language. But for example > also interpreted Python is able to do 100 times more calls per second. > Unfortunately profiler doesn;t show some bottleneck - looks like long > calltime is caused by large overhead of initializing and resetting memory > context and copying arguments data. > > 2. Inefficient implementation of expanding composite type columns using > (foo()).* clause. In this case function foo() will be invoked as much times > as there are fields in the returned composite type. Even in case of placing > call in FROM list (thanks to lateral joins in 9.3), PostgreSQL still > sometimes performs redundant calls which can be avoided using hack with > adding "OFFSET 1" clause. Yeah, this is long standing headache. LATERAL mostly deals with this but most cases (even with pre-9.3) can be worked around one way or another. > 3. 256Gb limit for used shared memory segment size at Linux. I figure this will be solved fairly soon. It's a nice problem to have. merlin
"ktm@rice.edu" <ktm@rice.edu> wrote: > The question I have, which applies to the matview support as > well, is "How can we transparently substitute usage of the > in-memory columnar store/matview in a SQL query?". My take on that regarding matviews is: (1) It makes no sense to start work on this without a far more sophisticated concept of matview "freshness" (or "staleness", as some products prefer to call it). (2) Work on query rewrite to use sufficiently fresh matviews to optimize the execution of a query and work on "freshness" tracking are orthogonal to work on incremental maintenance. I have no plans to work on either matview freshness or rewrite, as there seems to be several years worth of work to get incremental maintenance up to a level matching other products. I welcome anyone else to take on those other projects. -- Kevin Grittner EDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
<div dir="ltr"><br /><div class="gmail_extra"><br /><br /><div class="gmail_quote">2013/12/9 knizhnik <span dir="ltr"><<ahref="mailto:knizhnik@garret.ru" target="_blank">knizhnik@garret.ru</a>></span><br /><blockquote class="gmail_quote"style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex"> Hello!<br /><br/> I want to annouce my implementation of In-Memory Columnar Store extension for PostgreSQL:<br /><br /> Documentation:<a href="http://www.garret.ru/imcs/user_guide.html" target="_blank">http://www.garret.ru/imcs/<u></u>user_guide.html</a><br/> Sources: <a href="http://www.garret.ru/imcs-1.01.tar.gz"target="_blank">http://www.garret.ru/imcs-1.<u></u>01.tar.gz</a><br /><br />Any feedbacks, bug reports and suggestions are welcome.<br /><br /> Vertical representation of data is stored in PostgreSQLshared memory.<br /> This is why it is important to be able to utilize all available physical memory.<br /> Nowservers with Tb or more RAM are not something exotic, especially in financial world.<br /> But there is limitation inLinux with standard 4kb pages for maximal size of mapped memory segment: 256Gb.<br /> It is possible to overcome thislimitation either by creating multiple segments - but it requires too much changes in PostgreSQL memory manager.<br />Or just set MAP_HUGETLB flag (assuming that huge pages were allocated in the system).<br /><br /> I found several messagesrelated with MAP_HUGETLB flag, the most recent one was from 21 of November:<br /><a href="http://www.postgresql.org/message-id/20131125032920.GA23793@toroid.org" target="_blank">http://www.postgresql.org/<u></u>message-id/20131125032920.<u></u>GA23793@toroid.org</a><br/><br /> I wonderwhat is the current status of this patch?<span class=""><font color="#888888"><br /><br /><br /><br /><br /><br /><br/> -- <br /> Sent via pgsql-hackers mailing list (<a href="mailto:pgsql-hackers@postgresql.org" target="_blank">pgsql-hackers@postgresql.org</a>)<br/> To make changes to your subscription:<br /><a href="http://www.postgresql.org/mailpref/pgsql-hackers" target="_blank">http://www.postgresql.org/<u></u>mailpref/pgsql-hackers</a><br/></font></span></blockquote></div><br /><br/></div><div class="gmail_extra">Hello,<br /></div><div class="gmail_extra"> excellent work! I begin to dotesting and it's very fast, by the way I found a strange case of "endless" query with CPU a 100% when the value used asfilter does not exists:<br /><br /></div><div class="gmail_extra">I am testing with postgres 9.3.1 on debian and I useddefault value for the extension except memory ( 512mb )<br /><br /></div><div class="gmail_extra">how to recreate thetest case :<br /></div><div class="gmail_extra"><br /></div><div class="gmail_extra">## create a table :<br /><br /></div><divclass="gmail_extra">create table endless ( col1 int , col2 char(30) , col3 int ) ;<br /><br /></div><div class="gmail_extra">##insert some values:<br /></div><div class="gmail_extra"><br />insert into endless values ( 1, 'ahahahaha',3);<br /><br />insert into endless values ( 2, 'ghghghghg', 4);<br /><br />## create the column store objects:<br/><br />select cs_create('endless','col1','col2');<br /> cs_create <br />-----------<br /> <br />(1 row)<br /><br/></div><div class="gmail_extra">## try and test column store :<br /></div><div class="gmail_extra"><br />select cs_avg(col3)from endless_get('ahahahaha');<br /> cs_avg <br /> --------<br /> 3<br />(1 row)<br /><br />select cs_avg(col3)from endless_get('ghghghghg');<br /> cs_avg <br />--------<br /> 4<br />(1 row)<br /><br /></div><div class="gmail_extra">##now select with a value that does not exist :<br /></div><div class="gmail_extra"><br />select cs_avg(col3)from endless_get('testing');<br /><br /></div><div class="gmail_extra"># and now start to loop on cpu and seemsto never ends , I had to terminate backend<br /></div><div class="gmail_extra"><br /></div><div class="gmail_extra">Bye<br/><br />Mat<br /></div></div>
<div class="moz-cite-prefix">Thank you very much for reporting the problem.<br /> And sorry for this bug and lack of negativetests.<br /><br /> Attempt to access unexisted value cause autoloading of data from the table to columnar store (becauseautoload property is enabled by default) <br /> and as far as this entry is not present in the table, the code fallsinto infinite recursion.<br /> Patched version of IMCS is available at <a class="moz-txt-link-freetext" href="http://www.garret.ru/imcs-1.01.tar.gz">http://www.garret.ru/imcs-1.01.tar.gz</a><br/><br /> I am going to place IMCSunder version control now. Just looking for proper place for repository...<br /><br /><br /> On 12/12/2013 04:06 AM,desmodemone wrote:<br /></div><blockquote cite="mid:CAEs9oFn920CSw_0k+TTa79cdF6zQC+TdHz3xwvbdXmXa_iEMZQ@mail.gmail.com"type="cite"><div dir="ltr"><br /><div class="gmail_extra"><br/><br /><div class="gmail_quote">2013/12/9 knizhnik <span dir="ltr"><<a href="mailto:knizhnik@garret.ru"moz-do-not-send="true" target="_blank">knizhnik@garret.ru</a>></span><br /><blockquoteclass="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">Hello!<br /><br /> I want to annouce my implementation of In-Memory Columnar Store extensionfor PostgreSQL:<br /><br /> Documentation: <a href="http://www.garret.ru/imcs/user_guide.html" moz-do-not-send="true"target="_blank">http://www.garret.ru/imcs/user_guide.html</a><br /> Sources: <a href="http://www.garret.ru/imcs-1.01.tar.gz"moz-do-not-send="true" target="_blank">http://www.garret.ru/imcs-1.01.tar.gz</a><br/><br /> Any feedbacks, bug reports and suggestions are welcome.<br/><br /> Vertical representation of data is stored in PostgreSQL shared memory.<br /> This is why it is importantto be able to utilize all available physical memory.<br /> Now servers with Tb or more RAM are not something exotic,especially in financial world.<br /> But there is limitation in Linux with standard 4kb pages for maximal size ofmapped memory segment: 256Gb.<br /> It is possible to overcome this limitation either by creating multiple segments - butit requires too much changes in PostgreSQL memory manager.<br /> Or just set MAP_HUGETLB flag (assuming that huge pageswere allocated in the system).<br /><br /> I found several messages related with MAP_HUGETLB flag, the most recent onewas from 21 of November:<br /><a href="http://www.postgresql.org/message-id/20131125032920.GA23793@toroid.org" moz-do-not-send="true"target="_blank">http://www.postgresql.org/message-id/20131125032920.GA23793@toroid.org</a><br /><br/> I wonder what is the current status of this patch?<span class=""><font color="#888888"><br /><br /><br /><br /><br/><br /><br /> -- <br /> Sent via pgsql-hackers mailing list (<a href="mailto:pgsql-hackers@postgresql.org" moz-do-not-send="true"target="_blank">pgsql-hackers@postgresql.org</a>)<br /> To make changes to your subscription:<br /><ahref="http://www.postgresql.org/mailpref/pgsql-hackers" moz-do-not-send="true" target="_blank">http://www.postgresql.org/mailpref/pgsql-hackers</a><br/></font></span></blockquote></div><br /><br /></div><divclass="gmail_extra">Hello,<br /></div><div class="gmail_extra"> excellent work! I begin to do testingand it's very fast, by the way I found a strange case of "endless" query with CPU a 100% when the value used as filterdoes not exists:<br /><br /></div><div class="gmail_extra">I am testing with postgres 9.3.1 on debian and I used defaultvalue for the extension except memory ( 512mb )<br /><br /></div><div class="gmail_extra">how to recreate the testcase :<br /></div><div class="gmail_extra"><br /></div><div class="gmail_extra">## create a table :<br /><br /></div><divclass="gmail_extra">create table endless ( col1 int , col2 char(30) , col3 int ) ;<br /><br /></div><div class="gmail_extra">##insert some values:<br /></div><div class="gmail_extra"><br /> insert into endless values ( 1, 'ahahahaha',3);<br /><br /> insert into endless values ( 2, 'ghghghghg', 4);<br /><br /> ## create the column store objects:<br/><br /> select cs_create('endless','col1','col2');<br /> cs_create <br /> -----------<br /> <br /> (1 row)<br/><br /></div><div class="gmail_extra">## try and test column store :<br /></div><div class="gmail_extra"><br /> selectcs_avg(col3) from endless_get('ahahahaha');<br /> cs_avg <br /> --------<br /> 3<br /> (1 row)<br /><br />select cs_avg(col3) from endless_get('ghghghghg');<br /> cs_avg <br /> --------<br /> 4<br /> (1 row)<br /><br/></div><div class="gmail_extra">## now select with a value that does not exist :<br /></div><div class="gmail_extra"><br/> select cs_avg(col3) from endless_get('testing');<br /><br /></div><div class="gmail_extra"># andnow start to loop on cpu and seems to never ends , I had to terminate backend<br /></div><div class="gmail_extra"><br/></div><div class="gmail_extra">Bye<br /><br /> Mat<br /></div></div></blockquote><br />
<p dir="ltr">it is interesting idea. For me, a significant information from comparation, so we do some significantly wrong.Memory engine should be faster naturally, but I don't tkink it can be 1000x.<p dir="ltr">Yesterday we did a some tests,that shows so for large tables (5G)a our hashing is not effective. Disabling hash join and using merge join increasedspeed 2x<br /> Dne 9. 12. 2013 20:41 "knizhnik" <<a href="mailto:knizhnik@garret.ru">knizhnik@garret.ru</a>>napsal(a):<br /> ><br /> > Hello!<br /> ><br /> >I want to annouce my implementation of In-Memory Columnar Store extension for PostgreSQL:<br /> ><br /> > Documentation:<a href="http://www.garret.ru/imcs/user_guide.html">http://www.garret.ru/imcs/user_guide.html</a><br /> > Sources: <a href="http://www.garret.ru/imcs-1.01.tar.gz">http://www.garret.ru/imcs-1.01.tar.gz</a><br /> ><br/> > Any feedbacks, bug reports and suggestions are welcome.<br /> ><br /> > Vertical representation of datais stored in PostgreSQL shared memory.<br /> > This is why it is important to be able to utilize all available physicalmemory.<br /> > Now servers with Tb or more RAM are not something exotic, especially in financial world.<br />> But there is limitation in Linux with standard 4kb pages for maximal size of mapped memory segment: 256Gb.<br />> It is possible to overcome this limitation either by creating multiple segments - but it requires too much changesin PostgreSQL memory manager.<br /> > Or just set MAP_HUGETLB flag (assuming that huge pages were allocated inthe system).<br /> ><br /> > I found several messages related with MAP_HUGETLB flag, the most recent one was from21 of November:<br /> > <a href="http://www.postgresql.org/message-id/20131125032920.GA23793@toroid.org">http://www.postgresql.org/message-id/20131125032920.GA23793@toroid.org</a><br />><br /> > I wonder what is the current status of this patch?<br /> ><br /> ><br /> ><br /> ><br /> ><br/> ><br /> > -- <br /> > Sent via pgsql-hackers mailing list (<a href="mailto:pgsql-hackers@postgresql.org">pgsql-hackers@postgresql.org</a>)<br/> > To make changes to your subscription:<br/> > <a href="http://www.postgresql.org/mailpref/pgsql-hackers">http://www.postgresql.org/mailpref/pgsql-hackers</a><br/>
<div class="moz-cite-prefix">On 12/12/2013 11:42 AM, Pavel Stehule wrote:<br /></div><blockquote cite="mid:CAFj8pRDv7dHivPVrCQFT3zz2a7-d9+nm6cwB-mQcxOWtkdrrDg@mail.gmail.com"type="cite"><p dir="ltr">it is interesting idea.For me, a significant information from comparation, so we do some significantly wrong. Memory engine should be fasternaturally, but I don't tkink it can be 1000x.</blockquote><br /> Sorry, but I didn't fabricate this results:<br />Below is just snapshot from my computer:<br /><br /><br /> postgres=# select DbItem_load();<br /> dbitem_load <br /> -------------<br/> 9999998<br /> (1 row)<br /><br /> postgres=# \timing<br /> Timing is on.<br /> postgres=# selectcs_used_memory();<br /> cs_used_memory <br /> ----------------<br /> 4441894912<br /> (1 row)<br /><br /> postgres=#select agg_val,cs_cut(group_by,'c22c30c10') from <br /> (select (cs_project_agg(ss1.*)).* from <br /> (select (s1).sum/(s2).sum,(s1).groups from DbItem_get() q, <br /> cs_hash_sum(q.score*q.volenquired,q.trader||q.desk||q.office) s1, <br /> cs_hash_sum(q.volenquired, q.trader||q.desk||q.office)s2) ss1) ss2;<br /> agg_val | cs_cut <br /> ------------------+------------------------------------------------------------<br/> 1.50028393511844 | ("John Coltrane","NewYork Corporates","New York")<br /> ....<br /> Time: <font color="#ff0000">506.125 ms</font><br /><br /> postgres=#select sum(score*volenquired)/sum(volenquired) from DbItem group by (trader,desk,office);<br /> ...<br /> Time:<font color="#ff0000">449328.645 ms</font><br /> postgres=# select sum(score*volenquired)/sum(volenquired) from DbItemgroup by (trader,desk,office);<br /> ...<br /> Time: <font color="#ff0000">441530.689 ms</font><br /><br /> Pleasenotice that time of second execution is almost the same as first, although all data can fit in cache!<br /><br /> Certainlyit was intersting to me to understand the reason of such bad performance.<br /> And find out two things:<br /><br/> 1. <br /> select sum(score*volenquired)/sum(volenquired) from DbItem group by (trader,desk,office);<br /> and<br/> select sum(score*volenquired)/sum(volenquired) from DbItem group by trader,desk,office;<br /><br /> are notthe same queries (it is hard to understand to C programmer:)<br /> And first one is executed significantly slower.<br/><br /> 2. It is not enough to increase "shared_buffers" parameter in postgresql.conf.<br /> "work_mem" is alsovery important. When I increased it to 1Gb from default 1Mb, then time of query execution is reduced to <br /> 7107.146ms. So the real difference is ten times, not 1000 times.<br /><br /><br /><br /><br /><br /><blockquote cite="mid:CAFj8pRDv7dHivPVrCQFT3zz2a7-d9+nm6cwB-mQcxOWtkdrrDg@mail.gmail.com"type="cite"><p dir="ltr">Yesterday we did asome tests, that shows so for large tables (5G)a our hashing is not effective. Disabling hash join and using merge joinincreased speed 2x<br /> Dne 9. 12. 2013 20:41 "knizhnik" <<a href="mailto:knizhnik@garret.ru" moz-do-not-send="true">knizhnik@garret.ru</a>>napsal(a):<br /> ><br /> > Hello!<br /> ><br /> > I want toannouce my implementation of In-Memory Columnar Store extension for PostgreSQL:<br /> ><br /> > Documentation:<a href="http://www.garret.ru/imcs/user_guide.html" moz-do-not-send="true">http://www.garret.ru/imcs/user_guide.html</a><br/> > Sources: <a href="http://www.garret.ru/imcs-1.01.tar.gz"moz-do-not-send="true">http://www.garret.ru/imcs-1.01.tar.gz</a><br /> ><br/> > Any feedbacks, bug reports and suggestions are welcome.<br /> ><br /> > Vertical representation of datais stored in PostgreSQL shared memory.<br /> > This is why it is important to be able to utilize all available physicalmemory.<br /> > Now servers with Tb or more RAM are not something exotic, especially in financial world.<br />> But there is limitation in Linux with standard 4kb pages for maximal size of mapped memory segment: 256Gb.<br />> It is possible to overcome this limitation either by creating multiple segments - but it requires too much changesin PostgreSQL memory manager.<br /> > Or just set MAP_HUGETLB flag (assuming that huge pages were allocated inthe system).<br /> ><br /> > I found several messages related with MAP_HUGETLB flag, the most recent one was from21 of November:<br /> > <a href="http://www.postgresql.org/message-id/20131125032920.GA23793@toroid.org" moz-do-not-send="true">http://www.postgresql.org/message-id/20131125032920.GA23793@toroid.org</a><br/> ><br /> > Iwonder what is the current status of this patch?<br /> ><br /> ><br /> ><br /> ><br /> ><br /> ><br />> -- <br /> > Sent via pgsql-hackers mailing list (<a href="mailto:pgsql-hackers@postgresql.org" moz-do-not-send="true">pgsql-hackers@postgresql.org</a>)<br/> > To make changes to your subscription:<br /> > <a href="http://www.postgresql.org/mailpref/pgsql-hackers" moz-do-not-send="true">http://www.postgresql.org/mailpref/pgsql-hackers</a><br/></blockquote><br />
On Thu, Dec 12, 2013 at 4:02 AM, knizhnik <knizhnik@garret.ru> wrote: > On 12/12/2013 11:42 AM, Pavel Stehule wrote: > > it is interesting idea. For me, a significant information from comparation, > so we do some significantly wrong. Memory engine should be faster naturally, > but I don't tkink it can be 1000x. > > > Sorry, but I didn't fabricate this results: > Below is just snapshot from my computer: > > > postgres=# select DbItem_load(); > dbitem_load > ------------- > 9999998 > (1 row) > > postgres=# \timing > Timing is on. > postgres=# select cs_used_memory(); > cs_used_memory > ---------------- > 4441894912 > (1 row) > > postgres=# select agg_val,cs_cut(group_by,'c22c30c10') from > (select (cs_project_agg(ss1.*)).* from > (select (s1).sum/(s2).sum,(s1).groups from DbItem_get() q, > cs_hash_sum(q.score*q.volenquired, > q.trader||q.desk||q.office) s1, > cs_hash_sum(q.volenquired, q.trader||q.desk||q.office) s2) > ss1) ss2; > agg_val | cs_cut > ------------------+------------------------------------------------------------ > 1.50028393511844 | ("John Coltrane","New York Corporates","New York") > .... > Time: 506.125 ms > > postgres=# select sum(score*volenquired)/sum(volenquired) from DbItem group > by (trader,desk,office); > ... > Time: 449328.645 ms > postgres=# select sum(score*volenquired)/sum(volenquired) from DbItem group > by (trader,desk,office); > ... > Time: 441530.689 ms > > Please notice that time of second execution is almost the same as first, > although all data can fit in cache! > > Certainly it was intersting to me to understand the reason of such bad > performance. > And find out two things: > > 1. > select sum(score*volenquired)/sum(volenquired) from DbItem group by > (trader,desk,office); > and > select sum(score*volenquired)/sum(volenquired) from DbItem group by > trader,desk,office; > > are not the same queries (it is hard to understand to C programmer:) > And first one is executed significantly slower. > > 2. It is not enough to increase "shared_buffers" parameter in > postgresql.conf. > "work_mem" is also very important. When I increased it to 1Gb from default > 1Mb, then time of query execution is reduced to > 7107.146 ms. So the real difference is ten times, not 1000 times. Yeah. It's not fair to compare vs an implementation that is constrained to use only 1mb. For analytics work huge work mem is pretty typical setting. 10x improvement is believable considering you've removed all MVCC overhead, locking, buffer management, etc. and have a simplified data structure. merlin
On 12/12/2013 07:03 PM, Merlin Moncure wrote: > On Thu, Dec 12, 2013 at 4:02 AM, knizhnik <knizhnik@garret.ru> wrote: > Yeah. It's not fair to compare vs an implementation that is > constrained to use only 1mb. For analytics work huge work mem is > pretty typical setting. 10x improvement is believable considering > you've removed all MVCC overhead, locking, buffer management, etc. and > have a simplified data structure. merlin I agree that it is not fair comparison. As an excuse I can say that I am not an experienced PostgreSQL user, so I thought that setting shared_buffers is enough to avoid disk access by PostgreSQL. Only after getting such strange results I started investigation of how to properly tune P{ostgreSQL parameters. IMHO it is strange to see such small default values in postgresql configuration - PostgreSQL is not an embedded database and now even mobile devices have several gigs of memory... Also it will be nice to have one single switch - how much physical memory can PostgreSQL use. And let PostgreSQL spit it in optimal way. For example I have no idea how to optimally split memory between ""shared_buffers", "temp_buffers", "work_mem", "maintenance_work_mem". PostgreSQL itself should do this work much better than unexperienced administrator. And one of the possible values of such parameter can be "auto": make it possible to automatically determine available memory (it is not a big deal to check amount of available RAM in the system). I know that vendors of big databases never tries to simplify configuration and tuning of their products: just because most of the profit them get from consulting. But I think that it is not true for PostgreSQL.
On Thu, Dec 12, 2013 at 12:18 PM, knizhnik <knizhnik@garret.ru> wrote: > IMHO it is strange to see such small default values in postgresql > configuration. This (low default work mem) is because of three things: 1) Most queries do not really need a lot of work mem 2) Work mem stacks with each query using it -- so with your 1mb setting vs 1000 connections, you get a gigabyte. So, some conservatism is justified although this setting tended to be much more dangerous in the old days when we measured memory in megabytes. 3) Postgres does not query available physical memory for default settings due to portability issues. So we tend to tune to "common denominator". merlin