Обсуждение: Academic help for Postgres
I am giving a keynote at an IEEE database conference in Helsinki next week (http://icde2016.fi/). (Yes, I am not attending PGCon Ottawa because I accepted the Helsinki conference invitation before the PGCon Ottawa date was changed from June to May). As part of the keynote, I would like to mention areas where academia can help us. The topics I can think of are: Query optimizationOptimizer statisticsIndexing structuresReducing function call overheadCPU localitySortingParallelismSharding Any others? -- Bruce Momjian <bruce@momjian.us> http://momjian.us EnterpriseDB http://enterprisedb.com + As you are, so once was I. As I am, so you will be. + + Ancient Roman grave inscription +
On Wed, May 11, 2016 at 5:20 PM, Bruce Momjian <bruce@momjian.us> wrote: > I am giving a keynote at an IEEE database conference in Helsinki next > week (http://icde2016.fi/). (Yes, I am not attending PGCon Ottawa > because I accepted the Helsinki conference invitation before the PGCon > Ottawa date was changed from June to May). > > As part of the keynote, I would like to mention areas where academia can > help us. The topics I can think of are: > > Query optimization > Optimizer statistics > Indexing structures > Reducing function call overhead > CPU locality > Sorting > Parallelism > Sharding > > Any others? machine learning for adaptive planning distributed stuff > > -- > Bruce Momjian <bruce@momjian.us> http://momjian.us > EnterpriseDB http://enterprisedb.com > > + As you are, so once was I. As I am, so you will be. + > + Ancient Roman grave inscription + > > > -- > Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) > To make changes to your subscription: > http://www.postgresql.org/mailpref/pgsql-hackers
On 11.05.2016 17:20, Bruce Momjian wrote: > I am giving a keynote at an IEEE database conference in Helsinki next > week (http://icde2016.fi/). (Yes, I am not attending PGCon Ottawa > because I accepted the Helsinki conference invitation before the PGCon > Ottawa date was changed from June to May). > > As part of the keynote, I would like to mention areas where academia can > help us. The topics I can think of are: > > Query optimization > Optimizer statistics > Indexing structures > Reducing function call overhead > CPU locality > Sorting > Parallelism > Sharding > > Any others? > Incremental materialized views? -- Konstantin Knizhnik Postgres Professional: http://www.postgrespro.com The Russian Postgres Company
On Wed, May 11, 2016 at 05:24:44PM +0300, Oleg Bartunov wrote: > On Wed, May 11, 2016 at 5:20 PM, Bruce Momjian <bruce@momjian.us> wrote: > > As part of the keynote, I would like to mention areas where academia can > > help us. The topics I can think of are: > > > > Query optimization > > Optimizer statistics > > Indexing structures > > Reducing function call overhead > > CPU locality > > Sorting > > Parallelism > > Sharding > > > > Any others? > > machine learning for adaptive planning Do these fall in the "Query optimization" item? Does that need different text? > distributed stuff Oh, yes, distributed transactions. Good. -- Bruce Momjian <bruce@momjian.us> http://momjian.us EnterpriseDB http://enterprisedb.com + As you are, so once was I. As I am, so you will be. + + Ancient Roman grave inscription +
On Wed, May 11, 2016 at 05:31:10PM +0300, Konstantin Knizhnik wrote: > > > On 11.05.2016 17:20, Bruce Momjian wrote: > >I am giving a keynote at an IEEE database conference in Helsinki next > >week (http://icde2016.fi/). (Yes, I am not attending PGCon Ottawa > >because I accepted the Helsinki conference invitation before the PGCon > >Ottawa date was changed from June to May). > > > >As part of the keynote, I would like to mention areas where academia can > >help us. The topics I can think of are: > > > > Query optimization > > Optimizer statistics > > Indexing structures > > Reducing function call overhead > > CPU locality > > Sorting > > Parallelism > > Sharding > > > >Any others? > > > Incremental materialized views? I don't know. Is that something academics would research? -- Bruce Momjian <bruce@momjian.us> http://momjian.us EnterpriseDB http://enterprisedb.com + As you are, so once was I. As I am, so you will be. + + Ancient Roman grave inscription +
On 11/05/16 17:32, Bruce Momjian wrote: > On Wed, May 11, 2016 at 05:31:10PM +0300, Konstantin Knizhnik wrote: >> On 11.05.2016 17:20, Bruce Momjian wrote: >>> I am giving a keynote at an IEEE database conference in Helsinki next >>> week (http://icde2016.fi/). (Yes, I am not attending PGCon Ottawa >>> because I accepted the Helsinki conference invitation before the PGCon >>> Ottawa date was changed from June to May). >>> >>> As part of the keynote, I would like to mention areas where academia can >>> help us. The topics I can think of are: >>> >>> Query optimization >>> Optimizer statistics >>> Indexing structures >>> Reducing function call overhead >>> CPU locality >>> Sorting >>> Parallelism >>> Sharding >>> >>> Any others? >>> >> Incremental materialized views? > > I don't know. Is that something academics would research? Absolutely! There are plenty of papers on how to keep materialized views up-to-date. - Heikki
On 11.05.2016 17:32, Bruce Momjian wrote: > On Wed, May 11, 2016 at 05:31:10PM +0300, Konstantin Knizhnik wrote: >> >> On 11.05.2016 17:20, Bruce Momjian wrote: >>> I am giving a keynote at an IEEE database conference in Helsinki next >>> week (http://icde2016.fi/). (Yes, I am not attending PGCon Ottawa >>> because I accepted the Helsinki conference invitation before the PGCon >>> Ottawa date was changed from June to May). >>> >>> As part of the keynote, I would like to mention areas where academia can >>> help us. The topics I can think of are: >>> >>> Query optimization >>> Optimizer statistics >>> Indexing structures >>> Reducing function call overhead >>> CPU locality >>> Sorting >>> Parallelism >>> Sharding >>> >>> Any others? >>> >> Incremental materialized views? > I don't know. Is that something academics would research? > I am not sure. There is definitely a question which views can be incrementally recalculated and which inductive extension has to be constructed to make it possible. If you google for "incremental materialized views phd", you will get a larger number of references to articles. But I do not know if all question in this area are already closed or not... -- Konstantin Knizhnik Postgres Professional: http://www.postgrespro.com The Russian Postgres Company
On Wed, May 11, 2016 at 05:41:21PM +0300, Heikki Linnakangas wrote: > On 11/05/16 17:32, Bruce Momjian wrote: > >On Wed, May 11, 2016 at 05:31:10PM +0300, Konstantin Knizhnik wrote: > >>On 11.05.2016 17:20, Bruce Momjian wrote: > >>>I am giving a keynote at an IEEE database conference in Helsinki next > >>>week (http://icde2016.fi/). (Yes, I am not attending PGCon Ottawa > >>>because I accepted the Helsinki conference invitation before the PGCon > >>>Ottawa date was changed from June to May). > >>> > >>>As part of the keynote, I would like to mention areas where academia can > >>>help us. The topics I can think of are: > >>> > >>> Query optimization > >>> Optimizer statistics > >>> Indexing structures > >>> Reducing function call overhead > >>> CPU locality > >>> Sorting > >>> Parallelism > >>> Sharding > >>> > >>>Any others? > >>> > >>Incremental materialized views? > > > >I don't know. Is that something academics would research? > > Absolutely! There are plenty of papers on how to keep materialized views > up-to-date. Oh, OK. I will add it. -- Bruce Momjian <bruce@momjian.us> http://momjian.us EnterpriseDB http://enterprisedb.com + As you are, so once was I. As I am, so you will be. + + Ancient Roman grave inscription +
On Wed, May 11, 2016 at 9:32 AM, Bruce Momjian <bruce@momjian.us> wrote: > On Wed, May 11, 2016 at 05:31:10PM +0300, Konstantin Knizhnik wrote: >> Incremental materialized views? > > I don't know. Is that something academics would research? One paper I have found particularly good is this: http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.31.3208 Tantalizingly, there is mention that there is a longer version of the paper, but I have been unable to find it. There is enough in this paper, I think, to fill in the blanks and do a much better job with implementation than any ad hoc approach is likely to manage, but if there is anything that extends this work (or subsequent work which seems to be an improvement) that would be great. -- Kevin Grittner EDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
2016-05-11 23:20 GMT+09:00 Bruce Momjian <bruce@momjian.us>: > I am giving a keynote at an IEEE database conference in Helsinki next > week (http://icde2016.fi/). (Yes, I am not attending PGCon Ottawa > because I accepted the Helsinki conference invitation before the PGCon > Ottawa date was changed from June to May). > > As part of the keynote, I would like to mention areas where academia can > help us. The topics I can think of are: > > Query optimization > Optimizer statistics > Indexing structures > Reducing function call overhead > CPU locality > Sorting > Parallelism > Sharding > > Any others? > How about NVRAM utilization? -- KaiGai Kohei <kaigai@kaigai.gr.jp>
On 05/11/2016 07:54 AM, Bruce Momjian wrote: > On Wed, May 11, 2016 at 05:41:21PM +0300, Heikki Linnakangas wrote: >> On 11/05/16 17:32, Bruce Momjian wrote: >>> On Wed, May 11, 2016 at 05:31:10PM +0300, Konstantin Knizhnik wrote: >>>> On 11.05.2016 17:20, Bruce Momjian wrote: >>>>> I am giving a keynote at an IEEE database conference in Helsinki next >>>>> week (http://icde2016.fi/). (Yes, I am not attending PGCon Ottawa >>>>> because I accepted the Helsinki conference invitation before the PGCon >>>>> Ottawa date was changed from June to May). >>>>> >>>>> As part of the keynote, I would like to mention areas where academia can >>>>> help us. The topics I can think of are: >>>>> >>>>> Query optimization >>>>> Optimizer statistics >>>>> Indexing structures >>>>> Reducing function call overhead >>>>> CPU locality >>>>> Sorting >>>>> Parallelism >>>>> Sharding >>>>> >>>>> Any others? >>>>> >>>> Incremental materialized views? >>> >>> I don't know. Is that something academics would research? >> >> Absolutely! There are plenty of papers on how to keep materialized views >> up-to-date. > > Oh, OK. I will add it. > Together with that, automated substitution of materialized views for query clauses. Also: optimizing for new hardware, like persistent memory. -- -- Josh Berkus Red Hat OSAS (any opinions are my own)
On 11 May 2016 at 12:58, Josh berkus <josh@agliodbs.com> wrote:
Together with that, automated substitution of materialized views for
query clauses.
Also: optimizing for new hardware, like persistent memory.
I recently saw some material in ACM SIGOPS on tuning filesystems to play better with some of the new sorts of storage
An interesting such article was thus... <http://dl.acm.org/citation.cfm?id=2819002> The idea of it was to research better ways of doing hash table updates with PCM (Phase Change Memory) which apparently may be up-and-coming but with fairly different write characteristics than we're used to. You essentially write a fairly large page at a time, and can only do limited numbers of updates to any given page.
That encourages things like log-structured filesystems, but with further efforts to reduce there being "hot spots."
That encourages things like log-structured filesystems, but with further efforts to reduce there being "hot spots."
The paper was focused on hash tables; if the hardware turns out to be important, it'll also be important to have better variations on B-trees.
--
--
When confronted by a difficult problem, solve it by reducing it to the
question, "How would the Lone Ranger handle this?"
question, "How would the Lone Ranger handle this?"
On 12/05/16 02:20, Bruce Momjian wrote: > I am giving a keynote at an IEEE database conference in Helsinki next > week (http://icde2016.fi/). (Yes, I am not attending PGCon Ottawa > because I accepted the Helsinki conference invitation before the PGCon > Ottawa date was changed from June to May). > > As part of the keynote, I would like to mention areas where academia can > help us. The topics I can think of are: > > Query optimization > Optimizer statistics > Indexing structures > Reducing function call overhead > CPU locality > Sorting > Parallelism > Sharding > > Any others? > optimization of performance under very heavy loads ranging from almost all reads to almost all writes/updates, & otherusage profiles single box, and multiple boxen large numbers of CPU's most efficient use of SSD's best use of insanely large amounts of RAM optimization of handling arrays & JSON structures Cheers, Gavin
On 11 May 2016 at 22:20, Bruce Momjian <bruce@momjian.us> wrote:
I am giving a keynote at an IEEE database conference in Helsinki next
week (http://icde2016.fi/). (Yes, I am not attending PGCon Ottawa
because I accepted the Helsinki conference invitation before the PGCon
Ottawa date was changed from June to May).
As part of the keynote, I would like to mention areas where academia can
help us. The topics I can think of are:
[snip]
Any others?
When publishing work, publish source code somewhere stable that won't just vanish. And build on the latest stable release, don't build your prototype on Pg 8.0. Don't just publish a tarball with no information about what revision it's based on, publish a git tree or a patch series.
While academic prototype source is rarely usable directly, it can serve a valuable role with helping to understand the changes that were made, reproducing results, exploring further related work, etc
Include your dummy data or data generators, setup scripts, etc.
On 11 May 2016 19:50, Bruce Momjian Wrote: >I am giving a keynote at an IEEE database conference in Helsinki next >week (http://icde2016.fi/). (Yes, I am not attending PGCon Ottawa >because I accepted the Helsinki conference invitation before the PGCon >Ottawa date was changed from June to May). > >As part of the keynote, I would like to mention areas where academia can >help us. The topics I can think of are: > > Query optimization > Optimizer statistics > Indexing structures > Reducing function call overhead > CPU locality > Sorting > Parallelism > Sharding > >Any others? How about? 1. Considering NUMA aware architecture. 2. Optimizer tuning as per new hardware trends. 3. More effective version of Join algorithms (e.g. Compare to traditional "build and then probe" mechanism of Hash Join,now there is pipelining Hash join where probe and build both happens together). Thanks and Regards, Kumar Rajeev Rastogi
On 12 May 2016 at 11:16, Rajeev rastogi <rajeev.rastogi@huawei.com> wrote:
>Any others?
GPU offload.
Some work on that already got done as part of the AXLE project, but there's still a lot more to do to get anything that can be usefully integrated into Pg.
This likely ties in with batching work, since without batching it's unlikely you can get much benefit from GPU offload.
On May 12, 2016, at 6:16 AM, Rajeev rastogi wrote: > On 11 May 2016 19:50, Bruce Momjian Wrote: > > >> I am giving a keynote at an IEEE database conference in Helsinki next >> week (http://icde2016.fi/). (Yes, I am not attending PGCon Ottawa >> because I accepted the Helsinki conference invitation before the PGCon >> Ottawa date was changed from June to May). >> >> As part of the keynote, I would like to mention areas where academia can >> help us. The topics I can think of are: >> >> Query optimization >> Optimizer statistics >> Indexing structures >> Reducing function call overhead >> CPU locality >> Sorting >> Parallelism >> Sharding >> >> Any others? > > How about? > 1. Considering NUMA aware architecture. > 2. Optimizer tuning as per new hardware trends. > 3. More effective version of Join algorithms (e.g. Compare to traditional "build and then probe" mechanism of Hash Join,now there is pipelining Hash join where probe and build both happens together). Interesting article about optimal joins: http://arxiv.org/pdf/1203.1952v1.pdf > > Thanks and Regards, > Kumar Rajeev Rastogi > > > > -- > Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) > To make changes to your subscription: > http://www.postgresql.org/mailpref/pgsql-hackers
On Thu, May 12, 2016 at 08:57:34AM +0800, Craig Ringer wrote: > On 11 May 2016 at 22:20, Bruce Momjian <bruce@momjian.us> wrote: > > I am giving a keynote at an IEEE database conference in Helsinki next > > week (http://icde2016.fi/). (Yes, I am not attending PGCon Ottawa > > because I accepted the Helsinki conference invitation before the PGCon > > Ottawa date was changed from June to May). > > > > As part of the keynote, I would like to mention areas where academia can > > help us. The topics I can think of are: > > > > Any others? > > > > When publishing work, publish source code somewhere stable that won't just > vanish. And build on the latest stable release, don't build your prototype > on Pg 8.0. Don't just publish a tarball with no information about what > revision it's based on, publish a git tree or a patch series. > > While academic prototype source is rarely usable directly, it can serve a > valuable role with helping to understand the changes that were made, > reproducing results, exploring further related work, etc > > Include your dummy data or data generators, setup scripts, etc. That is all sound advise, but if they do all of the above, then they should also make sure the source (or parts of it) is potentially usable by the project, i.e. (joint?) PGDG copyright, if their academic institution allows that. Michael
On Thu, May 12, 2016 at 09:47:02AM +0200, Michael Banck wrote: > On Thu, May 12, 2016 at 08:57:34AM +0800, Craig Ringer wrote: > > On 11 May 2016 at 22:20, Bruce Momjian <bruce@momjian.us> wrote: > > > I am giving a keynote at an IEEE database conference in Helsinki next > > > week (http://icde2016.fi/). (Yes, I am not attending PGCon Ottawa > > > because I accepted the Helsinki conference invitation before the PGCon > > > Ottawa date was changed from June to May). > > > > > > As part of the keynote, I would like to mention areas where academia can > > > help us. The topics I can think of are: > > > > > > Any others? > > > > > > > When publishing work, publish source code somewhere stable that won't just > > vanish. And build on the latest stable release, don't build your prototype > > on Pg 8.0. Don't just publish a tarball with no information about what > > revision it's based on, publish a git tree or a patch series. > > > > While academic prototype source is rarely usable directly, it can serve a > > valuable role with helping to understand the changes that were made, > > reproducing results, exploring further related work, etc > > > > Include your dummy data or data generators, setup scripts, etc. > > That is all sound advise, but if they do all of the above, then they > should also make sure the source (or parts of it) is potentially usable > by the project, i.e. (joint?) PGDG copyright, if their academic > institution allows that. I have incorporated suggestions from this email thread into my IEEE talk for next week: http://momjian.us/main/writings/pgsql/ieee.pdf You will see most of it in the new slides toward the end. Please let me know if it needs more additions/changes. Thanks. -- Bruce Momjian <bruce@momjian.us> http://momjian.us EnterpriseDB http://enterprisedb.com + As you are, so once was I. As I am, so you will be. + + Ancient Roman grave inscription +
On Sat, May 14, 2016 at 7:13 AM, Bruce Momjian <bruce@momjian.us> wrote: > I have incorporated suggestions from this email thread into my IEEE talk > for next week: > > http://momjian.us/main/writings/pgsql/ieee.pdf > > You will see most of it in the new slides toward the end. Please let me > know if it needs more additions/changes. Thanks. Maybe slide 7 (NoSQL Sacrifices) should have a bullet point for "transaction isolation"? -- Thomas Munro http://www.enterprisedb.com