Обсуждение: [HACKERS] GSoC 2017
Alexander Korotkov
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company
Hi all!In 2016 PostgreSQL project didn't pass to GSoC program. In my understanding the reasons for that are following.1. We did last-minute submission of our application to GSoC.2. In 2016 GSoC application form for mentoring organizations has been changed. In particular, it required more detailed information about possible project.As result we didn't manage to make a good enough application that time. Thus, our application was declined. See [1] and [2] for details.I think that the right way to manage this in 2017 would be to start collecting required information in advance. According to GSoC 2017 timeline [3] mentoring organization can submit their applications from January 19 to February 9. Thus, now it's a good time to start collecting project ideas and make call for mentors. Also, we need to decide who would be our admin this year.In sum, we have following questions:1. What project ideas we have?2. Who are going to be mentors this year?3. Who is going to be project admin this year?BTW, I'm ready to be mentor this year. I'm also open to be an admin if needed.------
Alexander Korotkov
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company
2017-01-10 14:53 GMT+05:00 Alexander Korotkov <a.korotkov@postgrespro.ru>: > 1. What project ideas we have? Hi! I'd like to propose project on sorting algorithm research. I’m ready to be a mentor on this project. ===Topic=== Sorting algorithms benchmark and implementation. ===Idea=== Currently the PostgreSQL uses Hoare’s Quicksort implementation based on work of Bentley and McIlroy [1] from 1993, while there exist some more novel algorithms [2], [3], and [4] which are actively used by highly optimized code like Java and .NET. Probably, use of optimized sorting algorithm could yield general system performance improvement. Also, use of non-comparison based algorithms deserves attention and benchmarking [5]. ===Project details=== The project has four essential parts: 1. Implementation of benchmark for sorting. Making sure that operations using sorting are represented proportionally to some “average” use cases. 2. Selection of benchmark algorithms. Selection can be based, for example, on scientific papers or community opinions. 3. Benchmark implementation of selected algorithms. Analysis of results, picking of winner. 4. Industrial implementation for pg_qsort(), pg_qsort_args() and gen_qsort_tuple.pl. Implemented patch is submitted to commitfest, other patch is reviewed by the student. [1] Bentley, Jon L., and M. Douglas McIlroy. "Engineering a sort function." Software: Practice and Experience 23.11 (1993): 1249-1265. [2] Musser, David R. "Introspective sorting and selection algorithms." Softw., Pract. Exper. 27.8 (1997): 983-993. [3] Auger, Nicolas, Cyril Nicaud, and Carine Pivoteau. "Merge Strategies: from Merge Sort to TimSort." (2015). [4] Beniwal, Sonal, and Deepti Grover. "Comparison of various sorting algorithms: A review." International Journal of Emerging Research in Management &Technology 2 (2013). [5] Mcllroy, Peter M., Keith Bostic, and M. Douglas Mcllroy. "Engineering radix sort." Computing systems 6.1 (1993): 5-27. Best regards, Andrey Borodin.
2017-01-10 14:53 GMT+05:00 Alexander Korotkov <a.korotkov@postgrespro.ru>: > 1. What project ideas we have? I have one more project of interest which I can mentor. Topic. GiST API advancement ===Idea=== GiST API was designed at the beginning of 90th to reduce boilerplate code around data access methods over balanced tree. Now, after 30 years, there are some ideas on improving this API. ===Project details=== Opclass developer must specify 4 core operations to make a type GiST-indexable: 1. Split: a function to split set of datatype instances into two parts. 2. Penalty calculation: a function to measure penalty for unification of two keys. 3. Collision check: a function which determines whether two keys may have overlap or are not intersecting. 4. Unification: a function to combine two keys into one so that combined key collides with both input keys. Functions 2 and 3 can be improved. For example, Revised R*-tree[1] algorithm of insertion cannot be expressed in terms of penalty-based algorithms. There was some attempts to bring parts of RR*-tree insertion, but they come down to ugly hacks [2]. Current GiST API, due to penalty-based insertion algorithm, does not allow to implement important feature of RR*-tree: overlap optimization. As Norbert Beckman, author of RR*-tree, put it in discussion: “Overlap optimization is one of the main elements, if not the main query performance tuning element of the RR*-tree. You would fall back to old R-Tree times if that would be left off.” Collision check currently returns binary result: 1. Query may be collides with subtree MBR 2. Query do not collides with subtree This result may be augmented with a third state: subtree is totally within query. In this case GiST scan can scan down subtree without key checks. Potential effect of these improvements must be benchmarked. Probably, implementation of these two will spawn more ideas on GiST performance improvements. Finally, GiST do not provide API for bulk loading. Alexander Korotkov during GSoC 2011 implemented buffered GiST build. This index construction is faster, but yields the index tree with virtually same querying performance. There are different algorithms aiming to provide better indexing tree due to some knowledge of data, e.g. [3] [1] Beckmann, Norbert, and Bernhard Seeger. "A revised r*-tree in comparison with related index structures." Proceedings of the 2009 ACM SIGMOD International Conference on Management of data. ACM, 2009. [2] https://www.postgresql.org/message-id/flat/CAJEAwVFMo-FXaJ6Lkj8Wtb1br0MtBY48EGMVEJBOodROEGykKg%40mail.gmail.com#CAJEAwVFMo-FXaJ6Lkj8Wtb1br0MtBY48EGMVEJBOodROEGykKg@mail.gmail.com [3] Achakeev, Daniar, Bernhard Seeger, and Peter Widmayer. "Sort-based query-adaptive loading of r-trees." Proceedings of the 21st ACM international conference on Information and knowledge management. ACM, 2012. Best regards, Andrey Borodin.
On 1/10/17 1:53 AM, Alexander Korotkov wrote: > 1. What project ideas we have? Perhaps allowing SQL-only extensions without requiring filesystem files would be a good project. -- Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX Experts in Analytics, Data Architecture and PostgreSQL Data in Trouble? Get it in Treble! http://BlueTreble.com 855-TREBLE2 (855-873-2532)
On 1/10/17 1:53 AM, Alexander Korotkov wrote:1. What project ideas we have?
Perhaps allowing SQL-only extensions without requiring filesystem files would be a good project.
--
Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX
Experts in Analytics, Data Architecture and PostgreSQL
Data in Trouble? Get it in Treble! http://BlueTreble.com
855-TREBLE2 (855-873-2532)
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
2017-01-12 21:21 GMT+01:00 Jim Nasby <Jim.Nasby@bluetreble.com>:On 1/10/17 1:53 AM, Alexander Korotkov wrote:1. What project ideas we have?
Perhaps allowing SQL-only extensions without requiring filesystem files would be a good project.Implementation safe evaluation untrusted PL functions - evaluation under different user under different process.RegardsPavel
--
Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX
Experts in Analytics, Data Architecture and PostgreSQL
Data in Trouble? Get it in Treble! http://BlueTreble.com
855-TREBLE2 (855-873-2532)
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
San Francisco, California
"Everything was beautiful, and nothing hurt."—Kurt Vonnegut
Jim Nasby wrote: > On 1/10/17 1:53 AM, Alexander Korotkov wrote: > > 1. What project ideas we have? > > Perhaps allowing SQL-only extensions without requiring filesystem files > would be a good project. Don't we already have that in patch form? Dimitri submitted it as I recall. -- Álvaro Herrera https://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
On 1/13/17 4:08 PM, Alvaro Herrera wrote: > Jim Nasby wrote: >> On 1/10/17 1:53 AM, Alexander Korotkov wrote: >>> 1. What project ideas we have? >> >> Perhaps allowing SQL-only extensions without requiring filesystem files >> would be a good project. > > Don't we already have that in patch form? Dimitri submitted it as I > recall. My recollection is that he tried to boil the ocean and also support handing compiled C libraries to the database, which was enough to sink the patch. It might be nice to support that if we could, and maybe it could be a follow-on project. I do think complete lack of support for non-FS extensions is *seriously* hurting use of the feature thanks to environments like RDS and heroku. As Pavel mentioned, untrusted languages are in a similar boat. So maybe the best way to address these things is to advertise them as "increase usability in cloud environments" since cloud excites people. -- Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX Experts in Analytics, Data Architecture and PostgreSQL Data in Trouble? Get it in Treble! http://BlueTreble.com 855-TREBLE2 (855-873-2532)
On 1/13/17 3:09 PM, Peter van Hardenberg wrote: > A new data type, and/or a new index type could both be nicely scoped > bits of work. Did you have any particular data/index types in mind? Personally I'd love something that worked like a python dictionary, but I'm not sure how that'd work without essentially supporting a variant data type. I've got code for a variant type[1], and I don't think there's any holes in it, but the casting semantics are rather ugly. IIRC that problem appeared to be solvable if there was a hook in the current casting code right before Postgres threw in the towel and said a cast was impossible. 1: https://github.com/BlueTreble/variant/ -- Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX Experts in Analytics, Data Architecture and PostgreSQL Data in Trouble? Get it in Treble! http://BlueTreble.com 855-TREBLE2 (855-873-2532)
I'm ready to be a mentor.
Hi all!In 2016 PostgreSQL project didn't pass to GSoC program. In my understanding the reasons for that are following.1. We did last-minute submission of our application to GSoC.2. In 2016 GSoC application form for mentoring organizations has been changed. In particular, it required more detailed information about possible project.As result we didn't manage to make a good enough application that time. Thus, our application was declined. See [1] and [2] for details.I think that the right way to manage this in 2017 would be to start collecting required information in advance. According to GSoC 2017 timeline [3] mentoring organization can submit their applications from January 19 to February 9. Thus, now it's a good time to start collecting project ideas and make call for mentors. Also, we need to decide who would be our admin this year.In sum, we have following questions:1. What project ideas we have?2. Who are going to be mentors this year?3. Who is going to be project admin this year?BTW, I'm ready to be mentor this year. I'm also open to be an admin if needed.------
Alexander Korotkov
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company
-- Anastasia Lubennikova Postgres Professional: http://www.postgrespro.com The Russian Postgres Company
All, * Alexander Korotkov (a.korotkov@postgrespro.ru) wrote: > Also, we need to decide who would > be our admin this year. I don't see anyone jumping at the bit to be the admin (it's not exactly a fun and exciting job, after all), so, unless someone really wants it (or someone wishs to object), I volunteer as tribute to be the admin this year. As such, we need to get this whole thing moving, and pretty quickly, as Alexander noted. The first thing we need is an "Ideas" page which includes: - Brief descriptions of projects that can be completed in about 12 weeks. - For each project, a list of prerequisites, description of programming skills needed and estimation of difficulty level. - A list of potential mentors. The GSoC 2016 page was a start on this. I copied that page and updated it to be a somewhat clearer format, but it could probably use more work. Here's what google says about the ideas page: ---------- The best pages include links to more detailed descriptions and related materials for each project. They might even include actual use cases! Keep in mind that this page is often the first view of your organization by Google and potential student applicants. A link to your bug tracker does not an Ideas Page make. Put your best foot forward. In addition to a basic list, you might also consider providing links to relevant resources for mentors and students, particular FAQ entries, the timeline, etc. You might include a section on communication, giving specific advice on which mailing lists, channels and emails to use and how to use them. If your organization puts together an application template for students, you should include that on your page as well. Think of your Ideas Page as the GSoC portal to your organization. ---------- Would be great for folks to review what's there, maybe provide actual use-cases for the existing project suggestions, verify that the projects listed are still valid and appropriate at this point, and, please: ADD YOUR PROJECTS. https://wiki.postgresql.org/wiki/GSoC_2017 More information about what the project definition should look like is included here: http://write.flossmanuals.net/gsoc-mentoring/defining-a-project/ Before submitting it to Google, I'm going to either expand or nuke everything under the 'core' section, so if there's something that that you are really interested in, expand it out so we can have it properly included in our application to Google. Also, Google has said that they actually *like* "Umbrella" projects. As such, I believe we should encourage projects which are closely related to PostgreSQL to submit projects for consideration. I don't think "just uses PostgreSQL" would be reasonable, but I do think something like "Add feature XYZ to the pgconf.eu code base to help PostgreSQL-based organizations and community conferences" would be. Let's make this year's PostgreSQL GSoC awesome! Thanks! Stephen
On 1/13/17 3:09 PM, Peter van Hardenberg wrote:A new data type, and/or a new index type could both be nicely scoped
bits of work.
Did you have any particular data/index types in mind?
Personally I'd love something that worked like a python dictionary, but I'm not sure how that'd work without essentially supporting a variant data type. I've got code for a variant type[1], and I don't think there's any holes in it, but the casting semantics are rather ugly. IIRC that problem appeared to be solvable if there was a hook in the current casting code right before Postgres threw in the towel and said a cast was impossible.
1: https://github.com/BlueTreble/variant/
--
Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX
Experts in Analytics, Data Architecture and PostgreSQL
Data in Trouble? Get it in Treble! http://BlueTreble.com
855-TREBLE2 (855-873-2532)
--
San Francisco, California
"Everything was beautiful, and nothing hurt."—Kurt Vonnegut
On 1/23/17 3:45 PM, Peter van Hardenberg wrote: > A new currency type would be nice, and if kept small in scope, might be > manageable. I'd be rather nervous about this. My impression of community consensus on this is a currency type that doesn't somehow support conversion between different currencies is pretty useless, and supporting conversions opens a 55 gallon drum of worms. I could certainly be mistaken in my impression, but I think there'd need to be some kind of consensus on what a currency type should do before putting that up for GSoC. But, speaking of types, I wish we had a timestamp type that stored what the original timezone was, as well as the relevant TZDATA entry that was in place for that timestamp when it was created. Since it'd be completely impractical to store TZDATA as part of the dataum, there would need to be an immutable catalog table that stored the contents of TZDATA any time it changed, as well as a fast way to find the surrogate key for the current TZDATA. -- Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX Experts in Analytics, Data Architecture and PostgreSQL Data in Trouble? Get it in Treble! http://BlueTreble.com 855-TREBLE2 (855-873-2532)
On 1/23/17 3:45 PM, Peter van Hardenberg wrote:A new currency type would be nice, and if kept small in scope, might be
manageable.
I'd be rather nervous about this. My impression of community consensus on this is a currency type that doesn't somehow support conversion between different currencies is pretty useless, and supporting conversions opens a 55 gallon drum of worms. I could certainly be mistaken in my impression, but I think there'd need to be some kind of consensus on what a currency type should do before putting that up for GSoC.
San Francisco, California
"Everything was beautiful, and nothing hurt."—Kurt Vonnegut
On 24 January 2017 at 03:42, Peter van Hardenberg <pvh@pvh.ca> wrote: > The basic concept is that the value of a currency type is that it would > allow you to operate in multiple currencies without accidentally adding > them. You'd flatten them to a single type if when and how you wanted for any > given operation but could work without fear of losing information. I don't think this even needs to be tied to currencies. I've often thought this would be generally useful for any value with units. This would prevent you from accidentally adding miles to kilometers or hours to parsecs which is just as valid as preventing you from adding CAD to USD. Then you could imagine having a few entirely optional helper functions that could automatically provide conversion factors using units.dat or currency exchange rates. But even if you don't use these helper functions they would still be useful. -- greg
Greg Stark <stark@mit.edu> writes: > On 24 January 2017 at 03:42, Peter van Hardenberg <pvh@pvh.ca> wrote: >> The basic concept is that the value of a currency type is that it would >> allow you to operate in multiple currencies without accidentally adding >> them. You'd flatten them to a single type if when and how you wanted for any >> given operation but could work without fear of losing information. > I don't think this even needs to be tied to currencies. I've often > thought this would be generally useful for any value with units. There already is an extension somewhere for attaching units to numeric values, which would be a place to start from for this purpose. The things I think are unique to the currency situation are: * Time-varying conversion ratios. * Conventional number of decimal places for any given currency. * Idiosyncratic I/O formats (symbol to left or right of number, odd rules for negatives, etc). I think the space here is covered by the POSIX currency locale rules. regards, tom lane
On January 27, 2017 07:08, Tom Lane wrote: > ... The things I think are unique to the currency situation are: ... Add the potential for regulatory requirements to change at any time - sort of like timezone information. So no hard codedbehavior. rounding method/accuracy storage precision different than display precision conversion method (multiply,divide, triangulate, other) use of spot rates (multiple rate sources) rather than/in addition to time-varyingrates responding to the overall idea of a currency type Numeric values with units so that you get a warning/error when you mix different units in calculations? Ability to specifyrounding methods and intermediate precisions for calculations? +1 Good ideas with lots of potential applications. Built-in currency type? -1 I suspect this is one of those things that seems like a good idea but really isn't.
Greg Stark wrote > I don't think this even needs to be tied to currencies. I've often > thought this would be generally useful for any value with units. This > would prevent you from accidentally adding miles to kilometers or > hours to parsecs which is just as valid as preventing you from adding > CAD to USD. There is already such a concept - not tied to currencies or units in general. The SQL standard calls it DISTINCT types. And it can prevent comparing apples to oranges. I don't have the exact syntax at hand, but it's something like this: create distinct type customer_id_type as integer; create distinct type order_id_type as integer; create table customers (id customer_id_type primary key); create table orders (id order_id_type primary key, customer_id customer_id_type not null); And because those columns are defined with different types, the database will refuse to compare customers.id with orders.id (just like it would refuse to compare an integer with a date). So an accidental join like this: select * from orders o join customers c using (id); would throw an error because the data types of the IDs can not be compared. -- View this message in context: http://postgresql.nabble.com/GSoC-2017-tp5938331p5941383.html Sent from the PostgreSQL - hackers mailing list archive at Nabble.com.
On 1/27/17 8:17 AM, Brad DeJong wrote: > Add the potential for regulatory requirements to change at any time - sort of like timezone information. So no hard codedbehavior. Well, I wish we had support for storing those changing requirements as well. If we had that it would greatly simplify having a timestamp type that stores the original timezone. BTW, time itself fits in the multi-unit pattern, since months don't have a fixed conversion to days (and technically seconds don't have a fixed conversion to anything thanks to leap seconds). -- Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX Experts in Analytics, Data Architecture and PostgreSQL Data in Trouble? Get it in Treble! http://BlueTreble.com 855-TREBLE2 (855-873-2532)
On 1/27/17 8:17 AM, Brad DeJong wrote:Add the potential for regulatory requirements to change at any time - sort of like timezone information. So no hard coded behavior.
Well, I wish we had support for storing those changing requirements as well. If we had that it would greatly simplify having a timestamp type that stores the original timezone.
BTW, time itself fits in the multi-unit pattern, since months don't have a fixed conversion to days (and technically seconds don't have a fixed conversion to anything thanks to leap seconds).
--
San Francisco, California
"Everything was beautiful, and nothing hurt."—Kurt Vonnegut
On 27 January 2017 at 14:52, Thomas Kellerer <spam_eater@gmx.net> wrote: > > I don't have the exact syntax at hand, but it's something like this: > > create distinct type customer_id_type as integer; > create distinct type order_id_type as integer; > > create table customers (id customer_id_type primary key); > create table orders (id order_id_type primary key, customer_id > customer_id_type not null); That seems like a useful thing but it's not exactly the same use case. Measurements with units and currency amounts both have the property that you are likely to want to have a single column that uses different units for different rows. You can aggregate across them without converting as long as you have an appropriate where clause or group by clause -- GROUP BY units_of(debit_amount) for example. -- greg
1. What project ideas we have?
Hi!
We would like to propose a project on rewriting PostgreSQL executor from
traditional Volcano-style [1] to so-called push-based architecture as implemented in
Hyper [2][3] and VitesseDB [4]. The idea is to reverse the direction of data flow
control: instead of pulling up tuples one-by-one with ExecProcNode(), we suggest
pushing them from below to top until blocking operator (e.g. Aggregation) is
encountered. There’s a good example and more detailed explanation for this approach in [2].
The advantages of this approach:
* It allows to completely avoid the need of loading/storing the internal state of the bottommost
(scanning) nodes, which will significantly reduce overhead. With current pull-based model,
we call functions like heapgettup_pagemode() (and many others) number-of-tuples-to-retrieve
times, while in push-based model we will call them only once. Currently, we have
implemented a prototype for SeqScan node and achieved 2x speedup on query
“select * from lineitem”;
* The number of memory accesses is minimized; generally better code and data locality,
cache is used more effectively;
* Switching to push model also makes a good base for building effective JIT-compiler.
Currently we have working LLVM-based JIT compiler for expressions [5], as well as whole query
JIT-compiler [6], which speeds up TPC-H queries up to 4-5 times, but the latter took manually
re-implementing the executor logic with LLVM API using push model to get this speedup. JIT-compiling
from original Postgres C code didn't give significant improvement because of Volcano-style model
inherent inefficiency. After making a switch to push-model we expect to achieve speedup comparable
to stand-alone JIT, but using the same code for both JIT and the interpreter.
Also, while working on this project, we are likely be revealing and fixing other
weak places of the current query executor. Volcano-style model is known to have
inadequate performance characteristics [7][8], e.g. function call overhead,
and we should deal with it anyway. We also plan to make relatively small patches,
which will optimize the redundant reload of the internal state in the current pull-model.
Many DB systems with support of full query compilation (e.g. LegoBase [9], Hekaton [10]) implement it in push-based manner.
Also we have seen in the mailing list that Kumar Rajeev had been investigating this idea too, and he reported that the results were impressive (unfortunately, without specifying more details):
References
[1] Graefe G.. Volcano — an extensible and parallel query evaluation system. IEEE Trans. Knowl. Data Eng.,6(1): 120–135, 1994.
[2] Efficiently Compiling Efficient Query Plans for Modern Hardware,
http://www.vldb.org/pvldb/vol4/p539-neumann.pdf
[3] Compiling Database Queries into Machine Code,
http://sites.computer.org/debull/A14mar/p3.pdf
[5] PostgreSQL with JIT compiler for expressions,
https://github.com/ispras/postgres
[6] LLVM Cauldron, slides,
http://llvm.org/devmtg/2016-09/slides/Melnik-PostgreSQLLLVM.pdf
[7] MonetDB/X100: Hyper-Pipelining Query Execution
http://cidrdb.org/cidr2005/papers/P19.pdf
[8] Vectorization vs. Compilation in Query Execution,
https://pdfs.semanticscholar.org/dcee/b1e11d3b078b0157325872a581b51402ff66.pdf
[9] http://www.vldb.org/pvldb/vol7/p853-klonatos.pdf[10] https://www.microsoft.com/en-us/research/wp-content/uploads/2013/06/Hekaton-Sigmod2013-final.pdf
Ruben. <ruben@ispras.ru>
ISP RAS.
Вложения
On 2017/02/06 20:51, Ruben Buchatskiy wrote: > Also we have seen in the mailing list that Kumar Rajeev had been > investigating this idea too, and he reported that the results were > impressive (unfortunately, without specifying more details): > > https://www.postgresql.org/message-id/BF2827DCCE55594C8D7A8F7FFD3AB77159A9B904%40szxeml521-mbs.china.huawei.com You might also want to take a look at some of the ongoing work in this area: WIP: Faster Expression Processing and Tuple Deforming (including JIT) https://www.postgresql.org/message-id/flat/20161206034955.bh33paeralxbtluv%40alap3.anarazel.de Thanks, Amit
Greetings, * Amit Langote (Langote_Amit_f8@lab.ntt.co.jp) wrote: > On 2017/02/06 20:51, Ruben Buchatskiy wrote: > > Also we have seen in the mailing list that Kumar Rajeev had been > > investigating this idea too, and he reported that the results were > > impressive (unfortunately, without specifying more details): > > > > https://www.postgresql.org/message-id/BF2827DCCE55594C8D7A8F7FFD3AB77159A9B904%40szxeml521-mbs.china.huawei.com > > You might also want to take a look at some of the ongoing work in this area: > > WIP: Faster Expression Processing and Tuple Deforming (including JIT) > https://www.postgresql.org/message-id/flat/20161206034955.bh33paeralxbtluv%40alap3.anarazel.de Yes, exactly that. Please review what's been currently done and, ideally, have someone like Andres comment on your plan. Perhaps you could arrange something with him as the mentor, since it looked like you didn't have any specific mentors listed in a quick look. That's definitely something that will be needed to include this project. Thanks! Stephen
* Stephen Frost (sfrost@snowman.net) wrote: > * Amit Langote (Langote_Amit_f8@lab.ntt.co.jp) wrote: > > On 2017/02/06 20:51, Ruben Buchatskiy wrote: > > > Also we have seen in the mailing list that Kumar Rajeev had been > > > investigating this idea too, and he reported that the results were > > > impressive (unfortunately, without specifying more details): > > > > > > https://www.postgresql.org/message-id/BF2827DCCE55594C8D7A8F7FFD3AB77159A9B904%40szxeml521-mbs.china.huawei.com > > > > You might also want to take a look at some of the ongoing work in this area: > > > > WIP: Faster Expression Processing and Tuple Deforming (including JIT) > > https://www.postgresql.org/message-id/flat/20161206034955.bh33paeralxbtluv%40alap3.anarazel.de > > Yes, exactly that. Please review what's been currently done and, > ideally, have someone like Andres comment on your plan. > > Perhaps you could arrange something with him as the mentor, since it > looked like you didn't have any specific mentors listed in a quick look. > That's definitely something that will be needed to include this project. Apologies, looks like you do have a couple of mentors listed on the wiki, so that looks good. Thanks! Stephen
Ruben, * Ruben Buchatskiy (ruben@ispras.ru) wrote: > Difficulty Level > Moderate-level; however, microoptimizations might be hard. > Probably it will also be hard to keep the whole architecture as clean as it is > now. The above difficulty level looks fine, but doesn't match what's on the wiki. What's on the wiki looks like a copy/paste from one of the SSI-related items. Please fix. Thanks! Stephen
On Mon, Feb 6, 2017 at 6:51 AM, Ruben Buchatskiy <ruben@ispras.ru> wrote: > 2017-01-10 12:53 GMT+03:00 Alexander Korotkov <a.korotkov@postgrespro.ru>: >> 1. What project ideas we have? > > We would like to propose a project on rewriting PostgreSQL executor from > > traditional Volcano-style [1] to so-called push-based architecture as > implemented in > > Hyper [2][3] and VitesseDB [4]. The idea is to reverse the direction of data > flow > > control: instead of pulling up tuples one-by-one with ExecProcNode(), we > suggest > > pushing them from below to top until blocking operator (e.g. Aggregation) is > > encountered. There’s a good example and more detailed explanation for this > approach in [2]. I think this very possibly a good idea but extremely unlikely to be something that a college student or graduate student can complete in one summer. More like an existing expert developer and a year of doing not much else. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
On Mon, Feb 6, 2017 at 6:51 AM, Ruben Buchatskiy <ruben@ispras.ru> wrote:
> 2017-01-10 12:53 GMT+03:00 Alexander Korotkov <a.korotkov@postgrespro.ru>:
>> 1. What project ideas we have?
>
> We would like to propose a project on rewriting PostgreSQL executor from
>
> traditional Volcano-style [1] to so-called push-based architecture as
> implemented in
>
> Hyper [2][3] and VitesseDB [4]. The idea is to reverse the direction of data
> flow
>
> control: instead of pulling up tuples one-by-one with ExecProcNode(), we
> suggest
>
> pushing them from below to top until blocking operator (e.g. Aggregation) is
>
> encountered. There’s a good example and more detailed explanation for this
> approach in [2].
I think this very possibly a good idea but extremely unlikely to be
something that a college student or graduate student can complete in
one summer. More like an existing expert developer and a year of
doing not much else.
+1
--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On Mon, Feb 6, 2017 at 6:51 AM, Ruben Buchatskiy <ruben@ispras.ru> wrote:
> 2017-01-10 12:53 GMT+03:00 Alexander Korotkov <a.korotkov@postgrespro.ru>:
>> 1. What project ideas we have?
>
> We would like to propose a project on rewriting PostgreSQL executor from
>
> traditional Volcano-style [1] to so-called push-based architecture as
> implemented in
>
> Hyper [2][3] and VitesseDB [4]. The idea is to reverse the direction of data
> flow
>
> control: instead of pulling up tuples one-by-one with ExecProcNode(), we
> suggest
>
> pushing them from below to top until blocking operator (e.g. Aggregation) is
>
> encountered. There’s a good example and more detailed explanation for this
> approach in [2].
I think this very possibly a good idea but extremely unlikely to be
something that a college student or graduate student can complete in
one summer. More like an existing expert developer and a year of
doing not much else.
--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
Dmitry
Alexander Korotkov
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company
On Tue, Feb 28, 2017 at 11:42 AM, Alexander Korotkov <a.korotkov@postgrespro.ru> wrote: > Hi all! > > It seems that PostgreSQL has passed to GSoC mentoring organizations this > year! > https://summerofcode.withgoogle.com/organizations/4558465230962688/ > Congratulations! Very cool! By the way, that page claims that PostgreSQL runs on Irix and Tru64, which hasn't been true for a few years. -- Thomas Munro http://www.enterprisedb.com
On 2/27/17 4:52 PM, Thomas Munro wrote: > By the way, that page claims that PostgreSQL runs on Irix and Tru64, > which hasn't been true for a few years. There could be a GSoC project to add support for those back in... ;P -- Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX Experts in Analytics, Data Architecture and PostgreSQL Data in Trouble? Get it in Treble! http://BlueTreble.com 855-TREBLE2 (855-873-2532)
On Thu, Mar 2, 2017 at 3:45 AM, Jim Nasby <Jim.Nasby@bluetreble.com> wrote: > On 2/27/17 4:52 PM, Thomas Munro wrote: >> By the way, that page claims that PostgreSQL runs on Irix and Tru64, >> which hasn't been true for a few years. > > There could be a GSoC project to add support for those back in... ;P Greg Stark and Tom Lane did some work to fix problems in our VAX support a few years ago (try git log --grep=VAX), but I don't think Greg ever got it fully working. There could be some point to putting more effort into making PostgreSQL scale to very small systems. We seen to run pretty well even on very low-end hardware like a Raspberry Pi, but there's always something lower-end, and having compile or runtime options that lower our memory footprint would probably be useful as the natural opposite of the scalability and parallel query work we've been doing over the last few years. Whether it's also useful to try to support running the system on unobtainable operating systems is less clear to me. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
Robert Haas <robertmhaas@gmail.com> writes: > On Thu, Mar 2, 2017 at 3:45 AM, Jim Nasby <Jim.Nasby@bluetreble.com> wrote: >> On 2/27/17 4:52 PM, Thomas Munro wrote: >>> By the way, that page claims that PostgreSQL runs on Irix and Tru64, >>> which hasn't been true for a few years. >> There could be a GSoC project to add support for those back in... ;P > ... Whether it's also > useful to try to support running the system on unobtainable operating > systems is less clear to me. I seriously doubt that we'd take patches to run on non-mainstream OSes without a concomitant promise to support buildfarm animals running such OSes for the foreseeable future. Without that we don't know if the patches still work even a week after they're committed. We killed the above-mentioned OSes mainly for lack of any such animals, IIRC. regards, tom lane