Обсуждение: Proposal: Pluggable Optimizer Interface
Hi All,
Tomas Kovarik and I have presented at PGCon 2007 in Ottawa
the ideas about other possible optimizer algorithms to be used
in PostgreSQL.
We are quite new to PostgreSQL project so it took us some
time to go through the sources end explore the possibilities
how things could be implemented.
There is a proposal attached to this mail about the interface
we would like to implement for switching between different
optimizers. Please review it and provide a feedback to us.
Thank You.
Regards
Julius Stroffek
Tomas Kovarik and I have presented at PGCon 2007 in Ottawa
the ideas about other possible optimizer algorithms to be used
in PostgreSQL.
We are quite new to PostgreSQL project so it took us some
time to go through the sources end explore the possibilities
how things could be implemented.
There is a proposal attached to this mail about the interface
we would like to implement for switching between different
optimizers. Please review it and provide a feedback to us.
Thank You.
Regards
Julius Stroffek
Julius Stroffek wrote: > Hi All, > > Tomas Kovarik and I have presented at PGCon 2007 in Ottawa > the ideas about other possible optimizer algorithms to be used > in PostgreSQL. > > We are quite new to PostgreSQL project so it took us some > time to go through the sources end explore the possibilities > how things could be implemented. > > There is a proposal attached to this mail about the interface > we would like to implement for switching between different > optimizers. Please review it and provide a feedback to us. > Thank You. hmm - how does is that proposal different from what got implemented with: http://archives.postgresql.org/pgsql-committers/2007-05/msg00315.php Stefan
Stefan Kaltenbrunner <stefan@kaltenbrunner.cc> writes: > Julius Stroffek wrote: >> There is a proposal attached to this mail about the interface >> we would like to implement for switching between different >> optimizers. Please review it and provide a feedback to us. > hmm - how does is that proposal different from what got implemented with: > http://archives.postgresql.org/pgsql-committers/2007-05/msg00315.php Well, it's a very different level of abstraction. The planner_hook would allow you to replace the *entire* planner, but if you only want to replace GEQO (that is, only substitute some other heuristics for partial search of a large join-order space), doing it from planner_hook will probably require duplicating a great deal of code. A hook right at the place where we currently choose "geqo or regular" would be a lot easier to experiment with. Replacing GEQO sounds like a fine area for investigation to me; I've always been dubious about whether it's doing a good job. But I'd prefer a simple hook function pointer designed in the same style as planner_hook (ie, intended to be overridden by a loadable module). The proposed addition of a system catalog and SQL-level management commands sounds like a great way to waste a lot of effort on mere decoration, before ever getting to the point of being able to demonstrate that there's any value in it. Also, while we might accept a small hook-function patch for 8.3, there's zero chance of any of that other stuff making it into this release cycle. regards, tom lane
Stefan,
thanks for pointing this out. I missed this change.
We would like to place the hooks to a different place in the planner and we would like to just replace the non-deterministic algorithm searching for the best order of joins and keep the rest of the planner untouched.
I am not quite sure about the usage from the user point of view of what got implemented. I read just the code of the patch. Are there more explanations somewhere else?
I understood that if the user creates his own implementation of the planner which can be stored in some external library, he have to provide some C language function as a "hook activator" which will assign the desired value to the planner_hook variable. Both, the activator function and the new planner implementation have to be located in the same dynamic library which will be loaded when CREATE FUNCTION statement would be used on "hook activator" function.
Am I correct? Have I missed something?
If the above is the case than it is exactly what we wanted except we would like to have the hook also in the different place.
There are more things in the proposal as a new pg_optimizer catalog and different way of configuring the hooks. However, this thinks are not mandatory for the functionality but are more user friendly.
Thanks
Julo
Stefan Kaltenbrunner wrote:
thanks for pointing this out. I missed this change.
We would like to place the hooks to a different place in the planner and we would like to just replace the non-deterministic algorithm searching for the best order of joins and keep the rest of the planner untouched.
I am not quite sure about the usage from the user point of view of what got implemented. I read just the code of the patch. Are there more explanations somewhere else?
I understood that if the user creates his own implementation of the planner which can be stored in some external library, he have to provide some C language function as a "hook activator" which will assign the desired value to the planner_hook variable. Both, the activator function and the new planner implementation have to be located in the same dynamic library which will be loaded when CREATE FUNCTION statement would be used on "hook activator" function.
Am I correct? Have I missed something?
If the above is the case than it is exactly what we wanted except we would like to have the hook also in the different place.
There are more things in the proposal as a new pg_optimizer catalog and different way of configuring the hooks. However, this thinks are not mandatory for the functionality but are more user friendly.
Thanks
Julo
Stefan Kaltenbrunner wrote:
Julius Stroffek wrote:Hi All, Tomas Kovarik and I have presented at PGCon 2007 in Ottawa the ideas about other possible optimizer algorithms to be used in PostgreSQL. We are quite new to PostgreSQL project so it took us some time to go through the sources end explore the possibilities how things could be implemented. There is a proposal attached to this mail about the interface we would like to implement for switching between different optimizers. Please review it and provide a feedback to us. Thank You.hmm - how does is that proposal different from what got implemented with: http://archives.postgresql.org/pgsql-committers/2007-05/msg00315.php Stefan
Julius Stroffek <Julius.Stroffek@Sun.COM> writes: > I understood that if the user creates his own implementation of the > planner which can be stored in some external library, he have to provide > some C language function as a "hook activator" which will assign the > desired value to the planner_hook variable. Both, the activator function > and the new planner implementation have to be located in the same > dynamic library which will be loaded when CREATE FUNCTION statement > would be used on "hook activator" function. You could do it that way if you wanted, but a minimalistic solution is just to install the hook from the _PG_init function of a loadable library, and then LOAD is sufficient for a user to execute the thing. There's a small example at http://archives.postgresql.org/pgsql-patches/2007-05/msg00421.php Also, having the loadable module add a custom GUC variable would likely be a preferable solution for control purposes than making specialized functions. I attach another small hack I made recently, which simply scales all the planner's relation size estimates by a scale_factor GUC; this is handy for investigating how a plan will change with relation size, without having to actually create gigabytes of test data. > There are more things in the proposal as a new pg_optimizer catalog and > different way of configuring the hooks. However, this thinks are not > mandatory for the functionality but are more user friendly. Granted, but at this point we are talking about infrastructure for planner-hackers to play with, not something that's intended to be a long-term API for end users. It may or may not happen that we ever need a user API for this at all. I think a planner that just "does the right thing" is far preferable to one with a lot of knobs that users have to know how to twiddle, so I see this more as scaffolding on which someone can build and test the replacement for GEQO; which ultimately would go in without any user-visible API additions. regards, tom lane #include "postgres.h" #include "fmgr.h" #include "commands/explain.h" #include "optimizer/plancat.h" #include "optimizer/planner.h" #include "utils/guc.h" PG_MODULE_MAGIC; void _PG_init(void); void _PG_fini(void); static double scale_factor = 1.0; static void my_get_relation_info(PlannerInfo *root, Oid relationObjectId, bool inhparent, RelOptInfo *rel); /* * Get control during planner's get_relation_info() function, which sets up * a RelOptInfo struct based on the system catalog contents. We can modify * the struct contents to cause the planner to work with a hypothetical * situation rather than what's actually in the catalogs. * * This simplistic example just scales all relation size estimates by a * user-settable factor. */ static void my_get_relation_info(PlannerInfo *root, Oid relationObjectId, bool inhparent, RelOptInfo *rel) { ListCell *ilist; /* Do nothing for an inheritance parent RelOptInfo */ if (inhparent) return; rel->pages = (BlockNumber) ceil(rel->pages * scale_factor); rel->tuples = ceil(rel->tuples * scale_factor); foreach(ilist, rel->indexlist) { IndexOptInfo *ind = (IndexOptInfo *) lfirst(ilist); ind->pages = (BlockNumber) ceil(ind->pages * scale_factor); ind->tuples = ceil(ind->tuples * scale_factor); } } /* * _pg_init() - library load-time initialization * * DO NOT make this static nor change its name! */ void _PG_init(void) { /* Get into the hooks we need to be in all the time */ get_relation_info_hook = my_get_relation_info; /* Make scale_factor accessible through GUC */ DefineCustomRealVariable("scale_factor", "", "", &scale_factor, 0.0001, 1e9, PGC_USERSET, NULL, NULL); } /* * _PG_fini() - library unload-time finalization * * DO NOT make this static nor change its name! */ void _PG_fini(void) { /* Get out of all the hooks (just to be sure) */ get_relation_info_hook = NULL; }
Tom, > Also, while we might accept > a small hook-function patch for 8.3, there's zero chance of any of that > other stuff making it into this release cycle. I don't think anyone was thinking about 8.3. This is pretty much 8.4 stuff; Julius is just raising it now becuase they don't want to go down the wrong path and waste everyone's time. -- --Josh Josh Berkus PostgreSQL @ Sun San Francisco
Josh Berkus <josh@agliodbs.com> writes: > Tom, >> Also, while we might accept >> a small hook-function patch for 8.3, there's zero chance of any of that >> other stuff making it into this release cycle. > I don't think anyone was thinking about 8.3. This is pretty much 8.4 > stuff; Julius is just raising it now becuase they don't want to go down > the wrong path and waste everyone's time. Well, if they get the hook in now, then in six months or so when they have something to play with, people would be able to play with it. If not, there'll be zero uptake till after 8.4 is released... regards, tom lane