Re: [PATCH] bitmap indexes
От | Alvaro Herrera |
---|---|
Тема | Re: [PATCH] bitmap indexes |
Дата | |
Msg-id | 20130925222017.GX4832@eldon.alvh.no-ip.org обсуждение исходный текст |
Ответ на | [PATCH] bitmap indexes (Abhijit Menon-Sen <ams@2ndquadrant.com>) |
Ответы |
Re: [PATCH] bitmap indexes
(Antonin Houska <antonin.houska@gmail.com>)
Re: [PATCH] bitmap indexes (Abhijit Menon-Sen <ams@2ndquadrant.com>) Re: [PATCH] bitmap indexes (Jeff Janes <jeff.janes@gmail.com>) |
Список | pgsql-hackers |
Hi, Here are some quick items while skimming this patch. I am looking at commit 6448de29d from your github repo, branch bmi. What's with the pg_bitmapindex stuff in pg_namespace.h? It doesn't seem to be used anywhere. This led me to research how these indexes are stored. I note that what we're doing here is to create another regular table and a btree index on top of it, and those are the ones that actually store the index data. This seems grotty and completely unlike the way we do things elsewhere (compare GIN indexes which have rbtrees inside them). I think this should be stored in separate forks, or separate kinds of pages intermixed in the index's main fork. I don't like that files are named bitmap.c/.h. We already have bitmap scans, so having the generic concept (bitmap) show up as a file name is confusing. To cite an example, see the name of the bmgetbitmap function (which returns a TID bitmap from a bitmap index. How is that not confusing?). I think I would be more comfortable with the files being called "bmi" or maybe "bitmapidx", or something like that. (For sure do not change the AM's name. I mean "CREATE INDEX .. USING bmi" would suck.) Not happy with (the contents of) src/include/access/bitmap.h. There's way too much stuff in a single file. I think a three-file split would work nicely: one for the AM routine declarations (bitmap.h), one for xlog stuff (bitmap_xlog.h), one for internal routines and struct declarations (bitmap_priv.h, bitmap_internals.h, or something like that). Also, I think macros and structs should be defined in a narrow scope as possible; for example, macros such as HEADER_SET_FILL_BIT_ON should be defined in bitmaputil.c, not bitmap.h (that macro is missing parens BTW). For macros that are defined in headers, it would be better to have prefixes that scope them to bitmaps; for example IS_FILL_WORD should maybe have a BM_ prefix or something similar. I don't think it's necessary to renumber relopt_kind. Just stash the new one at the end of the enum. bmoptions's DESCR entry in pg_proc.h says "btree". contrib/bmfuncs.c defines CHECK_PAGE_OFFSET_RANGE but doesn't seem to use it anywhere. same file defines CHECK_RELATION_BLOCK_RANGE using a bare { } block. Our style is to have these multi-statement macros use do {} while (false). #include lines in source files are normally alphabetically sorted. The new code fails to meet this expectation in many places. First four lines of _bitmap_init seem gratuitous .. All those #ifdef DEBUG_BMI lines sprinkled all over the place look pretty bad; they interrupt the indentation flow. I would suggest to define a macro or some macros to emit debugging messages, which are enabled or disabled in a single place depending on DEBUG_BMI. Something simple such as DO_DB() in fd.c would suffice. I don't like this comment one bit: /* misteriously, MemSet segfaults... :( */ I think that's probably a bug that should be investigated rather than papered over. I don't understand the cur_bmbuild thingy .. I think that stuff should be passed down as arguments to whatever do the index build, instead of being a magical global var that also signals failure to find hash functions for some datatypes. Above definition of BMBuildHashData there's a comment referring us to execGrouping.c but I cannot understand what it refers to. "block unfound"? In general I think it's poor style to spell out the function name in error messages. I mean, ereport() already reports the function name. Also, please don't split long error messages across multiple lines; better to leave the line to run off the screen. I'm unsure about distinguishing errors in the recovery routines that raise ERROR from those that PANIC. I mean, they would both cause the server to PANIC. The bitmap_xlog_cleanup routine talks about an "uninitialized reln". That's a large point about xlog recovery -- you don't have relations. You only have relfilenodes. I think you need to shuffle some routines so that they can work with only a relfilenode. Generally, the xlog stuff needs a lot of comments. A pgindent run would be beneficial. -- Álvaro Herrera http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services
В списке pgsql-hackers по дате отправления:
Предыдущее
От: "MauMau"Дата:
Сообщение: Re: UTF8 national character data type support WIP patch and list of open issues.