Re: TOASTing smaller things
От | Chris Browne |
---|---|
Тема | Re: TOASTing smaller things |
Дата | |
Msg-id | 60ircujq6s.fsf@dba2.int.libertyrms.com обсуждение исходный текст |
Ответ на | TOASTing smaller things (Chris Browne <cbbrowne@acm.org>) |
Список | pgsql-hackers |
tgl@sss.pgh.pa.us (Tom Lane) writes: > Chris Browne <cbbrowne@acm.org> writes: >> #define TOAST_DENOMINATOR 17 >> /* Use this as the divisor; current default behaviour falls from TOAST_DENOMINATOR = 4 */ > >> #define TOAST_TUPLE_THRESHOLD^I\ >> ^IMAXALIGN_DOWN((BLCKSZ - \ >> ^I^I^I^I MAXALIGN(sizeof(PageHeaderData) + 3 * sizeof(ItemIdData))) \ >> ^I^I^I^I / TOAST_DENOMINATOR) > > Given that you are quoting code that was demonstrably broken since > the original coding of TOAST up till a month or two back, "it passes > regression" is not adequate proof of "it's right". In fact I think > it's not right; you have not got the roundoff condition straight. OK, then maybe some refinement was needed. That seemed too easy. Mind you, the problem seems to me to be that TOAST_TUPLE_THRESHOLD is not solely a threshold to compare things to (as done in heapam.c/toasting.c), but gets reused to calculate TOAST_MAX_CHUNK_SIZE. If the threshold was solely used as that, alignment wouldn't matter. FYI, I took a sample table and loaded it into the resulting 8.3 backend based on the us of the totally naive TOAST_DENOMINATOR; there may be something off in the sizing of the chunks, but that does not appear to have injured fidelity of the tuples I stored. Vacuum output: --- Production system (no TOASTing) INFO: "xml_log_table": found 0 removable, 1731329 nonremovable row versions in 175870 pages DETAIL: 0 dead row versions cannot be removed yet. --- In the 8.3 instance that did toast things... INFO: "xml_log_table": found 0 removable, 1730737 nonremovable row versions in 41120 pages DETAIL: 0 dead row versions cannot be removed yet. There were 0 unused item pointers. 610 pages contain useful free space. 0 pages are entirely empty. CPU 1.08s/0.36u sec elapsed 14.94 sec. INFO: vacuuming "pg_toast.pg_toast_49194" INFO: index "pg_toast_49194_index" now contains 2303864 row versions in 6319 pages DETAIL: 0 index row versions were removed. 0 index pages have been deleted, 0 are currently reusable. CPU 0.09s/0.03u sec elapsed 1.71 sec. INFO: "pg_toast_49194": found 0 removable, 2303864 nonremovable row versions in 98191 pages DETAIL: 0 dead row versions cannot be removed yet. There were 0 unused item pointers. 514 pages contain useful free space. 0 pages are entirely empty. Problem with alignment of TOAST_MAX_CHUNK_SIZE or not, I seem to be getting the right results, and this nicely partitions the table into 2 chunks, one, with the non-XML data, that occupies 41K pages, and the TOAST section storing those less-frequently-accessed columns. (There is a size difference; the production instance has more empty space since it sees active inserts + deletes.) In all ways except for "strict hygenic correctness of code," this accomplished what I was hoping. If someone could make a round-off-safe calculation of TOAST_TUPLE_THRESHOLD and TOAST_MAX_CHUNK_SIZE that exposed the denominator so it could be safely modified, that would be one step ahead... I generally try not to make changes to the core, so I'll try to avoid that... >> 4. A different mechanism would be to add a fifth storage column >> strategy (the present four are PLAIN, EXTENDED, EXTERNAL, MAIN), let's >> say, TOAST. > > Anything along this line would require invoking the toaster on every > single tuple, since we'd always have to crawl through all the columns > to see if toasting was supposed to happen. No thanks. Ah, I see. I infer from that that the code starts by checking to see if the tuple size is > TOAST_TUPLE_THRESHOLD, and only starts rummaging through TOAST infrastructure if the tuple is big enough. In that case, "TOAST by default" becomes rather a nonstarter, I agree. In the application context I'm thinking of, one table out of ~80 is a "TOAST candidate;" making access to the other 79 slower would not be of benefit. (Aside: I'll link to Simon Rigg's related note, as well as to the item on the TODO list...) <http://archives.postgresql.org/pgsql-hackers/2007-02/msg00213.php> <http://www.postgresql.org/docs/faqs.TODO.html> -- let name="cbbrowne" and tld="cbbrowne.com" in name ^ "@" ^ tld;; http://cbbrowne.com/info/finances.html Where do you *not* want to go today? "Confutatis maledictis, flammis acribus addictis" (<http://www.hex.net/~cbbrowne/msprobs.html>
В списке pgsql-hackers по дате отправления:
Следующее
От: Heikki LinnakangasДата:
Сообщение: Re: relation 71478240 deleted while still in use on 8.1