Обсуждение: About connectby() again
On Sat, 07 Sep 2002 10:21:21 -0700 Joe Conway <mail@joeconway.com> wrote: > I just sent in a patch using the ancestor check method. It turned out > that the performance hit was pretty small on a moderate sized tree. > > My test case was a 220000 record bill-of-material table. The tree built > was 9 levels deep with about 3800 nodes. The performance hit was only > about 1%. The previous patch fixed an infinite recursion bug in contrib/tablefunc/tablefunc.c:connectby. But, other unmanageable error seems to occur even if a table has commonplace tree data(see below). I would think the patch, ancestor check, should be if (strstr(branch_delim || branchstr->data || branch_delim, branch_delim || current_key || branch_delim)) This is my image, not a real code. However, if branchstr->data includes branch_delim, my image will not be perfect. -- test connectby with int based hierarchy DROP TABLE connectby_tree; CREATE TABLE connectby_tree(keyid int, parent_keyid int); INSERT INTO connectby_tree VALUES(11,NULL); INSERT INTO connectby_tree VALUES(10,11); INSERT INTO connectby_tree VALUES(111,11); INSERT INTO connectby_tree VALUES(1,111); SELECT * FROM connectby('connectby_tree', 'keyid', 'parent_keyid', '11', 0, '-') AS t(keyid int, parent_keyid int, levelint, branch text) ERROR: infinite recursion detected Regards, Masaru Sugawara
On Fri, 27 Sep 2002 02:02:49 +0900 I wrote <rk73@sea.plala.or.jp> wrote: > On Sat, 07 Sep 2002 10:21:21 -0700 > Joe Conway <mail@joeconway.com> wrote: > > > I just sent in a patch using the ancestor check method. It turned out > > that the performance hit was pretty small on a moderate sized tree. > > > > My test case was a 220000 record bill-of-material table. The tree built > > was 9 levels deep with about 3800 nodes. The performance hit was only > > about 1%. > > > The previous patch fixed an infinite recursion bug in > contrib/tablefunc/tablefunc.c:connectby. But, other unmanageable error > seems to occur even if a table has commonplace tree data(see below). > > > I would think the patch, ancestor check, should be > > if (strstr(branch_delim || branchstr->data || branch_delim, > branch_delim || current_key || branch_delim)) > > This is my image, not a real code. However, if branchstr->data includes ^^^^^^^^^^^^^^^^^^^^^^^^^ keyid or parent_keyid > branch_delim, my image will not be perfect. > > > > > -- test connectby with int based hierarchy > DROP TABLE connectby_tree; > CREATE TABLE connectby_tree(keyid int, parent_keyid int); > > INSERT INTO connectby_tree VALUES(11,NULL); > INSERT INTO connectby_tree VALUES(10,11); > INSERT INTO connectby_tree VALUES(111,11); > INSERT INTO connectby_tree VALUES(1,111); > > SELECT * FROM connectby('connectby_tree', 'keyid', 'parent_keyid', '11', 0, '-') > AS t(keyid int, parent_keyid int, level int, branch text) > > ERROR: infinite recursion detected > > > > Regards, > Masaru Sugawara > > > > ---------------------------(end of broadcast)--------------------------- > TIP 3: if posting/reading through Usenet, please send an appropriate > subscribe-nomail command to majordomo@postgresql.org so that your > message can get through to the mailing list cleanly Regards, Masaru Sugawara
Masaru Sugawara wrote: > The previous patch fixed an infinite recursion bug in > contrib/tablefunc/tablefunc.c:connectby. But, other unmanageable error > seems to occur even if a table has commonplace tree data(see below). > > I would think the patch, ancestor check, should be > > if (strstr(branch_delim || branchstr->data || branch_delim, > branch_delim || current_key || branch_delim)) > > This is my image, not a real code. However, if branchstr->data includes > branch_delim, my image will not be perfect. Good point. Thank you Masaru for the suggested fix. Attached is a patch to fix the bug found by Masaru. His example now produces: regression=# SELECT * FROM connectby('connectby_tree', 'keyid', 'parent_keyid', '11', 0, '-') AS t(keyid int, parent_keyid int, level int, branch text); keyid | parent_keyid | level | branch -------+--------------+-------+---------- 11 | | 0 | 11 10 | 11 | 1 | 11-10 111 | 11 | 1 | 11-111 1 | 111 | 2 | 11-111-1 (4 rows) While making the patch I also realized that the "no show branch" form of the function was not going to work very well for recursion detection. Therefore there is now a default branch delimiter ('~') that is used internally, for that case, to enable recursion detection to work. If you need a different delimiter for your specific data, you will have to use the "show branch" form of the function. If there are no objections, please apply. Thanks, Joe Index: contrib/tablefunc/README.tablefunc =================================================================== RCS file: /opt/src/cvs/pgsql-server/contrib/tablefunc/README.tablefunc,v retrieving revision 1.3 diff -c -r1.3 README.tablefunc *** contrib/tablefunc/README.tablefunc 2 Sep 2002 05:44:04 -0000 1.3 --- contrib/tablefunc/README.tablefunc 26 Sep 2002 22:57:27 -0000 *************** *** 365,371 **** branch_delim ! if optional branch value is desired, this string is used as the delimiter Outputs --- 365,373 ---- branch_delim ! If optional branch value is desired, this string is used as the delimiter. ! When not provided, a default value of '~' is used for internal ! recursion detection only, and no "branch" field is returned. Outputs *************** *** 388,394 **** the level value output 3. If the branch field is not desired, omit both the branch_delim input ! parameter *and* the branch field in the query column definition 4. If the branch field is desired, it must be the forth column in the query column definition, and it must be type TEXT --- 390,399 ---- the level value output 3. If the branch field is not desired, omit both the branch_delim input ! parameter *and* the branch field in the query column definition. Note ! that when branch_delim is not provided, a default value of '~' is used ! for branch_delim for internal recursion detection, even though the branch ! field is not returned. 4. If the branch field is desired, it must be the forth column in the query column definition, and it must be type TEXT Index: contrib/tablefunc/tablefunc.c =================================================================== RCS file: /opt/src/cvs/pgsql-server/contrib/tablefunc/tablefunc.c,v retrieving revision 1.9 diff -c -r1.9 tablefunc.c *** contrib/tablefunc/tablefunc.c 14 Sep 2002 19:32:54 -0000 1.9 --- contrib/tablefunc/tablefunc.c 26 Sep 2002 23:09:27 -0000 *************** *** 652,657 **** --- 652,660 ---- branch_delim = GET_STR(PG_GETARG_TEXT_P(5)); show_branch = true; } + else + /* default is no show, tilde for the delimiter */ + branch_delim = pstrdup("~"); per_query_ctx = rsinfo->econtext->ecxt_per_query_memory; oldcontext = MemoryContextSwitchTo(per_query_ctx); *************** *** 798,807 **** --- 801,816 ---- char *current_branch; char **values; StringInfo branchstr = NULL; + StringInfo chk_branchstr = NULL; + StringInfo chk_current_key = NULL; /* start a new branch */ branchstr = makeStringInfo(); + /* need these to check for recursion */ + chk_branchstr = makeStringInfo(); + chk_current_key = makeStringInfo(); + if (show_branch) values = (char **) palloc(CONNECTBY_NCOLS * sizeof(char *)); else *************** *** 854,875 **** { /* initialize branch for this pass */ appendStringInfo(branchstr, "%s", branch); /* get the next sql result tuple */ spi_tuple = tuptable->vals[i]; /* get the current key and parent */ current_key = SPI_getvalue(spi_tuple, spi_tupdesc, 1); current_key_parent = pstrdup(SPI_getvalue(spi_tuple, spi_tupdesc, 2)); - /* check to see if this key is also an ancestor */ - if (strstr(branchstr->data, current_key)) - elog(ERROR, "infinite recursion detected"); - /* get the current level */ sprintf(current_level, "%d", level); ! /* extend the branch */ appendStringInfo(branchstr, "%s%s", branch_delim, current_key); current_branch = branchstr->data; --- 863,886 ---- { /* initialize branch for this pass */ appendStringInfo(branchstr, "%s", branch); + appendStringInfo(chk_branchstr, "%s%s%s", branch_delim, branch, branch_delim); /* get the next sql result tuple */ spi_tuple = tuptable->vals[i]; /* get the current key and parent */ current_key = SPI_getvalue(spi_tuple, spi_tupdesc, 1); + appendStringInfo(chk_current_key, "%s%s%s", branch_delim, current_key, branch_delim); current_key_parent = pstrdup(SPI_getvalue(spi_tuple, spi_tupdesc, 2)); /* get the current level */ sprintf(current_level, "%d", level); ! /* check to see if this key is also an ancestor */ ! if (strstr(chk_branchstr->data, chk_current_key->data)) ! elog(ERROR, "infinite recursion detected"); ! ! /* OK, extend the branch */ appendStringInfo(branchstr, "%s%s", branch_delim, current_key); current_branch = branchstr->data; *************** *** 913,918 **** --- 924,935 ---- /* reset branch for next pass */ xpfree(branchstr->data); initStringInfo(branchstr); + + xpfree(chk_branchstr->data); + initStringInfo(chk_branchstr); + + xpfree(chk_current_key->data); + initStringInfo(chk_current_key); } }
On Thu, 26 Sep 2002 16:32:08 -0700 Joe Conway <mail@joeconway.com> wrote: > Masaru Sugawara wrote: > > The previous patch fixed an infinite recursion bug in > > contrib/tablefunc/tablefunc.c:connectby. But, other unmanageable error > > seems to occur even if a table has commonplace tree data(see below). > > > > I would think the patch, ancestor check, should be > > > > if (strstr(branch_delim || branchstr->data || branch_delim, > > branch_delim || current_key || branch_delim)) > > > > This is my image, not a real code. However, if branchstr->data includes > > branch_delim, my image will not be perfect. > > Good point. Thank you Masaru for the suggested fix. > > Attached is a patch to fix the bug found by Masaru. His example now produces: > > regression=# SELECT * FROM connectby('connectby_tree', 'keyid', > 'parent_keyid', '11', 0, '-') AS t(keyid int, parent_keyid int, level int, > branch text); > keyid | parent_keyid | level | branch > -------+--------------+-------+---------- > 11 | | 0 | 11 > 10 | 11 | 1 | 11-10 > 111 | 11 | 1 | 11-111 > 1 | 111 | 2 | 11-111-1 > (4 rows) > > While making the patch I also realized that the "no show branch" form of the > function was not going to work very well for recursion detection. Therefore > there is now a default branch delimiter ('~') that is used internally, for > that case, to enable recursion detection to work. If you need a different > delimiter for your specific data, you will have to use the "show branch" form > of the function. > > If there are no objections, please apply. Thanks, I have no objection to your internally adding strings to detect a recursion. And I agree with your definition--the default delimiter is a tilde. Thanks a lot. Regards, Masaru Sugawara
Your patch has been added to the PostgreSQL unapplied patches list at: http://candle.pha.pa.us/cgi-bin/pgpatches I will try to apply it within the next 48 hours. --------------------------------------------------------------------------- Joe Conway wrote: > Masaru Sugawara wrote: > > The previous patch fixed an infinite recursion bug in > > contrib/tablefunc/tablefunc.c:connectby. But, other unmanageable error > > seems to occur even if a table has commonplace tree data(see below). > > > > I would think the patch, ancestor check, should be > > > > if (strstr(branch_delim || branchstr->data || branch_delim, > > branch_delim || current_key || branch_delim)) > > > > This is my image, not a real code. However, if branchstr->data includes > > branch_delim, my image will not be perfect. > > Good point. Thank you Masaru for the suggested fix. > > Attached is a patch to fix the bug found by Masaru. His example now produces: > > regression=# SELECT * FROM connectby('connectby_tree', 'keyid', > 'parent_keyid', '11', 0, '-') AS t(keyid int, parent_keyid int, level int, > branch text); > keyid | parent_keyid | level | branch > -------+--------------+-------+---------- > 11 | | 0 | 11 > 10 | 11 | 1 | 11-10 > 111 | 11 | 1 | 11-111 > 1 | 111 | 2 | 11-111-1 > (4 rows) > > While making the patch I also realized that the "no show branch" form of the > function was not going to work very well for recursion detection. Therefore > there is now a default branch delimiter ('~') that is used internally, for > that case, to enable recursion detection to work. If you need a different > delimiter for your specific data, you will have to use the "show branch" form > of the function. > > If there are no objections, please apply. Thanks, > > Joe > Index: contrib/tablefunc/README.tablefunc > =================================================================== > RCS file: /opt/src/cvs/pgsql-server/contrib/tablefunc/README.tablefunc,v > retrieving revision 1.3 > diff -c -r1.3 README.tablefunc > *** contrib/tablefunc/README.tablefunc 2 Sep 2002 05:44:04 -0000 1.3 > --- contrib/tablefunc/README.tablefunc 26 Sep 2002 22:57:27 -0000 > *************** > *** 365,371 **** > > branch_delim > > ! if optional branch value is desired, this string is used as the delimiter > > Outputs > > --- 365,373 ---- > > branch_delim > > ! If optional branch value is desired, this string is used as the delimiter. > ! When not provided, a default value of '~' is used for internal > ! recursion detection only, and no "branch" field is returned. > > Outputs > > *************** > *** 388,394 **** > the level value output > > 3. If the branch field is not desired, omit both the branch_delim input > ! parameter *and* the branch field in the query column definition > > 4. If the branch field is desired, it must be the forth column in the query > column definition, and it must be type TEXT > --- 390,399 ---- > the level value output > > 3. If the branch field is not desired, omit both the branch_delim input > ! parameter *and* the branch field in the query column definition. Note > ! that when branch_delim is not provided, a default value of '~' is used > ! for branch_delim for internal recursion detection, even though the branch > ! field is not returned. > > 4. If the branch field is desired, it must be the forth column in the query > column definition, and it must be type TEXT > Index: contrib/tablefunc/tablefunc.c > =================================================================== > RCS file: /opt/src/cvs/pgsql-server/contrib/tablefunc/tablefunc.c,v > retrieving revision 1.9 > diff -c -r1.9 tablefunc.c > *** contrib/tablefunc/tablefunc.c 14 Sep 2002 19:32:54 -0000 1.9 > --- contrib/tablefunc/tablefunc.c 26 Sep 2002 23:09:27 -0000 > *************** > *** 652,657 **** > --- 652,660 ---- > branch_delim = GET_STR(PG_GETARG_TEXT_P(5)); > show_branch = true; > } > + else > + /* default is no show, tilde for the delimiter */ > + branch_delim = pstrdup("~"); > > per_query_ctx = rsinfo->econtext->ecxt_per_query_memory; > oldcontext = MemoryContextSwitchTo(per_query_ctx); > *************** > *** 798,807 **** > --- 801,816 ---- > char *current_branch; > char **values; > StringInfo branchstr = NULL; > + StringInfo chk_branchstr = NULL; > + StringInfo chk_current_key = NULL; > > /* start a new branch */ > branchstr = makeStringInfo(); > > + /* need these to check for recursion */ > + chk_branchstr = makeStringInfo(); > + chk_current_key = makeStringInfo(); > + > if (show_branch) > values = (char **) palloc(CONNECTBY_NCOLS * sizeof(char *)); > else > *************** > *** 854,875 **** > { > /* initialize branch for this pass */ > appendStringInfo(branchstr, "%s", branch); > > /* get the next sql result tuple */ > spi_tuple = tuptable->vals[i]; > > /* get the current key and parent */ > current_key = SPI_getvalue(spi_tuple, spi_tupdesc, 1); > current_key_parent = pstrdup(SPI_getvalue(spi_tuple, spi_tupdesc, 2)); > > - /* check to see if this key is also an ancestor */ > - if (strstr(branchstr->data, current_key)) > - elog(ERROR, "infinite recursion detected"); > - > /* get the current level */ > sprintf(current_level, "%d", level); > > ! /* extend the branch */ > appendStringInfo(branchstr, "%s%s", branch_delim, current_key); > current_branch = branchstr->data; > > --- 863,886 ---- > { > /* initialize branch for this pass */ > appendStringInfo(branchstr, "%s", branch); > + appendStringInfo(chk_branchstr, "%s%s%s", branch_delim, branch, branch_delim); > > /* get the next sql result tuple */ > spi_tuple = tuptable->vals[i]; > > /* get the current key and parent */ > current_key = SPI_getvalue(spi_tuple, spi_tupdesc, 1); > + appendStringInfo(chk_current_key, "%s%s%s", branch_delim, current_key, branch_delim); > current_key_parent = pstrdup(SPI_getvalue(spi_tuple, spi_tupdesc, 2)); > > /* get the current level */ > sprintf(current_level, "%d", level); > > ! /* check to see if this key is also an ancestor */ > ! if (strstr(chk_branchstr->data, chk_current_key->data)) > ! elog(ERROR, "infinite recursion detected"); > ! > ! /* OK, extend the branch */ > appendStringInfo(branchstr, "%s%s", branch_delim, current_key); > current_branch = branchstr->data; > > *************** > *** 913,918 **** > --- 924,935 ---- > /* reset branch for next pass */ > xpfree(branchstr->data); > initStringInfo(branchstr); > + > + xpfree(chk_branchstr->data); > + initStringInfo(chk_branchstr); > + > + xpfree(chk_current_key->data); > + initStringInfo(chk_current_key); > } > } > > > ---------------------------(end of broadcast)--------------------------- > TIP 4: Don't 'kill -9' the postmaster -- Bruce Momjian | http://candle.pha.pa.us pgman@candle.pha.pa.us | (610) 359-1001+ If your life is a hard drive, | 13 Roberts Road + Christ can be your backup. | Newtown Square, Pennsylvania19073
Patch applied. Thanks. --------------------------------------------------------------------------- Joe Conway wrote: > Masaru Sugawara wrote: > > The previous patch fixed an infinite recursion bug in > > contrib/tablefunc/tablefunc.c:connectby. But, other unmanageable error > > seems to occur even if a table has commonplace tree data(see below). > > > > I would think the patch, ancestor check, should be > > > > if (strstr(branch_delim || branchstr->data || branch_delim, > > branch_delim || current_key || branch_delim)) > > > > This is my image, not a real code. However, if branchstr->data includes > > branch_delim, my image will not be perfect. > > Good point. Thank you Masaru for the suggested fix. > > Attached is a patch to fix the bug found by Masaru. His example now produces: > > regression=# SELECT * FROM connectby('connectby_tree', 'keyid', > 'parent_keyid', '11', 0, '-') AS t(keyid int, parent_keyid int, level int, > branch text); > keyid | parent_keyid | level | branch > -------+--------------+-------+---------- > 11 | | 0 | 11 > 10 | 11 | 1 | 11-10 > 111 | 11 | 1 | 11-111 > 1 | 111 | 2 | 11-111-1 > (4 rows) > > While making the patch I also realized that the "no show branch" form of the > function was not going to work very well for recursion detection. Therefore > there is now a default branch delimiter ('~') that is used internally, for > that case, to enable recursion detection to work. If you need a different > delimiter for your specific data, you will have to use the "show branch" form > of the function. > > If there are no objections, please apply. Thanks, > > Joe > Index: contrib/tablefunc/README.tablefunc > =================================================================== > RCS file: /opt/src/cvs/pgsql-server/contrib/tablefunc/README.tablefunc,v > retrieving revision 1.3 > diff -c -r1.3 README.tablefunc > *** contrib/tablefunc/README.tablefunc 2 Sep 2002 05:44:04 -0000 1.3 > --- contrib/tablefunc/README.tablefunc 26 Sep 2002 22:57:27 -0000 > *************** > *** 365,371 **** > > branch_delim > > ! if optional branch value is desired, this string is used as the delimiter > > Outputs > > --- 365,373 ---- > > branch_delim > > ! If optional branch value is desired, this string is used as the delimiter. > ! When not provided, a default value of '~' is used for internal > ! recursion detection only, and no "branch" field is returned. > > Outputs > > *************** > *** 388,394 **** > the level value output > > 3. If the branch field is not desired, omit both the branch_delim input > ! parameter *and* the branch field in the query column definition > > 4. If the branch field is desired, it must be the forth column in the query > column definition, and it must be type TEXT > --- 390,399 ---- > the level value output > > 3. If the branch field is not desired, omit both the branch_delim input > ! parameter *and* the branch field in the query column definition. Note > ! that when branch_delim is not provided, a default value of '~' is used > ! for branch_delim for internal recursion detection, even though the branch > ! field is not returned. > > 4. If the branch field is desired, it must be the forth column in the query > column definition, and it must be type TEXT > Index: contrib/tablefunc/tablefunc.c > =================================================================== > RCS file: /opt/src/cvs/pgsql-server/contrib/tablefunc/tablefunc.c,v > retrieving revision 1.9 > diff -c -r1.9 tablefunc.c > *** contrib/tablefunc/tablefunc.c 14 Sep 2002 19:32:54 -0000 1.9 > --- contrib/tablefunc/tablefunc.c 26 Sep 2002 23:09:27 -0000 > *************** > *** 652,657 **** > --- 652,660 ---- > branch_delim = GET_STR(PG_GETARG_TEXT_P(5)); > show_branch = true; > } > + else > + /* default is no show, tilde for the delimiter */ > + branch_delim = pstrdup("~"); > > per_query_ctx = rsinfo->econtext->ecxt_per_query_memory; > oldcontext = MemoryContextSwitchTo(per_query_ctx); > *************** > *** 798,807 **** > --- 801,816 ---- > char *current_branch; > char **values; > StringInfo branchstr = NULL; > + StringInfo chk_branchstr = NULL; > + StringInfo chk_current_key = NULL; > > /* start a new branch */ > branchstr = makeStringInfo(); > > + /* need these to check for recursion */ > + chk_branchstr = makeStringInfo(); > + chk_current_key = makeStringInfo(); > + > if (show_branch) > values = (char **) palloc(CONNECTBY_NCOLS * sizeof(char *)); > else > *************** > *** 854,875 **** > { > /* initialize branch for this pass */ > appendStringInfo(branchstr, "%s", branch); > > /* get the next sql result tuple */ > spi_tuple = tuptable->vals[i]; > > /* get the current key and parent */ > current_key = SPI_getvalue(spi_tuple, spi_tupdesc, 1); > current_key_parent = pstrdup(SPI_getvalue(spi_tuple, spi_tupdesc, 2)); > > - /* check to see if this key is also an ancestor */ > - if (strstr(branchstr->data, current_key)) > - elog(ERROR, "infinite recursion detected"); > - > /* get the current level */ > sprintf(current_level, "%d", level); > > ! /* extend the branch */ > appendStringInfo(branchstr, "%s%s", branch_delim, current_key); > current_branch = branchstr->data; > > --- 863,886 ---- > { > /* initialize branch for this pass */ > appendStringInfo(branchstr, "%s", branch); > + appendStringInfo(chk_branchstr, "%s%s%s", branch_delim, branch, branch_delim); > > /* get the next sql result tuple */ > spi_tuple = tuptable->vals[i]; > > /* get the current key and parent */ > current_key = SPI_getvalue(spi_tuple, spi_tupdesc, 1); > + appendStringInfo(chk_current_key, "%s%s%s", branch_delim, current_key, branch_delim); > current_key_parent = pstrdup(SPI_getvalue(spi_tuple, spi_tupdesc, 2)); > > /* get the current level */ > sprintf(current_level, "%d", level); > > ! /* check to see if this key is also an ancestor */ > ! if (strstr(chk_branchstr->data, chk_current_key->data)) > ! elog(ERROR, "infinite recursion detected"); > ! > ! /* OK, extend the branch */ > appendStringInfo(branchstr, "%s%s", branch_delim, current_key); > current_branch = branchstr->data; > > *************** > *** 913,918 **** > --- 924,935 ---- > /* reset branch for next pass */ > xpfree(branchstr->data); > initStringInfo(branchstr); > + > + xpfree(chk_branchstr->data); > + initStringInfo(chk_branchstr); > + > + xpfree(chk_current_key->data); > + initStringInfo(chk_current_key); > } > } > > > ---------------------------(end of broadcast)--------------------------- > TIP 4: Don't 'kill -9' the postmaster -- Bruce Momjian | http://candle.pha.pa.us pgman@candle.pha.pa.us | (610) 359-1001+ If your life is a hard drive, | 13 Roberts Road + Christ can be your backup. | Newtown Square, Pennsylvania19073