Обсуждение: BUG #14245: Segfault on weird to_tsquery
VGhlIGZvbGxvd2luZyBidWcgaGFzIGJlZW4gbG9nZ2VkIG9uIHRoZSB3ZWJz aXRlOgoKQnVnIHJlZmVyZW5jZTogICAgICAxNDI0NQpMb2dnZWQgYnk6ICAg ICAgICAgIERhdmlkIEtlbGx1bQpFbWFpbCBhZGRyZXNzOiAgICAgIGRhdmlk QGdyYXZpdGV4dC5jb20KUG9zdGdyZVNRTCB2ZXJzaW9uOiA5LjZiZXRhMgpP cGVyYXRpbmcgc3lzdGVtOiAgIExpbnV4CkRlc2NyaXB0aW9uOiAgICAgICAg CgpJIGFtIGRvaW5nIHNvbWUgKGZ1enopIHRlc3Rpbmcgb2YgZnVsbCB0ZXh0 IHF1ZXJpZXMgYW5kIG1hbmFnZWQgdG8NCmdlbmVyYXRlIHRoZSBmb2xsb3dp bmcgY2FzZSB3aGljaCBjYXVzZXMgYSBTRUdGQVVMVCBvbiBQb3N0Z3JlU1FM IDkuNg0KYmV0YTEgYW5kIGJldGEyOg0KDQpzZWxlY3QgdG9fdHNxdWVyeSgn IShhICYgIWIpICYgYycpIGFzIHRzcXVlcnkNCg0KVGhpcyB3ZWlyZCBxdWVy eSBvdXRwdXRzIHRoZSBmb2xsb3dpbmcgb24gOS41LjIsIGluc3RlYWQgb2Yg Y3Jhc2hpbmc6DQoNCiIhKCAhJ2InICkgJiAnYyciDQoNCkJlbG93IGlzIG15 IGxvZyBvdXRwdXQsIHdoaWNoIGluY2x1ZGVzIGEgc3RhY2sgdHJhY2U6DQoN Ckp1bCAxMiAxMDowNDowMSBrbGVpbiBrZXJuZWw6IHBvc3RncmVzWzIyMTkx XTogc2VnZmF1bHQgYXQgMTAgaXAKMDAwMDAwMDAwMDc3NTRjZCBzcCAwMDAw N2ZmYzY0YjRhOTUwIGVycm9yIDQgaW4gcG9zdGdyZXNbNDAwMDAwKzVmODAw MF0NCkp1bCAxMiAxMDowNDowMSBrbGVpbiBzeXN0ZW1kWzFdOiBTdGFydGVk IFByb2Nlc3MgQ29yZSBEdW1wIChQSUQgMjIxOTIvVUlECjApLg0KSnVsIDEy IDEwOjA0OjAxIGtsZWluIHBvc3RncmVzWzQ4Ml06IExPRzogIHNlcnZlciBw cm9jZXNzIChQSUQgMjIxOTEpIHdhcwp0ZXJtaW5hdGVkIGJ5IHNpZ25hbCAx MTogU2VnbWVudGF0aW9uIGZhdWx0DQpKdWwgMTIgMTA6MDQ6MDEga2xlaW4g cG9zdGdyZXNbNDgyXTogREVUQUlMOiAgRmFpbGVkIHByb2Nlc3Mgd2FzIHJ1 bm5pbmc6CnNlbGVjdCB0b190c3F1ZXJ5KCchKGEgJiAhYikgJiBjJykgYXMg dHNxdWVyeQ0KSnVsIDEyIDEwOjA0OjAxIGtsZWluIHBvc3RncmVzWzQ4Ml06 IExPRzogIHRlcm1pbmF0aW5nIGFueSBvdGhlciBhY3RpdmUKc2VydmVyIHBy b2Nlc3Nlcw0KSnVsIDEyIDEwOjA0OjAxIGtsZWluIHBvc3RncmVzWzQ4Ml06 IFdBUk5JTkc6ICB0ZXJtaW5hdGluZyBjb25uZWN0aW9uCmJlY2F1c2Ugb2Yg Y3Jhc2ggb2YgYW5vdGhlciBzZXJ2ZXIgcHJvY2Vzcw0KSnVsIDEyIDEwOjA0 OjAxIGtsZWluIHBvc3RncmVzWzQ4Ml06IERFVEFJTDogIFRoZSBwb3N0bWFz dGVyIGhhcyBjb21tYW5kZWQKdGhpcyBzZXJ2ZXIgcHJvY2VzcyB0byByb2xs IGJhY2sgdGhlIGN1cnJlbnQgdHJhbnNhY3Rpb24gYW5kIGV4aXQsIGJlY2F1 c2UKYW5vdGhlciBzZXJ2ZXIgcHJvY2VzcyBleGl0ZWQgYWJub3JtYWxseSBh bmQgcG9zc2libHkgY29ycnVwdGVkIHNoYXJlZAptZW1vcnkuDQpKdWwgMTIg MTA6MDQ6MDEga2xlaW4gcG9zdGdyZXNbNDgyXTogSElOVDogIEluIGEgbW9t ZW50IHlvdSBzaG91bGQgYmUgYWJsZQp0byByZWNvbm5lY3QgdG8gdGhlIGRh dGFiYXNlIGFuZCByZXBlYXQgeW91ciBjb21tYW5kLg0KSnVsIDEyIDEwOjA0 OjAxIGtsZWluIHBvc3RncmVzWzQ4Ml06IExPRzogIGFsbCBzZXJ2ZXIgcHJv Y2Vzc2VzIHRlcm1pbmF0ZWQ7CnJlaW5pdGlhbGl6aW5nDQpKdWwgMTIgMTA6 MDQ6MDEga2xlaW4gcG9zdGdyZXNbNDgyXTogTE9HOiAgZGF0YWJhc2Ugc3lz dGVtIHdhcyBpbnRlcnJ1cHRlZDsKbGFzdCBrbm93biB1cCBhdCAyMDE2LTA3 LTEyIDEwOjAzOjQ3IFBEVA0KSnVsIDEyIDEwOjA0OjAxIGtsZWluIHN5c3Rl bWQtY29yZWR1bXBbMjIxOTNdOiBQcm9jZXNzIDIyMTkxIChwb3N0Z3Jlcykg b2YKdXNlciA4OCBkdW1wZWQgY29yZS4NCiAgICAgICAgICAgICAgICAgICAg ICAgICAgICAgICAgICAgICAgICAgICAgICAgU3RhY2sgdHJhY2Ugb2YgdGhy ZWFkCjIyMTkxOg0KICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAg ICAgICAgICAgICAgICAjMCAgMHgwMDAwMDAwMDAwNzc1NGNkCm5vcm1hbGl6 ZV9waHJhc2VfdHJlZSAocG9zdGdyZXMpDQogICAgICAgICAgICAgICAgICAg ICAgICAgICAgICAgICAgICAgICAgICAgICAgICMxICAweDAwMDAwMDAwMDA3 NzU2ZTEKbm9ybWFsaXplX3BocmFzZV90cmVlIChwb3N0Z3JlcykNCiAgICAg ICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgIzIg IDB4MDAwMDAwMDAwMDc3NTZkNQpub3JtYWxpemVfcGhyYXNlX3RyZWUgKHBv c3RncmVzKQ0KICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAg ICAgICAgICAgICAjMyAgMHgwMDAwMDAwMDAwNzc1OWJiCmNsZWFudXBfZmFr ZXZhbF9hbmRfcGhyYXNlIChwb3N0Z3JlcykNCiAgICAgICAgICAgICAgICAg ICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgIzQgIDB4MDAwMDAwMDAw MDc3NDYxMwpwYXJzZV90c3F1ZXJ5IChwb3N0Z3JlcykNCiAgICAgICAgICAg ICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgIzUgIDB4MDAw MDAwMDAwMDZjYTIxYQp0b190c3F1ZXJ5X2J5aWQgKHBvc3RncmVzKQ0KICAg ICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAj NiAgMHgwMDAwMDAwMDAwN2FkNWE3CkRpcmVjdEZ1bmN0aW9uQ2FsbDJDb2xs IChwb3N0Z3JlcykNCiAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAg ICAgICAgICAgICAgICAgIzcgIDB4MDAwMDAwMDAwMDViNzljMQpFeGVjTWFr ZUZ1bmN0aW9uUmVzdWx0Tm9TZXRzIChwb3N0Z3JlcykNCiAgICAgICAgICAg ICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgIzggIDB4MDAw MDAwMDAwMDViZDI4NQpFeGVjUHJvamVjdCAocG9zdGdyZXMpDQogICAgICAg ICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICM5ICAw eDAwMDAwMDAwMDA1ZDE3MjIKRXhlY1Jlc3VsdCAocG9zdGdyZXMpDQogICAg ICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICMx MCAweDAwMDAwMDAwMDA1YjZhNTgKRXhlY1Byb2NOb2RlIChwb3N0Z3JlcykN CiAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAg ICAgIzExIDB4MDAwMDAwMDAwMDViMmZlZgpzdGFuZGFyZF9FeGVjdXRvclJ1 biAocG9zdGdyZXMpDQogICAgICAgICAgICAgICAgICAgICAgICAgICAgICAg ICAgICAgICAgICAgICAgICMxMiAweDAwMDAwMDAwMDA2YmJhZjgKUG9ydGFs UnVuU2VsZWN0IChwb3N0Z3JlcykNCiAgICAgICAgICAgICAgICAgICAgICAg ICAgICAgICAgICAgICAgICAgICAgICAgIzEzIDB4MDAwMDAwMDAwMDZiY2Yx ZQpQb3J0YWxSdW4gKHBvc3RncmVzKQ0KICAgICAgICAgICAgICAgICAgICAg ICAgICAgICAgICAgICAgICAgICAgICAgICAjMTQgMHgwMDAwMDAwMDAwNmJh OTc5ClBvc3RncmVzTWFpbiAocG9zdGdyZXMpDQogICAgICAgICAgICAgICAg ICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICMxNSAweDAwMDAwMDAw MDA0NmYzNWYKU2VydmVyTG9vcCAocG9zdGdyZXMpDQogICAgICAgICAgICAg ICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICMxNiAweDAwMDAw MDAwMDA2NjEyNGMKUG9zdG1hc3Rlck1haW4gKHBvc3RncmVzKQ0KICAgICAg ICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAjMTcg MHgwMDAwMDAwMDAwNDcwM2ZmIG1haW4KKHBvc3RncmVzKQ0KICAgICAgICAg ICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAjMTggMHgw MDAwN2ZlMTE0ODEyNzQxCl9fbGliY19zdGFydF9tYWluIChsaWJjLnNvLjYp DQogICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAg ICAgICMxOSAweDAwMDAwMDAwMDA0NzA0OTkgX3N0YXJ0Cihwb3N0Z3JlcykN Ckp1bCAxMiAxMDowNDowMiBrbGVpbiBwb3N0Z3Jlc1s0ODJdOiBMT0c6ICBk YXRhYmFzZSBzeXN0ZW0gd2FzIG5vdCBwcm9wZXJseQpzaHV0IGRvd247IGF1 dG9tYXRpYyByZWNvdmVyeSBpbiBwcm9ncmVzcw0KSnVsIDEyIDEwOjA0OjAy IGtsZWluIHBvc3RncmVzWzQ4Ml06IExPRzogIGludmFsaWQgcmVjb3JkIGxl bmd0aCBhdAoxLzJGQTNFMUM4OiB3YW50ZWQgMjQsIGdvdCAwDQpKdWwgMTIg MTA6MDQ6MDIga2xlaW4gcG9zdGdyZXNbNDgyXTogTE9HOiAgcmVkbyBpcyBu b3QgcmVxdWlyZWQNCkp1bCAxMiAxMDowNDowMiBrbGVpbiBwb3N0Z3Jlc1s0 ODJdOiBMT0c6ICBNdWx0aVhhY3QgbWVtYmVyIHdyYXBhcm91bmQKcHJvdGVj dGlvbnMgYXJlIG5vdyBlbmFibGVkDQpKdWwgMTIgMTA6MDQ6MDIga2xlaW4g cG9zdGdyZXNbNDgyXTogTE9HOiAgZGF0YWJhc2Ugc3lzdGVtIGlzIHJlYWR5 IHRvCmFjY2VwdCBjb25uZWN0aW9ucw0KSnVsIDEyIDEwOjA0OjAyIGtsZWlu IHBvc3RncmVzWzQ4Ml06IExPRzogIGF1dG92YWN1dW0gbGF1bmNoZXIgc3Rh cnRlZA0KCgo=
On Tue, Jul 12, 2016 at 10:58 AM, <david@gravitext.com> wrote: > The following bug has been logged on the website: > > Bug reference: 14245 > Logged by: David Kellum > Email address: david@gravitext.com > PostgreSQL version: 9.6beta2 > Operating system: Linux > Description: > > I am doing some (fuzz) testing of full text queries and managed to > generate the following case which causes a SEGFAULT on PostgreSQL 9.6 > beta1 and beta2: > > select to_tsquery('!(a & !b) & c') as tsquery Interesting discovery. How did you fuzz test? -- Peter Geoghegan
On Tue, Jul 12, 2016 at 11:40 AM, Peter Geoghegan <pg@heroku.com> wrote: > Interesting discovery. How did you fuzz test? This appears to be a NULL pointer dereference. Here is a backtrace with proper debug info: #0 0x0000000000e45ada in normalize_phrase_tree (node=0x0) at tsquery_cleanup.c:397 #1 0x0000000000e468f3 in normalize_phrase_tree (node=<optimized out>) at tsquery_cleanup.c:416 #2 0x0000000000e4687f in normalize_phrase_tree (node=0x0) at tsquery_cleanup.c:543 #3 0x0000000000e44ce9 in cleanup_fakeval_and_phrase (in=<optimized out>) at tsquery_cleanup.c:603 #4 0x0000000000e3f528 in parse_tsquery (buf=<optimized out>, pushval=0x6250002e9490, opaque=<optimized out>, isplain=<optimized out>) at tsquery.c:695 #5 0x0000000000c8abcf in to_tsquery_byid (fcinfo=<optimized out>) at to_tsany.c:372 #6 0x0000000000ee0cc6 in DirectFunctionCall2Coll (func=0xc8aac0 <to_tsquery_byid>, collation=1342381084, arg1=12126, arg2=108095739809240) at fmgr.c:1049 #7 0x000000000093d2a9 in ExecMakeFunctionResultNoSets (fcache=<optimized out>, econtext=0x6250002ee368, isNull=<optimized out>, isDone=<optimized out>) at execQual.c:2041 #8 0x000000000093a89c in ExecTargetList (targetlist=0x6250002ef0e0, tupdesc=<optimized out>, econtext=<optimized out>, values=0x6250002eefb8, isnull=0x6250002eefd8 "\276~\276\276\276"..., itemIsDone=0x6250002ef118, isDone=<optimized out>) at execQual.c:5376 #9 0x000000000093a5ab in ExecProject (projInfo=<optimized out>, isDone=<optimized out>) at execQual.c:5600 ***SNIP *** -- Peter Geoghegan
On Tue, Jul 12, 2016 at 11:40 AM, Peter Geoghegan <pg@heroku.com> wrote: > On Tue, Jul 12, 2016 at 10:58 AM, <david@gravitext.com> wrote: >> The following bug has been logged on the website: >> >> Bug reference: 14245 >> >> I am doing some (fuzz) testing of full text queries and managed to >> generate the following case which causes a SEGFAULT on PostgreSQL >> 9.6 >> beta1 and beta2: >> >> select to_tsquery('!(a & !b) & c') as tsquery > > Interesting discovery. How did you fuzz test? Motivated by the new phrase search support in 9.6, I'm working on a query language which is lenient to any user input when parsed and can be transformed and output to PG tsquery syntax. The fuzz testing is by randomly permuted fragments in the custom query language. Using this, I found and fixed a bunch of issues in my own parser, and identified lots of characters to treat as whitespace and filter before output to tsquery, before stumbling on this Postgres crash.
david@gravitext.com writes: > I am doing some (fuzz) testing of full text queries and managed to > generate the following case which causes a SEGFAULT on PostgreSQL 9.6 > beta1 and beta2: > select to_tsquery('!(a & !b) & c') as tsquery > This weird query outputs the following on 9.5.2, instead of crashing: > "!( !'b' ) & 'c'" Note that while crashing is certainly not good, the pre-9.6 behavior can hardly be called correct either. What happened to 'a'? Also, it looks like this is specific to to_tsquery; if you just feed the same thing to tsqueryin, it seems fine with it: # select '!(a & !b) & c'::tsquery; tsquery ----------------------- !( 'a' & !'b' ) & 'c' (1 row) regards, tom lane
On Tue, Jul 12, 2016 at 12:42 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote: > david@gravitext.com writes: >> I am doing some (fuzz) testing of full text queries and managed to >> generate the following case which causes a SEGFAULT on PostgreSQL >> 9.6 >> beta1 and beta2: >> select to_tsquery('!(a & !b) & c') as tsquery >> This weird query outputs the following on 9.5.2, instead of >> crashing: >> "!( !'b' ) & 'c'" > > Note that while crashing is certainly not good, the pre-9.6 behavior > can hardly be called correct either. What happened to 'a'? 'a' is a stopword, dropped by to_tsquery() as described here: https://www.postgresql.org/docs/9.6/static/textsearch-controls.html#TEXTSEARCH-PARSING-QUERIES > The difference is that while basic tsquery input takes the tokens at > face value, to_tsquery normalizes each token into a lexeme using the > specified or default configuration, and discards any tokens that are > stop words according to the configuration. ...and I believe I want this behavior. Otherwise queries with stopword in '&' condition will not match anything. In truth I have no reason to want to support this kind of weird double negative, on any version, and will also look at filtering it out in my code before calling to_tsquery(). It might be worth noting that these other slightly different cases are fine on 9.6: select to_tsquery('!(apple & !b) & c'); ---> !( 'appl' & !'b' ) & 'c' select to_tsquery('!(apple & !a) & c'); ---> !'appl' & 'c'\ Clearly a pretty obscure case, but a crash nonetheless. > Also, it looks like this is specific to to_tsquery; if you just feed > the same thing to tsqueryin, it seems fine with it: > > # select '!(a & !b) & c'::tsquery; > tsquery > ----------------------- > !( 'a' & !'b' ) & 'c' > (1 row) Against another test table, English search config, I confirmed that 'a & ball'::tsquery doesn't match anything, but to_tsquery('a & ball') does. Thanks, David
David Kellum <david@gravitext.com> writes: > On Tue, Jul 12, 2016 at 12:42 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote: >> Note that while crashing is certainly not good, the pre-9.6 behavior >> can hardly be called correct either. What happened to 'a'? > 'a' is a stopword, dropped by to_tsquery() as described here: Ah! OK, so it's probably necessary to have a stopword there in order to break it. BTW, all these variants also crash: select to_tsquery('!(a | !b) & c') as tsquery; select to_tsquery('!( !b & a) & c') as tsquery; select to_tsquery('!( !b | a) & c') as tsquery; regards, tom lane
On Tue, Jul 12, 2016 at 05:11:32PM -0400, Tom Lane wrote: > David Kellum <david@gravitext.com> writes: > > On Tue, Jul 12, 2016 at 12:42 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote: > >> Note that while crashing is certainly not good, the pre-9.6 behavior > >> can hardly be called correct either. What happened to 'a'? > > > 'a' is a stopword, dropped by to_tsquery() as described here: > > Ah! OK, so it's probably necessary to have a stopword there in order > to break it. > > BTW, all these variants also crash: > > select to_tsquery('!(a | !b) & c') as tsquery; > select to_tsquery('!( !b & a) & c') as tsquery; > select to_tsquery('!( !b | a) & c') as tsquery; [Action required within 72 hours. This is a generic notification.] The above-described topic is currently a PostgreSQL 9.6 open item. Teodor, since you committed the patch believed to have created it, you own this open item. If some other commit is more relevant or if this does not belong as a 9.6 open item, please let us know. Otherwise, please observe the policy on open item ownership[1] and send a status update within 72 hours of this message. Include a date for your subsequent status update. Testers may discover new open items at any time, and I want to plan to get them all fixed well in advance of shipping 9.6rc1. Consequently, I will appreciate your efforts toward speedy resolution. Thanks. [1] http://www.postgresql.org/message-id/20160527025039.GA447393@tornado.leadboat.com
> The above-described topic is currently a PostgreSQL 9.6 open item. Teodor, I'm working on it now and believe that fix will be published today. > since you committed the patch believed to have created it, you own this open > item. If some other commit is more relevant or if this does not belong as a > 9.6 open item, please let us know. Otherwise, please observe the policy on > open item ownership[1] and send a status update within 72 hours of this > message. Include a date for your subsequent status update. Testers may > discover new open items at any time, and I want to plan to get them all fixed > well in advance of shipping 9.6rc1. Consequently, I will appreciate your > efforts toward speedy resolution. Thanks. > > [1] http://www.postgresql.org/message-id/20160527025039.GA447393@tornado.leadboat.com > > -- Teodor Sigaev E-mail: teodor@sigaev.ru WWW: http://www.sigaev.ru/
> select to_tsquery('!(a & !b) & c') as tsquery Thank you very much for your report, fixed. -- Teodor Sigaev E-mail: teodor@sigaev.ru WWW: http://www.sigaev.ru/