"Campbell, Lance" <lance@illinois.edu> writes:
> Is there a preferred way to search text within an HTML document? I have been reading up on searching via
to_tsvector. You can pass the to_tsvector two parameters. The first appears to be a dictionary and the second text.
Isthere by chance an English HTML dictionary? That way html tags or html attributes would be ignored.
I believe all the built-in text search configurations ignore HTML tags by
default, since they have no mapping for the "tag" token type that the
built-in parser reports those as. You could of course make a custom
configuration that acts differently.
regards, tom lane