Обсуждение: Text Databases
Hi, Here is a very general question: How good is PostgreSQL at hosting nearly pure text based databases ? Is it possible to search mutilple cols for the same keyword ? How well would it handle 40,000+ rows with say 15 columns each with about 20 words in each cell ? Nicholas
On Wed, 2 Dec 1998, Nicholas Humfrey wrote: # How good is PostgreSQL at hosting nearly pure text based databases ? Is # it possible to search mutilple cols for the same keyword ? How well # would it handle 40,000+ rows with say 15 columns each with about 20 # words in each cell ? Get a book on relational databases, it sounds like you're looking at the wrong tool for a particular job. -- SA, beyond.com My girlfriend asked me which one I like better. pub 1024/3CAE01D5 1994/11/03 Dustin Sallings <dustin@spy.net> | Key fingerprint = 87 02 57 08 02 D0 DA D6 C8 0F 3E 65 51 98 D8 BE L_______________________ I hope the answer won't upset her. ____________
On Wed, 2 Dec 1998, Nicholas Humfrey wrote: > Hi, > > Here is a very general question: > > How good is PostgreSQL at hosting nearly pure text based databases ? Is it > possible to search mutilple cols for the same keyword ? How well would it > handle 40,000+ rows with say 15 columns each with about 20 words in each > cell ? See the 'fulltextindex' directory under the contrib directory. Part of the README follows... Marc G. Fournier Systems Administrator @ hub.org primary: scrappy@hub.org secondary: scrappy@{freebsd|postgresql}.org An attempt at some sort of Full Text Indexing for PostgreSQL. The included software is an attempt to add some sort of Full Text Indexing support to PostgreSQL. I mean by this that we can ask questions like: Give me all rows that have 'still' and 'nash' in the 'artist' field. Ofcourse we can write this as: select * from cds where artist ~* 'stills' and artist ~* 'nash'; But this does not use any indices, and therefore, if your database gets very large, it will not have very high performance (the above query requires at least one sequential scan, it probably takes 2 due to the self-join). The approach used by this add-on is to define a trigger on the table and column you want to do this queries on. On every insert in the table, it takes the value in the specified column, breaks the text in this column up into pieces, and stores all sub-strings into another table, together with a reference to the row in the original table that contained this sub-string (it uses the oid of that row). By now creating an index over the 'fti-table', we can search for substrings that occur in the original table. By making a join between the fti-table and the orig-table, we can get the actual rows we want (this can also be done by using subselects, and maybe there're other ways too).
On Wed, 2 Dec 1998, Dustin Sallings wrote: >it sounds like you're looking at the wrong tool for a particular job. I am trying to get the best of both worlds. A good relational database system as well as being able to do text searches in mutiple fields. On Thu, 3 Dec 1998, Marc G. Fournier wrote: >See the 'fulltextindex' directory under the contrib directory. How fast is PostgreSQL at searching for text. Should I, as Dustin Sallings wrote look for a different relational database system ? I like PostgreSQL's features especially free source code, so I can run it on different platforms (Sparc,PPC,i486 etc.) Is there a more suitable free system available ? Nicholas
Well, I've found isearch to be very useful for massive fulltext indexes.. Perhaps postgres could borrow it's methods.. On Thu, 3 Dec 1998, Nicholas Humfrey wrote: > > On Wed, 2 Dec 1998, Dustin Sallings wrote: > >it sounds like you're looking at the wrong tool for a particular job. > > I am trying to get the best of both worlds. A good relational database > system as well as being able to do text searches in mutiple fields. > > On Thu, 3 Dec 1998, Marc G. Fournier wrote: > >See the 'fulltextindex' directory under the contrib directory. > > How fast is PostgreSQL at searching for text. Should I, as Dustin Sallings > wrote look for a different relational database system ? I like PostgreSQL's > features especially free source code, so I can run it on different > platforms (Sparc,PPC,i486 etc.) Is there a more suitable free system > available ? > > Nicholas > > >
> On Wed, 2 Dec 1998, Dustin Sallings wrote: > >it sounds like you're looking at the wrong tool for a particular job. > > I am trying to get the best of both worlds. A good relational database > system as well as being able to do text searches in mutiple fields. > Why wouldn't you just build a consolidated index from the data in your multiple fields, then load it into postgres or any other database engine? Dustin made a good point: things you are contemplating to do are not exactly the job of a database backend. Its all yours: the database you are building is just as good as the index you design and implement in the form of a relation or a set of relations, which postgres will take care of. --Gene
On Thu, 3 Dec 1998, Nicholas Humfrey wrote: // How fast is PostgreSQL at searching for text. Should I, as Dustin // Sallings wrote look for a different relational database system ? I // like PostgreSQL's features especially free source code, so I can run // it on different platforms (Sparc,PPC,i486 etc.) Is there a more // suitable free system available ? Well, Postgres is just about as good as it gets, but if you're trying to do arbitrary text searches, grep on text files will always be faster. The search system mentioned above is good if you're doing exact matches out of your index, but it still doesn't make sense if you're trying to match over multiple columns, etc... Relational databases make the most sense when you can look up an exact match in a column. -- SA, beyond.com The world is watching America, pub 1024/3CAE01D5 1994/11/03 Dustin Sallings <dustin@spy.net> | Key fingerprint = 87 02 57 08 02 D0 DA D6 C8 0F 3E 65 51 98 D8 BE L________________________________________ and America is watching TV. __
On Thu, 3 Dec 1998, Nicholas Humfrey wrote: > > On Wed, 2 Dec 1998, Dustin Sallings wrote: > >it sounds like you're looking at the wrong tool for a particular job. > > I am trying to get the best of both worlds. A good relational database > system as well as being able to do text searches in mutiple fields. > > On Thu, 3 Dec 1998, Marc G. Fournier wrote: > >See the 'fulltextindex' directory under the contrib directory. > > How fast is PostgreSQL at searching for text. Should I, as Dustin Sallings > wrote look for a different relational database system ? I like PostgreSQL's > features especially free source code, so I can run it on different > platforms (Sparc,PPC,i486 etc.) Is there a more suitable free system > available ? From reading the README file in the fulltextindex directory, it looks like the way they implemented it should be as fast as any other of the indices....check the README file and see what you think... Marc G. Fournier Systems Administrator @ hub.org primary: scrappy@hub.org secondary: scrappy@{freebsd|postgresql}.org
> On Thu, 3 Dec 1998, Nicholas Humfrey wrote: > > > > > On Wed, 2 Dec 1998, Dustin Sallings wrote: > > >it sounds like you're looking at the wrong tool for a particular job. > > > > I am trying to get the best of both worlds. A good relational database > > system as well as being able to do text searches in mutiple fields. > > > > On Thu, 3 Dec 1998, Marc G. Fournier wrote: > > >See the 'fulltextindex' directory under the contrib directory. > > > > How fast is PostgreSQL at searching for text. Should I, as Dustin Sallings > > wrote look for a different relational database system ? I like PostgreSQL's > > features especially free source code, so I can run it on different > > platforms (Sparc,PPC,i486 etc.) Is there a more suitable free system > > available ? > > From reading the README file in the fulltextindex directory, it > looks like the way they implemented it should be as fast as any other of > the indices....check the README file and see what you think... Yes, it should be. -- Bruce Momjian | http://www.op.net/~candle maillist@candle.pha.pa.us | (610) 853-3000 + If your life is a hard drive, | 830 Blythe Avenue + Christ can be your backup. | Drexel Hill, Pennsylvania 19026