Обсуждение: Large object insert performance.
I'm populating a new database from some text files. I'm using large objects to store the body of the text files. I have a little thing setup to monitor how fast the inserts are going. They started out at about 20/sec and have been slowly dropping. I'm about 6% through my data and I'm already down to 2/sec and dropping. All I'm doing is inserting the large objects. No other action is happening. Here's the portion of the script that is populating my database: self.db.query('begin') body_lo = self.db.locreate(pg.INV_READ | pg.INV_WRITE) body_lo.open(pg.INV_WRITE) body_lo.write(puff.get('message/body', '')) body_oid = body_lo.oid body_lo.close() self.db.query('end') That is the full extent of my queries to the database. There are no tables or indexes defined. The average size of a body is about 300 bytes, but it goes as high as 30k. Is there any way to speed this up? If the handling of large objects is this bad, I think I might just store these guys on the file system.
Peter Haight <peterh@sapros.com> writes: > All I'm doing is inserting the large objects. How many LOs are we talking about here? The current LO implementation creates a separate table, with index, for each LO. That means two files in the database directory per LO. On most Unix filesystems I've dealt with, performance will go to hell in a handbasket for more than a few thousand files in one directory. Denis Perchine did a reimplementation of LOs to store 'em in a single table. This hasn't been checked or applied to current sources yet, but if you're feeling adventurous see the pgsql-patches archives from late June. > Is there any way to speed this up? If the handling of large objects is this > bad, I think I might just store these guys on the file system. You could do that too, if you don't need transactional semantics for large-object operations. regards, tom lane
>Peter Haight <peterh@sapros.com> writes: >> All I'm doing is inserting the large objects. > >How many LOs are we talking about here? 400,000. It gets too slow way before that, though. Around 30,000. I'm going ahead with storing them in external files with an index in the database. That is going very well. Thanks.
*** Peter Haight <peterh@sapros.com> [Thursday, 24.August.2000, 17:50 -0700]: > >Peter Haight <peterh@sapros.com> writes: > >> All I'm doing is inserting the large objects. > >How many LOs are we talking about here? > > 400,000. It gets too slow way before that, though. Around 30,000. I'm going > ahead with storing them in external files with an index in the database. > That is going very well. [.rs.] As Tom suggested it propably caused by FS. For all people which have to use current LO implementation on Linux I suggest using reiserfs (file system) which is designed to handle large amount of files in directiories (used by proxies and all such). -- radoslaw.stachowiak.........................................http://alter.pl/