Обсуждение: Huge sample dataset for testing.
Does anybody know if there is a sample database or text files I can import to do some performance testing?
I would like to have tables with tens of millions of records if possible.
I would like to have tables with tens of millions of records if possible.
Hello Tim, you can create this by yourself very easily, e.g. you have a table CREATE TABLE test1 ( a_int serial NOT NULL, a_text character varying(200), dt timestamp without time zone DEFAULT now(), primary key (a_int) ); create a bunch of data with something like: insert into test1 (a_text) (select 'this is row number: '||i::text from (select generate_series(1,1000000) as i) as q); regards...GERD... Tim Uckun schrieb: > Does anybody know if there is a sample database or text files I can > import to do some performance testing? > > I would like to have tables with tens of millions of records if possible. > > -- /===============================\ | Gerd König | - Infrastruktur - | | TRANSPOREON GmbH | Pfarrer-Weiss-Weg 12 | DE - 89077 Ulm | | | Tel: +49 [0]731 16906 16 | Fax: +49 [0]731 16906 99 | Web: www.transporeon.com | \===============================/ Bleiben Sie auf dem Laufenden. Jetzt den Transporeon Newsletter abonnieren! http://www.transporeon.com/unternehmen_newsletter.shtml TRANSPOREON GmbH, Amtsgericht Ulm, HRB 722056 Geschäftsf.: Axel Busch, Peter Förster, Roland Hötzl, Marc-Oliver Simon
In response to Tim Uckun : > Does anybody know if there is a sample database or text files I can import to > do some performance testing? > > I would like to have tables with tens of millions of records if possible. It is easy to create such a table: test=# create table huge_data_table as select s, md5(s::text) from generate_series(1,10) s; SELECT test=*# select * from huge_data_table ; s | md5 ----+---------------------------------- 1 | c4ca4238a0b923820dcc509a6f75849b 2 | c81e728d9d4c2f636f067f89cc14862c 3 | eccbc87e4b5ce2fe28308fd9f2a7baf3 4 | a87ff679a2f3e71d9181a67b7542122c 5 | e4da3b7fbbce2345d7772b0674a318d5 6 | 1679091c5a880faf6fb5e6087eb1b2dc 7 | 8f14e45fceea167a5a36dedd4bea2543 8 | c9f0f895fb98ab9159f51fd0297e236d 9 | 45c48cce2e2d7fbdea1afc51c7c6ad26 10 | d3d9446802a44259755d38e6d163e820 (10 rows) Change the 2nd parameter from 10 to <insert a big number> Andreas -- Andreas Kretschmer Kontakt: Heynitz: 035242/47150, D1: 0160/7141639 (mehr: -> Header) GnuPG-ID: 0x3FFF606C, privat 0x7F4584DA http://wwwkeys.de.pgp.net
It is easy to create such a table:
> I would like to have tables with tens of millions of records if possible.
test=# create table huge_data_table as select s, md5(s::text) from generate_series(1,10) s;
Thanks I'll try something like that.
I guess can create some random dates or something for other types of fields too.
I was hoping there was already something like this available though because it's going to take some time to create relations and such.
In response to Tim Uckun : > Thanks I'll try something like that. > > I guess can create some random dates or something for other types of fields > too. Sure, dates for instance: test=*# select (current_date + random() * 1000 * '1day'::interval)::date from generate_series(1,10); date ------------ 2010-12-11 2009-06-20 2009-08-13 2011-10-17 2011-10-09 2010-10-13 2010-02-04 2011-03-04 2012-01-17 2010-11-18 (10 rows) > > I was hoping there was already something like this available though because > it's going to take some time to create relations and such. You want really download a database or table with 100 million rows? Andreas -- Andreas Kretschmer Kontakt: Heynitz: 035242/47150, D1: 0160/7141639 (mehr: -> Header) GnuPG-ID: 0x3FFF606C, privat 0x7F4584DA http://wwwkeys.de.pgp.net
On Tue, 28 Apr 2009, Tim Uckun wrote: > Does anybody know if there is a sample database or text files I can > import to do some performance testing? I would like to have tables with > tens of millions of records if possible. There is a utility that ships with PostgreSQL named pgbench that includes a simple schema (4 tables) and a data generator. The generator initialization step takes a database scale factor and creates 100,000 records per unit of scale. So a scale of, say, 500 would give you 50M records. These tables are pretty simple, just having some ID number keys and simulated bank accounts balances. If you want a more complicated schema, you might try one of those from the various DBT projects. See http://www.slideshare.net/markwkm/postgresql-portland-performance-practice-project-database-test-2-howto for an intro to DBT2, which gives you 9 tables you can populate in various ways to play with. -- * Greg Smith gsmith@gregsmith.com http://www.gregsmith.com Baltimore, MD