Обсуждение: Huge sample dataset for testing.

Поиск
Список
Период
Сортировка

Huge sample dataset for testing.

От
Tim Uckun
Дата:
Does anybody know if there is a sample database or text files I can import to do some performance testing?

I would like to have tables with tens of millions of records if possible.


Re: Huge sample dataset for testing.

От
Gerd König
Дата:
Hello Tim,

you can create this by yourself very easily, e.g. you have a table
CREATE TABLE test1
(
  a_int serial NOT NULL,
  a_text character varying(200),
  dt timestamp without time zone DEFAULT now(),
    primary key (a_int)
);
create a bunch of data with something like:
insert into test1 (a_text) (select 'this is row number: '||i::text from (select
generate_series(1,1000000) as i) as q);

regards...GERD...

Tim Uckun schrieb:
> Does anybody know if there is a sample database or text files I can
> import to do some performance testing?
>
> I would like to have tables with tens of millions of records if possible.
>
>

--
/===============================\
| Gerd König
| - Infrastruktur -
|
| TRANSPOREON GmbH
| Pfarrer-Weiss-Weg 12
| DE - 89077 Ulm
|
|
| Tel: +49 [0]731 16906 16
| Fax: +49 [0]731 16906 99
| Web: www.transporeon.com
|
\===============================/



Bleiben Sie auf dem Laufenden.
Jetzt den Transporeon Newsletter abonnieren!
http://www.transporeon.com/unternehmen_newsletter.shtml


TRANSPOREON GmbH, Amtsgericht Ulm, HRB 722056
Geschäftsf.: Axel Busch, Peter Förster, Roland Hötzl, Marc-Oliver Simon

Re: Huge sample dataset for testing.

От
"A. Kretschmer"
Дата:
In response to Tim Uckun :
> Does anybody know if there is a sample database or text files I can import to
> do some performance testing?
>
> I would like to have tables with tens of millions of records if possible.

It is easy to create such a table:

test=# create table huge_data_table as select s, md5(s::text) from generate_series(1,10) s;
SELECT
test=*# select * from huge_data_table ;
 s  |               md5
----+----------------------------------
  1 | c4ca4238a0b923820dcc509a6f75849b
  2 | c81e728d9d4c2f636f067f89cc14862c
  3 | eccbc87e4b5ce2fe28308fd9f2a7baf3
  4 | a87ff679a2f3e71d9181a67b7542122c
  5 | e4da3b7fbbce2345d7772b0674a318d5
  6 | 1679091c5a880faf6fb5e6087eb1b2dc
  7 | 8f14e45fceea167a5a36dedd4bea2543
  8 | c9f0f895fb98ab9159f51fd0297e236d
  9 | 45c48cce2e2d7fbdea1afc51c7c6ad26
 10 | d3d9446802a44259755d38e6d163e820
(10 rows)

Change the 2nd parameter from 10 to <insert a big number>

Andreas
--
Andreas Kretschmer
Kontakt:  Heynitz: 035242/47150,   D1: 0160/7141639 (mehr: -> Header)
GnuPG-ID:   0x3FFF606C, privat 0x7F4584DA   http://wwwkeys.de.pgp.net

Re: Huge sample dataset for testing.

От
Tim Uckun
Дата:


> I would like to have tables with tens of millions of records if possible.

It is easy to create such a table:

test=# create table huge_data_table as select s, md5(s::text) from generate_series(1,10) s;

Thanks I'll try something like that.

I guess can create some random dates or something for other types of fields too.

I was hoping there was already something like this available though because it's going to take some time to create relations and such.

Re: Huge sample dataset for testing.

От
"A. Kretschmer"
Дата:
In response to Tim Uckun :
> Thanks I'll try something like that.
>
> I guess can create some random dates or something for other types of fields
> too.

Sure, dates for instance:

test=*# select (current_date + random() * 1000 * '1day'::interval)::date from generate_series(1,10);
    date
------------
 2010-12-11
 2009-06-20
 2009-08-13
 2011-10-17
 2011-10-09
 2010-10-13
 2010-02-04
 2011-03-04
 2012-01-17
 2010-11-18
(10 rows)


>
> I was hoping there was already something like this available though because
> it's going to take some time to create relations and such.

You want really download a database or table with 100 million rows?


Andreas
--
Andreas Kretschmer
Kontakt:  Heynitz: 035242/47150,   D1: 0160/7141639 (mehr: -> Header)
GnuPG-ID:   0x3FFF606C, privat 0x7F4584DA   http://wwwkeys.de.pgp.net

Re: Huge sample dataset for testing.

От
Greg Smith
Дата:
On Tue, 28 Apr 2009, Tim Uckun wrote:

> Does anybody know if there is a sample database or text files I can
> import to do some performance testing? I would like to have tables with
> tens of millions of records if possible.

There is a utility that ships with PostgreSQL named pgbench that includes
a simple schema (4 tables) and a data generator.  The generator
initialization step takes a database scale factor and creates 100,000
records per unit of scale.  So a scale of, say, 500 would give you 50M
records.  These tables are pretty simple, just having some ID number keys
and simulated bank accounts balances.

If you want a more complicated schema, you might try one of those from the
various DBT projects.  See
http://www.slideshare.net/markwkm/postgresql-portland-performance-practice-project-database-test-2-howto
for an intro to DBT2, which gives you 9 tables you can populate in various
ways to play with.

--
* Greg Smith gsmith@gregsmith.com http://www.gregsmith.com Baltimore, MD