random dataset generator for SKYLINE operator

Поиск
Список
Период
Сортировка
От Hannes Eder
Тема random dataset generator for SKYLINE operator
Дата
Msg-id 470E36C2.1040308@HannesEder.net
обсуждение исходный текст
Ответы Re: random dataset generator for SKYLINE operator  (Hannes Eder <Hannes@HannesEder.net>)
Список pgsql-hackers
We wrote a little contrib module, which we'd like to share. It can be
used to generate random datasets as they have been used in
[Borzsonyi2001] and related work. The code is based directly on the
code of the authors, thanks to Donald Kossmann for sharing the
code. Donald Kossmann agrees on sharing our implementation with the
community.

We use it primary for profiling and regression testing of our
implementation of the SKYLINE operator in PostgreSQL, which is work in
progress, details will be announced soon.

We use it in two ways:

(1) create table a2d1000 as (select * from pg_rand_dataset(‘anti’, 2,
1000, 0) ds(id int, d1 float, d2 float)). And then work with this table,
e.g. an additional index and query the data or

(2) directly for profiling and regression testing as: select * from
pg_rand_dataset(‘indep’, 3, 1000, 0) ds(id int, d1 float, d2 float, d3
float) skyline by d1 min, d2 min, d3 max;

It might be useful for other purposes.

We ask you for comments and suggestions.

What's the best way to share this module?


Declaration
-----------

CREATE FUNCTION pg_catalog.pg_rand_dataset(disttype text, dim int,
rows int, seed int)
RETURNS setof record
AS '$libdir/randdataset', 'pg_rand_dataset'
LANGUAGE C IMMUTABLE STRICT;


Usage example
-------------

Hannes=# select * from pg_rand_dataset('indep',1,5,0) ds(id int, d1 float);
id | d1
----+--------------------
1 | 0.170828036112165
2 | 0.749901980510867
3 | 0.0963716553972902
4 | 0.870465227342427
5 | 0.577303506702792
(5 rows)

Hannes=# select * from pg_rand_dataset('corr',2,3,0) ds(id int, d1
float, d2 float);
id | d1 | d2
----+-------------------+-------------------
1 | 0.489722997173791 | 0.431007019449241
2 | 0.385624871971487 | 0.645283734523713
3 | 0.754378154805565 | 0.82440492311745
(3 rows)

Hannes=# select * from pg_rand_dataset('anti',3,2,0) ds(id int, d1
float, d2 float, d3 float);
id | d1 | d2 | d3
----+-------------------+--------------------+-------------------
1 | 0.848072555389202 | 0.0868831583602555 | 0.656344708211334
2 | 0.447278565604571 | 0.641176215525044 | 0.523196923510816
(2 rows)


References
----------

[Borzsonyi2001] Börzsönyi, S.; Kossmann, D. & Stocker, K.:
The Skyline Operator, ICDE 2001, 421--432


Best,
-Hannes

Вложения

В списке pgsql-hackers по дате отправления:

Предыдущее
От: "Pavel Stehule"
Дата:
Сообщение: Re: full text search in 8.3
Следующее
От: "Marko Kreen"
Дата:
Сообщение: Patch: txid in core