Re: Proposed Patch to Improve Performance of Multi-Batch Hash Join for Skewed Data Sets

Поиск
Список
Период
Сортировка
От Lawrence, Ramon
Тема Re: Proposed Patch to Improve Performance of Multi-Batch Hash Join for Skewed Data Sets
Дата
Msg-id 6EEA43D22289484890D119821101B1DF2C180E@exchange20.mercury.ad.ubc.ca
обсуждение исходный текст
Ответ на Proposed Patch to Improve Performance of Multi-Batch Hash Join for Skewed Data Sets  ("Lawrence, Ramon" <ramon.lawrence@ubc.ca>)
Список pgsql-hackers
Robert,

You do not need to use qgen.exe to generate queries as you are not
running the TPC-H benchmark test.  Attached is an example of the 22
sample TPC-H queries according to the benchmark.

We have not tested using the TPC-H queries for this particular patch and
only use the TPC-H database as a large, skewed data set.  The simpler
queries we test involve joins of Part-Lineitem or Supplier-Lineitem such
as:

Select * from part, lineitem where p_partkey = l_partkey

OR

Select count(*) from part, lineitem where p_partkey = l_partkey

The count(*) version is usually more useful for comparisons as the
generation of output tuples on the client side (say with pgadmin)
dominates the actual time to complete the query.

To isolate query costs, we also test using a simple server-side
function.  The setup description I have also attached.

I would be happy to help in any way I can.

Bryce is currently working on an updated patch according to your
suggestions.

--
Dr. Ramon Lawrence
Assistant Professor, Department of Computer Science, University of
British Columbia Okanagan
E-mail: ramon.lawrence@ubc.ca


> -----Original Message-----
> From: pgsql-hackers-owner@postgresql.org [mailto:pgsql-hackers-
> owner@postgresql.org] On Behalf Of Robert Haas
> Sent: December 17, 2008 7:54 PM
> To: Lawrence, Ramon
> Cc: Tom Lane; pgsql-hackers@postgresql.org; Bryce Cutt
> Subject: Re: [HACKERS] Proposed Patch to Improve Performance of Multi-
> Batch Hash Join for Skewed Data Sets
>
> Dr. Lawrence:
>
> I'm still working on reviewing this patch.  I've managed to load the
> sample TPCH data from tpch1g1z.zip after changing the line endings to
> UNIX-style and chopping off the trailing vertical bars.  (If anyone is
> interested, I have the results of pg_dump | bzip2 -9 on the resulting
> database, which I would be happy to upload if someone has server
> space.  It is about 250MB.)
>
> But, I'm not sure quite what to do in terms of generating queries.
> TPCHSkew contains QGEN.EXE, but that seems to require that you provide
> template queries as input, and I'm not sure where to get the
> templates.
>
> Any suggestions?
>
> Thanks,
>
> ...Robert
>
> --
> Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-hackers

Вложения

В списке pgsql-hackers по дате отправления:

Предыдущее
От: David Fetter
Дата:
Сообщение: Re: Partitioning wiki page
Следующее
От: Heikki Linnakangas
Дата:
Сообщение: Re: Preventing index scans for non-recoverable index AMs