Обсуждение: selects on differing subsets of a query

Поиск
Список
Период
Сортировка

selects on differing subsets of a query

От
ed.temp.01@gmail.com
Дата:
First post, be gentle as I have terminology problems and so the
subject might be wrongly worded.

Say I have a table with fields
...
gender
diet_pref
...

What I am trying to construct is a *single* query showing the total
number of males in the table
and also the total number of male vegetarians in the table, i.e. the
2nd value is computed on a subset of the records needed for the first
value.

As 2 queries this would be:
select count(*) from mytab where gender='m'
select count(*) from mytab where gender='m' and diet_pref='veg'

The table is big and I'd like to do the select where gender='m' only
once. (In the actual situation the select is on a date range)

If there is a TFM, please point me at it with an indication of exactly
what it is I am trying to achieve. If I'm trying to do something
stupid, gentle advice would be appreciated.


Re: selects on differing subsets of a query

От
Markus Schaber
Дата:
Hi, Ed Temp,

ed.temp.01@gmail.com wrote:

> First post, be gentle as I have terminology problems and so the
> subject might be wrongly worded.

Hmm, the biggest mistake seems that "et temp 01" is a very unlikely real
name, so you should reconfigure your mail client :-)

> What I am trying to construct is a *single* query showing the total
> number of males in the table
> and also the total number of male vegetarians in the table, i.e. the
> 2nd value is computed on a subset of the records needed for the first
> value.
> 
> As 2 queries this would be:
> select count(*) from mytab where gender='m'
> select count(*) from mytab where gender='m' and diet_pref='veg'

Have you tried to UNION ALL the two queries?

> The table is big and I'd like to do the select where gender='m' only
> once. (In the actual situation the select is on a date range)

SELECT count(*),diet_pref='veg' FROM table WHERE gender='m' GROUP BY
diet_pref='veg'

Is not exactly what you want, as your application still has to add two
numbers to get the total result, but avoids the duplicated table scan.


SELECT count(*),count(nullif(diet_pref='veg', f)) FROM table WHERE
gender='m'

Should also give you both counts, this time in different columns, also
avoiding the duplicated table scan. It relies on the fact that
count(something) is only called if something is not null, whereas
count(*) is called for every row (as a special case).

HTH,
Markus

-- 
Markus Schaber | Logical Tracking&Tracing International AG
Dipl. Inf.     | Software Development GIS

Fight against software patents in EU! www.ffii.org www.nosoftwarepatents.org


Re: selects on differing subsets of a query

От
ed.temp.01@gmail.com
Дата:
Hello Markus,

It's actually a temporary mailbox just in case the list attracts spam :-)

Thank you for your help, I will study it when I get development time
on the database.

On 03/05/06, Markus Schaber <schabi@logix-tt.com> wrote:
> Hi, Ed Temp,
>
> ed.temp.01@gmail.com wrote:
>
> > First post, be gentle as I have terminology problems and so the
> > subject might be wrongly worded.
>
> Hmm, the biggest mistake seems that "et temp 01" is a very unlikely real
> name, so you should reconfigure your mail client :-)
>
> > What I am trying to construct is a *single* query showing the total
> > number of males in the table
> > and also the total number of male vegetarians in the table, i.e. the
> > 2nd value is computed on a subset of the records needed for the first
> > value.
> >
> > As 2 queries this would be:
> > select count(*) from mytab where gender='m'
> > select count(*) from mytab where gender='m' and diet_pref='veg'
>
> Have you tried to UNION ALL the two queries?
>
> > The table is big and I'd like to do the select where gender='m' only
> > once. (In the actual situation the select is on a date range)
>
> SELECT count(*),diet_pref='veg' FROM table WHERE gender='m' GROUP BY
> diet_pref='veg'
>
> Is not exactly what you want, as your application still has to add two
> numbers to get the total result, but avoids the duplicated table scan.
>
>
> SELECT count(*),count(nullif(diet_pref='veg', f)) FROM table WHERE
> gender='m'
>
> Should also give you both counts, this time in different columns, also
> avoiding the duplicated table scan. It relies on the fact that
> count(something) is only called if something is not null, whereas
> count(*) is called for every row (as a special case).
>
> HTH,
> Markus
>
> --
> Markus Schaber | Logical Tracking&Tracing International AG
> Dipl. Inf.     | Software Development GIS
>
> Fight against software patents in EU! www.ffii.org www.nosoftwarepatents.org
>


Re: selects on differing subsets of a query

От
Markus Schaber
Дата:
Hi, Ed,

ed.temp.01@gmail.com wrote:

> It's actually a temporary mailbox just in case the list attracts spam :-)

As far as I can see, the PostgreSQL lists are extremely low on spam
attraction, compared to other lists.

I think this is a result of the "only subscribers may post" policy, but
I'm getting of topic.

> Thank you for your help, I will study it when I get development time
> on the database.

Tell us about your findings here.


HTH,
Markus


-- 
Markus Schaber | Logical Tracking&Tracing International AG
Dipl. Inf.     | Software Development GIS

Fight against software patents in EU! www.ffii.org www.nosoftwarepatents.org


Re: selects on differing subsets of a query

От
Bruno Wolff III
Дата:
On Wed, May 03, 2006 at 14:19:27 +0200, Markus Schaber <schabi@logix-tt.com> wrote:
> 
> I think this is a result of the "only subscribers may post" policy, but
> I'm getting of topic.

Note the policy is really only subscribers may post without moderator approval.


Re: selects on differing subsets of a query

От
Bruno Wolff III
Дата:
On Wed, May 03, 2006 at 09:47:49 +0100, ed.temp.01@gmail.com wrote:
> First post, be gentle as I have terminology problems and so the
> subject might be wrongly worded.
> 
> Say I have a table with fields
> ...
> gender
> diet_pref
> ...
> 
> What I am trying to construct is a *single* query showing the total
> number of males in the table
> and also the total number of male vegetarians in the table, i.e. the
> 2nd value is computed on a subset of the records needed for the first
> value.

There are a few ways you could do this. One is to use a CASE function to
return 1 for diet_pref = 'veg' and 0 otherwise. Then you can do a count(*)
and a count of the CASE result in the same query and get both totals
with one pass through the table. Another option would be joining the two
queries. I don't think this is a good idea when you have to count everyone
anyway, but if you were counting a couple of small subsets of the data and
had partial indexes to speed those counts up, this might be a better strategy.


Re: selects on differing subsets of a query

От
Markus Schaber
Дата:
Hi, Bruno,

Bruno Wolff III wrote:

>>I think this is a result of the "only subscribers may post" policy, but
>>I'm getting of topic.
> 
> Note the policy is really only subscribers may post without moderator approval.

Thanks for clarification.

I also think that the moderators are aided by some automatic spam
classification techque.

Markus
-- 
Markus Schaber | Logical Tracking&Tracing International AG
Dipl. Inf.     | Software Development GIS

Fight against software patents in EU! www.ffii.org www.nosoftwarepatents.org