Обсуждение: Need some advice on a difficult query
RecipeId
RecipeTitle
RecipeRating
And an ingredients table:
RecipeId
IngredientId
Amount
When the user searches for a new pasta dish, the UI would generate a query something like this:
SELECT * FROM Recipes where RecipeTitle ilike '%pasta%';
I only need the data from the recipes table since I display a summary of the search results and don't load the full recipe until the user clicks on the link. This works great. However, I'm now in the process of implementing an ingredient blacklist. This means NEVER show me any recipes which have one of my blacklisted ingredients, as if I ingest any I will die a painful death. Or maybe my god will smite me or something. This table would look like:
UserId
IngredientId
Ok, now the question. What's the best way to query for my pasta dish above while excluding any recipe that contains one or more of my blacklisted ingredients? I don't want to join in Ingredients because I'd then have a row for each ingredient of each recipe (which would cause me massive headache and redesign).. I thought about some weird nested query, but I'm concerned it might be slow (searching for recipes is the central feature of my site and must be fast, fast and fast). This basically boils down to some weird Cartesian product thing.
I am, however, willing to redesign the mechanism I store blacklisted ingredients with. I could store this as an array on the Users table, or anything else that might make things easier. Any advice on this?
Mike
Mike
Hi Mike,
here is an untested "weird nested query" for your problem:
SELECT * FROM Recipes r where lower(RecipeTitle) like lower('%pasta%')
and not exists
(select 1 from ingredients inner join blacklist using (IngredientId) where RecipeId = r.RecipeId and blacklist.UserId = 123 limit 1);
Should be quite fast with the right indexes.
Timo
Von: pgsql-general-owner@postgresql.org [mailto:pgsql-general-owner@postgresql.org] Im Auftrag von Mike Christensen
Gesendet: Mittwoch, 16. Dezember 2009 09:48
An: pgsql-general@postgresql.org
Betreff: [GENERAL] Need some advice on a difficult query
Hi all - I'd like some advice on how to write a rather complicated (for me, anyway) query and if there's any nifty Postgres features I can take advantage of in this situation. Imagine a database, if you will, used to store recipes. I have a recipes table:
RecipeId
RecipeTitle
RecipeRating
And an ingredients table:
RecipeId
IngredientId
Amount
When the user searches for a new pasta dish, the UI would generate a query something like this:
SELECT * FROM Recipes where RecipeTitle ilike '%pasta%';
I only need the data from the recipes table since I display a summary of the search results and don't load the full recipe until the user clicks on the link. This works great. However, I'm now in the process of implementing an ingredient blacklist. This means NEVER show me any recipes which have one of my blacklisted ingredients, as if I ingest any I will die a painful death. Or maybe my god will smite me or something. This table would look like:
UserId
IngredientId
Ok, now the question. What's the best way to query for my pasta dish above while excluding any recipe that contains one or more of my blacklisted ingredients? I don't want to join in Ingredients because I'd then have a row for each ingredient of each recipe (which would cause me massive headache and redesign).. I thought about some weird nested query, but I'm concerned it might be slow (searching for recipes is the central feature of my site and must be fast, fast and fast). This basically boils down to some weird Cartesian product thing.
I am, however, willing to redesign the mechanism I store blacklisted ingredients with. I could store this as an array on the Users table, or anything else that might make things easier. Any advice on this?
Mike
Hi Mike,
here is an untested "weird nested query" for your problem:
SELECT * FROM Recipes r where lower(RecipeTitle) like lower('%pasta%')
and not exists
(select 1 from ingredients inner join blacklist using (IngredientId) where RecipeId = r.RecipeId and blacklist.UserId = 123 limit 1);
Should be quite fast with the right indexes.
Timo
Von: pgsql-general-owner@postgresql.org [mailto:pgsql-general-owner@postgresql.org] Im Auftrag von Mike Christensen
Gesendet: Mittwoch, 16. Dezember 2009 09:48
An: pgsql-general@postgresql.org
Betreff: [GENERAL] Need some advice on a difficult query
Hi all - I'd like some advice on how to write a rather complicated (for me, anyway) query and if there's any nifty Postgres features I can take advantage of in this situation.  Imagine a database, if you will, used to store recipes.  I have a recipes table:
 RecipeId
 RecipeTitle
 RecipeRating
 And an ingredients table:
 RecipeId
 IngredientId
 Amount
 When the user searches for a new pasta dish, the UI would generate a query something like this:
 SELECT * FROM Recipes where RecipeTitle ilike '%pasta%';
 I only need the data from the recipes table since I display a summary of the search results and don't load the full recipe until the user clicks on the link.  This works great.  However, I'm now in the process of implementing an ingredient blacklist.  This means NEVER show me any recipes which have one of my blacklisted ingredients, as if I ingest any I will die a painful death.  Or maybe my god will smite me or something.  This table would look like:
 UserId
 IngredientId
 Ok, now the question.  What's the best way to query for my pasta dish above while excluding any recipe that contains one or more of my blacklisted ingredients?  I don't want to join in Ingredients because I'd then have a row for each ingredient of each recipe (which would cause me massive headache and redesign)..   I thought about some weird nested query, but I'm concerned it might be slow (searching for recipes is the central feature of my site and must be fast, fast and fast).  This basically boils down to some weird Cartesian product thing.
 I am, however, willing to redesign the mechanism I store blacklisted ingredients with.  I could store this as an array on the Users table, or anything else that might make things easier.  Any advice on this?
 Mike
On Wed, Dec 16, 2009 at 12:47:36AM -0800, Mike Christensen wrote:
> When the user searches for a new pasta dish, the UI would generate a query
> something like this:
>
> SELECT * FROM Recipes where RecipeTitle ilike '%pasta%';
>
> I only need the data from the recipes table since I display a summary of the
> search results and don't load the full recipe until the user clicks on the
> link.  This works great.  However, I'm now in the process of implementing an
> ingredient blacklist.  This means NEVER show me any recipes which have one
> of my blacklisted ingredients, as if I ingest any I will die a painful
> death.
If you expect the number of blacklisted recipes to be low, the
following may be a good alternative as well:
  SELECT r.*
  FROM recipes r LEFT JOIN (
    SELECT i.recipeid FROM ingredients i, blacklist b
    WHERE i.ingredientid = b.ingredientid
      AND b.userid = 123
    GROUP BY i.recipeid) x ON r.recipeid = x.recipeid
  WHERE r.recipetitle ILIKE '%pasta%'
    AND x.recipeid IS NULL;
Note that it's generally considered bad form to include "*" in the
return of a query when it's code dealing with the response.  Humans can
deal with the columns coming back differently, but code has the habit of
getting confused.
Also, you may want to consider using full text search when searching the
titles.  That ILIKE requires a full table scan and will slow down as
more recipes get added.
--
  Sam  http://samason.me.uk/
			
		I'll do some tests with your technique below and see which works better..
Mike
On Wed, Dec 16, 2009 at 12:47:36AM -0800, Mike Christensen wrote:If you expect the number of blacklisted recipes to be low, the
> When the user searches for a new pasta dish, the UI would generate a query
> something like this:
>
> SELECT * FROM Recipes where RecipeTitle ilike '%pasta%';
>
> I only need the data from the recipes table since I display a summary of the
> search results and don't load the full recipe until the user clicks on the
> link. This works great. However, I'm now in the process of implementing an
> ingredient blacklist. This means NEVER show me any recipes which have one
> of my blacklisted ingredients, as if I ingest any I will die a painful
> death.
following may be a good alternative as well:
SELECT r.*
FROM recipes r LEFT JOIN (
SELECT i.recipeid FROM ingredients i, blacklist b
WHERE i.ingredientid = b.ingredientid
AND b.userid = 123
GROUP BY i.recipeid) x ON r.recipeid = x.recipeid
WHERE r.recipetitle ILIKE '%pasta%'
AND x.recipeid IS NULL;
Note that it's generally considered bad form to include "*" in the
return of a query when it's code dealing with the response. Humans can
deal with the columns coming back differently, but code has the habit of
getting confused.
Also, you may want to consider using full text search when searching the
titles. That ILIKE requires a full table scan and will slow down as
more recipes get added.
--
Sam http://samason.me.uk/
--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general