Hi,
I’ve run into this pattern a few times, and I usually get a little confused. I’m wondering if there are some common
solutionsor techniques.
Simplified example:
I have tables `items`, `colors`, and `images`. Items have many colors, and many images.
I want a query to list items, with their colors and images. Each result row is an item, so the colors and images must
beaggregated into arrays or json.
If there were only one other table, it’s a simple join and group…
select items.*, array_agg(color_name)
from items join colors on items.id = colors.item_id
group by items.id
Now to get images too, my first try uses a CTE…
// Tables:
// items (id, name, foo)
// colors (item_id, color_name, color_foo)
// images (item_id, image_file, image_foo)
with items_with_colors as (
// This is the same query as above
select items.*, array_agg(colors.color_name) as color_names
from items
join colors on items.id = colors.item_id
group by items.id
)
// Same idea repeated but now joining to images table
select items.*, array_agg(images.image_file) as image_files
from items_with_colors items
join images on items.id = images.item_id
group by items.id, items.name, items.foo, items.color_names; // mmm :(
One first problem is on the last line. I have to list out all the columns. In my real situation there are many more.
Listingthem is only an inconvenience, but I’m more worried that it is internally comparing all the columns, when really
itcould compare only `items.id`.
So… are there some other techniques that usually (or might) work better? It seems like the kind of thing that could run
inparallel, if I wrote it right.
Rob