Hello together,
my name is Florian (ok, that much was obvious) and I’m using PostgreSQL as well as MySQL for a caching solution on a routing server. One thing I am always trying to do is optimize the db times, so here I go again…
I have a table myTable with many rows (millions, potentially billions). This table looks approx. like this...
- attribute_id
- source
- destination
- payload1
- payload2
- payload3
All of these columns are integers. The primary key is (attribute_id, source, destination), but there are also single indexes on the first three columns.
What I want to do is to get the “matrix” for n source/destination points, for example if I had the points 1,2, then I would want the payload for…
source = 1, destination = 1
source = 1, destination = 2
source = 2, destination = 1
source = 2, destination = 2
So, my query, using attribute_1 = 1 as an example, looks like this:
select c.source, c.destination, c.payload1, c.payload2, c.payload3 from myTable
inner join TEMP_TABLE t1 ON t1.point = c.source
inner join TEMP_TABLE t2 ON t2.point = c.destination
where c.attribute_id = 1;
The TEMP_TABLE simply contains the list of integers as “point” (with an index).
With the temp table is filled with around 1700 numers, the result will be approx. 2.9mil entries (1700 * 1700). Unfortunately, it’s quite slow (10-15secs). With MySQL, the main problem is the fetch, while the query itself only takes milliseconds, but I don’t know where the problem with PostgreSQL is (if any). But with only selecting count(*) the whole thing is much faster (1.3 seconds), so I guess the fetch part is the problem here, too.
I have already tried more indices, sub-queries, IN, = ANY, etc. – nothing seems to speed it up.
Does anyone see a chance to speed this query up significantly?
Regards,
Flo