Обсуждение: Data Set Growth causing 26+hour runtime, on what we believe to be very simple SQL

Поиск

Список

Период

Сортировка

Data Set Growth causing 26+hour runtime, on what we believe to be very simple SQL

От

Tory M Blue

Дата:

21 января 2010 г., 21:15:42

The issues we are seeing besides just saying the reports take over 26 hours, is that the query seems to be CPU bound. Meaning that the query consumes an entire CPU and quite often it is sitting with 65%-90% WAIT. Now this is not iowait, the disks are fine, 5000-6000tps, 700K reads etc with maybe 10-13% iowait.

However much of the time that I see the CPU at 65%-90% Wait, there is very little disk access, so it's not the disk subsystem (in my opinion). I've also moved CPU's around and the sql seems to stall regardless of what CPU the job has been provided with. Memory I pulled memory to test and again, other than this data set consuming 10gigs of data, 700K free (will add more memory), but since the disks are not a bottleneck and I don't appear to be swapping, I keep coming back to the data or sql.

I'm providing the data that I think is requested when a performance issue is observed.

There is an autovac running, the queries are run on static data so INSERTS/UPDATES/DELETES

The query seems to have gotten slower as the data set grew.

Redhat
Postgres 8.3.4
8 cpu box
10gig of ram

Number of rows in the table= 100million

· Size of table including indices =21GB

· Time to create a combined index on 2 columns (tagged boolean , makeid text) = more than 1 hr 30 minutes

· Data distribution = In the 98mill records, there are 7000 unique makeid's, and 21mill unique UID's. About 41mill of the records have tagged=true

· Time to execute the following query with indices on makeid and tagged = 90-120 seconds. The planner uses the webid index and filters on tagged and then rechecks the webid index

SELECT COUNT(DISTINCT uid ) AS active_users FROM pixelpool.userstats WHERE makeid ='bmw-ferman' AND tagged =true

· Time to execute the the same query with a combined index on makeid and tagged = 60-100 seconds. The planner uses the combined index and then filters tagged.

SELECT COUNT(DISTINCT uid ) AS active_users FROM pixelpool.userstats WHERE makeid ='bmw-ferman' AND tagged =true

Plan

"Aggregate (cost=49467.00..49467.01 rows=1 width=8)"

" -> Bitmap Heap Scan on userstats (cost=363.49..49434.06 rows=13175 width=8)"

" Recheck Cond: (makeid = 'b1mw-ferman'::text)"

" Filter: tagged"

" -> Bitmap Index Scan on idx_retargetuserstats_makeidtag (cost=0.00..360.20 rows=13175 width=0)"

" Index Cond: ((makeid = 'b1mw-ferman'::text) AND (tagged = true))"

Any assistance would be appreciated, don't worry about slapping me around I need to figure this out. Otherwise I'm buying new hardware where it may not be required.

Thanks

Tory

Re: Data Set Growth causing 26+hour runtime, on what we believe to be very simple SQL

От

Craig Ringer

Дата:

22 января 2010 г., 02:46:25

Tory M Blue wrote:

> Any assistance would be appreciated, don't worry about slapping me
> around I need to figure this out. Otherwise I'm buying new hardware
> where it may not be required.

What is the reporting query that takes 26 hours? You didn't seem to
include it, or any query plan information for it (EXPLAIN or EXPLAIN
ANALYZE results).

What sort of activity is happening on the db concurrently with your
tests? What's your max connection limit?

What're your shared_buffers and effective_cache_size settings?

Could sorts be spilling to disk? Check work_mem size and enable logging
of tempfiles (see the manual).

Does an explicit ANALYZE of the problem table(s) help?

--
Craig Ringer

Re: Data Set Growth causing 26+hour runtime, on what we believe to be very simple SQL

От

Richard Huxton

Дата:

22 января 2010 г., 08:43:09

On 21/01/10 22:15, Tory M Blue wrote:
> ·         Data distribution = In the 98mill records, there are 7000 unique
> makeid's, and 21mill unique UID's. About 41mill of the records have
> tagged=true
>
> ·         Time to execute the following query with indices on makeid and
> tagged = 90-120 seconds. The planner uses the webid index and filters on
> tagged and then rechecks the webid index
>
> *                SELECT COUNT(DISTINCT uid )  AS active_users FROM
> pixelpool.userstats    WHERE makeid ='bmw-ferman' AND tagged =true*
>
> ·         Time to execute the the same query with a combined index on makeid
> and tagged = 60-100 seconds. The planner uses the combined index and then
> filters tagged.

Two things:

1. You have got the combined index on (makeid, tagged) and not (tagged,
makeid) haven't you? Just checking.

2. If it's mostly tagged=true you are interested in you can always use a
partial index: CREATE INDEX ... (makeid) WHERE tagged
This might be a win even if you need a second index with WHERE NOT tagged.

Also, either I've not had enough cofee yet, or a bitmap scan is an odd
choice for only ~ 13000 rows out of 100 million.

> *                "        ->   Bitmap Index Scan on
> idx_retargetuserstats_makeidtag  (cost=0.00..360.20 rows=13175 width=0)"*
>
> *                "              Index Cond: ((makeid = 'b1mw-ferman'::text)
> AND (tagged = true))"*

Otherwise, see what Craig said.

I'm assuming this isn't the query that is CPU bound for a long time.
Unless your table is horribly bloated, there's no reason for that
judging by this plan.

--
   Richard Huxton
   Archonet Ltd

Re: Data Set Growth causing 26+hour runtime, on what we believe to be very simple SQL

От

Tory M Blue

Дата:

22 января 2010 г., 16:59:46

On Thu, Jan 21, 2010 at 7:46 PM, Craig Ringer <craig@postnewspapers.com.au> wrote:

> Any assistance would be appreciated, don't worry about slapping me
> around I need to figure this out. Otherwise I'm buying new hardware
> where it may not be required.

What is the reporting query that takes 26 hours? You didn't seem to
include it, or any query plan information for it (EXPLAIN or EXPLAIN
ANALYZE results).

It's this query, run 6000 times with a diff makeid's

SELECT COUNT(DISTINCT uid ) AS active_users FROM pixelpool.userstats WHERE makeid ='bmw-ferman' AND tagged =true

Plan

"Aggregate (cost=49467.00..49467.01 rows=1 width=8)"

" -> Bitmap Heap Scan on userstats (cost=363.49..49434.06 rows=13175 width=8)"

" Recheck Cond: (makeid = 'b1mw-ferman'::text)"

" Filter: tagged"

" -> Bitmap Index Scan on idx_retargetuserstats_makeidtag (cost=0.00..360.20 rows=13175 width=0)"

" Index Cond: ((makeid = 'b1mw-ferman'::text) AND (tagged = true))"

What sort of activity is happening on the db concurrently with your
tests? What's your max connection limit?

50 max and there is nothing, usually one person connected if that, otherwise it's a cron job that bulk inserts and than jobs later on run that generate the reports off the static data. No deletes or updates happening.

What're your shared_buffers and effective_cache_size settings?

shared_buffers = 1028MB (Had this set at 128 and 256 and just recently bumped it higher, didn't buy me anything)
maintenance_work_mem = 128MB
fsync=on
random_page_cost = 4.0
effective_cache_size = 7GB
default vac settings

Could sorts be spilling to disk? Check work_mem size and enable logging
of tempfiles (see the manual).

work_mem = 100MB # min 64kB

Will do and I guess it's possible but during the queries, reports I don't see a ton of writes, mostly reads

Does an explicit ANALYZE of the problem table(s) help?

It didn't.

Thanks
Tory

Re: Data Set Growth causing 26+hour runtime, on what we believe to be very simple SQL

От

Tory M Blue

Дата:

22 января 2010 г., 17:03:50

On Fri, Jan 22, 2010 at 1:42 AM, Richard Huxton <dev@archonet.com> wrote:

On 21/01/10 22:15, Tory M Blue wrote:
· Data distribution = In the 98mill records, there are 7000 unique

makeid's, and 21mill unique UID's. About 41mill of the records have
tagged=true

· Time to execute the following query with indices on makeid and
tagged = 90-120 seconds. The planner uses the webid index and filters on
tagged and then rechecks the webid index

* SELECT COUNT(DISTINCT uid ) AS active_users FROM
pixelpool.userstats WHERE makeid ='bmw-ferman' AND tagged =true*

· Time to execute the the same query with a combined index on makeid
and tagged = 60-100 seconds. The planner uses the combined index and then
filters tagged.

Two things:

1. You have got the combined index on (makeid, tagged) and not (tagged, makeid) haven't you? Just checking.

Yes we do

2. If it's mostly tagged=true you are interested in you can always use a partial index: CREATE INDEX ... (makeid) WHERE tagged
This might be a win even if you need a second index with WHERE NOT tagged.

Partial index doesn't seem to fit here due to the fact that there are 35-40% Marked True.

Didn't think about creating a second index for false, may give that a shot.

Also, either I've not had enough cofee yet, or a bitmap scan is an odd choice for only ~ 13000 rows out of 100 million.

* " -> Bitmap Index Scan on

idx_retargetuserstats_makeidtag (cost=0.00..360.20 rows=13175 width=0)"*

* " Index Cond: ((makeid = 'b1mw-ferman'::text)
AND (tagged = true))"*

Otherwise, see what Craig said.

I'm assuming this isn't the query that is CPU bound for a long time. Unless your table is horribly bloated, there's no reason for that judging by this plan.

It is, but not always, only when there are 10K more matches. And the explain unfortunately is sometimes way high or way low, so the expalin is hit and miss.

But the same sql that returns maybe 500 rows is pretty fast, it's the return of 10K+ rows that seems to stall and is CPU Bound.

Thanks

Tory

Re: Data Set Growth causing 26+hour runtime, on what we believe to be very simple SQL

От

Richard Huxton

Дата:

22 января 2010 г., 17:25:13

On 22/01/10 18:03, Tory M Blue wrote:
> On Fri, Jan 22, 2010 at 1:42 AM, Richard Huxton<dev@archonet.com>  wrote:
>
>> On 21/01/10 22:15, Tory M Blue wrote:

>> 2. If it's mostly tagged=true you are interested in you can always use a
>> partial index: CREATE INDEX ... (makeid) WHERE tagged
>> This might be a win even if you need a second index with WHERE NOT tagged.
>>
>
> Partial index doesn't seem to fit here due to the fact that there are 35-40%
> Marked True.
>
> Didn't think about creating a second index for false, may give that a shot.

If you're mostly search tagged=true, try the partial index - it'll mean
the planner is just scanning the index for the one term.

>> Also, either I've not had enough cofee yet, or a bitmap scan is an odd
>> choice for only ~ 13000 rows out of 100 million.
>>
>>   *                "        ->    Bitmap Index Scan on
>>>
>>> idx_retargetuserstats_makeidtag  (cost=0.00..360.20 rows=13175 width=0)"*
>>>
>>> *                "              Index Cond: ((makeid =
>>> 'b1mw-ferman'::text)
>>> AND (tagged = true))"*
>>>
>>
>> Otherwise, see what Craig said.
>>
>> I'm assuming this isn't the query that is CPU bound for a long time. Unless
>> your table is horribly bloated, there's no reason for that judging by this
>> plan.
>
> It is, but not always, only when there are 10K more matches. And the explain
> unfortunately is sometimes way high or way low, so the expalin is hit and
> miss.
>
> But the same sql that returns maybe 500 rows is pretty fast, it's the return
> of 10K+ rows that seems to stall and is CPU Bound.

Hmm - might be able to push that cross-over point up a bit by tweaking
various costs, but you've got to be careful you don't end up making all
your other queries worse. It'd be good to figure out what the problem is
first.

Looking at the query there are four stages:
  1. Scan the index, build a bitmap of heap pages with matching rows
  2. Scan those pages, find the rows that match
  3. Run DISTINCT on the uids
  4. Count them
I wonder if it could be the DISTINCT. What happens with a count(*) or
count(uid) instead? Also - you might find EXPLAIN ANALYZE more useful
than straight EXPLAIN here. That will show actual times for each stage.

On Craig's branch of this thread, you say you call it 6000 times with
different "makeid"s. Any reason why you can't join to a temp table and
just do it in one query?

--
   Richard Huxton
   Archonet Ltd

Re: Data Set Growth causing 26+hour runtime, on what we believe to be very simple SQL

От

Matthew Wakeling

Дата:

22 января 2010 г., 17:27:03

On Fri, 22 Jan 2010, Tory M Blue wrote:
> But the same sql that returns maybe 500 rows is pretty fast, it's the return
> of 10K+ rows that seems to stall and is CPU Bound.

Okay, so you have two differing cases. Show us the EXPLAIN ANALYSE for
both of them, and we will see what the difference is.

Matthew

--
 The third years are wandering about all worried at the moment because they
 have to hand in their final projects. Please be sympathetic to them, say
 things like "ha-ha-ha", but in a sympathetic tone of voice
                                        -- Computer Science Lecturer

Re: Data Set Growth causing 26+hour runtime, on what we believe to be very simple SQL

От

Scott Marlowe

Дата:

22 января 2010 г., 17:38:35

On Fri, Jan 22, 2010 at 10:59 AM, Tory M Blue <tmblue@gmail.com> wrote:
> On Thu, Jan 21, 2010 at 7:46 PM, Craig Ringer <craig@postnewspapers.com.au>
> wrote:
>> > Any assistance would be appreciated, don't worry about slapping me
>> > around I need to figure this out. Otherwise I'm buying new hardware
>> > where it may not be required.
>>
>> What is the reporting query that takes 26 hours? You didn't seem to
>> include it, or any query plan information for it (EXPLAIN or EXPLAIN
>> ANALYZE results).
>
> It's this query, run 6000 times with a diff makeid's
>
> SELECT COUNT(DISTINCT uid )  AS active_users FROM pixelpool.userstats
> WHERE makeid ='bmw-ferman' AND tagged =true

Any chance of trying this instead:

select makeid, count(distinct uid) as active_users from
pixelpool.userstats where tagged=true group by makeid

And seeing how long it takes?  If you're limiting the total number of
makeids then you could add

and makeid in (biglistofmakeidsgoeshere)

Note that a partial index of

create index xyz on pixelpool.userstats (makeid) where tagged;

might help both the original and this query.

Re: Data Set Growth causing 26+hour runtime, on what we believe to be very simple SQL

От

Tory M Blue

Дата:

22 января 2010 г., 18:06:39

On Fri, Jan 22, 2010 at 10:26 AM, Matthew Wakeling <matthew@flymine.org> wrote:
>
> On Fri, 22 Jan 2010, Tory M Blue wrote:
>>
>> But the same sql that returns maybe 500 rows is pretty fast, it's the return
>> of 10K+ rows that seems to stall and is CPU Bound.
>
> Okay, so you have two differing cases. Show us the EXPLAIN ANALYSE for both of them, and we will see what the
differenceis. 
>
> Matthew

Okay, understood

Here is the explain plan for the query. Actual rows that the query
returns is 6369

SLOW

This is for this SQL
  SELECT COUNT(distinct uid )  AS active_users FROM
pixelpool.userstats    WHERE makeid =
'gmps-armen-chevy' and tagged=true

Explain
"Aggregate  (cost=118883.96..118883.97 rows=1 width=8)"
"  ->  Bitmap Heap Scan on userstats  (cost=797.69..118850.46
rows=13399 width=8)"
"        Recheck Cond: (makeid = 'gmps-armen-chevy'::text)"
"        Filter: tagged"
"        ->  Bitmap Index Scan on idx_retargetuserstats_makeid
(cost=0.00..794.34 rows=33276 width=0)"
"              Index Cond: (makeid = 'gmps-armen-chevy'::text)"


Explain Analyze
"Aggregate  (cost=118883.96..118883.97 rows=1 width=8) (actual
time=31219.376..31219.376 rows=1 loops=1)"
"  ->  Bitmap Heap Scan on userstats  (cost=797.69..118850.46
rows=13399 width=8) (actual time=281.604..31190.290 rows=19799
loops=1)"
"        Recheck Cond: (makeid = 'gmps-armen-chevy'::text)"
"        Filter: tagged"
"        ->  Bitmap Index Scan on idx_retargetuserstats_makeid
(cost=0.00..794.34 rows=33276 width=0) (actual time=258.506..258.506
rows=23242 loops=1)"
"              Index Cond: (makeid = 'gmps-armen-chevy'::text)"
"Total runtime: 31219.536 ms"

FAST

"Explain
"explain      SELECT a.makeid, COUNT(DISTINCT CASE WHEN tagged=true
THEN  a.uid END)
"AS active_users,
"COUNT(DISTINCT CASE WHEN tagged=false THEN a.uid END) as unassociated
"FROM pixelpool.userstats a
"WHERE makeid ='gmps-oden'  GROUP BY a.makeid



Explain Analyze

"GroupAggregate  (cost=802.66..119105.01 rows=1 width=23) (actual
time=3813.550..3813.551 rows=1 loops=1)"
"  ->  Bitmap Heap Scan on userstats a  (cost=802.66..118855.43
rows=33276 width=23) (actual time=55.400..3807.908 rows=2606 loops=1)"
"        Recheck Cond: (makeid = 'gmps-oden'::text)"
"        ->  Bitmap Index Scan on idx_retargetuserstats_makeid
(cost=0.00..794.34 rows=33276 width=0) (actual time=51.748..51.748
rows=2677 loops=1)"
"              Index Cond: (makeid = 'gmps-oden'::text)"
"Total runtime: 3813.626 ms"

Thanks again

Tory

Re: Data Set Growth causing 26+hour runtime, on what we believe to be very simple SQL

От

Tory M Blue

Дата:

22 января 2010 г., 18:08:14

On Fri, Jan 22, 2010 at 11:06 AM, Tory M Blue <tmblue@gmail.com> wrote:
> On Fri, Jan 22, 2010 at 10:26 AM, Matthew Wakeling <matthew@flymine.org> wrote:
>>
>> On Fri, 22 Jan 2010, Tory M Blue wrote:
>>>
>>> But the same sql that returns maybe 500 rows is pretty fast, it's the return
>>> of 10K+ rows that seems to stall and is CPU Bound.
>>
>> Okay, so you have two differing cases. Show us the EXPLAIN ANALYSE for both of them, and we will see what the
differenceis. 
>>
>> Matthew
>
> Okay, understood
>
> Here is the explain plan for the query. Actual rows that the query
> returns is 6369
>
> SLOW
>
> This is for this SQL
>   SELECT COUNT(distinct uid )  AS active_users FROM
> pixelpool.userstats    WHERE makeid =
> 'gmps-armen-chevy' and tagged=true
>
> Explain
> "Aggregate  (cost=118883.96..118883.97 rows=1 width=8)"
> "  ->  Bitmap Heap Scan on userstats  (cost=797.69..118850.46
> rows=13399 width=8)"
> "        Recheck Cond: (makeid = 'gmps-armen-chevy'::text)"
> "        Filter: tagged"
> "        ->  Bitmap Index Scan on idx_retargetuserstats_makeid
> (cost=0.00..794.34 rows=33276 width=0)"
> "              Index Cond: (makeid = 'gmps-armen-chevy'::text)"
>
>
> Explain Analyze
> "Aggregate  (cost=118883.96..118883.97 rows=1 width=8) (actual
> time=31219.376..31219.376 rows=1 loops=1)"
> "  ->  Bitmap Heap Scan on userstats  (cost=797.69..118850.46
> rows=13399 width=8) (actual time=281.604..31190.290 rows=19799
> loops=1)"
> "        Recheck Cond: (makeid = 'gmps-armen-chevy'::text)"
> "        Filter: tagged"
> "        ->  Bitmap Index Scan on idx_retargetuserstats_makeid
> (cost=0.00..794.34 rows=33276 width=0) (actual time=258.506..258.506
> rows=23242 loops=1)"
> "              Index Cond: (makeid = 'gmps-armen-chevy'::text)"
> "Total runtime: 31219.536 ms"

Sorry mis-pasted the FAST!!!

Explain
"GroupAggregate  (cost=802.66..119105.01 rows=1 width=23)"
"  ->  Bitmap Heap Scan on userstats a  (cost=802.66..118855.43
rows=33276 width=23)"
"        Recheck Cond: (makeid = 'gmps-oden'::text)"
"        ->  Bitmap Index Scan on idx_retargetuserstats_makeid
(cost=0.00..794.34 rows=33276 width=0)"
"              Index Cond: (makeid = 'gmps-oden'::text)"


Explain Analyze
"GroupAggregate  (cost=802.66..119105.01 rows=1 width=23) (actual
time=3813.550..3813.551 rows=1 loops=1)"
"  ->  Bitmap Heap Scan on userstats a  (cost=802.66..118855.43
rows=33276 width=23) (actual time=55.400..3807.908 rows=2606 loops=1)"
"        Recheck Cond: (makeid = 'gmps-oden'::text)"
"        ->  Bitmap Index Scan on idx_retargetuserstats_makeid
(cost=0.00..794.34 rows=33276 width=0) (actual time=51.748..51.748
rows=2677 loops=1)"
"              Index Cond: (makeid = 'gmps-oden'::text)"
"Total runtime: 3813.626 ms"

Re: Data Set Growth causing 26+hour runtime, on what we believe to be very simple SQL

От

Craig Ringer

Дата:

23 января 2010 г., 01:25:38

On 23/01/2010 1:59 AM, Tory M Blue wrote:

> It's this query, run 6000 times with a diff makeid's /
> /
>
> /SELECT COUNT(DISTINCT uid )  AS active_users FROM
> pixelpool.userstats    WHERE makeid ='bmw-ferman' AND tagged =true/
>
> /                Plan/
>
> / "Aggregate  (cost=49467.00..49467.01 rows=1 width=8)"/
>
> / "  ->  Bitmap Heap Scan on userstats  (cost=363.49..49434.06
> rows=13175 width=8)"/
>
> / "        Recheck Cond: (makeid = 'b1mw-ferman'::text)"/
>
> / "        Filter: tagged"/
>
> / "        ->  Bitmap Index Scan on idx_retargetuserstats_makeidtag
> (cost=0.00..360.20 rows=13175 width=0)"/
>
> / "              Index Cond: ((makeid = 'b1mw-ferman'::text) AND (tagged
> = true))"/

Try:

- Adding a partial index on makeid, eg:

    CREATE INDEX userstats_makeid_where_tagged_idx
    ON userstats (makeid) WHERE (tagged);

- Instead of repeating the query 6000 times in a loop, collect the data
in one pass by joining against a temp table containing the makeids of
interest.

SELECT COUNT(DISTINCT u.uid) AS active_users
FROM pixelpool.userstats u
INNER JOIN temp_makeids m ON (u.makeid = m.makeid)
WHERE u.tagged = true;

(If the 6000 repeats are really a correlated subquery part of a bigger
query you still haven't shown, then you might be able to avoid 6000
individual passes by adjusting your outer query instead).

--
Craig Ringer

Re: Data Set Growth causing 26+hour runtime, on what we believe to be very simple SQL

От

Richard Huxton

Дата:

25 января 2010 г., 09:54:45

On 22/01/10 19:06, Tory M Blue wrote:

 > Here is the explain plan for the query. Actual rows that the query
 > returns is 6369

Actually, it processes 19,799 rows (see the actual rows= below).

> SLOW

> "  ->   Bitmap Heap Scan on userstats  (cost=797.69..118850.46
> rows=13399 width=8) (actual time=281.604..31190.290 rows=19799
> loops=1)"

> "Total runtime: 31219.536 ms"

> FAST

> "  ->   Bitmap Heap Scan on userstats a  (cost=802.66..118855.43
> rows=33276 width=23) (actual time=55.400..3807.908 rows=2606 loops=1)"

> "Total runtime: 3813.626 ms"

OK - so the first query processes 19,799 rows in 31,219 ms (about 1.5ms
per row)

The second processes 2,606 rows in 3,813 ms (about 1.3ms per row).

You are asking for DISTINCT user-ids, so it's seems reasonable that it
will take slightly longer to check a larger set of user-ids.

Otherwise, both queries are the same. I'm still a little puzzled by the
bitmap scan, but the planner probably knows more about your data than I do.

The main time is spent in the "bitmap heap scan" which is where it's
grabbing actual row data (and presumably building a hash over the uid
column). you can see how long in the "actual time" the first number
(e.g. 281.604) is the time spent before it starts, and the second is the
total time at finish (31190.290). If "loops" was greater than 1 you
would multiply the times by the number of loops to get a total.

So - there's nothing "wrong" in the sense that the second query does the
same as the first. Let's take a step back. What you really want is your
reports to be faster.

You mentioned you were running this query thousands of times with a
different "makeid" each time. Running it once for all possible values
and stashing the results in a temp table will probably be *much* faster.
The planner can just scan the whole table once and build up its results
as it goes.

--
   Richard Huxton
   Archonet Ltd

Re: Data Set Growth causing 26+hour runtime, on what we believe to be very simple SQL

От

Matthew Wakeling

Дата:

25 января 2010 г., 10:59:56

On Mon, 25 Jan 2010, Richard Huxton wrote:
> OK - so the first query processes 19,799 rows in 31,219 ms (about 1.5ms per
> row)
>
> The second processes 2,606 rows in 3,813 ms (about 1.3ms per row).

Agreed. One query is faster than the other because it has to do an eighth
the amount of work.

Matthew

--
 I wouldn't be so paranoid if you weren't all out to get me!!

Re: Data Set Growth causing 26+hour runtime, on what we believe to be very simple SQL

От

Tory M Blue

Дата:

25 января 2010 г., 21:17:27

On Mon, Jan 25, 2010 at 3:59 AM, Matthew Wakeling <matthew@flymine.org> wrote:
> On Mon, 25 Jan 2010, Richard Huxton wrote:
>>
>> OK - so the first query processes 19,799 rows in 31,219 ms (about 1.5ms
>> per row)
>>
>> The second processes 2,606 rows in 3,813 ms (about 1.3ms per row).
>
> Agreed. One query is faster than the other because it has to do an eighth
> the amount of work.
>
> Matthew

Thanks guys, ya this has dropped the time by half. The process is
manageable now. Thanks again. For some reason we thought this method
would make it take more time, vs less. So again appreciate the help :)

Tory

Вход в личный кабинет

Восстановление пароля

Подтверждение аккаунта

Изменение пароля

Обсуждение: Data Set Growth causing 26+hour runtime, on what we believe to be very simple SQL