Re: need much better query perfomance

Поиск
Список
Период
Сортировка
От Steve Crawford
Тема Re: need much better query perfomance
Дата
Msg-id 20030129164951.61C81103E8@polaris.pinpointresearch.com
обсуждение исходный текст
Ответ на need much better query perfomance  (andy@mixonic.com (Andy))
Список pgsql-general
First the basics - have you run vacuum analyze? Also, if you can have
exclusive access to the database for a while, "vacuum full".

I haven't had my coffee but the query doesn't look right. If I understand the
question you want to find those tracks which are marked for deletion and do
not appear on any "active" (non-deleted) album.

I don't think union is the correct solution. I'd try something like:

select track_id from track where track.deleted = 'Y'
and
not exists (select * from album_track where
album_track.track_id = track.track_id
and
album_track.album_id = album.album.id
and
album.deleted is null)

Cheers,
Steve


>     from track t
>     where t.deleted = 'Y'
>       UNION
>   select track_id from track where track_id not in (
>     select at.track_id from album_track at, album alb
>       where at.album_id = alb.album_id and alb.deleted is null
>   )


On Friday 24 January 2003 3:28 pm, Andy wrote:
> Here's the problem:
>
> Table track has 100,000 rows
> Table album has 20,000 rows
> Table album_track is a lookup table linking tracks to albums and has
> 80,000 rows
>
> The track table has a PK, a deleted field ('Y' for deleted tracks /
> null otherwise), and several track-specific fields.
> The album table also has a PK and a deleted field ('Y' or null), as
> well as several album-specific fields.
> The album_track table simple has two columns: The PKs of the album and
> track tables.
>
> A deleted track may still exist in an not-deleted album.  A deleted
> track is simply one that cannot be added to new albums.
> Likewise, a deleted album may contain not-deleted tracks.  It is
> simply an album that the user has deleted.
>
> So even though we've marked deleted albums & tracks in the database,
> we haven't removed "deleted" tracks from our fileserver.  Now we're
> almost out of space, so we need to find all tracks that are marked as
> deleted and which are not a part of any non-deleted albums.
>
> I wrote what must be a very naive query to find such tracks.  Doing an
> explain on this query gave a huge time estimate.  My query is below.
> Would someone please suggest a faster approach?
>
>   select t.track_id
>     from track t
>     where t.deleted = 'Y'
>       UNION
>   select track_id from track where track_id not in (
>     select at.track_id from album_track at, album alb
>       where at.album_id = alb.album_id and alb.deleted is null
>   )
>
> The above query should work, but it takes too long to execute.  The
> sample query below, however, is fast but incorrect:
>
>   select t.track_id
>     from track t
>     where t.deleted = 'Y'
>       UNION
>   select at.track_id
>     from album_track at, album alb
>     where at.album_id = alb.album_id and alb.deleted = 'Y'
>
> This query is incorrect because it the bottom-most select is getting
> all deleted albums, but a track might be in both deleted AND
> non-deleted albums.  This incorrect query will return all the
> "completely deleted" tracks, but will also return tracks that are part
> of both deleted and non-deleted albums, and we don't want this.
>
> ---------------------------(end of broadcast)---------------------------
> TIP 6: Have you searched our list archives?
>
> http://archives.postgresql.org

В списке pgsql-general по дате отправления:

Предыдущее
От: Tom Lane
Дата:
Сообщение: Re: monitoring postgres
Следующее
От: Stephan Szabo
Дата:
Сообщение: Re: index on timestamp performance