need much better query perfomance

Поиск
Список
Период
Сортировка
От andy@mixonic.com (Andy)
Тема need much better query perfomance
Дата
Msg-id 8d60b5ea.0301241528.230778b2@posting.google.com
обсуждение исходный текст
Ответы Re: need much better query perfomance  (Stephan Szabo <sszabo@megazone23.bigpanda.com>)
Re: need much better query perfomance  (Tom Lane <tgl@sss.pgh.pa.us>)
Re: need much better query perfomance  (Steve Crawford <scrawford@pinpointresearch.com>)
Список pgsql-general
Here's the problem:

Table track has 100,000 rows
Table album has 20,000 rows
Table album_track is a lookup table linking tracks to albums and has
80,000 rows

The track table has a PK, a deleted field ('Y' for deleted tracks /
null otherwise), and several track-specific fields.
The album table also has a PK and a deleted field ('Y' or null), as
well as several album-specific fields.
The album_track table simple has two columns: The PKs of the album and
track tables.

A deleted track may still exist in an not-deleted album.  A deleted
track is simply one that cannot be added to new albums.
Likewise, a deleted album may contain not-deleted tracks.  It is
simply an album that the user has deleted.

So even though we've marked deleted albums & tracks in the database,
we haven't removed "deleted" tracks from our fileserver.  Now we're
almost out of space, so we need to find all tracks that are marked as
deleted and which are not a part of any non-deleted albums.

I wrote what must be a very naive query to find such tracks.  Doing an
explain on this query gave a huge time estimate.  My query is below.
Would someone please suggest a faster approach?

  select t.track_id
    from track t
    where t.deleted = 'Y'
      UNION
  select track_id from track where track_id not in (
    select at.track_id from album_track at, album alb
      where at.album_id = alb.album_id and alb.deleted is null
  )

The above query should work, but it takes too long to execute.  The
sample query below, however, is fast but incorrect:

  select t.track_id
    from track t
    where t.deleted = 'Y'
      UNION
  select at.track_id
    from album_track at, album alb
    where at.album_id = alb.album_id and alb.deleted = 'Y'

This query is incorrect because it the bottom-most select is getting
all deleted albums, but a track might be in both deleted AND
non-deleted albums.  This incorrect query will return all the
"completely deleted" tracks, but will also return tracks that are part
of both deleted and non-deleted albums, and we don't want this.

В списке pgsql-general по дате отправления:

Предыдущее
От: "Matt Block"
Дата:
Сообщение: Re: How do I unsubscribe?
Следующее
От: "Williams, Travis L, NPONS"
Дата:
Сообщение: psql command line question..