Jeff,
**************************************************************************
> I'll post this message to the list, in case anyone wants to follow -
> or,
> if it's too OT, let me know, and I'll stop. The query "SELECT
> authuser,
> SUM(bytes) AS traffic FROM logfile GROUP BY authuser ORDER BY
> traffic"
> on a DB of approx. 1.8 million rows (same data in Pgsql and in mysql)
> takes 1.83min. in mysql, and 7.36min. on pgsql. The mysql db is
> raw...no
> indexes or anything 'tuning' done.
First off, this is nonsense. One of its benefits for web developers is
that MySQL automatically indexes everything.
> The pgsql db is indexed on the
> 'authuser' field, and I've run 'analyze logfile'.
He also needs to index the bytes field and the traffic field. And run
VACUUM, not just ANALYZE, if this is a high-activitly table, which I
suspect.
The machine is a
> PIII
> 600 w/728Mb RAM - and it's definitely CPU bound (both scream up to
> 100%
> and stay ;). As far as effecient queries, I'm not too sure how much
> more
> efficient that query can be made - I'm a netadmin, not a DBA :)
We can tell. To be blunt, MySQL is the database for non-DBAs. He
should probably stick to using it rather than Postgres, which requires
some knowledge of performance tuning and query structure.
> My guess is he hasn't optimized PostgreSQL at all on his system. I
> will
> try and find out more from him as to what version of PostgreSQL he's
> running and try and get a copy of his postgresql.conf file. But can
> anyone think of how the SQL statement could be written to be more
> efficient?
He also needs to up his sort_mem to the max his system and load will
allow. He's sorting 1.8 million rows.
-Josh Berkus