Обсуждение: Re: Utilizing multiple disks

Поиск
Список
Период
Сортировка

Re: Utilizing multiple disks

От
Gaetano Mendola
Дата:
Steve wrote:

> Hi,
>
> I've been running postgres on my server for over a year now and the
> tables have become huge. I have 3 tables that have data over 10GB each
> and these tables are read very very frequently. In fact, heavy searches
> on these tables are expected every 2 to 3 minutes. This unfortunately
> gives a very poor response time to the end user and so I'm looking at
> other alternatives now.

This depend on the query that you are running on it:

Are you performing queries using the like operator? If yes did you define
an index on it using the right operator class ?

Are you performing queries on a calculated field ? If yes then you need
to construct a sort of materialized view.

If you are on Linux did you mounted your data partition with the option
noatime ?


Please provide use more information on your queries and on your datas,
your configurations...

Usualy split your tables on multiple disk is the last optimization step,
are you sure did you already reach the bootleneck of your sistem ?



Regards
Gaetano Mendola



Re: Utilizing multiple disks

От
Gaetano Mendola
Дата:
Steve wrote:

> Gaetano,
>
> Thanks for your reply.
>
> Gaetano Mendola wrote:
>
>> Are you performing queries using the like operator? If yes did you define
>> an index on it using the right operator class ?
>
>
> Yes I do use the like operator to perform queries on a few 'varchar' and
> 'text' fields. I have indexed almost every column in the table (because
> the client application allows complex searches on multiple columns and
> each column has an equal chance of being used in the 'WHERE' condition)

And did you use the right operator class ?
See http://www.postgresql.org/docs/7.4/interactive/indexes-opclass.html
for details

>> Are you performing queries on a calculated field ? If yes then you need
>> to construct a sort of materialized view.
>
>
> Hmm not sure what you mean by a calculated field. Since I'm not sure
> what it means, I'm probably not using it. I'm definitely not running any
> functions on columns to get an accumulated result back etc.

if you have for example a view like:

CREATE VIEW v_test AS
SELECT  foo(a)    AS  field_1,        bar(a+b)  AS  field_2
FROM  test;


and you do:   SELECT * from v_test WHERE field_2 = 123;

then you are in trouble!


>> If you are on Linux did you mounted your data partition with the option
>> noatime ?
>
>
> Yes I'm on linux and am using the ext3 filesystem. I am just about to
> mount the data partition with the noatime option. Docos do say that the
> speed increases considerably. I'll give it a shot but I'm still very
> interested in utilizing the extra three SCSI disks I have at my disposal.

mount that disk in RAID configuration then, you can at least try to move the
WAL on a separate disk.

> Database integrity is of utmost importance and so is speed. I know there
> are tradeoffs but I really do think that moving the data to a separate
> disk will help. Can you please tell me how to set this up? Thanks a lot!
>
> Steve

The version 7.5 will give you the tablespaces  feature, in the mean time you
can move your table and or index across your disk using symlinks, moving the
file associated with your table or index in another disk and replace it with a
link.


Regards
Gaetano Mendola
















Re: Utilizing multiple disks

От
Steve
Дата:
Gaetano,

Thanks for your reply.

Gaetano Mendola wrote:

> Are you performing queries using the like operator? If yes did you define
> an index on it using the right operator class ?

Yes I do use the like operator to perform queries on a few 'varchar' and
'text' fields. I have indexed almost every column in the table (because
the client application allows complex searches on multiple columns and
each column has an equal chance of being used in the 'WHERE' condition)

> Are you performing queries on a calculated field ? If yes then you need
> to construct a sort of materialized view.

Hmm not sure what you mean by a calculated field. Since I'm not sure
what it means, I'm probably not using it. I'm definitely not running any
functions on columns to get an accumulated result back etc.

> If you are on Linux did you mounted your data partition with the option
> noatime ?

Yes I'm on linux and am using the ext3 filesystem. I am just about to
mount the data partition with the noatime option. Docos do say that the
speed increases considerably. I'll give it a shot but I'm still very
interested in utilizing the extra three SCSI disks I have at my disposal.

> Usualy split your tables on multiple disk is the last optimization step,
> are you sure did you already reach the bootleneck of your sistem ?

Actually IO is currently the only bottleneck for my system. I was using
a single drive and had optimized it as much as I could (used indexes,
partitioned tables horizontally on old dates etc) but I thought it would
help me on the long run if I could store the actualy physical 'table
files' on separate disks (as the rule says: more spindles == parallel
IO). My current database basically has quite a few tables, out of which
only four are the 'largest'. Each table is completely independant (I
don't have a relational database setup because I didn't need it in this
case) and is accessed simultaneously by around 10 users at one time.
Each table also has almost all of its columns indexed. some of these
columns are of type 'varchar', 'text' or 'double'. I am aware that the
'text' fields are stored internally in a separate table. Does this
complicate things? Is it as easy as moving the files (I don't know which
ones though - please help me out here) to an independant drive and then
creating symbolic links to it to the '/opt/postgresql/data/...'
directory? How do I achieve this without losing any of the data.

Database integrity is of utmost importance and so is speed. I know there
are tradeoffs but I really do think that moving the data to a separate
disk will help. Can you please tell me how to set this up? Thanks a lot!

Steve