Re: Efficient use of space in large table?

Поиск
Список
Период
Сортировка
От Manfred Koizar
Тема Re: Efficient use of space in large table?
Дата
Msg-id 535biug0vv0v2olbhpt46nr8281ktqijtq@4ax.com
обсуждение исходный текст
Ответ на Efficient use of space in large table?  (Josh Jore <josh@greentechnologist.org>)
Ответы Re: Efficient use of space in large table?  (Curt Sampson <cjs@cynic.net>)
Re: Efficient use of space in large table?  (Josh Jore <josh@greentechnologist.org>)
Список pgsql-general
On Thu, 4 Jul 2002 21:43:10 -0500 (CDT), Josh Jore
<josh@greentechnologist.org> wrote:
>I was just wondering - I've got two large tables and I was wondering
>if there is anyway to shrink them somewhat. I imagined compression for
>non-indexed columns or something. Is varchar or char more efficient than
>text?
>
Josh,

first of all, text is ok.  You might want to store NULL instead of ''
to squeeze out a few bytes here and there.

Now I have even more questions instead of answers :-)

PG version? OS?
Do you have lots of UPDATEs/DELETEs?
Do you ANALYZE regularly?
Please show us the outputs of
    VACUUM VERBOSE sospeople;
    VACUUM VERBOSE votes;

>
>A size summary
>        relname         | size  |   type    | reltuples
>------------------------+-------+---------- +----------
> sospeople              | 599MB | table     |    2M
> sospeople_fn           | 71MB  | index     |    2M
> sospeople_ln           | 73MB  | index     |    2M
> sospeople_zip          | 73MB  | index     |    2M
> votes                  | 937MB | table     |   15M

Could you show us the tuple counts in your relations, just like the
wild guess I have inserted?

>
>
>                Table "sospeople"
>    Attribute     |  Type   |      Modifier
>------------------+---------+---------------------
> sosid            | text    | not null default ''
> countyname       | text    | not null
> firstname        | text    |
> middlename       | text    |
> lastname         | text    |
> suffix           | text    |
> homephone        | text    |
> registrationdate | text    |
> birthyear        | integer |
> status           | text    |
> precinctcode     | text    |
> housenumber      | text    |
> streetname       | text    |
> unittype         | text    |
> unit             | text    |
> address2         | text    |
> city             | text    |
> state            | text    |
> zipcode          | text    |

What are the average sizes of these text columns?  If there are long
repeated values (e.g. countyname), it might help do pull these out
into a separate table

    CREATE TABLE county (id serial, name text);

and replace
> countyname       | text    | not null
by
    county_id NOT NULL REFERENCES county

>Indices: sospeople_fn,
>         sospeople_ln,
>         sospeople_zip
>
Also post the results of
    \d sospeople_fn
    \d sospeople_ln
    \d sospeople_zip

>             Table "votes"
> Attribute  |     Type      | Modifier
>------------+---------------+----------
> sosid      | character(10) |
> electionid | integer       |
> votetype   | character(1)  |

No index here?
Using sosno INT instead of sosid CHAR(10) could save you (vaguely
estimated) up to 10% of space for this table.  OTOH you would have to
insert sosno into sospeople, so this would only be a win, if votes has
far more rows than sospeople.

>I'm sort of just grasping at straws for something to get the data
>smaller.

Is your problem really related to space, or to speed?

Servus
 Manfred



В списке pgsql-general по дате отправления:

Предыдущее
От: Do Not Spam
Дата:
Сообщение: Time bug fixed?
Следующее
От: "Marc G. Fournier"
Дата:
Сообщение: Re: I am being interviewed by OReilly