Re: Zedstore - compressed in-core columnar storage

Поиск
Список
Период
Сортировка
От Justin Pryzby
Тема Re: Zedstore - compressed in-core columnar storage
Дата
Msg-id 20190820020425.GX11185@telsasoft.com
обсуждение исходный текст
Ответ на Re: Zedstore - compressed in-core columnar storage  (Alexandra Wang <lewang@pivotal.io>)
Ответы Re: Zedstore - compressed in-core columnar storage  (Heikki Linnakangas <hlinnaka@iki.fi>)
Список pgsql-hackers
On Mon, Aug 19, 2019 at 04:15:30PM -0700, Alexandra Wang wrote:
> On Sun, Aug 18, 2019 at 12:35 PM Justin Pryzby <pryzby@telsasoft.com> wrote:
> 
> >  . I was missing a way to check for compression ratio;
> 
> Here are the ways to check compression ratio for zedstore:
> 
> Table level:
> SELECT sum(uncompressedsz::numeric)/sum(totalsz) AS compratio FROM pg_zs_btree_pages(<tablename>);

postgres=# SELECT sum(uncompressedsz::numeric)/sum(totalsz) AS compratio FROM
pg_zs_btree_pages('child.cdrs_huawei_pgwrecord_2019_07_01');
compratio | 4.2730304163521529

For a fair test, I created a separate ZFS tablspace for storing just a copy of
that table.

ts=# CREATE TABLE test TABLESPACE testcomp AS SELECT * FROM child.cdrs_huawei_pgwrecord_2019_07_01;
SELECT 39933381
Time: 882417.775 ms (14:42.418)

zfs/testJTP20190819  compressratio  6.01x     -
zfs/testJTP20190819  compression    gzip-1    inherited from zfs

> Per column level:
> select attno, count(*), sum(uncompressedsz::numeric)/sum(totalsz) as compratio from pg_zs_btree_pages(<tablename>)
groupby attno order by attno;
 

Order by 3; I see we have SOME highly compressed columns.

It's still surprising to me that's as low as it is, given their content: phone
numbers and IPv4 addresses in text form, using characters limited to
[[:digit:].]

(I realize we can probably save space using inet type.)

     0 |  4743 | 1.00000000000000000000
    32 | 21912 | 1.05953637381493823513
    80 | 36441 |     1.2416446300175039
     4 | 45059 |     1.3184106811322728
    83 | 45059 |     1.3184106811322728
    52 | 39208 |     1.3900788061770992
...
    74 |  3464 |    10.8258665101057364
    17 |  3535 |    10.8776086243096534
     3 |  7092 |    11.0388009154683678
    11 |  3518 |    11.4396055611832109
    65 |  3333 |    14.6594723104237634
    35 | 14077 |    15.1642131499381887
...
    43 |  1601 |    21.4200106784573211
    79 |  1599 |    21.4487670806076829
    89 |  1934 |    23.6292134031933401
    33 |  1934 |    23.6292134031933401

It seems clear the columns with high n_distinct have low compress ratio, and
columns with high compress ratio are those with n_distinct=1...

CREATE TEMP TABLE zs AS SELECT zs.*, n_distinct, avg_width, a.attname FROM (SELECT
'child.cdrs_huawei_pgwrecord_2019_07_01'::regclasst)t , LATERAL (SELECT attno, count(*),
sum(uncompressedsz::numeric)/sum(totalsz)AS compratio FROM pg_zs_btree_pages(t) GROUP BY attno)zs , pg_attribute a,
pg_classc, pg_stats s WHERE a.attrelid=t AND a.attnum=zs.attno AND c.oid=a.attrelid AND c.relname=s.tablename AND
s.attname=a.attname;

 n_distinct |       compratio        
------------+------------------------
     217141 |     1.2416446300175039
     154829 |     1.5306062496764190
     144486 |     1.3900788061770992
     128334 |     1.5395022739568842
     121324 |     1.4005533187886683
      86341 |     1.6262709389296389
      84073 |     4.4379336418590519
      65413 |     5.1890181028038757
      63703 |     5.5029855093836425
      63637 |     5.3648468796642262
      46450 |     1.3184106811322728
      46450 |     1.3184106811322728
      43029 |     1.8003513772661308
      39363 |     1.5845730687475706
      36720 |     1.4751147557399539
      36445 |     1.8403087513759131
      36445 |     1.5453935268318613
      11455 | 1.05953637381493823513
       2862 |     9.8649823666870671
       2625 |     2.3573614181847621
       1376 |     1.7895024285340428
       1335 |     2.2812551964262787
        807 |     7.1192324141359373
        610 |     7.9373623460089360
         16 |    11.4396055611832109
         10 |     5.5429763442365557
          7 |     5.0440578041440675
          7 |     5.2000132813261135
          4 |     6.9741514753325536
          4 |     4.2872818036896340
          3 |     1.9080838412634827
          3 |     2.9915954457453485
          3 |     2.3056387009407882
          2 |    10.8776086243096534
          2 |     5.5950929307378287
          2 |    18.5796576388128741
          2 |    10.8258665101057364
          2 |     9.1112820658021406
          2 |     3.4986057630739795
          2 |     4.6250999234025238
          2 |    11.0388009154683678
          1 |    15.1642131499381887
          1 |     2.8855860118178798
          1 |    23.6292134031933401
          1 |    21.4200106784573211
[...]

> > it looks like zedstore
> >    with lz4 gets ~4.6x for our largest customer's largest table.  zfs using
> >    compress=gzip-1 gives 6x compression across all their partitioned
> > tables,
> >    and I'm surprised it beats zedstore .
> >
> 
> What kind of tables did you use? Is it possible to give us the schema
> of the table? Did you perform 'INSERT INTO ... SELECT' or COPY?

I did this:

|time ~/src/postgresql.bin/bin/pg_restore /srv/cdrperfbackup/ts/final/child.cdrs_huawei_pgwrecord_2019_07_01 -f-
|PGOPTIONS='-cdefault_table_access_method=zedstore'psql --port 5678 postgres --host /tmp
 
...
COPY 39933381
...
real    100m25.764s

 child  | cdrs_huawei_pgwrecord_2019_07_01 | table | pryzbyj | permanent   | 8277 MB | 

postgres=# SELECT array_to_string(array_agg(format_type(atttypid, atttypmod) ||CASE WHEN attnotnull THEN ' not null'
ELSE'' END ORDER BY attnum),',') FROM pg_attribute WHERE attrelid='child.cdrs_huawei_pgwrecord_2019_07_01'::regclass
ANDattnum>0;
 
array_to_string | text not null,text,text not null,text not null,text not
null,text,text,text,boolean,text,text,text,text,text,text,text,text,text,text,text,text,text,text,text,text,text,timestamp
withouttime zone not null,bigint not null,text not null,text,text,text,text,text,text,text,text,text,text not
null,text,boolean,text,text,text,text,text,text,text,text,text,text,text,text,text,text,text,text,text,text,text,text,text,text,text,text,text,text,text,text,text,text,text,text,text,text,text,text,text,text,timestamp
withtime zone,timestamp with time zone,text,text,boolean,text,text,boolean,boolean,text not null,text not null
 




В списке pgsql-hackers по дате отправления:

Предыдущее
От: Michael Paquier
Дата:
Сообщение: Re: Cleanup isolation specs from unused steps
Следующее
От: Alvaro Herrera
Дата:
Сообщение: Re: Cleanup isolation specs from unused steps