Обсуждение: [GENERAL] Function to return per-column counts?

Поиск
Список
Период
Сортировка

[GENERAL] Function to return per-column counts?

От
Seamus Abshere
Дата:
hey,

Does anybody have a function lying around (preferably pl/pgsql) that
takes a table name and returns coverage counts?

e.g.

#> select * from column_counts('cats'::regclass);
column_name | all_count | present_count | null_count | coverage |
---------------------------------------
name | 300 | 100 | 200 | 0.66

Thanks!
Seamus

--
Seamus Abshere, SCEA
https://www.faraday.io
https://github.com/seamusabshere
https://linkedin.com/in/seamusabshere


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] Function to return per-column counts?

От
Tomas Vondra
Дата:

On 09/28/2017 04:34 PM, Seamus Abshere wrote:
> hey,
> 
> Does anybody have a function lying around (preferably pl/pgsql) that
> takes a table name and returns coverage counts?
> 

What is "coverage count"?

cheers

-- 
Tomas Vondra                  http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] Function to return per-column counts?

От
John McKown
Дата:
On Thu, Sep 28, 2017 at 12:15 PM, Tomas Vondra <tomas.vondra@2ndquadrant.com> wrote:


On 09/28/2017 04:34 PM, Seamus Abshere wrote:
> hey,
>
> Does anybody have a function lying around (preferably pl/pgsql) that
> takes a table name and returns coverage counts?
>

What is "coverage count"?

​I'm guessing it's what is described here: https://www.red-gate.com/blog/sql-cover

IIUC, this is "code coverage" for things kept in your RDMS system, such as triggers, procedures, and other "code" items which are implicitly part of your application code.

 

cheers

--
Tomas Vondra                  http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


--
I just child proofed my house.
But the kids still manage to get in.


Maranatha! <><
John McKown

Re: [GENERAL] Function to return per-column counts?

От
Seamus Abshere
Дата:
> > > Does anybody have a function lying around (preferably pl/pgsql) that
> > > takes a table name and returns coverage counts?
> >
> > What is "coverage count"?

Ah, I should have explained better. I meant how much of a column is
null.

Basically you have to

0. count how many total records in a table
1. discover the column names in a table
2. for each column name, count how many nulls and subtract from total
count

If nobody has one written, I'll write one and blog it.

Thanks!
Seamus

PS. In a similar vein, we published
http://blog.faraday.io/how-to-do-histograms-in-postgresql/ which gives
plpsql so you can do:

SELECT * FROM histogram($table_name_or_subquery, $column_name)

--
Seamus Abshere, SCEA
https://www.faraday.io
https://github.com/seamusabshere
https://linkedin.com/in/seamusabshere


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] Function to return per-column counts?

От
Melvin Davidson
Дата:


On Thu, Sep 28, 2017 at 3:31 PM, Seamus Abshere <seamus@abshere.net> wrote:
> > > Does anybody have a function lying around (preferably pl/pgsql) that
> > > takes a table name and returns coverage counts?
> >
> > What is "coverage count"?

Ah, I should have explained better. I meant how much of a column is
null.

Basically you have to

0. count how many total records in a table
1. discover the column names in a table
2. for each column name, count how many nulls and subtract from total
count

If nobody has one written, I'll write one and blog it.

Thanks!
Seamus

PS. In a similar vein, we published
http://blog.faraday.io/how-to-do-histograms-in-postgresql/ which gives
plpsql so you can do:

SELECT * FROM histogram($table_name_or_subquery, $column_name)

--
Seamus Abshere, SCEA
https://www.faraday.io
https://github.com/seamusabshere
https://linkedin.com/in/seamusabshere


--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

I can't really do the full query for you, but the following should be able to give you a head start:

SELECT c.relname AS table,
               a.attname AS column,
               a.attnum   AS colnum,
               s.stanullfrac as pct_null,
               s.stadistinct
  FROM pg_class c
  JOIN pg_attribute a ON a.attrelid = c.oid
  JOIN pg_statistic s ON (s.starelid = c.oid AND s.staattnum = a.attnum)
 WHERE c.relname = 'your_table_name'
   AND a.attnum > 0
 ORDER BY 3

--
Melvin Davidson
I reserve the right to fantasize.  Whether or not you
wish to share my fantasy is entirely up to you.