Обсуждение: since when has pg_stat_user_indexes.idx_scan been counting?
Hi,
We have some indexes that don't seem to be used at all.
I'd like to know since when they have not been used.
That is, the time when postgres started counting to reach the number that is in pg_stat_user_indexes.idx_scan
Is there a way to retrieve that from the database ?
Cheers,
WBL
--
"Patriotism is the conviction that your country is superior to all others because you were born in it." -- George Bernard Shaw
We have some indexes that don't seem to be used at all.
I'd like to know since when they have not been used.
That is, the time when postgres started counting to reach the number that is in pg_stat_user_indexes.idx_scan
Is there a way to retrieve that from the database ?
Cheers,
WBL
--
"Patriotism is the conviction that your country is superior to all others because you were born in it." -- George Bernard Shaw
Dne 12.5.2011 17:39, Willy-Bas Loos napsal(a): > Hi, > > We have some indexes that don't seem to be used at all. > I'd like to know since when they have not been used. > That is, the time when postgres started counting to reach the number > that is in pg_stat_user_indexes.idx_scan > > Is there a way to retrieve that from the database ? Well, not really :-( You could call pg_postmaster_start_time() to get the start time, but that has two major drawbacks (1) The stats may be actually collected for much longer, because restart does not reset them. (2) If someone called pg_stat_reset(), the stats are lost but the start time remains the same. So there really is no reliable way to do detect this. In 9.1 this is not true - there's a timestamp for each database (and global stats) to keep track of the last reset. regards Tomas
Dne 12.5.2011 17:39, Willy-Bas Loos napsal(a): > Hi, > > We have some indexes that don't seem to be used at all. > I'd like to know since when they have not been used. > That is, the time when postgres started counting to reach the number > that is in pg_stat_user_indexes.idx_scan > > Is there a way to retrieve that from the database ? BTW it's really really tricky to remove indexes once they're created. What if the index is created for a single batch process that runs once a year to close the fiscal year etc? So be very careful about this. Tomas
Then, are the index scans counted in a memory variable and written at analyze time?
--
"Patriotism is the conviction that your country is superior to all others because you were born in it." -- George Bernard Shaw
On Thu, May 12, 2011 at 8:22 PM, raghu ram <raghuchennuru@gmail.com> wrote:
"Analyze" activity will update the statistics of each catalog table.--Raghu Ram
--
"Patriotism is the conviction that your country is superior to all others because you were born in it." -- George Bernard Shaw
Tomas Vondra wrote: > BTW it's really really tricky to remove indexes once they're created. > What if the index is created for a single batch process that runs once a > year to close the fiscal year etc? > True in theory. Reports that are executing something big at the end of the year fall into three categories: 1) They touch a whole lot of the data for the year first. In this case, sequential scan is likely regardless. 2) They access data similarly to regular queries, using the same indexes. 3) They have some very specific data only they touch that is retrieved with an index. You're saying to watch out for (3); I think that's not usually the case, but that's a fair thing to warn about. Even in that case, though, it may still be worth dropping the index. Year-end processes are not usually very sensitive to whether they take a little or a long time to execute. But you will be paying to maintain the index every day while it is there. -- Greg Smith 2ndQuadrant US greg@2ndQuadrant.com Baltimore, MD PostgreSQL Training, Services, and 24x7 Support www.2ndQuadrant.us "PostgreSQL 9.0 High Performance": http://www.2ndQuadrant.com/books
Greg Smith <greg@2ndquadrant.com> writes: > You're saying to watch out for (3); I think that's not usually the case, > but that's a fair thing to warn about. Even in that case, though, it > may still be worth dropping the index. Year-end processes are not > usually very sensitive to whether they take a little or a long time to > execute. But you will be paying to maintain the index every day while > it is there. Yeah. Another idea worth considering is to have the year-end processing build the index it wants, use it, drop it. It seems unlikely that it's worth maintaining an index year-round for such infrequent usage. regards, tom lane
On Thu, May 12, 2011 at 9:09 PM, Willy-Bas Loos <willybas@gmail.com> wrote:
Hi,
We have some indexes that don't seem to be used at all.
I'd like to know since when they have not been used.
That is, the time when postgres started counting to reach the number that is in pg_stat_user_indexes.idx_scan
Is there a way to retrieve that from the database ?
"Analyze" activity will update the statistics of each catalog table.
pg_postmaster_start_time --> Retrieves the Postmaster [ PostgreSQL Instance] start time
postgres=# select pg_postmaster_start_time();
--Raghu Ram
Dne 12.5.2011 22:03, Willy-Bas Loos napsal(a): > Then, are the index scans counted in a memory variable and written at > analyze time? No, I believe raghu mixed two things - stats used by the planner and stats about access to the data (how many tuples were read using an index, etc.) Stats for the planner are stored in pg_class/pg_statistic/pg_stats catalogs and are updated by ANALYZE (either manual or automatic). This is what raghu refered to, but these stats are completely useless when looking for unused indexes. Stats about access to the data (index/seq scans, cache hit ratio etc.) are stored in pg_stat_* and pg_statio_* catalogs, and are updated after running each query. AFAIK it's not a synchronous process, but when a backend finishes a query, it sends the stats to the postmaster (and postmaster updates the catalogs). Tomas
Tomas Vondra wrote: > Stats about access to the data (index/seq scans, cache hit ratio etc.) > are stored in pg_stat_* and pg_statio_* catalogs, and are updated after > running each query. AFAIK it's not a synchronous process, but when a > backend finishes a query, it sends the stats to the postmaster (and > postmaster updates the catalogs). > Description in the docs goes over this in a little more detail http://www.postgresql.org/docs/current/static/monitoring-stats.html : "The statistics collector communicates with the backends needing information (including autovacuum) through temporary files. These files are stored in the pg_stat_tmp subdirectory...When using the statistics to monitor current activity, it is important to realize that the information does not update instantaneously. Each individual server process transmits new statistical counts to the collector just before going idle; so a query or transaction still in progress does not affect the displayed totals. Also, the collector itself emits a new report at most once per PGSTAT_STAT_INTERVAL milliseconds (500 unless altered while building the server). So the displayed information lags behind actual activity. However, current-query information collected by track_activities is always up-to-date." It's not synchronous at all. The clients create a temporary file for the statistics collector and move on. The actual statistics don't get updated until the statistics collector decides enough time has passed to bother, which defaults to at most every 500ms. -- Greg Smith 2ndQuadrant US greg@2ndQuadrant.com Baltimore, MD PostgreSQL Training, Services, and 24x7 Support www.2ndQuadrant.us "PostgreSQL 9.0 High Performance": http://www.2ndQuadrant.com/books
> It's not synchronous at all. The clients create a temporary file for > the statistics collector and move on. The actual statistics don't get > updated until the statistics collector decides enough time has passed to > bother, which defaults to at most every 500ms. Really? I thought the clients send the updates using a socket, at least that's what I see in backend/postmaster/pgstat.c (e.g. in pgstat_send_bgwriter where the data are sent, and in PgstatCollectorMain where it's read from the socket and applied). But no matter how exactly this works, this kind of stats has nothing to do with ANALYZe - it's asynchronously updated every time you run a query. regards Tomas