Обсуждение: [HACKERS] Explicit relation name in VACUUM VERBOSE log
Hi all, Currently vacuum verbose outputs vacuum logs as follows. The first log message INFO: vacuuming "public.hoge" writes the relation name with schema name but subsequent vacuum logs output only relation name without schema name. I've encountered a situation where there are some same name tables in different schemas and the concurrent vacuum logs made me hard to distinguish tables. Is there any reasons why we don't write an explicit name in vacuum verbose logs? If not, can we add schema names to be more clearly? =# vacuum verbose hoge; INFO: vacuuming "public.hoge" INFO: index "hoge_idx" now contains 100000 row versions in 276 pages DETAIL: 0 index row versions were removed. 0 index pages have been deleted, 0 are currently reusable. CPU: user: 0.00 s, system: 0.00 s, elapsed: 0.00 s. INFO: "hoge": found 0 removable, 108 nonremovable row versions in 1 out of 443 pages DETAIL: 0 dead row versions cannot be removed yet, oldest xmin: 559 There were 0 unused item pointers. Skipped 0 pages due to buffer pins, 0 frozen pages. 0 pages are entirely empty. CPU: user: 0.00 s, system: 0.00 s, elapsed: 0.00 s. VACUUM Attached patch and will add to commit fest. Regards, -- Masahiko Sawada NIPPON TELEGRAPH AND TELEPHONE CORPORATION NTT Open Source Software Center -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Вложения
On Tue, Aug 15, 2017 at 10:27 AM, Masahiko Sawada <sawada.mshk@gmail.com> wrote: > Currently vacuum verbose outputs vacuum logs as follows. The first log > message INFO: vacuuming "public.hoge" writes the relation name with > schema name but subsequent vacuum logs output only relation name > without schema name. I've encountered a situation where there are some > same name tables in different schemas and the concurrent vacuum logs > made me hard to distinguish tables. Is there any reasons why we don't > write an explicit name in vacuum verbose logs? If not, can we add > schema names to be more clearly? That's definitely a good idea. lazy_vacuum_rel() uses in one place dbname.schname.relname for autovacuum. This is an inconsistent bit, but that's not really worth changing and there is always log_line_prefix = '%d'. In vacuum_rel()@vacuum.c, there are a couple of logs that could be improved as well with the schema name. -- Michael
Michael Paquier wrote: > On Tue, Aug 15, 2017 at 10:27 AM, Masahiko Sawada <sawada.mshk@gmail.com> wrote: > > Currently vacuum verbose outputs vacuum logs as follows. The first log > > message INFO: vacuuming "public.hoge" writes the relation name with > > schema name but subsequent vacuum logs output only relation name > > without schema name. I've encountered a situation where there are some > > same name tables in different schemas and the concurrent vacuum logs > > made me hard to distinguish tables. Is there any reasons why we don't > > write an explicit name in vacuum verbose logs? If not, can we add > > schema names to be more clearly? > > That's definitely a good idea. lazy_vacuum_rel() uses in one place > dbname.schname.relname for autovacuum. This is an inconsistent bit, > but that's not really worth changing and there is always > log_line_prefix = '%d'. Worth keeping in mind that INFO messages do not normally go to the server log, but rather only to the client. If it were a problem at the server side, you could also suggest adding %p to the log line prefix to disambiguate. Maybe the scenario where this is a real problem is vacuumdb -j ... > In vacuum_rel()@vacuum.c, there are a couple of logs that could be > improved as well with the schema name. I agree that there's a lot of room for improvement there. If I'm allowed some scope creep, I'd say that gathering detailed vacuum info from both autovacuum and user-invoked vacuum in a central place would be very welcome. -- Álvaro Herrera https://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
On Tue, Aug 15, 2017 at 11:14 AM, Alvaro Herrera <alvherre@2ndquadrant.com> wrote: >> In vacuum_rel()@vacuum.c, there are a couple of logs that could be >> improved as well with the schema name. > > I agree that there's a lot of room for improvement there. If I'm > allowed some scope creep, I'd say that gathering detailed vacuum info > from both autovacuum and user-invoked vacuum in a central place would be > very welcome. Hm. I am not sure what you have in mind here. The existing logs are spread all other the place, and it is useful to VERBOSE information when things are being run, bot just once a run has been done (if what you mean here is to store vacuum verbose-related stats for each relation in memory and then spawn it all at once, or we could log this info and also store them). -- Michael
Michael Paquier wrote: > On Tue, Aug 15, 2017 at 11:14 AM, Alvaro Herrera > <alvherre@2ndquadrant.com> wrote: > >> In vacuum_rel()@vacuum.c, there are a couple of logs that could be > >> improved as well with the schema name. > > > > I agree that there's a lot of room for improvement there. If I'm > > allowed some scope creep, I'd say that gathering detailed vacuum info > > from both autovacuum and user-invoked vacuum in a central place would be > > very welcome. > > Hm. I am not sure what you have in mind here. I'm thinking that this data is useful to analyze as a stream of related events, rather than as individual data points. Grepping logs in order to extract the numbers is lame and slow. If you additionally have to mix data that comes from vacuumdb with the autovacuum data from the server log, that's much worse. Maybe you remember this thread? https://www.postgresql.org/message-id/flat/509300F7.5000803%402ndQuadrant.com I'm thinking it would be damn convenient to have both user-invoked VACUUM and autovacuum emit some sort of event data which is saved somewhere for later analysis. I *did* mention this was scope creep ;-) -- Álvaro Herrera https://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
On Tue, Aug 15, 2017 at 12:45 PM, Alvaro Herrera <alvherre@2ndquadrant.com> wrote: > I'm thinking that this data is useful to analyze as a stream of related > events, rather than as individual data points. Grepping logs in order to > extract the numbers is lame and slow. If you additionally have to mix > data that comes from vacuumdb with the autovacuum data from the server > log, that's much worse. Maybe you remember this thread? > https://www.postgresql.org/message-id/flat/509300F7.5000803%402ndQuadrant.com > I'm thinking it would be damn convenient to have both user-invoked > VACUUM and autovacuum emit some sort of event data which is saved > somewhere for later analysis. Ah... A tracker for the history activity. With these kind of things there need to be a careful design to control data retention as events keep piling up. Yeah that would be useful to save CPU grepping for dedicated logs. And we could just have an API to a history table to which is sent a timestamp, an event name and an object like a JSON blob or a text array. The stats collector could do the cleanup of the oldest records by maintaining a counter to know the number of records it should keep around, defined by a GUC, or a duration to allow retention of history for this period, say the last X days of events. For vacuum logs, it would be possible to get things done with saving this information into a private memory area, and then offer a hook to let callers do what they want with this data... Not extensible and ugly, but that would do the work. Still I don't like that. -- Michael
On Tue, Aug 15, 2017 at 11:14 AM, Alvaro Herrera <alvherre@2ndquadrant.com> wrote: > Michael Paquier wrote: >> On Tue, Aug 15, 2017 at 10:27 AM, Masahiko Sawada <sawada.mshk@gmail.com> wrote: >> > Currently vacuum verbose outputs vacuum logs as follows. The first log >> > message INFO: vacuuming "public.hoge" writes the relation name with >> > schema name but subsequent vacuum logs output only relation name >> > without schema name. I've encountered a situation where there are some >> > same name tables in different schemas and the concurrent vacuum logs >> > made me hard to distinguish tables. Is there any reasons why we don't >> > write an explicit name in vacuum verbose logs? If not, can we add >> > schema names to be more clearly? >> >> That's definitely a good idea. lazy_vacuum_rel() uses in one place >> dbname.schname.relname for autovacuum. This is an inconsistent bit, >> but that's not really worth changing and there is always >> log_line_prefix = '%d'. > > Worth keeping in mind that INFO messages do not normally go to the > server log, but rather only to the client. If it were a problem at the > server side, you could also suggest adding %p to the log line prefix to > disambiguate. Maybe the scenario where this is a real problem is > vacuumdb -j ... Yeah, the situation I encountered is that. Invoke vaucuumdb -j and save its logs to check later. Regards, -- Masahiko Sawada NIPPON TELEGRAPH AND TELEPHONE CORPORATION NTT Open Source Software Center
On 15 August 2017 at 02:27, Masahiko Sawada <sawada.mshk@gmail.com> wrote: > Is there any reasons why we don't > write an explicit name in vacuum verbose logs? None. Sounds like a good idea. > If not, can we add > schema names to be more clearly? Yes, we can. I'm not sure why you would do this only for VACUUM though? I see many messages in various places that need same treatment I would also be inclined to do this by changing only the string presented, not the actual message string. e.g. replace RelationGetRelationName() with RelationGetOptionallyQualifiedRelationName() and then control whether we include this new behaviour with log_qualified_object_names = on | off -- Simon Riggs http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
On Tue, Aug 22, 2017 at 3:23 PM, Simon Riggs <simon@2ndquadrant.com> wrote: > On 15 August 2017 at 02:27, Masahiko Sawada <sawada.mshk@gmail.com> wrote: > >> Is there any reasons why we don't >> write an explicit name in vacuum verbose logs? > > None. Sounds like a good idea. > >> If not, can we add >> schema names to be more clearly? > > Yes, we can. I'm not sure why you would do this only for VACUUM > though? I see many messages in various places that need same treatment Yeah, I was thinking that too. But since there are a lot of message that output relation name I proposed this for the first step. > I would also be inclined to do this by changing only the string > presented, not the actual message string. +1 > e.g. > replace RelationGetRelationName() with > RelationGetOptionallyQualifiedRelationName() > and then control whether we include this new behaviour with > log_qualified_object_names = on | off Is there any case where we don't want to get non-qualified object names? If users want to get the same log message as what they got so far, it would be better to have a GUC that allows us to switch between the existing behavior and the forcibly logging qualified object names. Regards, -- Masahiko Sawada NIPPON TELEGRAPH AND TELEPHONE CORPORATION NTT Open Source Software Center
On Wed, Aug 23, 2017 at 10:59 AM, Masahiko Sawada <sawada.mshk@gmail.com> wrote: > On Tue, Aug 22, 2017 at 3:23 PM, Simon Riggs <simon@2ndquadrant.com> wrote: >> e.g. >> replace RelationGetRelationName() with >> RelationGetOptionallyQualifiedRelationName() >> and then control whether we include this new behaviour with >> log_qualified_object_names = on | off > > Is there any case where we don't want to get non-qualified object > names? If users want to get the same log message as what they got so > far, it would be better to have a GUC that allows us to switch between > the existing behavior and the forcibly logging qualified object names. I can imagine plenty of cases where providing more information is valuable, but not really any where it makes more sense to provide less information, so -1 for a GUC to control such behavior. I would imagine that people are not going to set it anyway. A RangeVar may not set the schema_name, so I would suggest to rely on that to decide if the error messages show the schema name or name. Still we are only talking about two messages in the vacuum code paths, which are the ones close to the checks where is assigned the OID of the relation with a RangeVar. -- Michael
On 23 August 2017 at 08:18, Michael Paquier <michael.paquier@gmail.com> wrote: > On Wed, Aug 23, 2017 at 10:59 AM, Masahiko Sawada <sawada.mshk@gmail.com> wrote: >> On Tue, Aug 22, 2017 at 3:23 PM, Simon Riggs <simon@2ndquadrant.com> wrote: >>> e.g. >>> replace RelationGetRelationName() with >>> RelationGetOptionallyQualifiedRelationName() >>> and then control whether we include this new behaviour with >>> log_qualified_object_names = on | off >> >> Is there any case where we don't want to get non-qualified object >> names? If users want to get the same log message as what they got so >> far, it would be better to have a GUC that allows us to switch between >> the existing behavior and the forcibly logging qualified object names. > > I can imagine plenty of cases where providing more information is > valuable, but not really any where it makes more sense to provide less > information, so -1 for a GUC to control such behavior. I would imagine > that people are not going to set it anyway. A RangeVar may not set the > schema_name, so I would suggest to rely on that to decide if the error > messages show the schema name or name. We can put in the GUC if there are objections about backwards compaibility, so I am OK to leave it out. > Still we are only talking about > two messages in the vacuum code paths, which are the ones close to the > checks where is assigned the OID of the relation with a RangeVar. The proposal is to change all log messages so we have consistency, not just VACUUM. -- Simon Riggs http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
Alvaro Herrera <alvherre@2ndquadrant.com> writes: > Michael Paquier wrote: >> Hm. I am not sure what you have in mind here. > I'm thinking that this data is useful to analyze as a stream of related > events, rather than as individual data points. Grepping logs in order to > extract the numbers is lame and slow. Yes. And there is a bigger issue here, which is that the output of VACUUM VERBOSE is meant to be sent to the client for *human* readability. (As you noted, INFO-level messages don't normally go to the server log in the first place, and that's not by accident.) Repeating the full table name in every line will be really annoying for that primary use-case. I am not sure what we want to do to address Masahiko-san's use-case, but ideally his workflow wouldn't involve log-scraping at all. It definitely shouldn't involve depending on INFO messages --- for one thing, what happens when those get translated? regards, tom lane
On Thu, Aug 24, 2017 at 11:35 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote: > Alvaro Herrera <alvherre@2ndquadrant.com> writes: >> Michael Paquier wrote: >>> Hm. I am not sure what you have in mind here. > >> I'm thinking that this data is useful to analyze as a stream of related >> events, rather than as individual data points. Grepping logs in order to >> extract the numbers is lame and slow. > > Yes. And there is a bigger issue here, which is that the output of > VACUUM VERBOSE is meant to be sent to the client for *human* readability. > (As you noted, INFO-level messages don't normally go to the server log > in the first place, and that's not by accident.) Repeating the full table > name in every line will be really annoying for that primary use-case. > I am not sure what we want to do to address Masahiko-san's use-case, but > ideally his workflow wouldn't involve log-scraping at all. The use-case I had is that I run vacuumdb *without -j option* and save all verbose logs into a text file, and then checking it later. I said vacuumdb with -j option before but it was wrong. It cannot happen two vacuum verbose logs are emitted mixed together even if -j option is specified. I sometimes search a particular table/index from the logs but also in that case it was hard to distinguish logs. This is still not primary case though. Regards, -- Masahiko Sawada NIPPON TELEGRAPH AND TELEPHONE CORPORATION NTT Open Source Software Center
On Tue, Aug 22, 2017 at 2:23 AM, Simon Riggs <simon@2ndquadrant.com> wrote: > Yes, we can. I'm not sure why you would do this only for VACUUM > though? I see many messages in various places that need same treatment I'm skeptical about the idea of doing this too generally. rhaas=> select * from foo; ERROR: permission denied for relation foo Do we really want to start saying public.foo in all such error messages? To me, that's occasionally helpful but more often just useless chatter. One problem with making this behavior optional is that we'd then need two separate translatable strings in every case, one saying "table %s has problems" and the other saying "table %s.%s has problems". Maybe we could avoid that via some clever trick, but that's how we're doing it in some existing cases. I have a feeling we're making a small patch with a narrow goal into a giant patch for which everyone has a different goal. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
> On 29 Aug 2017, at 17:21, Robert Haas <robertmhaas@gmail.com> wrote: > > On Tue, Aug 22, 2017 at 2:23 AM, Simon Riggs <simon@2ndquadrant.com> wrote: >> Yes, we can. I'm not sure why you would do this only for VACUUM >> though? I see many messages in various places that need same treatment > > I'm skeptical about the idea of doing this too generally. > > rhaas=> select * from foo; > ERROR: permission denied for relation foo > > Do we really want to start saying public.foo in all such error > messages? To me, that's occasionally helpful but more often just > useless chatter. > > One problem with making this behavior optional is that we'd then need > two separate translatable strings in every case, one saying "table %s > has problems" and the other saying "table %s.%s has problems". Maybe > we could avoid that via some clever trick, but that's how we're doing > it in some existing cases. > > I have a feeling we're making a small patch with a narrow goal into a > giant patch for which everyone has a different goal. Based on the concerns raised in this thread which are left to address or resolve, and the fact that the patch has been without update during the CF, I’m marking this Returned with Feedback. Please re-submit a new version of the patch to a future commitfest. cheers ./daniel -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
On Mon, Oct 2, 2017 at 4:33 PM, Daniel Gustafsson <daniel@yesql.se> wrote: >> On 29 Aug 2017, at 17:21, Robert Haas <robertmhaas@gmail.com> wrote: >> >> On Tue, Aug 22, 2017 at 2:23 AM, Simon Riggs <simon@2ndquadrant.com> wrote: >>> Yes, we can. I'm not sure why you would do this only for VACUUM >>> though? I see many messages in various places that need same treatment >> >> I'm skeptical about the idea of doing this too generally. >> >> rhaas=> select * from foo; >> ERROR: permission denied for relation foo >> >> Do we really want to start saying public.foo in all such error >> messages? To me, that's occasionally helpful but more often just >> useless chatter. >> >> One problem with making this behavior optional is that we'd then need >> two separate translatable strings in every case, one saying "table %s >> has problems" and the other saying "table %s.%s has problems". Maybe >> we could avoid that via some clever trick, but that's how we're doing >> it in some existing cases. >> >> I have a feeling we're making a small patch with a narrow goal into a >> giant patch for which everyone has a different goal. > > Based on the concerns raised in this thread which are left to address or > resolve, and the fact that the patch has been without update during the CF, I’m > marking this Returned with Feedback. Please re-submit a new version of the > patch to a future commitfest. > After more thought, I'd like to focus this work on only log messages related to VACUUM. The making all logs show explicit names might not be good solution so we can leave the more generic solution for other log messages. But especially for VACUUM VERBOSE log, since the log messages could be very long when the table has multiple indices this patch would be useful for users. Since the previous patch conflicts with current HEAD, attached the updated patch. Regards, -- Masahiko Sawada NIPPON TELEGRAPH AND TELEPHONE CORPORATION NTT Open Source Software Center -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers