Обсуждение: log_hostname and pg_stat_activity
Somehow I fantasized that log_hostname would also turn pg_stat_activity.client_addr into names instead of IP addresses. It doesn't, but perhaps it should? It would be nice to be able to configure an IP-address free setup. Or would it be worth having a separate configuration parameter for that?
On Thu, Dec 23, 2010 at 22:09, Peter Eisentraut <peter_e@gmx.net> wrote: > Somehow I fantasized that log_hostname would also turn > pg_stat_activity.client_addr into names instead of IP addresses. It > doesn't, but perhaps it should? It would be nice to be able to > configure an IP-address free setup. Or would it be worth having a > separate configuration parameter for that? It should certainly be renamed to something else if it does both, but I don't see the point of having two separate parameters between them. As long as you can use a cached version of the lookup, you're only paying the price once, after all... However, pg_stat_activity.client_addr is an inet field, not a text string, so you'd have to invent a separate field for it I think... -- Magnus Hagander Me: http://www.hagander.net/ Work: http://www.redpill-linpro.com/
On tor, 2010-12-23 at 22:21 +0100, Magnus Hagander wrote: > On Thu, Dec 23, 2010 at 22:09, Peter Eisentraut <peter_e@gmx.net> wrote: > > Somehow I fantasized that log_hostname would also turn > > pg_stat_activity.client_addr into names instead of IP addresses. It > > doesn't, but perhaps it should? It would be nice to be able to > > configure an IP-address free setup. Or would it be worth having a > > separate configuration parameter for that? > > It should certainly be renamed to something else if it does both, but > I don't see the point of having two separate parameters between them. > As long as you can use a cached version of the lookup, you're only > paying the price once, after all... > > However, pg_stat_activity.client_addr is an inet field, not a text > string, so you'd have to invent a separate field for it I think... Here is a patch that adds a client_hostname field to pg_stat_activity. It takes the hostname if it is available either by having log_hostname set or if the pg_hba.conf processing resolved it. So I think for most people who would care about this thing, it would "just work". I'm not so sure about the pgstat.c internals. Do we need the separate buffer in shared memory, as in this patch, or could we just copy the name directly into the PgBackendStatus struct? I guess not the latter, since my compiler complains about copying a string into a volatile pointer.
Вложения
Excerpts from Peter Eisentraut's message of sáb ene 15 13:15:45 -0300 2011: > Here is a patch that adds a client_hostname field to pg_stat_activity. > It takes the hostname if it is available either by having log_hostname > set or if the pg_hba.conf processing resolved it. So I think for most > people who would care about this thing, it would "just work". > > I'm not so sure about the pgstat.c internals. Do we need the separate > buffer in shared memory, as in this patch, or could we just copy the > name directly into the PgBackendStatus struct? I guess not the latter, > since my compiler complains about copying a string into a volatile > pointer. I think you should do it like clientaddr is handled: fill a string with the value from MyProcPort, then make the PgBackendStatus point to that. BTW you need to touch BackendStatusShmemSize if you change the routine below that (but AFAIU that should be taken out). ... I'm confused ... why isn't this code broken? I don't understand why isn't clientaddr clobbered the next time someone uses the stack space, given that it's not allocated anywhere. -- Álvaro Herrera <alvherre@commandprompt.com> The PostgreSQL Company - Command Prompt, Inc. PostgreSQL Replication, Consulting, Custom Development, 24x7 support
Here is my review for this patch Submission Review ---------------- -Patch applies cleanly -Patch does not include documentation changes. At a minimum: update the table that lists what pg_stat_activity and pg_stat_replication includes in monitoring.sgml but I propose more below. -No tests are included but writing unit tests that depend on produce the same output involving the hostname of the client is not possible. Usability review ---------------- See my comments below in the testing section. The patch does do what it says but the log_hostname issue is a usability issue (it not being obvious why you get only null owhen log_hostname is turned off). Documenting it might be fine. If log_hostname were new to 9.1 I would encourage renaming it to something that implies it does more than just control logging output but I don't see this as a good enough reason to rename a parameter. I think being able to see the hostnames connections in pg_stat_activity come from is useful and it is a feature we don't already have. Feature Test ---------------- If my connection is authorized through a line in pg_hba that uses client_hostname then the column shows what I expect even with log_hostname set to off. However if I connect with a line in pg_hba that matches on an IP network then my client_hostname is always null unless log_hostname is set to true. This is consistent with the behavior you describe but I think the average user will find it a bit confusing. Having a column that is always null unless a GUC is set is less than ideal but I understand why log_hostname isn't on by default. I would be inclined to add an entry for these views in catalogs.sgml that where we can then give users a pointer that they need to set log_hostname to get anything out of this column. If I connect through unix sockets (say psql -h /tmp --port 5433) I was also expecting to see "[local]" in client_hostname but I am just getting NULL. This this lookup is static I don't see why it should be dependent on log_hostname (even with log_hostname set I'm not seeing [local]) I have not tested this with ipv6 Performance Review ------------------ The lookup is done when the connection is established not each time the view is queried. I don't see any performance issues Coding Review ------------- As Alvaro pointed out BackendStatusShmemSize should be updated. To answer his question about why clientaddr works: clientaddr is a SockAddr which is a structure not a pointer so the data gets copied by value to beentry. That won't work for strings, I have no issues with how your allocating space in beentry and then strlcpy'ing the values. I see no issues with the implementation I'm marking this as 'Waiting for Author' pending documentation changes and maybe a fix the behaviour with unix sockets Steve
Excerpts from Steve Singer's message of mar ene 18 21:24:25 -0300 2011: > Coding Review > ------------- > > As Alvaro pointed out BackendStatusShmemSize should be updated. > > To answer his question about why clientaddr works: clientaddr is a > SockAddr which is a structure not a pointer so the data gets copied by > value to beentry. That won't work for strings, I have no issues with > how your allocating space in beentry and then strlcpy'ing the values. Doh, of course. Thanks. -- Álvaro Herrera <alvherre@commandprompt.com> The PostgreSQL Company - Command Prompt, Inc. PostgreSQL Replication, Consulting, Custom Development, 24x7 support
On Tue, Jan 18, 2011 at 7:24 PM, Steve Singer <ssinger_pg@sympatico.ca> wrote: > If my connection is authorized through a line in pg_hba that uses > client_hostname then the column shows what I expect even with log_hostname > set to off. > > However if I connect with a line in pg_hba that matches on an IP network > then my client_hostname is always null unless log_hostname is set to true. > This is consistent with the behavior you describe but I think the average > user will find it a bit confusing. I agree. I'm not sure there's enough value to this feature to warrant the amount of user confusion this is likely to produce. But if we're going to do it anyway we at least need to do this: > I would be inclined to add an entry for these views in catalogs.sgml that > where we can then give users a pointer that they need to set log_hostname to > get anything out of this column. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
On sön, 2011-01-30 at 15:03 -0500, Robert Haas wrote: > On Tue, Jan 18, 2011 at 7:24 PM, Steve Singer > <ssinger_pg@sympatico.ca> wrote: > > If my connection is authorized through a line in pg_hba that uses > > client_hostname then the column shows what I expect even with > log_hostname > > set to off. > > > > However if I connect with a line in pg_hba that matches on an IP > network > > then my client_hostname is always null unless log_hostname is set to > true. > > This is consistent with the behavior you describe but I think the > average > > user will find it a bit confusing. > > I agree. I'm not sure there's enough value to this feature to warrant > the amount of user confusion this is likely to produce. What alternative behavior would you suggest?
On Sun, Jan 30, 2011 at 4:30 PM, Peter Eisentraut <peter_e@gmx.net> wrote: > On sön, 2011-01-30 at 15:03 -0500, Robert Haas wrote: >> On Tue, Jan 18, 2011 at 7:24 PM, Steve Singer >> <ssinger_pg@sympatico.ca> wrote: >> > If my connection is authorized through a line in pg_hba that uses >> > client_hostname then the column shows what I expect even with >> log_hostname >> > set to off. >> > >> > However if I connect with a line in pg_hba that matches on an IP >> network >> > then my client_hostname is always null unless log_hostname is set to >> true. >> > This is consistent with the behavior you describe but I think the >> average >> > user will find it a bit confusing. >> >> I agree. I'm not sure there's enough value to this feature to warrant >> the amount of user confusion this is likely to produce. > > What alternative behavior would you suggest? I don't know. As I said in my previous email, I'd either (a) forget the whole thing or (b) make sure that the documentation is *extremely* explicit about what the behavior is. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
On tis, 2011-01-18 at 19:24 -0500, Steve Singer wrote: > However if I connect with a line in pg_hba that matches on an IP > network then my client_hostname is always null unless log_hostname is > set to true. This is consistent with the behavior you describe but I > think the average user will find it a bit confusing. Having a column > that is always null unless a GUC is set is less than ideal but I > understand why log_hostname isn't on by default. Well, we have all these track_* variables, which also control what appears in the statistics views. After thinking about this some more, I think it might be better to be less cute and forget about the interaction with the pg_hba.conf hostname behavior. That is, the host name is set if and only if log_hostname is on. Otherwise you will for example have an inconsistency between the statistics views and the server log, unless you want to argue that we can override the log_hostname setting based on what happens in pg_hba.conf. That's just getting too weird.
On Tue, Feb 1, 2011 at 1:09 PM, Peter Eisentraut <peter_e@gmx.net> wrote: > On tis, 2011-01-18 at 19:24 -0500, Steve Singer wrote: >> However if I connect with a line in pg_hba that matches on an IP >> network then my client_hostname is always null unless log_hostname is >> set to true. This is consistent with the behavior you describe but I >> think the average user will find it a bit confusing. Having a column >> that is always null unless a GUC is set is less than ideal but I >> understand why log_hostname isn't on by default. > > Well, we have all these track_* variables, which also control what > appears in the statistics views. > > After thinking about this some more, I think it might be better to be > less cute and forget about the interaction with the pg_hba.conf hostname > behavior. That is, the host name is set if and only if log_hostname is > on. +1 for doing it that way. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
On Tue, Feb 1, 2011 at 1:33 PM, Robert Haas <robertmhaas@gmail.com> wrote: > On Tue, Feb 1, 2011 at 1:09 PM, Peter Eisentraut <peter_e@gmx.net> wrote: >> On tis, 2011-01-18 at 19:24 -0500, Steve Singer wrote: >>> However if I connect with a line in pg_hba that matches on an IP >>> network then my client_hostname is always null unless log_hostname is >>> set to true. This is consistent with the behavior you describe but I >>> think the average user will find it a bit confusing. Having a column >>> that is always null unless a GUC is set is less than ideal but I >>> understand why log_hostname isn't on by default. >> >> Well, we have all these track_* variables, which also control what >> appears in the statistics views. >> >> After thinking about this some more, I think it might be better to be >> less cute and forget about the interaction with the pg_hba.conf hostname >> behavior. That is, the host name is set if and only if log_hostname is >> on. > > +1 for doing it that way. I think there are no outstanding issues with this patch of any significance, so I'm marking it Ready for Committer. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
On 11-02-10 10:13 AM, Robert Haas wrote: > On Tue, Feb 1, 2011 at 1:33 PM, Robert Haas<robertmhaas@gmail.com> wrote: >> On Tue, Feb 1, 2011 at 1:09 PM, Peter Eisentraut<peter_e@gmx.net> wrote: >>> On tis, 2011-01-18 at 19:24 -0500, Steve Singer wrote: >>>> However if I connect with a line in pg_hba that matches on an IP >>>> network then my client_hostname is always null unless log_hostname is >>>> set to true. This is consistent with the behavior you describe but I >>>> think the average user will find it a bit confusing. Having a column >>>> that is always null unless a GUC is set is less than ideal but I >>>> understand why log_hostname isn't on by default. >>> Well, we have all these track_* variables, which also control what >>> appears in the statistics views. >>> >>> After thinking about this some more, I think it might be better to be >>> less cute and forget about the interaction with the pg_hba.conf hostname >>> behavior. That is, the host name is set if and only if log_hostname is >>> on. >> +1 for doing it that way. > I think there are no outstanding issues with this patch of any > significance, so I'm marking it Ready for Committer. > Was there an uodated version of this patch I missed? The original patch needed some sort of documentation saying that having something showup in the new pg_stat_activity columns is controlled by log_hostname. Above Peter and you seem to agree that having the having the line matched in pg_hba being a controlling factor should be removed but I haven't seen an updated patch that implements that.
On Thu, Feb 10, 2011 at 10:22 AM, Steve Singer <ssinger_pg@sympatico.ca> wrote: > On 11-02-10 10:13 AM, Robert Haas wrote: >> >> On Tue, Feb 1, 2011 at 1:33 PM, Robert Haas<robertmhaas@gmail.com> wrote: >>> >>> On Tue, Feb 1, 2011 at 1:09 PM, Peter Eisentraut<peter_e@gmx.net> wrote: >>>> >>>> On tis, 2011-01-18 at 19:24 -0500, Steve Singer wrote: >>>>> >>>>> However if I connect with a line in pg_hba that matches on an IP >>>>> network then my client_hostname is always null unless log_hostname is >>>>> set to true. This is consistent with the behavior you describe but I >>>>> think the average user will find it a bit confusing. Having a column >>>>> that is always null unless a GUC is set is less than ideal but I >>>>> understand why log_hostname isn't on by default. >>>> >>>> Well, we have all these track_* variables, which also control what >>>> appears in the statistics views. >>>> >>>> After thinking about this some more, I think it might be better to be >>>> less cute and forget about the interaction with the pg_hba.conf hostname >>>> behavior. That is, the host name is set if and only if log_hostname is >>>> on. >>> >>> +1 for doing it that way. >> >> I think there are no outstanding issues with this patch of any >> significance, so I'm marking it Ready for Committer. >> > Was there an uodated version of this patch I missed? > > The original patch needed some sort of documentation saying that having > something showup in the new pg_stat_activity columns is controlled by > log_hostname. > > Above Peter and you seem to agree that having the having the line matched in > pg_hba being a controlling factor should be removed but I haven't seen an > updated patch that implements that. I was assuming those changes were sufficiently trivial that they could be made at commit-time, especially if Peter is committing it himself. Of course if he'd like a re-review, he can always post an updated patch, but I just thought that was overly pedantic in this particular case. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
On 11-02-10 10:32 AM, Robert Haas wrote: > > I was assuming those changes were sufficiently trivial that they could > be made at commit-time, especially if Peter is committing it himself. > Of course if he'd like a re-review, he can always post an updated > patch, but I just thought that was overly pedantic in this particular > case. > Sounds reasonable.
On Thu, Feb 10, 2011 at 10:40 AM, Steve Singer <ssinger_pg@sympatico.ca> wrote: > On 11-02-10 10:32 AM, Robert Haas wrote: >> I was assuming those changes were sufficiently trivial that they could >> be made at commit-time, especially if Peter is committing it himself. >> Of course if he'd like a re-review, he can always post an updated >> patch, but I just thought that was overly pedantic in this particular >> case. > > Sounds reasonable. I rebased this patch, wrote documentation, and fixed BackendStatusShmemSize. PFA. On further review, I'm inclined to go with Peter's original approach to displaying the hostname: show it if we have it, and don't if we don't. It's not that hard to document the relevant criteria, and it seems silly to suppress the information if we have it. I had a thought of declaring st_clienthostname as char st_clienthostname[NAMEDATALEN] rather than char *st_clienthostname. That would simplify the initialization code. But I believe that the protocol used for updating this data structure is unsafe unless the whole thing fits into a single cache line. I'm not positive that's going to be true on every architecture even as things stand. On my Mac, with this patch, it's 208 bytes (which means that it's presumably 200 without the patch, and that it would be 264 with the alternate approach proposed above). According to that font of knowledge, Wikipedia, the size of a cache line can vary from 8 to 512 bytes [citation needed]. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company