Обсуждение: pg_affected Change Request
I've been using PostgreSQL for about three months. So far my only objection is that "pg_affected" provides the number of attempts rather than the number of rows that are actually changed. I could easily do that with a little counter.
I think PostgreSQL should be changed to return the number of rows actually changed (whether updated, inserted or deleted).
On Wed, Feb 09, 2005 at 09:50:41PM -0500, Jan wrote: > I've been using PostgreSQL for about three months. So far my > only objection is that "pg_affected" provides the number of attempts > rather than the number of rows that are actually changed. I could > easily do that with a little counter. Please provide some context: What "pg_affected" are you talking about? How are you determining that it provides "attempts" instead of "actual"? > I think PostgreSQL should be changed to return the number of rows > actually changed (whether updated, inserted or deleted). Please provide an example of what you're doing, what results you expect or would like to see, and what results you actually get. -- Michael Fuhr http://www.fuhr.org/~mfuhr/
On Thu, Feb 10, 2005 at 05:56:33AM -0500, Jan wrote: > > I write a program that mines data from a small few websites. I revisit > those websites on a daily basis. I find a matching key (actually two fields > comprise my unique key) and with the data collected on this visit I attempt > to UPDATE an existing record. I want to know whether I just changed the > data or that the data collected is the same as on my last visit. > > I use PHP. If I check pg_affected_rows($result) I find one record is always > "affected" even when no data has actually changed. Nothing has changed so > the rows affected should be zero. The "affected" is actually "attempted". PostgreSQL stores a new version of each row regardless of whether the update changed any columns or not, so in that sense all of the rows were "affected." Presumably there's a reason for doing this, although at the moment I'm not remembering why. The following is a bit ugly, but if you want to update only those rows where a value has changed, then you could do something like this: UPDATE tablename SET col1 = <col1value>, col2 = <col2value>, ... WHERE keycol = <keyvalue> AND (col1 IS DISTINCT FROM <col1value> OR col2 IS DISTINCT FROM <col2value> ...) This statement uses IS DISTINCT FROM instead of <> so the comparisons will handle NULLs properly. If the columns are all NOT NULL then you could use <>. -- Michael Fuhr http://www.fuhr.org/~mfuhr/
Michael, I've never seen that syntax. Is that ANSI standard? The last SQL database I used did not require that syntax to return the "affected" count I needed. Is there any industry standard concerning the implementation of "affected"? Thanks ----- Original Message ----- From: "Michael Fuhr" <mike@fuhr.org> To: "Jan" <jan@fastpitchcentral.com> Cc: <pgsql-general@postgresql.org> Sent: Thursday, February 10, 2005 12:31 PM Subject: Re: [GENERAL] pg_affected Change Request > On Thu, Feb 10, 2005 at 05:56:33AM -0500, Jan wrote: > > > > I write a program that mines data from a small few websites. I revisit > > those websites on a daily basis. I find a matching key (actually two fields > > comprise my unique key) and with the data collected on this visit I attempt > > to UPDATE an existing record. I want to know whether I just changed the > > data or that the data collected is the same as on my last visit. > > > > I use PHP. If I check pg_affected_rows($result) I find one record is always > > "affected" even when no data has actually changed. Nothing has changed so > > the rows affected should be zero. The "affected" is actually "attempted". > > PostgreSQL stores a new version of each row regardless of whether > the update changed any columns or not, so in that sense all of the > rows were "affected." Presumably there's a reason for doing this, > although at the moment I'm not remembering why. > > The following is a bit ugly, but if you want to update only those > rows where a value has changed, then you could do something like > this: > > UPDATE tablename SET col1 = <col1value>, col2 = <col2value>, ... > WHERE keycol = <keyvalue> > AND (col1 IS DISTINCT FROM <col1value> OR > col2 IS DISTINCT FROM <col2value> ...) > > This statement uses IS DISTINCT FROM instead of <> so the comparisons > will handle NULLs properly. If the columns are all NOT NULL then > you could use <>. > > -- > Michael Fuhr > http://www.fuhr.org/~mfuhr/ >
On Fri, Feb 11, 2005 at 09:26:22AM -0500, Jan wrote: > > I've never seen that syntax. Is that ANSI standard? If you're referring to IS DISTINCT FROM, it's defined in SQL:1999 and SQL:2003. > The last SQL database I used did not require that syntax to return the > "affected" count I needed. That database probably had a different interpretation of "affected". MySQL 4.1.9, for example, reports the following: UPDATE foo ... Query OK, 3 rows affected (0.02 sec) Rows matched: 5 Changed: 3 Warnings: 0 > Is there any industry standard concerning the implementation of "affected"? I'm not sure if this is the correct reference, but I'm looking at a draft[1] of SQL:2003, Part 2: Foundation, 22.1 <get diagnostics management>, p. 1064, where it describes ROW_COUNT. On that page are the following paragraphs (line breaks changed): 1) If S does not contain a <search condition>, then the value of ROW_COUNT is the cardinality of T before the execution of S. 2) Otherwise, let SC be the <search condition> directly contained in S. The value of ROW_COUNT is effectively derived by executing the statement: SELECT COUNT(*) FROM T MCN WHERE SC before the execution of S. If that's an appropriate reference and if I'm reading it correctly, then PostgreSQL's behavior appears to comply with the standard. Maybe one of the developers can confirm or deny. [1] http://www.wiscorp.com/sql/sql_2003_standard.zip -- Michael Fuhr http://www.fuhr.org/~mfuhr/
On Thu, Feb 10, 2005 at 10:31:40AM -0700, Michael Fuhr wrote: > > UPDATE tablename SET col1 = <col1value>, col2 = <col2value>, ... > WHERE keycol = <keyvalue> > AND (col1 IS DISTINCT FROM <col1value> OR > col2 IS DISTINCT FROM <col2value> ...) You could also do this with a trigger. If all columns in the old and new rows are identical, then the trigger would return NULL to skip the update for that row. -- Michael Fuhr http://www.fuhr.org/~mfuhr/