Обсуждение: EncodeDateTime performance

Поиск
Список
Период
Сортировка

EncodeDateTime performance

От
George McCollister
Дата:
I'm trying to optimize postgres performance on a headless solid state
hardware platform (no fans or disks). I have the database stored on a
USB 2.0 flash drive (hdparm benchmarks reads at 10 MB/s). Performance is
limited by the 533Mhz CPU.

Hardware:
IXP425 XScale (big endian) 533Mhz 64MB RAM
USB 2.0 Flash Drive

Software:
Linux 2.6.21.4
postgres 8.2.5

I created a fresh database using initdb, then added one table.

Here is the create table:
CREATE TABLE archivetbl
(
  "DateTime" timestamp without time zone,
  "StationNum" smallint,
  "DeviceDateTime" timestamp without time zone,
  "DeviceNum" smallint,
  "Tagname" character(64),
  "Value" double precision,
  "Online" boolean
)
WITH (OIDS=FALSE);
ALTER TABLE archivetbl OWNER TO novatech;

I populated the table with 38098 rows.

I'm doing this simple query:
select * from archivetbl;

It takes 79 seconds to complete the query (when postgres is compiled
with -O2). I'm running the query from pgadmin3 over TCP/IP.

oprofile is showing that memset (via dopr) is using about 60% of the CPU. I traced back further and noticed most of the
usagewas coming from EncodeDateTime. 

I'm not quite sure why oprofile is showing that memset is hogging so much CPU. Regardless, I found way to eliminate
mostof the sprintf calls that were taking place in my situation. 

I made some modifications to EncodeDateTime and have attached them as a patch. These changes alone reduced the query
timeof the "select *  from archivetbl;" from 79 seconds to just 35 seconds. 

This patch is against 8.2.5. Since I'm cross compiling changing versions is a bit of a pain, but if someone thinks the
performancehas changed much in this area I could probably get the latest version cross compiling. 

Regards,
George McCollister


diff -Naur postgresql-8.2.5/src/backend/utils/adt/datetime.c postgresql-8.2.5.new/src/backend/utils/adt/datetime.c
--- postgresql-8.2.5/src/backend/utils/adt/datetime.c    2007-06-12 10:58:39.000000000 -0500
+++ postgresql-8.2.5.new/src/backend/utils/adt/datetime.c    2008-09-15 12:16:32.000000000 -0500
@@ -3287,6 +3287,53 @@
     return TRUE;
 }    /* EncodeTimeOnly() */

+void ymdhm(char * buf, int year, int mon, int day, int hour, int min)
+{
+    buf[0] = (char)((year / 1000) % 10) + '0';
+    buf[1] = (char)((year / 100) % 10) + '0';
+    buf[2] = (char)((year / 10) % 10) + '0';
+    buf[3] = (char)(year % 10) + '0';
+    buf[4] = '-';
+    buf[5] = (char)((mon / 10) % 10) + '0';
+    buf[6] = (char)(mon % 10) + '0';
+    buf[7] = '-';
+    buf[8] = (char)((day / 10) % 10) + '0';
+    buf[9] = (char)(day % 10) + '0';
+    buf[10] = ' ';
+    buf[11] = (char)((hour / 10) % 10) + '0';
+    buf[12] = (char)(hour % 10) + '0';
+    buf[13] = ':';
+    buf[14] = (char)((min / 10) % 10) + '0';
+    buf[15] = (char)(min % 10) + '0';
+    buf[16] = '\0';
+}
+
+void append_seconds(char * buf, int sec)
+{
+    buf[0] = ':';
+    buf[1] = (char)((sec / 10) % 10) + '0';
+    buf[2] = (char)(sec % 10) + '0';
+    buf[3] = '\0';
+}
+
+#ifdef HAVE_INT64_TIMESTAMP
+void append_seconds_and_fsecs(char * buf, int sec, fsec_t fsec)
+{
+    buf[0] = ':';
+    buf[1] = (char)((sec / 10) % 10) + '0';
+    buf[2] = (char)(sec % 10) + '0';
+    buf[3] = '.';
+    buf[4] = (char)((fsec / 100000) % 10) + '0';
+    buf[5] = (char)((fsec / 10000) % 10) + '0';
+    buf[6] = (char)((fsec / 1000) % 10) + '0';
+    buf[7] = (char)((fsec / 100) % 10) + '0';
+    buf[8] = (char)((fsec / 10) % 10) + '0';
+    buf[9] = (char)(fsec % 10) + '0';
+    buf[10] = '\0';
+}
+#endif
+
+

 /* EncodeDateTime()
  * Encode date and time interpreted as local time.
@@ -3315,9 +3362,8 @@
         case USE_ISO_DATES:
             /* Compatible with ISO-8601 date formats */

-            sprintf(str, "%04d-%02d-%02d %02d:%02d",
-                    (tm->tm_year > 0) ? tm->tm_year : -(tm->tm_year - 1),
-                    tm->tm_mon, tm->tm_mday, tm->tm_hour, tm->tm_min);
+            ymdhm(str, (tm->tm_year > 0) ? tm->tm_year : -(tm->tm_year - 1),
+                tm->tm_mon, tm->tm_mday, tm->tm_hour, tm->tm_min);

             /*
              * Print fractional seconds if any.  The field widths here should
@@ -3329,7 +3375,7 @@
 #ifdef HAVE_INT64_TIMESTAMP
             if (fsec != 0)
             {
-                sprintf(str + strlen(str), ":%02d.%06d", tm->tm_sec, fsec);
+                append_seconds_and_fsecs(str + strlen(str), tm->tm_sec, fsec);
                 TrimTrailingZeros(str);
             }
 #else
@@ -3340,7 +3386,7 @@
             }
 #endif
             else
-                sprintf(str + strlen(str), ":%02d", tm->tm_sec);
+                append_seconds(str + strlen(str), tm->tm_sec);

             /*
              * tzp == NULL indicates that we don't want *any* time zone info

Re: EncodeDateTime performance

От
Heikki Linnakangas
Дата:
George McCollister wrote:
> oprofile is showing that memset (via dopr) is using about 60% of the CPU. I traced back further and noticed most of
theusage was coming from EncodeDateTime.
 
> 
> I'm not quite sure why oprofile is showing that memset is hogging so much CPU. Regardless, I found way to eliminate
mostof the sprintf calls that were taking place in my situation.
 
> 
> I made some modifications to EncodeDateTime and have attached them as a patch. These changes alone reduced the query
timeof 
 
the "select *  from archivetbl;" from 79 seconds to just 35 seconds.

I remember I saw a similar effect some time ago, in a test case of 
dumping a table with lots of timestamp columns. It was on 32-bit Linux, 
which uses the glibc sprintf, instead of the one in src/port, and the 
time wasn't spent in MemSet but in the glibc sprintf. However, the fix I 
came up with was exactly the same: replace sprintf with functions like 
that, which takes removes the overhead of parsing the format string on 
every invocation.

It then occurred to me that the compiler could do the same for us, and 
posted that idea on the gcc mailing list:

http://gcc.gnu.org/ml/gcc/2007-10/msg00073.html

I got busy with other stuff, and never got the chance to follow up with 
that for gcc, or with a patch to PostgreSQL.

It does seem like it would be worthwhile to do something about this.

--   Heikki Linnakangas  EnterpriseDB   http://www.enterprisedb.com