Why is pq_begintypsend so slow?

Поиск
Список
Период
Сортировка
От Jeff Janes
Тема Why is pq_begintypsend so slow?
Дата
Msg-id CAMkU=1whbRDUwa4eayD9+59K-coxO9senDkPRbTn3cg0pUz4AQ@mail.gmail.com
обсуждение исходный текст
Ответы Re: Why is pq_begintypsend so slow?
Список pgsql-hackers
I was using COPY recently and was wondering why BINARY format is not much (if any) faster than the default format.  Once I switched from mostly exporting ints to mostly exporting double precisions (7e6 rows of 100 columns, randomly generated), it was faster, but not by as much as I intuitively thought it should be.  

Running 'perf top' to profile a "COPY BINARY .. TO '/dev/null'" on a AWS m5.large machine running Ubuntu 18.04, with self compiled PostgreSQL:

PostgreSQL 13devel on x86_64-pc-linux-gnu, compiled by gcc (Ubuntu 7.4.0-1ubuntu1~18.04.1) 7.4.0, 64-bit

I saw that the hotspot was pq_begintypsend at 20%, which was twice the percentage as the next place winner (AllocSetAlloc).  If I drill down into teh function, I see something like the below.  I don't really speak assembly, but usually when I see an assembly instruction being especially hot and not being the inner most instruction in a loop, I blame it on CPU cache misses.  But everything being touched here should already be well cached, since  initStringInfo has just got done setting it up. And if not for that, then the by the 2nd invocation of appendStringInfoCharMacro it certainly should be in the cache, yet that one is even slower than the 1st appendStringInfoCharMacro.

Why is this such a bottleneck?

pq_begintypsend  /usr/local/pgsql/bin/postgres

 0.15 |    push  %rbx
 0.09 |    mov  %rdi,%rbx
   |       initStringInfo(buf);
 3.03 |    callq initStringInfo
   |       /* Reserve four bytes for the bytea length word */
   |       appendStringInfoCharMacro(buf, '\0');
   |    movslq 0x8(%rbx),%rax
 1.05 |    lea  0x1(%rax),%edx
 0.72 |    cmp  0xc(%rbx),%edx
   |    jge  b0
 2.92 |    mov  (%rbx),%rdx
   |    movb  $0x0,(%rdx,%rax,1)
13.76 |    mov  0x8(%rbx),%eax
 0.81 |    mov  (%rbx),%rdx
 0.52 |    add  $0x1,%eax
 0.12 |    mov  %eax,0x8(%rbx)
 2.85 |    cltq
 0.01 |    movb  $0x0,(%rdx,%rax,1)
   |       appendStringInfoCharMacro(buf, '\0');
10.65 |    movslq 0x8(%rbx),%rax
   |    lea  0x1(%rax),%edx
 0.90 |    cmp  0xc(%rbx),%edx
   |    jge  ca
 0.54 | 42:  mov  (%rbx),%rdx
 1.84 |    movb  $0x0,(%rdx,%rax,1)
13.88 |    mov  0x8(%rbx),%eax
 0.03 |    mov  (%rbx),%rdx
   |    add  $0x1,%eax
 0.33 |    mov  %eax,0x8(%rbx)
 2.60 |    cltq
 0.06 |    movb  $0x0,(%rdx,%rax,1)
   |       appendStringInfoCharMacro(buf, '\0');
 3.21 |    movslq 0x8(%rbx),%rax
 0.23 |    lea  0x1(%rax),%edx
 1.74 |    cmp  0xc(%rbx),%edx
   |    jge  e0
 0.21 | 67:  mov  (%rbx),%rdx
 1.18 |    movb  $0x0,(%rdx,%rax,1)
 9.29 |    mov  0x8(%rbx),%eax
 0.18 |    mov  (%rbx),%rdx
   |    add  $0x1,%eax
 0.19 |    mov  %eax,0x8(%rbx)
 3.14 |    cltq
 0.12 |    movb  $0x0,(%rdx,%rax,1)
   |       appendStringInfoCharMacro(buf, '\0');
 5.29 |    movslq 0x8(%rbx),%rax
 0.03 |    lea  0x1(%rax),%edx
 1.45 |    cmp  0xc(%rbx),%edx
   |    jge  f6
 0.41 | 8c:  mov  (%rbx),%rdx

Cheers,

Jeff

В списке pgsql-hackers по дате отправления:

Предыдущее
От: David Fetter
Дата:
Сообщение: Re: Make autovacuum sort tables in descending order of xid_age
Следующее
От: Tom Lane
Дата:
Сообщение: Re: Why is pq_begintypsend so slow?