Обсуждение: Fsync IO issue

Поиск
Список
Период
Сортировка

Fsync IO issue

От
ProfiVPS Support
Дата:

Hi there, 

 I've been struggling with very high write load on a server. 

 We are collecting around 400k values each 5 minutes into a hypertable. (We use timescaledb extension, I also shared this on timescale forum but then I realised the issue is postgresql related.) 

 When running iostat I see a constant 7-10MB/s write by postgres, and this just doesn’t add up for me :frowning: and I’m fully stuck with this. Even with the row overhead it should be around 20Mb / 5 mins ! Even with indeces this 7-10MB/s constant write is inexplicable for me.

  The writes may trigger an update in an other table, but not all of them do (I use a time filter). Let’s say 70% does (which I dont think). There we update two timestamps, and two ints. This still doesnt add up for me. Even if we talk about 50MB of records, that should be 0,16MB/s at most!

 So I dag in and found it was WAL, of course, what else. 

 Tweaking all around the config, reading forums and docs, to no avail. The only thing that made the scenario realistic is disabling fsync (which I know I must not, but for the experiment I did). That eased the write load to 0.6MB/s.

 I also found that the 16MB WAL segment got 80+ MB written into it before being closed. So what's happening here? Does fsync cause the whole file to be written out again and again? 

 I checked with pg_dump, the content is as expected. 

 We are talking about some insane data overhead here, two magnitudes more is being written to WAL than the actual useful data. 


All help is greatly appreciated. 

Thanks!

András



---
Olcsó Virtuális szerver:
http://www.ProfiVPS.hu

Támogatás: Support@ProfiVPS.hu

Re: Fsync IO issue

От
ProfiVPS Support
Дата:


Oh, sorry, we are using PostgreSQL 13.10 (Debian 13.10-1.pgdg100+1)  on the server with with TimescaleDB 2.5.1 on Debian 10. 


2023-05-04 19:31 időpontban ProfiVPS Support ezt írta:

Hi there, 

 I've been struggling with very high write load on a server. 

 We are collecting around 400k values each 5 minutes into a hypertable. (We use timescaledb extension, I also shared this on timescale forum but then I realised the issue is postgresql related.) 

 When running iostat I see a constant 7-10MB/s write by postgres, and this just doesn’t add up for me :frowning: and I’m fully stuck with this. Even with the row overhead it should be around 20Mb / 5 mins ! Even with indeces this 7-10MB/s constant write is inexplicable for me.

  The writes may trigger an update in an other table, but not all of them do (I use a time filter). Let’s say 70% does (which I dont think). There we update two timestamps, and two ints. This still doesnt add up for me. Even if we talk about 50MB of records, that should be 0,16MB/s at most!

 So I dag in and found it was WAL, of course, what else. 

 Tweaking all around the config, reading forums and docs, to no avail. The only thing that made the scenario realistic is disabling fsync (which I know I must not, but for the experiment I did). That eased the write load to 0.6MB/s.

 I also found that the 16MB WAL segment got 80+ MB written into it before being closed. So what's happening here? Does fsync cause the whole file to be written out again and again? 

 I checked with pg_dump, the content is as expected. 

 We are talking about some insane data overhead here, two magnitudes more is being written to WAL than the actual useful data. 


All help is greatly appreciated. 

Thanks!

András



---
Olcsó Virtuális szerver:
http://www.ProfiVPS.hu

Támogatás: Support@ProfiVPS.hu


---
Olcsó Virtuális szerver:
http://www.ProfiVPS.hu

Támogatás: Support@ProfiVPS.hu
Вложения

Re: Fsync IO issue

От
Andres Freund
Дата:
Hi,

On 2023-05-04 19:31:45 +0200, ProfiVPS Support wrote:
>  We are collecting around 400k values each 5 minutes into a hypertable. (We
> use timescaledb extension, I also shared this on timescale forum but then I
> realised the issue is postgresql related.)

I don't know how timescale does its storage - how did you conclude this is
about postgres, not about timescale? Obviously WAL write patterns depend on
the way records are inserted and flushed.


>  I also found that the 16MB WAL segment got 80+ MB written into it before
> being closed. So what's happening here? Does fsync cause the whole file to
> be written out again and again?

One possible reason for this is that you are committing small transactions
very frequently. When a transaction commits, the commit records needs to be
flushed to disk. If the transactions are small, the next commit might reside
on the same page - which needs to be written out again. Which of course can
increase the write rate considerably.

Your workload does not sound like it actually needs to commit in tiny
transactions? Some larger batching / using longer lived transactions might
help a lot.

Another possibility is that timescale does flush WAL too frequently for some
reason...

Greetings,

Andres Freund



Re: Fsync IO issue

От
ProfiVPS Support
Дата:

Hi, 

 thank you for your response :)

 Yes, that's exactly what's happening and I understand the issue with fsync in these cases. But I see no workaround about this as the data is ingested one-by-one (sent by collectd) and a db function handles it (it has to do lookup and set state in a different table based on the incoming value). 

 I feel like ANYTHING would be better than this. Even risking loosing _some_ of the latest data in case of a server crash (if it crashes we lose data anyways until restart, ofc we could have HA I know and we will when there'll be a need) . 

 Around 100 times the write need for wall than the useful data! That's insane. This is actually endangering the whole project we've been working on for the last 1.5 years and I face this issue after 100k devices have been added for a client. So I'm between a rock and a hard place :( 

 Ye, I think this is called "experience", but I must be honest, I was not expecting this at all :(

 However,  collectd's plugin does have an option to increase commit interval, but that kept the records locked and it caused strange issues, that's why I disabled it. I tried now to add that setting back and it does ease the situation somewhat with write spikes on commit. 

 All in all, thank you for your help. Honestly, after todays journey I thought that's the issue, but didn't want to believe it.


Thanks,

András


2023-05-04 21:21 időpontban Andres Freund ezt írta:

Hi,

On 2023-05-04 19:31:45 +0200, ProfiVPS Support wrote:
 We are collecting around 400k values each 5 minutes into a hypertable. (We
use timescaledb extension, I also shared this on timescale forum but then I
realised the issue is postgresql related.)

I don't know how timescale does its storage - how did you conclude this is
about postgres, not about timescale? Obviously WAL write patterns depend on
the way records are inserted and flushed.


 I also found that the 16MB WAL segment got 80+ MB written into it before
being closed. So what's happening here? Does fsync cause the whole file to
be written out again and again?

One possible reason for this is that you are committing small transactions
very frequently. When a transaction commits, the commit records needs to be
flushed to disk. If the transactions are small, the next commit might reside
on the same page - which needs to be written out again. Which of course can
increase the write rate considerably.

Your workload does not sound like it actually needs to commit in tiny
transactions? Some larger batching / using longer lived transactions might
help a lot.

Another possibility is that timescale does flush WAL too frequently for some
reason...

Greetings,

Andres Freund



---
Olcsó Virtuális szerver:
http://www.ProfiVPS.hu

Támogatás: Support@ProfiVPS.hu

Re: Fsync IO issue

От
Thomas Munro
Дата:
On Fri, May 5, 2023 at 8:37 AM ProfiVPS Support <support@profivps.hu> wrote:
>  I feel like ANYTHING would be better than this. Even risking loosing _some_ of the latest data in case of a server
crash(if it crashes we lose data anyways until restart, ofc we could have HA I know and we will when there'll be a
need). 

Try synchronous_commit=off:

https://www.postgresql.org/docs/current/wal-async-commit.html



Re: Fsync IO issue

От
Merlin Moncure
Дата:
On Thu, May 4, 2023 at 4:23 PM Thomas Munro <thomas.munro@gmail.com> wrote:
On Fri, May 5, 2023 at 8:37 AM ProfiVPS Support <support@profivps.hu> wrote:
>  I feel like ANYTHING would be better than this. Even risking loosing _some_ of the latest data in case of a server crash (if it crashes we lose data anyways until restart, ofc we could have HA I know and we will when there'll be a need) .

Try synchronous_commit=off:

https://www.postgresql.org/docs/current/wal-async-commit.html

Yeah, or batch multiple inserts into a transaction somehow.   In the worst case, each insert can cause multiple things to happen, write to WAL, flush to heap (8k write), commit bit set (another 8k write), etc.  In most workloads these steps can aggregate together in various ways but not always.   

merlin