Re: Enhance pg_dump multi-threaded streaming (WAS: Re: filesystem full during vacuum - space recovery issues)

Поиск
Список
Период
Сортировка
От Andrew Dunstan
Тема Re: Enhance pg_dump multi-threaded streaming (WAS: Re: filesystem full during vacuum - space recovery issues)
Дата
Msg-id 5e3a16ad-c8e2-4695-93cb-74121c3add56@dunslane.net
обсуждение исходный текст
Ответ на Re: Enhance pg_dump multi-threaded streaming (WAS: Re: filesystem full during vacuum - space recovery issues)  (Thomas Simpson <ts@talentstack.to>)
Ответы Re: Enhance pg_dump multi-threaded streaming (WAS: Re: filesystem full during vacuum - space recovery issues)
Список pgsql-hackers
On 2024-07-19 Fr 9:46 AM, Thomas Simpson wrote:
>
> Hi Scott,
>
> I realize some of the background was snipped on what I sent to the 
> hacker list, I'll try to fill in the details.
>
> Short background is very large database ran out of space during vacuum 
> full taking down the server.  There is a replica which was applying 
> the WALs and so it too ran out of space.  On restart after clearing 
> some space, the database came back up but left over the in-progress 
> rebuild files.  I've cleared that replica and am using it as my 
> rebuild target just now.
>
> Trying to identify the 'orphan' files and move them away always led to 
> the database spotting the supposedly unused files having gone and 
> refusing to start, so I had no successful way to clean up and get 
> space back.
>
> Last resort after discussion is pg_dumpall & reload.  I'm doing this 
> via a network pipe (netcat) as I do not have the vast amount of 
> storage necessary for the dump file to be stored (in any format).
>
> On 19-Jul-2024 09:26, Scott Ribe wrote:
>> Do you actually have 100G networking between the nodes? Because if not, a single CPU should be able to saturate
10G.
> Servers connect via 10G WAN; sending is not the issue, it's 
> application of the incoming stream on the destination which is 
> bottlenecked.
>> Likewise the receiving end would need disk capable of keeping up. Which brings up the question, why not write to
disk,but directly to the destination rather than write locally then copy?
 
> In this case, it's not a local write, it's piped via netcat.
>> Do you require dump-reload because of suspected corruption? That's a tough one. But if not, if the goal is just to
getup and running on a new server, why not pg_basebackup, streaming replica, promote? That depends on the level of data
modificationactivity being low enough that pg_basebackup can keep up with WAL as it's generated and apply it faster
thannew WAL comes in, but given that your server is currently keeping up with writing that much WAL and flushing that
manychanges, seems likely it would keep up as long as the network connection is fast enough. Anyway, in that scenario,
youdon't need to care how long pg_basebackup takes.
 
>>
>> If you do need a dump/reload because of suspected corruption, the only thing I can think of is something like doing
ita table at a time--partitioning would help here, if practical.
 
>
> The basebackup is, to the best of my understanding, essentially just 
> copying the database files.  Since the failed vacuum has left extra 
> files, my expectation is these too would be copied, leaving me in the 
> same position I started in.  If I'm wrong, please tell me as that 
> would be vastly quicker - it is how I originally set up the replica 
> and it took only a few hours on the 10G link.
>
> The inability to get a clean start if I move any files out the way 
> leads me to be concerned for some underlying corruption/issue and the 
> recommendation earlier in the discussion was opt for dump/reload as 
> the fail-safe.
>
> Resigned to my fate, my thoughts were to see if there is a way to 
> improve the dump-reload approach for the future.  Since dump-reload is 
> the ultimate upgrade suggestion in the documentation, it seems 
> worthwhile to see if there is a way to improve the performance of that 
> especially as very large databases like mine are a thing with 
> PostgreSQL.  From a quick review of pg_dump.c (I'm no expert on it 
> obviously), it feels like it's already doing most of what needs done 
> and the addition is some sort of multi-thread coordination with a 
> restore client to ensure each thread can successfully complete each 
> task it has before accepting more work.  I realize that's actually 
> difficult to implement.
>
>

There is a plan for a non-text mode for pg_dumpall. I have started work 
on it, and hope to have a WIP patch in a month or so. It's not my 
intention to parallelize it for the first cut, but it could definitely 
be parallelizable in future. However, it will require writing to disk 
somewhere, albeit that the data will be compressed. It's well nigh 
impossible to parallelize text format dumps.

Restoration of custom and directory format dumps has long been 
parallelized. Parallel dumps require directory format, and so will 
non-text pg_dumpall.


cheers


andrew



--
Andrew Dunstan
EDB: https://www.enterprisedb.com




В списке pgsql-hackers по дате отправления:

Предыдущее
От: Tom Lane
Дата:
Сообщение: Re: Vacuum ERRORs out considering freezing dead tuples from before OldestXmin
Следующее
От: Tom Lane
Дата:
Сообщение: Re: Vacuum ERRORs out considering freezing dead tuples from before OldestXmin