Обсуждение: blending fast and temp space volumes

Поиск
Список
Период
Сортировка

blending fast and temp space volumes

От
Rick Otten
Дата:
Some of my data processes use large quantities of temp space - 5 or 6T anyway.

We are running in Google Cloud.  In order to get the best performance out of all of my queries that might need temp space, I've configured temp space on a concatenated local (volatile) SSD volume.  In GCE, local SSD's are more than 20x faster than SAN SSD's in GCE.

side note:  The disadvantage of local SSD is that it won't survive "hitting the virtual power button" on an instance, nor can it migrate automatically to other hardware.  (We have to hit the power button to add memory/cpu to the system, and sometimes the power button might get hit by accident.)  This is OK for temp space.  I never have my database come up automatically on boot, and I have scripted the entire setup of the temp space volume and data structures.  I can run that script before starting the database.   I've done some tests and it seems to work great.  I don't mind rolling back any transaction that might be in play during a power failure.

So here is the problem:   The largest local SSD configuration I can get in GCE is 3T.  Since I have processes that sometimes use more than that, I've configured a second temp space volume on regular SAN SSD.   My hope was that if a query ran out of temp space on one volume it would spill over onto the other volume.  Unfortunately it doesn't appear to do that automatically.  When it hits the 3T limit on the one volume, the query fails.  :-(

So, the obvious solution is to anticipate which processes will need more than 3T temp space and then 'set temp_tablespaces' to not use the 3T volume.  And that is what we'll try next.

Meanwhile, I'd like other processes to "prefer" the fast volume over the slow one when the space is available.  Ideally I'd like to always use the fast volume and have the planner know about the different performance characteristics and capacity of the available temp space volumes and then choose the best one (speed or size) depending on the query's needs.  

I was wondering if there anyone had ideas for how to make that possible.   I don't think I want to add the SAN disk to the same LVM volume group as the local disk, but maybe that would work, since I'm already building it with a script anyhow ... Is LVM smart enough to optimize radically different disk performances?

At the moment it seems like when multiple temp spaces are available, the temp spaces are chosen in a 'round robin' or perhaps 'random' fashion.  Is that true?

I'm meeting with my GCE account rep next week to see if there is any way to get more than 3T of local SSD, but I'm skeptical it will be available any time soon.

thoughts?



Re: blending fast and temp space volumes

От
Tom Lane
Дата:
Rick Otten <rottenwindfish@gmail.com> writes:
> At the moment it seems like when multiple temp spaces are available, the
> temp spaces are chosen in a 'round robin' or perhaps 'random' fashion.  Is
> that true?

Yes, see fd.c's SetTempTablespaces and GetNextTempTableSpace.
There's no concept of different temp spaces having different performance
characteristics, and anyway we don't really have enough info to make
accurate predictions of temp space consumption.  So it's hard to see the
planner doing this for you automagically.

            regards, tom lane


Re: blending fast and temp space volumes

От
Craig James
Дата:

On Wed, Feb 21, 2018 at 7:53 AM, Rick Otten <rottenwindfish@gmail.com> wrote:
Some of my data processes use large quantities of temp space - 5 or 6T anyway.

We are running in Google Cloud.  In order to get the best performance out of all of my queries that might need temp space, I've configured temp space on a concatenated local (volatile) SSD volume.  In GCE, local SSD's are more than 20x faster than SAN SSD's in GCE.

side note:  The disadvantage of local SSD is that it won't survive "hitting the virtual power button" on an instance, nor can it migrate automatically to other hardware.  (We have to hit the power button to add memory/cpu to the system, and sometimes the power button might get hit by accident.)  This is OK for temp space.  I never have my database come up automatically on boot, and I have scripted the entire setup of the temp space volume and data structures.  I can run that script before starting the database.   I've done some tests and it seems to work great.  I don't mind rolling back any transaction that might be in play during a power failure.

So here is the problem:   The largest local SSD configuration I can get in GCE is 3T.  Since I have processes that sometimes use more than that, I've configured a second temp space volume on regular SAN SSD.   My hope was that if a query ran out of temp space on one volume it would spill over onto the other volume.  Unfortunately it doesn't appear to do that automatically.  When it hits the 3T limit on the one volume, the query fails.  :-(

So, the obvious solution is to anticipate which processes will need more than 3T temp space and then 'set temp_tablespaces' to not use the 3T volume.  And that is what we'll try next.

Meanwhile, I'd like other processes to "prefer" the fast volume over the slow one when the space is available.  Ideally I'd like to always use the fast volume and have the planner know about the different performance characteristics and capacity of the available temp space volumes and then choose the best one (speed or size) depending on the query's needs.  

I was wondering if there anyone had ideas for how to make that possible.   I don't think I want to add the SAN disk to the same LVM volume group as the local disk, but maybe that would work, since I'm already building it with a script anyhow ... Is LVM smart enough to optimize radically different disk performances?

Couldn't you configure both devices into a single 6T device via RAID0 using md?

Craig
 

At the moment it seems like when multiple temp spaces are available, the temp spaces are chosen in a 'round robin' or perhaps 'random' fashion.  Is that true?

I'm meeting with my GCE account rep next week to see if there is any way to get more than 3T of local SSD, but I'm skeptical it will be available any time soon.

thoughts?






--
---------------------------------
Craig A. James
Chief Technology Officer
eMolecules, Inc.
---------------------------------

Re: blending fast and temp space volumes

От
Peter Geoghegan
Дата:
On Wed, Feb 21, 2018 at 7:53 AM, Rick Otten <rottenwindfish@gmail.com> wrote:
> side note:  The disadvantage of local SSD is that it won't survive "hitting
> the virtual power button" on an instance, nor can it migrate automatically
> to other hardware.  (We have to hit the power button to add memory/cpu to
> the system, and sometimes the power button might get hit by accident.)  This
> is OK for temp space.  I never have my database come up automatically on
> boot, and I have scripted the entire setup of the temp space volume and data
> structures.  I can run that script before starting the database.   I've done
> some tests and it seems to work great.  I don't mind rolling back any
> transaction that might be in play during a power failure.

It sounds like you're treating a temp_tablespaces tablespace as
ephemeral, which IIRC can have problems that an ephemeral
stats_temp_directory does not have.

-- 
Peter Geoghegan


Re: blending fast and temp space volumes

От
Claudio Freire
Дата:
On Wed, Feb 21, 2018 at 4:50 PM, Peter Geoghegan <pg@bowt.ie> wrote:
> On Wed, Feb 21, 2018 at 7:53 AM, Rick Otten <rottenwindfish@gmail.com> wrote:
>> side note:  The disadvantage of local SSD is that it won't survive "hitting
>> the virtual power button" on an instance, nor can it migrate automatically
>> to other hardware.  (We have to hit the power button to add memory/cpu to
>> the system, and sometimes the power button might get hit by accident.)  This
>> is OK for temp space.  I never have my database come up automatically on
>> boot, and I have scripted the entire setup of the temp space volume and data
>> structures.  I can run that script before starting the database.   I've done
>> some tests and it seems to work great.  I don't mind rolling back any
>> transaction that might be in play during a power failure.
>
> It sounds like you're treating a temp_tablespaces tablespace as
> ephemeral, which IIRC can have problems that an ephemeral
> stats_temp_directory does not have.

For instance?

I've been doing that for years without issue. If you're careful to
restore the skeleton directory structure at server boot up, I haven't
had any issues.



On Wed, Feb 21, 2018 at 4:22 PM, Craig James <cjames@emolecules.com> wrote:
>
> On Wed, Feb 21, 2018 at 7:53 AM, Rick Otten <rottenwindfish@gmail.com>
>> I was wondering if there anyone had ideas for how to make that possible.
>> I don't think I want to add the SAN disk to the same LVM volume group as the
>> local disk, but maybe that would work, since I'm already building it with a
>> script anyhow ... Is LVM smart enough to optimize radically different disk
>> performances?
>
>
> Couldn't you configure both devices into a single 6T device via RAID0 using
> md?

That would probably perform as slow as the slowest disk.


Re: blending fast and temp space volumes

От
Peter Geoghegan
Дата:
On Wed, Feb 21, 2018 at 12:07 PM, Claudio Freire <klaussfreire@gmail.com> wrote:
> On Wed, Feb 21, 2018 at 4:50 PM, Peter Geoghegan <pg@bowt.ie> wrote:
>> On Wed, Feb 21, 2018 at 7:53 AM, Rick Otten <rottenwindfish@gmail.com> wrote:
>>> side note:  The disadvantage of local SSD is that it won't survive "hitting
>>> the virtual power button" on an instance, nor can it migrate automatically
>>> to other hardware.  (We have to hit the power button to add memory/cpu to
>>> the system, and sometimes the power button might get hit by accident.)  This
>>> is OK for temp space.  I never have my database come up automatically on
>>> boot, and I have scripted the entire setup of the temp space volume and data
>>> structures.  I can run that script before starting the database.   I've done
>>> some tests and it seems to work great.  I don't mind rolling back any
>>> transaction that might be in play during a power failure.
>>
>> It sounds like you're treating a temp_tablespaces tablespace as
>> ephemeral, which IIRC can have problems that an ephemeral
>> stats_temp_directory does not have.
>
> For instance?
>
> I've been doing that for years without issue. If you're careful to
> restore the skeleton directory structure at server boot up, I haven't
> had any issues.

Then you clearly know what I mean already. That's not documented as
either required or safe anywhere.

-- 
Peter Geoghegan


Re: blending fast and temp space volumes

От
Claudio Freire
Дата:
On Wed, Feb 21, 2018 at 5:09 PM, Peter Geoghegan <pg@bowt.ie> wrote:
> On Wed, Feb 21, 2018 at 12:07 PM, Claudio Freire <klaussfreire@gmail.com> wrote:
>> On Wed, Feb 21, 2018 at 4:50 PM, Peter Geoghegan <pg@bowt.ie> wrote:
>>> On Wed, Feb 21, 2018 at 7:53 AM, Rick Otten <rottenwindfish@gmail.com> wrote:
>>>> side note:  The disadvantage of local SSD is that it won't survive "hitting
>>>> the virtual power button" on an instance, nor can it migrate automatically
>>>> to other hardware.  (We have to hit the power button to add memory/cpu to
>>>> the system, and sometimes the power button might get hit by accident.)  This
>>>> is OK for temp space.  I never have my database come up automatically on
>>>> boot, and I have scripted the entire setup of the temp space volume and data
>>>> structures.  I can run that script before starting the database.   I've done
>>>> some tests and it seems to work great.  I don't mind rolling back any
>>>> transaction that might be in play during a power failure.
>>>
>>> It sounds like you're treating a temp_tablespaces tablespace as
>>> ephemeral, which IIRC can have problems that an ephemeral
>>> stats_temp_directory does not have.
>>
>> For instance?
>>
>> I've been doing that for years without issue. If you're careful to
>> restore the skeleton directory structure at server boot up, I haven't
>> had any issues.
>
> Then you clearly know what I mean already. That's not documented as
> either required or safe anywhere.

Ah, ok.

But the OP did mention he was doing that already. So it should be safe.