Re: Dynamic Partitioning using Segment Visibility Maps

Поиск
Список
Период
Сортировка
От Markus Schiltknecht
Тема Re: Dynamic Partitioning using Segment Visibility Maps
Дата
Msg-id 477FD451.1010406@bluegap.ch
обсуждение исходный текст
Ответ на Re: Dynamic Partitioning using Segment Visibility Maps  (Robert Treat <xzilla@users.sourceforge.net>)
Ответы Re: Dynamic Partitioning using Segment Visibility Maps  (Robert Treat <xzilla@users.sourceforge.net>)
Re: Dynamic Partitioning using Segment Visibility Maps  (Andrew Sullivan <ajs@crankycanuck.ca>)
Список pgsql-hackers
Hi,

Robert Treat wrote:
> Personally I cant say it complicates things, because it isn't clear how it 
> will be managed. :-)

Well, management of relations is easy enough, known to the DBA and most 
importantly: it already exists. Having to set up something which is 
*not* tied to a relation complicates things just because it's an 
additional concept.

But as I've pointed out, maybe what we have in mind isn't that different 
at all. Just have a sentinel relation mean a set of segments, i.e. all 
read-only segments of a table. Then again, a table - in a way - is not 
much else than a set of segments. So where's the real difference?

>> To satisfy all the different requirements of partitioning with segments
>> based partitioning, we'd have to allow a table to span multiple table
>> spaces. I'm not very keen on going that way.
>>
> 
> Why?

Uh.. if a table (RELKIND_RELATION) can only span one table space, as it 
is now, all of its segments are in the same table space. I don't quite 
call that partitioning. Well, sure, you could call it so, but then, each 
and every Postgres table is already partitioned in 1G segments.

It all depends on the definitions, but in my world, horizontal 
partitioning for databases involves multiple table spaces (and is quite 
useless without that). Calling anything else partitioning is confusing, IMO.

> So the one thing that always scares me about these "define it all and let the 
> database sort it out" methods is they seem to lead to cases where the system 
> ends up rewriting the data to fit into some new partition layout.

That holds true no matter if you shuffle between segments or relations. 
To be able to let the DBA define an exact split point, the database 
*will* have to shuffle tuples around. Why does that scare you? It's a 
regular database system's maintenance procedure.

> One thing 
> that is nice about the current partitioning scheme is you can control the 
> impact of this behavior in these scenarios, but moving around small portions 
> of the table at a time.  

Uh.. I'm not quite following. What "current partitioning scheme" are you 
referring to?

Why should that not be possible with other schemes? Moving the split 
point between two partitions involves moving tuples around, no matter if 
you are going to move them between segments or between relations 
building the partitions.

> More to the point (I think) is that people define access to the data based on 
> the meaning of the data, not how it is stored on disk.  For example, in some 
> tables we only need to be active on 1 months worth of data... how that is 
> laid out on disk (# partitions, which tablespaces) is a means to the end of 
> working actively on 1 months worth of data. I can't think of many cases where 
> people would actually say the want to work actively on the most recent GB of 
> data.  

Agreed. I'd say that's why the DBA needs to be able to define the split 
point between partitions: only he knows the meaning of the data.

>> To me, both of SVM and
>> SE look much more like an optimization for certain special cases and
>> don't have much to do with partitioning.
> 
> Even if this were true, it might still be a useful optimization.

Possibly, yes. To me, the use case seems pretty narrow, though. For 
example it doesn't affect index scans much.

> One table I 
> am thinking of in particular in my system has one query we need to run across 
> partitions, which ends up doing a slew of bitmap index scans for all the 
> partitions. If using segment exclusion on it meant that I could get a global 
> index to help that query, I'd be happy. 

As proposed, Segment Exclusion works only on exactly one table. Thus, if 
you already have your data partitioned into multiple relations, it most 
probably won't affect your setup much. It certainly has nothing to do 
with what I understand by 'global index' (that's an index spanning 
multiple tables, right?).

Regards

Markus



В списке pgsql-hackers по дате отправления:

Предыдущее
От: Tom Lane
Дата:
Сообщение: Re: SSL over Unix-domain sockets
Следующее
От: Mark Mielke
Дата:
Сообщение: Re: SSL over Unix-domain sockets