Обсуждение: Streaming replication and sharding

Поиск
Список
Период
Сортировка

Streaming replication and sharding

От
Tiemo Kieft
Дата:
Hi,

We are developing an application that uses various web analytics packages
(like Google Analytics) to run analyses on. We are currently in closed beta
stadium where we don't have a lot of data in the database, but at some
point it will grow considerably.

We basically have two different sets of data, on the one hand we have raw
metrics from the datasource, and on the other hand we have account and meta
information. The former can be re-downloaded at any time, and will grow to
quite large sizes. The latter set is the one that we really care about, and
don't want to risk losing.

Currently we plan on using streaming replication to replicate all data to
at least one slave, for the near future this will do, since we can run some
of the large (read-only) aggregation queries on the slave database. In the
future the dataset might grow to the point where we need to start thinking
about sharding. The analytics data can be sharded on a per-customer basis,
and doesn't have to be replicated.

Since Postgres doesn't support per-table streaming replication (as far as I
can tell), the only solution would be to run two separate instances of
postgres per server. One instance is replicated to all servers, and will
contain account and other important information. The other instance is used
to store analytics data. Is this a viable way of solving this problem, or
are we overlooking something?

The problem is not really immediate, as the dataset is currently small
enough to fit on one machine (and replicated to a second), just want to be
future proof, and get this solved before the problems start.

--
- Tiemo

Re: Streaming replication and sharding

От
Albe Laurenz
Дата:
VGllbW8gS2llZnQgd3JvdGU6DQo+IFdlIGFyZSBkZXZlbG9waW5nIGFuIGFwcGxpY2F0aW9uIHRo
YXQgdXNlcyB2YXJpb3VzIHdlYiBhbmFseXRpY3MgcGFja2FnZXMgKGxpa2UgR29vZ2xlIEFuYWx5
dGljcykgdG8NCj4gcnVuIGFuYWx5c2VzIG9uLiBXZSBhcmUgY3VycmVudGx5IGluIGNsb3NlZCBi
ZXRhIHN0YWRpdW0gd2hlcmUgd2UgZG9uJ3QgaGF2ZSBhIGxvdCBvZiBkYXRhIGluIHRoZQ0KPiBk
YXRhYmFzZSwgYnV0IGF0IHNvbWUgcG9pbnQgaXQgd2lsbCBncm93IGNvbnNpZGVyYWJseS4NCj4g
DQo+IFdlIGJhc2ljYWxseSBoYXZlIHR3byBkaWZmZXJlbnQgc2V0cyBvZiBkYXRhLCBvbiB0aGUg
b25lIGhhbmQgd2UgaGF2ZSByYXcgbWV0cmljcyBmcm9tIHRoZSBkYXRhc291cmNlLA0KPiBhbmQg
b24gdGhlIG90aGVyIGhhbmQgd2UgaGF2ZSBhY2NvdW50IGFuZCBtZXRhIGluZm9ybWF0aW9uLiBU
aGUgZm9ybWVyIGNhbiBiZSByZS1kb3dubG9hZGVkIGF0IGFueQ0KPiB0aW1lLCBhbmQgd2lsbCBn
cm93IHRvIHF1aXRlIGxhcmdlIHNpemVzLiBUaGUgbGF0dGVyIHNldCBpcyB0aGUgb25lIHRoYXQg
d2UgcmVhbGx5IGNhcmUgYWJvdXQsIGFuZA0KPiBkb24ndCB3YW50IHRvIHJpc2sgbG9zaW5nLg0K
PiANCj4gQ3VycmVudGx5IHdlIHBsYW4gb24gdXNpbmcgc3RyZWFtaW5nIHJlcGxpY2F0aW9uIHRv
IHJlcGxpY2F0ZSBhbGwgZGF0YSB0byBhdCBsZWFzdCBvbmUgc2xhdmUsIGZvciB0aGUNCj4gbmVh
ciBmdXR1cmUgdGhpcyB3aWxsIGRvLCBzaW5jZSB3ZSBjYW4gcnVuIHNvbWUgb2YgdGhlIGxhcmdl
IChyZWFkLW9ubHkpIGFnZ3JlZ2F0aW9uIHF1ZXJpZXMgb24gdGhlDQo+IHNsYXZlIGRhdGFiYXNl
LiBJbiB0aGUgZnV0dXJlIHRoZSBkYXRhc2V0IG1pZ2h0IGdyb3cgdG8gdGhlIHBvaW50IHdoZXJl
IHdlIG5lZWQgdG8gc3RhcnQgdGhpbmtpbmcNCj4gYWJvdXQgc2hhcmRpbmcuIFRoZSBhbmFseXRp
Y3MgZGF0YSBjYW4gYmUgc2hhcmRlZCBvbiBhIHBlci1jdXN0b21lciBiYXNpcywgYW5kIGRvZXNu
J3QgaGF2ZSB0byBiZQ0KPiByZXBsaWNhdGVkLg0KPiANCj4gU2luY2UgUG9zdGdyZXMgZG9lc24n
dCBzdXBwb3J0IHBlci10YWJsZSBzdHJlYW1pbmcgcmVwbGljYXRpb24gKGFzIGZhciBhcyBJIGNh
biB0ZWxsKSwgdGhlIG9ubHkNCj4gc29sdXRpb24gd291bGQgYmUgdG8gcnVuIHR3byBzZXBhcmF0
ZSBpbnN0YW5jZXMgb2YgcG9zdGdyZXMgcGVyIHNlcnZlci4gT25lIGluc3RhbmNlIGlzIHJlcGxp
Y2F0ZWQgdG8NCj4gYWxsIHNlcnZlcnMsIGFuZCB3aWxsIGNvbnRhaW4gYWNjb3VudCBhbmQgb3Ro
ZXIgaW1wb3J0YW50IGluZm9ybWF0aW9uLiBUaGUgb3RoZXIgaW5zdGFuY2UgaXMgdXNlZCB0bw0K
PiBzdG9yZSBhbmFseXRpY3MgZGF0YS4gSXMgdGhpcyBhIHZpYWJsZSB3YXkgb2Ygc29sdmluZyB0
aGlzIHByb2JsZW0sIG9yIGFyZSB3ZSBvdmVybG9va2luZyBzb21ldGhpbmc/DQo+IA0KPiBUaGUg
cHJvYmxlbSBpcyBub3QgcmVhbGx5IGltbWVkaWF0ZSwgYXMgdGhlIGRhdGFzZXQgaXMgY3VycmVu
dGx5IHNtYWxsIGVub3VnaCB0byBmaXQgb24gb25lIG1hY2hpbmUNCj4gKGFuZCByZXBsaWNhdGVk
IHRvIGEgc2Vjb25kKSwganVzdCB3YW50IHRvIGJlIGZ1dHVyZSBwcm9vZiwgYW5kIGdldCB0aGlz
IHNvbHZlZCBiZWZvcmUgdGhlIHByb2JsZW1zDQo+IHN0YXJ0Lg0KDQpUaGUgcHJvYmxlbXMgSSBz
ZWUgd2l0aCBkaXN0cmlidXRpbmcgeW91ciBkYXRhIGFjcm9zcw0Kc2V2ZXJhbCBQb3N0Z3JlU1FM
IGNsdXN0ZXJzIGlzIHRoYXQgdGhleSBiZWNvbWUgZGlzY29ubmVjdGVkLg0KDQpJdCB3aWxsIGJl
Y29tZSBtdWNoIG1vcmUgZGlmZmljdWx0IHRvIGtlZXAgdGhlbSBjb25zaXN0ZW50Og0KDQpZb3Ug
Y2Fubm90IGhhdmUgcmVmZXJlbnRpYWwgaW50ZWdyaXR5LCBhbmQgaWYgYSBkYXRhYmFzZSByZXN0
b3JlDQppcyBuZWVkZWQsIHlvdSBoYXZlIHRvIG1ha2UgZXh0cmEgcHJvdmlzaW9ucyB0aGF0IHlv
dSBjYW4gcmVzdG9yZQ0KeW91ciBzeXN0ZW0gdG8gYSBjb25zaXN0ZW50IHN0YXRlIGFjcm9zcyBh
bGwgY2x1c3RlcnMuDQoNCllvdSBhbHNvIGxvc2UgdGhlIGFiaWxpdHkgdG8gam9pbiBiZXR3ZWVu
IHRhYmxlcyB0aGF0IGFyZQ0KZGlzdHJpYnV0ZWQgYWNyb3NzIGRpZmZlcmVudCBkYXRhYmFzZXMu
ICBUaGlzIGNhbiBiZSBhDQpwZXJkb3JtYW5jZSBwcm9ibGVtLCBwYXJ0aWN1bGFybHkgaW4gYW4g
T0xBUCBzY2VuYXJpby4NCg0KVGhhdCdzIGFsbCBJIGNhbiB0aGluayBvZiBhdCB0aGUgbW9tZW50
Lg0KDQpZb3VycywNCkxhdXJlbnogQWxiZQ0K

Re: Streaming replication and sharding

От
"René Romero Benavides"
Дата:
<p style=" margin-top:0px; margin-bottom:0px; margin-left:0px; margin-right:0px; -qt-block-indent:0; text-indent:0px;
-qt-user-state:0;">I'malso interested in leveraging something like that, these are my thoughts:<p
style="-qt-paragraph-type:empty;margin-top:0px; margin-bottom:0px; margin-left:0px; margin-right:0px;
-qt-block-indent:0;text-indent:0px; "> <p style=" margin-top:0px; margin-bottom:0px; margin-left:0px; margin-right:0px;
-qt-block-indent:0;text-indent:0px; -qt-user-state:0;">- Have a master server with all the data<p style="
margin-top:0px;margin-bottom:0px; margin-left:0px; margin-right:0px; -qt-block-indent:0; text-indent:0px;
-qt-user-state:0;">-Have the data partitioned vertically (inheritance, exclusion constrains, etc)<p style="
margin-top:0px;margin-bottom:0px; margin-left:0px; margin-right:0px; -qt-block-indent:0; text-indent:0px;
-qt-user-state:0;">-One synchronous slave<p style=" margin-top:0px; margin-bottom:0px; margin-left:0px;
margin-right:0px;-qt-block-indent:0; text-indent:0px; -qt-user-state:0;">- N asynchronous slaves that feed from the
synchronousslave (cascading replication)<p style=" margin-top:0px; margin-bottom:0px; margin-left:0px;
margin-right:0px;-qt-block-indent:0; text-indent:0px; -qt-user-state:0;">- Use plproxy to enforce that one instance
servesqueries of only one portion of the data (pseudo-sharding). Also with plproxy is possible to parallelize some
queries.<pstyle="-qt-paragraph-type:empty; margin-top:0px; margin-bottom:0px; margin-left:0px; margin-right:0px;
-qt-block-indent:0;text-indent:0px; "> <p style=" margin-top:0px; margin-bottom:0px; margin-left:0px; margin-right:0px;
-qt-block-indent:0;text-indent:0px; -qt-user-state:0;">Advantages:<p style=" margin-top:0px; margin-bottom:0px;
margin-left:0px;margin-right:0px; -qt-block-indent:0; text-indent:0px; -qt-user-state:0;"> + Highly redundant<p style="
margin-top:0px;margin-bottom:0px; margin-left:0px; margin-right:0px; -qt-block-indent:0; text-indent:0px;
-qt-user-state:0;">+ Lends itself to automatic failover to the syncrhonous slave<p style=" margin-top:0px;
margin-bottom:0px;margin-left:0px; margin-right:0px; -qt-block-indent:0; text-indent:0px; -qt-user-state:0;"> + Read
scalable<pstyle=" margin-top:0px; margin-bottom:0px; margin-left:0px; margin-right:0px; -qt-block-indent:0;
text-indent:0px;-qt-user-state:0;"> + No synchronization conflicts among "shards"<p style=" margin-top:0px;
margin-bottom:0px;margin-left:0px; margin-right:0px; -qt-block-indent:0; text-indent:0px; -qt-user-state:0;"> + Some
queriescould be parallelized<p style="-qt-paragraph-type:empty; margin-top:0px; margin-bottom:0px; margin-left:0px;
margin-right:0px;-qt-block-indent:0; text-indent:0px; "> <p style=" margin-top:0px; margin-bottom:0px; margin-left:0px;
margin-right:0px;-qt-block-indent:0; text-indent:0px; -qt-user-state:0;">Disadvantages:<p style=" margin-top:0px;
margin-bottom:0px;margin-left:0px; margin-right:0px; -qt-block-indent:0; text-indent:0px; -qt-user-state:0;"> + Plproxy
worksonly with functions<p style=" margin-top:0px; margin-bottom:0px; margin-left:0px; margin-right:0px;
-qt-block-indent:0;text-indent:0px; -qt-user-state:0;"> + Writes can be performed only in the master ( not write
scalable)<pstyle=" margin-top:0px; margin-bottom:0px; margin-left:0px; margin-right:0px; -qt-block-indent:0;
text-indent:0px;-qt-user-state:0;"> + Needs good communication infrastructure<p style=" margin-top:0px;
margin-bottom:0px;margin-left:0px; margin-right:0px; -qt-block-indent:0; text-indent:0px; -qt-user-state:0;"> + Table
structuresneed to be simple.<p style=" margin-top:0px; margin-bottom:0px; margin-left:0px; margin-right:0px;
-qt-block-indent:0;text-indent:0px; -qt-user-state:0;"> + At some point (failover), you'll need to implement plproxy
automaticre-configuration<p style="-qt-paragraph-type:empty; margin-top:0px; margin-bottom:0px; margin-left:0px;
margin-right:0px;-qt-block-indent:0; text-indent:0px; "> <p style=" margin-top:0px; margin-bottom:0px; margin-left:0px;
margin-right:0px;-qt-block-indent:0; text-indent:0px; -qt-user-state:0;">On Monday, February 18, 2013 08:00:33 AM Albe
Laurenzwrote:<p style=" margin-top:0px; margin-bottom:0px; margin-left:0px; margin-right:0px; -qt-block-indent:0;
text-indent:0px;-qt-user-state:0;">> Tiemo Kieft wrote:<p style=" margin-top:0px; margin-bottom:0px;
margin-left:0px;margin-right:0px; -qt-block-indent:0; text-indent:0px; -qt-user-state:0;">> <p style="
margin-top:0px;margin-bottom:0px; margin-left:0px; margin-right:0px; -qt-block-indent:0; text-indent:0px;
-qt-user-state:0;">>> We are developing an application that uses various web analytics packages<p style="
margin-top:0px;margin-bottom:0px; margin-left:0px; margin-right:0px; -qt-block-indent:0; text-indent:0px;
-qt-user-state:0;">>> (like Google Analytics) to<p style=" margin-top:0px; margin-bottom:0px; margin-left:0px;
margin-right:0px;-qt-block-indent:0; text-indent:0px; -qt-user-state:0;"> run analyses on. We are currently in closed<p
style="margin-top:0px; margin-bottom:0px; margin-left:0px; margin-right:0px; -qt-block-indent:0; text-indent:0px;
-qt-user-state:0;">>> beta stadium where we don't have a lot of data in the database, but at<p style="
margin-top:0px;margin-bottom:0px; margin-left:0px; margin-right:0px; -qt-block-indent:0; text-indent:0px;
-qt-user-state:0;">>> some point it will grow considerably.<p style=" margin-top:0px; margin-bottom:0px;
margin-left:0px;margin-right:0px; -qt-block-indent:0; text-indent:0px; -qt-user-state:0;">> > <p style="
margin-top:0px;margin-bottom:0px; margin-left:0px; margin-right:0px; -qt-block-indent:0; text-indent:0px;
-qt-user-state:0;">>> We basically have two different sets of data, on the one hand we have raw<p style="
margin-top:0px;margin-bottom:0px; margin-left:0px; margin-right:0px; -qt-block-indent:0; text-indent:0px;
-qt-user-state:0;">>> metrics from the datasource,<p style=" margin-top:0px; margin-bottom:0px; margin-left:0px;
margin-right:0px;-qt-block-indent:0; text-indent:0px; -qt-user-state:0;"> and on the other hand we have account and<p
style="margin-top:0px; margin-bottom:0px; margin-left:0px; margin-right:0px; -qt-block-indent:0; text-indent:0px;
-qt-user-state:0;">>> meta information. The former can be re-downloaded at any time, and will<p style="
margin-top:0px;margin-bottom:0px; margin-left:0px; margin-right:0px; -qt-block-indent:0; text-indent:0px;
-qt-user-state:0;">>> grow to quite large sizes. The latter set is the one that we really care<p style="
margin-top:0px;margin-bottom:0px; margin-left:0px; margin-right:0px; -qt-block-indent:0; text-indent:0px;
-qt-user-state:0;">>> about, and don't want to risk losing.<p style=" margin-top:0px; margin-bottom:0px;
margin-left:0px;margin-right:0px; -qt-block-indent:0; text-indent:0px; -qt-user-state:0;">> > <p style="
margin-top:0px;margin-bottom:0px; margin-left:0px; margin-right:0px; -qt-block-indent:0; text-indent:0px;
-qt-user-state:0;">>> Currently we plan on using streaming replication to replicate all data to<p style="
margin-top:0px;margin-bottom:0px; margin-left:0px; margin-right:0px; -qt-block-indent:0; text-indent:0px;
-qt-user-state:0;">>> at least one slave, for the<p style=" margin-top:0px; margin-bottom:0px; margin-left:0px;
margin-right:0px;-qt-block-indent:0; text-indent:0px; -qt-user-state:0;"> near future this will do, since we can run<p
style="margin-top:0px; margin-bottom:0px; margin-left:0px; margin-right:0px; -qt-block-indent:0; text-indent:0px;
-qt-user-state:0;">>> some of the large (read-only) aggregation queries on the slave database.<p style="
margin-top:0px;margin-bottom:0px; margin-left:0px; margin-right:0px; -qt-block-indent:0; text-indent:0px;
-qt-user-state:0;">>> In the future the dataset might grow to the point where we need to start<p style="
margin-top:0px;margin-bottom:0px; margin-left:0px; margin-right:0px; -qt-block-indent:0; text-indent:0px;
-qt-user-state:0;">>> thinking about sharding. The analytics data can be sharded on a<p style=" margin-top:0px;
margin-bottom:0px;margin-left:0px; margin-right:0px; -qt-block-indent:0; text-indent:0px; -qt-user-state:0;">> >
per-customerbasis, and doesn't have to be replicated.<p style=" margin-top:0px; margin-bottom:0px; margin-left:0px;
margin-right:0px;-qt-block-indent:0; text-indent:0px; -qt-user-state:0;">> > <p style=" margin-top:0px;
margin-bottom:0px;margin-left:0px; margin-right:0px; -qt-block-indent:0; text-indent:0px; -qt-user-state:0;">> >
SincePostgres doesn't support per-table streaming replication (as far as<p style=" margin-top:0px; margin-bottom:0px;
margin-left:0px;margin-right:0px; -qt-block-indent:0; text-indent:0px; -qt-user-state:0;">> > I can tell), the
only<pstyle=" margin-top:0px; margin-bottom:0px; margin-left:0px; margin-right:0px; -qt-block-indent:0;
text-indent:0px;-qt-user-state:0;"> solution would be to run two separate instances of<p style=" margin-top:0px;
margin-bottom:0px;margin-left:0px; margin-right:0px; -qt-block-indent:0; text-indent:0px; -qt-user-state:0;">> >
postgresper server. One instance is replicated to all servers, and will<p style=" margin-top:0px; margin-bottom:0px;
margin-left:0px;margin-right:0px; -qt-block-indent:0; text-indent:0px; -qt-user-state:0;">> > contain account and
otherimportant information. The other instance is<p style=" margin-top:0px; margin-bottom:0px; margin-left:0px;
margin-right:0px;-qt-block-indent:0; text-indent:0px; -qt-user-state:0;">> > used to store analytics data. Is
thisa viable way of solving this<p style=" margin-top:0px; margin-bottom:0px; margin-left:0px; margin-right:0px;
-qt-block-indent:0;text-indent:0px; -qt-user-state:0;">> > problem, or are we overlooking something? <p style="
margin-top:0px;margin-bottom:0px; margin-left:0px; margin-right:0px; -qt-block-indent:0; text-indent:0px;
-qt-user-state:0;">>> The problem is not really immediate, as the dataset is currently small<p style="
margin-top:0px;margin-bottom:0px; margin-left:0px; margin-right:0px; -qt-block-indent:0; text-indent:0px;
-qt-user-state:0;">>> enough to fit on one machine<p style=" margin-top:0px; margin-bottom:0px; margin-left:0px;
margin-right:0px;-qt-block-indent:0; text-indent:0px; -qt-user-state:0;"> (and replicated to a second), just want to<p
style="margin-top:0px; margin-bottom:0px; margin-left:0px; margin-right:0px; -qt-block-indent:0; text-indent:0px;
-qt-user-state:0;">>> be future proof, and get this solved before the problems start.<p style=" margin-top:0px;
margin-bottom:0px;margin-left:0px; margin-right:0px; -qt-block-indent:0; text-indent:0px; -qt-user-state:0;">> <p
style="margin-top:0px; margin-bottom:0px; margin-left:0px; margin-right:0px; -qt-block-indent:0; text-indent:0px;
-qt-user-state:0;">><p style=" margin-top:0px; margin-bottom:0px; margin-left:0px; margin-right:0px;
-qt-block-indent:0;text-indent:0px; -qt-user-state:0;">> The problems I see with distributing your data across<p
style="margin-top:0px; margin-bottom:0px; margin-left:0px; margin-right:0px; -qt-block-indent:0; text-indent:0px;
-qt-user-state:0;">>several PostgreSQL clusters is that they become disconnected.<p style=" margin-top:0px;
margin-bottom:0px;margin-left:0px; margin-right:0px; -qt-block-indent:0; text-indent:0px; -qt-user-state:0;">> <p
style="margin-top:0px; margin-bottom:0px; margin-left:0px; margin-right:0px; -qt-block-indent:0; text-indent:0px;
-qt-user-state:0;">>It will become much more difficult to keep them consistent:<p style=" margin-top:0px;
margin-bottom:0px;margin-left:0px; margin-right:0px; -qt-block-indent:0; text-indent:0px; -qt-user-state:0;">> <p
style="margin-top:0px; margin-bottom:0px; margin-left:0px; margin-right:0px; -qt-block-indent:0; text-indent:0px;
-qt-user-state:0;">>You cannot have referential integrity, and if a database restore<p style=" margin-top:0px;
margin-bottom:0px;margin-left:0px; margin-right:0px; -qt-block-indent:0; text-indent:0px; -qt-user-state:0;">> is
needed,you have to make extra provisions that you can restore<p style=" margin-top:0px; margin-bottom:0px;
margin-left:0px;margin-right:0px; -qt-block-indent:0; text-indent:0px; -qt-user-state:0;">> your system to a
consistentstate across all clusters.<p style=" margin-top:0px; margin-bottom:0px; margin-left:0px; margin-right:0px;
-qt-block-indent:0;text-indent:0px; -qt-user-state:0;">> <p style=" margin-top:0px; margin-bottom:0px;
margin-left:0px;margin-right:0px; -qt-block-indent:0; text-indent:0px; -qt-user-state:0;">> You also lose the
abilityto join between tables that are<p style=" margin-top:0px; margin-bottom:0px; margin-left:0px; margin-right:0px;
-qt-block-indent:0;text-indent:0px; -qt-user-state:0;">> distributed across different databases. This can be a<p
style="margin-top:0px; margin-bottom:0px; margin-left:0px; margin-right:0px; -qt-block-indent:0; text-indent:0px;
-qt-user-state:0;">>perdormance problem, particularly in an OLAP scenario.<p style=" margin-top:0px;
margin-bottom:0px;margin-left:0px; margin-right:0px; -qt-block-indent:0; text-indent:0px; -qt-user-state:0;">> <p
style="margin-top:0px; margin-bottom:0px; margin-left:0px; margin-right:0px; -qt-block-indent:0; text-indent:0px;
-qt-user-state:0;">>That's all I can think of at the moment.<p style=" margin-top:0px; margin-bottom:0px;
margin-left:0px;margin-right:0px; -qt-block-indent:0; text-indent:0px; -qt-user-state:0;">> <p style="
margin-top:0px;margin-bottom:0px; margin-left:0px; margin-right:0px; -qt-block-indent:0; text-indent:0px;
-qt-user-state:0;">>Yours,<p style=" margin-top:0px; margin-bottom:0px; margin-left:0px; margin-right:0px;
-qt-block-indent:0;text-indent:0px; -qt-user-state:0;">> Laurenz Albe<p style=" margin-top:0px; margin-bottom:0px;
margin-left:0px;margin-right:0px; -qt-block-indent:0; text-indent:0px; -qt-user-state:0;"><br />-- <p style="
margin-top:0px;margin-bottom:0px; margin-left:0px; margin-right:0px; -qt-block-indent:0; text-indent:0px;
-qt-user-state:0;">RenéRomero Benavides<span style=" font-family:'liberation sans';"> </span><a
href="https://twitter.com/iCodeiExist"><spanstyle=" font-family:'liberation sans'; text-decoration: underline;
color:#0057ae;">@iCodeiExist</span></a><a href="https://twitter.com/PgsqlMx"><span style=" text-decoration: underline;
color:#0057ae;">@PgsqlMx</span></a><pstyle="-qt-paragraph-type:empty; margin-top:0px; margin-bottom:0px;
margin-left:0px;margin-right:0px; -qt-block-indent:0; text-indent:0px; "> <p style=" margin-top:0px; margin-bottom:0px;
margin-left:0px;margin-right:0px; -qt-block-indent:0; text-indent:0px; -qt-user-state:0;">Postgresql Tips en español
parala comunidad de México e Hispanoamérica.<p style=" margin-top:0px; margin-bottom:0px; margin-left:0px;
margin-right:0px;-qt-block-indent:0; text-indent:0px; -qt-user-state:0;"><a href="http://postgresql.org.mx"><span
style="text-decoration: underline; color:#0057ae;">http://postgresql.org.mx</span></a><p
style="-qt-paragraph-type:empty;margin-top:0px; margin-bottom:0px; margin-left:0px; margin-right:0px;
-qt-block-indent:0;text-indent:0px; "> <p style="-qt-paragraph-type:empty; margin-top:0px; margin-bottom:0px;
margin-left:0px;margin-right:0px; -qt-block-indent:0; text-indent:0px; "> <p style="-qt-paragraph-type:empty;
margin-top:0px;margin-bottom:0px; margin-left:0px; margin-right:0px; -qt-block-indent:0; text-indent:0px; "> <p
style="-qt-paragraph-type:empty;margin-top:0px; margin-bottom:0px; margin-left:0px; margin-right:0px;
-qt-block-indent:0;text-indent:0px; "> <p style="-qt-paragraph-type:empty; margin-top:0px; margin-bottom:0px;
margin-left:0px;margin-right:0px; -qt-block-indent:0; text-indent:0px; "> <p style="-qt-paragraph-type:empty;
margin-top:0px;margin-bottom:0px; margin-left:0px; margin-right:0px; -qt-block-indent:0; text-indent:0px; ">  

Re: Streaming replication and sharding

От
"René Romero Benavides"
Дата:
<p style=" margin-top:0px; margin-bottom:0px; margin-left:0px; margin-right:0px; -qt-block-indent:0; text-indent:0px;
-qt-user-state:0;">There'salso one project called Postgres-XC which seems very promising, but I'm kind of wary about
itslevel of maturity.<p style="-qt-paragraph-type:empty; margin-top:0px; margin-bottom:0px; margin-left:0px;
margin-right:0px;-qt-block-indent:0; text-indent:0px; "> <p style=" margin-top:0px; margin-bottom:0px; margin-left:0px;
margin-right:0px;-qt-block-indent:0; text-indent:0px; -qt-user-state:0;">On Monday, February 18, 2013 08:00:33 AM Albe
Laurenzwrote:<p style=" margin-top:0px; margin-bottom:0px; margin-left:0px; margin-right:0px; -qt-block-indent:0;
text-indent:0px;-qt-user-state:0;">> Tiemo Kieft wrote:<p style=" margin-top:0px; margin-bottom:0px;
margin-left:0px;margin-right:0px; -qt-block-indent:0; text-indent:0px; -qt-user-state:0;">> <p style="
margin-top:0px;margin-bottom:0px; margin-left:0px; margin-right:0px; -qt-block-indent:0; text-indent:0px;
-qt-user-state:0;">>> We are developing an application that uses various web analytics packages<p style="
margin-top:0px;margin-bottom:0px; margin-left:0px; margin-right:0px; -qt-block-indent:0; text-indent:0px;
-qt-user-state:0;">>> (like Google Analytics) to<p style=" margin-top:0px; margin-bottom:0px; margin-left:0px;
margin-right:0px;-qt-block-indent:0; text-indent:0px; -qt-user-state:0;"> run analyses on. We are currently in closed<p
style="margin-top:0px; margin-bottom:0px; margin-left:0px; margin-right:0px; -qt-block-indent:0; text-indent:0px;
-qt-user-state:0;">>> beta stadium where we don't have a lot of data in the database, but at<p style="
margin-top:0px;margin-bottom:0px; margin-left:0px; margin-right:0px; -qt-block-indent:0; text-indent:0px;
-qt-user-state:0;">>> some point it will grow considerably.<p style=" margin-top:0px; margin-bottom:0px;
margin-left:0px;margin-right:0px; -qt-block-indent:0; text-indent:0px; -qt-user-state:0;">> > <p style="
margin-top:0px;margin-bottom:0px; margin-left:0px; margin-right:0px; -qt-block-indent:0; text-indent:0px;
-qt-user-state:0;">>> We basically have two different sets of data, on the one hand we have raw<p style="
margin-top:0px;margin-bottom:0px; margin-left:0px; margin-right:0px; -qt-block-indent:0; text-indent:0px;
-qt-user-state:0;">>> metrics from the datasource,<p style=" margin-top:0px; margin-bottom:0px; margin-left:0px;
margin-right:0px;-qt-block-indent:0; text-indent:0px; -qt-user-state:0;"> and on the other hand we have account and<p
style="margin-top:0px; margin-bottom:0px; margin-left:0px; margin-right:0px; -qt-block-indent:0; text-indent:0px;
-qt-user-state:0;">>> meta information. The former can be re-downloaded at any time, and will<p style="
margin-top:0px;margin-bottom:0px; margin-left:0px; margin-right:0px; -qt-block-indent:0; text-indent:0px;
-qt-user-state:0;">>> grow to quite large sizes. The latter set is the one that we really care<p style="
margin-top:0px;margin-bottom:0px; margin-left:0px; margin-right:0px; -qt-block-indent:0; text-indent:0px;
-qt-user-state:0;">>> about, and don't want to risk losing.<p style=" margin-top:0px; margin-bottom:0px;
margin-left:0px;margin-right:0px; -qt-block-indent:0; text-indent:0px; -qt-user-state:0;">> > <p style="
margin-top:0px;margin-bottom:0px; margin-left:0px; margin-right:0px; -qt-block-indent:0; text-indent:0px;
-qt-user-state:0;">>> Currently we plan on using streaming replication to replicate all data to<p style="
margin-top:0px;margin-bottom:0px; margin-left:0px; margin-right:0px; -qt-block-indent:0; text-indent:0px;
-qt-user-state:0;">>> at least one slave, for the<p style=" margin-top:0px; margin-bottom:0px; margin-left:0px;
margin-right:0px;-qt-block-indent:0; text-indent:0px; -qt-user-state:0;"> near future this will do, since we can run<p
style="margin-top:0px; margin-bottom:0px; margin-left:0px; margin-right:0px; -qt-block-indent:0; text-indent:0px;
-qt-user-state:0;">>> some of the large (read-only) aggregation queries on the slave database.<p style="
margin-top:0px;margin-bottom:0px; margin-left:0px; margin-right:0px; -qt-block-indent:0; text-indent:0px;
-qt-user-state:0;">>> In the future the dataset might grow to the point where we need to start<p style="
margin-top:0px;margin-bottom:0px; margin-left:0px; margin-right:0px; -qt-block-indent:0; text-indent:0px;
-qt-user-state:0;">>> thinking about sharding. The analytics data can be sharded on a<p style=" margin-top:0px;
margin-bottom:0px;margin-left:0px; margin-right:0px; -qt-block-indent:0; text-indent:0px; -qt-user-state:0;">> >
per-customerbasis, and doesn't have to be replicated.<p style=" margin-top:0px; margin-bottom:0px; margin-left:0px;
margin-right:0px;-qt-block-indent:0; text-indent:0px; -qt-user-state:0;">> > <p style=" margin-top:0px;
margin-bottom:0px;margin-left:0px; margin-right:0px; -qt-block-indent:0; text-indent:0px; -qt-user-state:0;">> >
SincePostgres doesn't support per-table streaming replication (as far as<p style=" margin-top:0px; margin-bottom:0px;
margin-left:0px;margin-right:0px; -qt-block-indent:0; text-indent:0px; -qt-user-state:0;">> > I can tell), the
only<pstyle=" margin-top:0px; margin-bottom:0px; margin-left:0px; margin-right:0px; -qt-block-indent:0;
text-indent:0px;-qt-user-state:0;"> solution would be to run two separate instances of<p style=" margin-top:0px;
margin-bottom:0px;margin-left:0px; margin-right:0px; -qt-block-indent:0; text-indent:0px; -qt-user-state:0;">> >
postgresper server. One instance is replicated to all servers, and will<p style=" margin-top:0px; margin-bottom:0px;
margin-left:0px;margin-right:0px; -qt-block-indent:0; text-indent:0px; -qt-user-state:0;">> > contain account and
otherimportant information. The other instance is<p style=" margin-top:0px; margin-bottom:0px; margin-left:0px;
margin-right:0px;-qt-block-indent:0; text-indent:0px; -qt-user-state:0;">> > used to store analytics data. Is
thisa viable way of solving this<p style=" margin-top:0px; margin-bottom:0px; margin-left:0px; margin-right:0px;
-qt-block-indent:0;text-indent:0px; -qt-user-state:0;">> > problem, or are we overlooking something? <p style="
margin-top:0px;margin-bottom:0px; margin-left:0px; margin-right:0px; -qt-block-indent:0; text-indent:0px;
-qt-user-state:0;">>> The problem is not really immediate, as the dataset is currently small<p style="
margin-top:0px;margin-bottom:0px; margin-left:0px; margin-right:0px; -qt-block-indent:0; text-indent:0px;
-qt-user-state:0;">>> enough to fit on one machine<p style=" margin-top:0px; margin-bottom:0px; margin-left:0px;
margin-right:0px;-qt-block-indent:0; text-indent:0px; -qt-user-state:0;"> (and replicated to a second), just want to<p
style="margin-top:0px; margin-bottom:0px; margin-left:0px; margin-right:0px; -qt-block-indent:0; text-indent:0px;
-qt-user-state:0;">>> be future proof, and get this solved before the problems start.<p style=" margin-top:0px;
margin-bottom:0px;margin-left:0px; margin-right:0px; -qt-block-indent:0; text-indent:0px; -qt-user-state:0;">> <p
style="margin-top:0px; margin-bottom:0px; margin-left:0px; margin-right:0px; -qt-block-indent:0; text-indent:0px;
-qt-user-state:0;">><p style=" margin-top:0px; margin-bottom:0px; margin-left:0px; margin-right:0px;
-qt-block-indent:0;text-indent:0px; -qt-user-state:0;">> The problems I see with distributing your data across<p
style="margin-top:0px; margin-bottom:0px; margin-left:0px; margin-right:0px; -qt-block-indent:0; text-indent:0px;
-qt-user-state:0;">>several PostgreSQL clusters is that they become disconnected.<p style=" margin-top:0px;
margin-bottom:0px;margin-left:0px; margin-right:0px; -qt-block-indent:0; text-indent:0px; -qt-user-state:0;">> <p
style="margin-top:0px; margin-bottom:0px; margin-left:0px; margin-right:0px; -qt-block-indent:0; text-indent:0px;
-qt-user-state:0;">>It will become much more difficult to keep them consistent:<p style=" margin-top:0px;
margin-bottom:0px;margin-left:0px; margin-right:0px; -qt-block-indent:0; text-indent:0px; -qt-user-state:0;">> <p
style="margin-top:0px; margin-bottom:0px; margin-left:0px; margin-right:0px; -qt-block-indent:0; text-indent:0px;
-qt-user-state:0;">>You cannot have referential integrity, and if a database restore<p style=" margin-top:0px;
margin-bottom:0px;margin-left:0px; margin-right:0px; -qt-block-indent:0; text-indent:0px; -qt-user-state:0;">> is
needed,you have to make extra provisions that you can restore<p style=" margin-top:0px; margin-bottom:0px;
margin-left:0px;margin-right:0px; -qt-block-indent:0; text-indent:0px; -qt-user-state:0;">> your system to a
consistentstate across all clusters.<p style=" margin-top:0px; margin-bottom:0px; margin-left:0px; margin-right:0px;
-qt-block-indent:0;text-indent:0px; -qt-user-state:0;">> <p style=" margin-top:0px; margin-bottom:0px;
margin-left:0px;margin-right:0px; -qt-block-indent:0; text-indent:0px; -qt-user-state:0;">> You also lose the
abilityto join between tables that are<p style=" margin-top:0px; margin-bottom:0px; margin-left:0px; margin-right:0px;
-qt-block-indent:0;text-indent:0px; -qt-user-state:0;">> distributed across different databases. This can be a<p
style="margin-top:0px; margin-bottom:0px; margin-left:0px; margin-right:0px; -qt-block-indent:0; text-indent:0px;
-qt-user-state:0;">>perdormance problem, particularly in an OLAP scenario.<p style=" margin-top:0px;
margin-bottom:0px;margin-left:0px; margin-right:0px; -qt-block-indent:0; text-indent:0px; -qt-user-state:0;">> <p
style="margin-top:0px; margin-bottom:0px; margin-left:0px; margin-right:0px; -qt-block-indent:0; text-indent:0px;
-qt-user-state:0;">>That's all I can think of at the moment.<p style=" margin-top:0px; margin-bottom:0px;
margin-left:0px;margin-right:0px; -qt-block-indent:0; text-indent:0px; -qt-user-state:0;">> <p style="
margin-top:0px;margin-bottom:0px; margin-left:0px; margin-right:0px; -qt-block-indent:0; text-indent:0px;
-qt-user-state:0;">>Yours,<p style=" margin-top:0px; margin-bottom:0px; margin-left:0px; margin-right:0px;
-qt-block-indent:0;text-indent:0px; -qt-user-state:0;">> Laurenz Albe<p style=" margin-top:0px; margin-bottom:0px;
margin-left:0px;margin-right:0px; -qt-block-indent:0; text-indent:0px; -qt-user-state:0;"><br />-- <p style="
margin-top:0px;margin-bottom:0px; margin-left:0px; margin-right:0px; -qt-block-indent:0; text-indent:0px;
-qt-user-state:0;">RenéRomero Benavides<span style=" font-family:'liberation sans';"> </span><a
href="https://twitter.com/iCodeiExist"><spanstyle=" font-family:'liberation sans'; text-decoration: underline;
color:#0057ae;">@iCodeiExist</span></a><a href="https://twitter.com/PgsqlMx"><span style=" text-decoration: underline;
color:#0057ae;">@PgsqlMx</span></a><pstyle="-qt-paragraph-type:empty; margin-top:0px; margin-bottom:0px;
margin-left:0px;margin-right:0px; -qt-block-indent:0; text-indent:0px; "> <p style=" margin-top:0px; margin-bottom:0px;
margin-left:0px;margin-right:0px; -qt-block-indent:0; text-indent:0px; -qt-user-state:0;">Postgresql Tips en español
parala comunidad de México e Hispanoamérica.<p style=" margin-top:0px; margin-bottom:0px; margin-left:0px;
margin-right:0px;-qt-block-indent:0; text-indent:0px; -qt-user-state:0;"><a href="http://postgresql.org.mx"><span
style="text-decoration: underline; color:#0057ae;">http://postgresql.org.mx</span></a><p
style="-qt-paragraph-type:empty;margin-top:0px; margin-bottom:0px; margin-left:0px; margin-right:0px;
-qt-block-indent:0;text-indent:0px; "> <p style="-qt-paragraph-type:empty; margin-top:0px; margin-bottom:0px;
margin-left:0px;margin-right:0px; -qt-block-indent:0; text-indent:0px; "> <p style="-qt-paragraph-type:empty;
margin-top:0px;margin-bottom:0px; margin-left:0px; margin-right:0px; -qt-block-indent:0; text-indent:0px; "> <p
style="-qt-paragraph-type:empty;margin-top:0px; margin-bottom:0px; margin-left:0px; margin-right:0px;
-qt-block-indent:0;text-indent:0px; "> <p style="-qt-paragraph-type:empty; margin-top:0px; margin-bottom:0px;
margin-left:0px;margin-right:0px; -qt-block-indent:0; text-indent:0px; "> <p style="-qt-paragraph-type:empty;
margin-top:0px;margin-bottom:0px; margin-left:0px; margin-right:0px; -qt-block-indent:0; text-indent:0px; ">  

Re: Streaming replication and sharding

От
Michael Paquier
Дата:
On Tue, Feb 19, 2013 at 6:01 AM, Ren=E9 Romero Benavides <ichbinrene@gmail.=
com
> wrote:

> **
>
> There's also one project called Postgres-XC which seems very promising,
> but I'm kind of wary about its level of maturity.
>
Project page is here: https://sourceforge.net/projects/postgres-xc/
The latest version 1.0 released last year in June lacks of a couple of
features widely used these days like returning or triggers, but the team is
working hard in implementing that for 1.1 planned for April-June this year
with many other things.
--=20
Michael

Re: Streaming replication and sharding

От
"René Romero Benavides"
Дата:
<p style=" margin-top:0px; margin-bottom:0px; margin-left:0px; margin-right:0px; -qt-block-indent:0; text-indent:0px;
-qt-user-state:0;">Lookingforward to that release and best wishes for the project in general.<p style=" margin-top:0px;
margin-bottom:0px;margin-left:0px; margin-right:0px; -qt-block-indent:0; text-indent:0px; -qt-user-state:0;"><br />On
Tuesday,February 19, 2013 01:02:34 PM Michael Paquier wrote:<br /><p style=" margin-top:0px; margin-bottom:0px;
margin-left:40px;margin-right:40px; -qt-block-indent:0; text-indent:0px; -qt-user-state:0;">On Tue, Feb 19, 2013 at
6:01AM, René Romero Benavides <<a href="mailto:ichbinrene@gmail.com"><span style=" text-decoration: underline;
color:#0057ae;">ichbinrene@gmail.com</span></a>>wrote:<br /><p style=" margin-top:0px; margin-bottom:0px;
margin-left:44px;margin-right:40px; -qt-block-indent:0; text-indent:0px; -qt-user-state:0;">There's also one project
calledPostgres-XC which seems very promising, but I'm kind of wary about its level of maturity.<p style="
margin-top:0px;margin-bottom:0px; margin-left:40px; margin-right:40px; -qt-block-indent:0; text-indent:0px;
-qt-user-state:0;">Projectpage is here: <a href="https://sourceforge.net/projects/postgres-xc/"><span style="
text-decoration:underline; color:#0057ae;">https://sourceforge.net/projects/postgres-xc/</span></a><br />The latest
version1.0 released last year in June lacks of a couple of features widely used these days like returning or triggers,
butthe team is working hard in implementing that for 1.1 planned for April-June this year with many other things.<br
/><pstyle=" margin-top:0px; margin-bottom:0px; margin-left:0px; margin-right:0px; -qt-block-indent:0; text-indent:0px;
-qt-user-state:0;">--<br />Michael<br /><p style=" margin-top:0px; margin-bottom:0px; margin-left:0px;
margin-right:0px;-qt-block-indent:0; text-indent:0px; -qt-user-state:0;"><br /><br /><br />-- <p style="
margin-top:0px;margin-bottom:0px; margin-left:0px; margin-right:0px; -qt-block-indent:0; text-indent:0px;
-qt-user-state:0;">RenéRomero Benavides<span style=" font-family:'liberation sans';"> </span><a
href="https://twitter.com/iCodeiExist"><spanstyle=" font-family:'liberation sans'; text-decoration: underline;
color:#0057ae;">@iCodeiExist</span></a><a href="https://twitter.com/PgsqlMx"><span style=" text-decoration: underline;
color:#0057ae;">@PgsqlMx</span></a><pstyle="-qt-paragraph-type:empty; margin-top:0px; margin-bottom:0px;
margin-left:0px;margin-right:0px; -qt-block-indent:0; text-indent:0px; "> <p style=" margin-top:0px; margin-bottom:0px;
margin-left:0px;margin-right:0px; -qt-block-indent:0; text-indent:0px; -qt-user-state:0;">Postgresql Tips en español
parala comunidad de México e Hispanoamérica.<p style=" margin-top:0px; margin-bottom:0px; margin-left:0px;
margin-right:0px;-qt-block-indent:0; text-indent:0px; -qt-user-state:0;"><a href="http://postgresql.org.mx"><span
style="text-decoration: underline; color:#0057ae;">http://postgresql.org.mx</span></a><p
style="-qt-paragraph-type:empty;margin-top:0px; margin-bottom:0px; margin-left:0px; margin-right:0px;
-qt-block-indent:0;text-indent:0px; "> <p style="-qt-paragraph-type:empty; margin-top:0px; margin-bottom:0px;
margin-left:0px;margin-right:0px; -qt-block-indent:0; text-indent:0px; "> <p style="-qt-paragraph-type:empty;
margin-top:0px;margin-bottom:0px; margin-left:0px; margin-right:0px; -qt-block-indent:0; text-indent:0px; "> <p
style="-qt-paragraph-type:empty;margin-top:0px; margin-bottom:0px; margin-left:0px; margin-right:0px;
-qt-block-indent:0;text-indent:0px; "> <p style="-qt-paragraph-type:empty; margin-top:0px; margin-bottom:0px;
margin-left:0px;margin-right:0px; -qt-block-indent:0; text-indent:0px; "> <p style="-qt-paragraph-type:empty;
margin-top:0px;margin-bottom:0px; margin-left:0px; margin-right:0px; -qt-block-indent:0; text-indent:0px; ">