Selective Replication help

Поиск

Список

Период

Сортировка

От	Hailey Eckstrand
Тема	Selective Replication help
Дата	28 мая 2015 г. 22:13:42
Msg-id	CABCgRVEzwwzyeo0kXCTSJs0_geaQ+EUvTDM_deyOZfj-o4EmrQ@mail.gmail.com обсуждение исходный текст
Ответы	Re: Selective Replication help
Список	pgsql-novice

Дерево обсуждения

I have one local master database and 4 remote slave databases. Every hour I send, around 1000 rows from the master to each of the slaves (4 columns). For context, my master database is where I do web scraping. After the data is scraped, I send it out to the slave databases.

Of the 1000 rows, there are some duplicates in the primary key of the slave databases so part of the push involves deleting the primary key overlaps and then inserting all of the data.

On the master database, the data I send to the slaves is compiled from a view which is a subset of 5 tables. The reason I am explaining this is because it is not a simple master table, slave table replication. The data I send out is from a view of multiple tables in the master and inserted into one table on the slave.

Currently, the way I push the data from local to remote is,

I've created an SQL function which operates on a slave database table (via a foreign data wrapper) that checks if the row exists and if it does, it updates the values and if it doesn't exist, it inserts the new row.

I find that this is very slow and I'm guessing inefficient but I'm not very good at testing that. The amount of rows that I'm sending out is about to increase to 10,000 and I'd like to look into other solutions.

Option 1) In my research, I've found the function 'dblink_build_sql_insert' that states it 'can be useful in doing selective replication of a local table to a remote database' which sounds like what I'm doing but I've not found any examples online.

Option 2) I've also been looking into a master-slave replication software called bucardo but it works table to table and so if I did that.. I think I would have to create local versions of the slave table (which would be storing duplicated data), update them by triggers (when new data was inserted into the 5 tables via web scraping scripts) and then set up replication on those tables to the slave databases. However, with replication comes a steep learning curve. I'm not sure if what I'm trying to do is complicated enough to warrant replication.

Can anyone suggest other options which would be fast and perhaps more efficient or let me know if 1 or 2 are a good idea?

I know enough python and php to get by and can figure out triggers. I am not amazing at plpgsql but could figure it out.

Thank you,

Hailey

В списке pgsql-novice по дате отправления:

Вход в личный кабинет

Восстановление пароля

Подтверждение аккаунта

Изменение пароля

Selective Replication help