announce: spark-postgres 3 released

Поиск

Список

Период

Сортировка

От	Nicolas Paris
Тема	announce: spark-postgres 3 released
Дата	11 ноября 2019 г. 03:05:36
Msg-id	20191111000536.4vuo3wlmqkv3wojd@riseup.net обсуждение исходный текст
Ответы	Re: announce: spark-postgres 3 released
Список	pgsql-general

Дерево обсуждения

Hello postgres users,

Spark-postgres is designed for reliable and performant ETL in big-data
workload and offers read/write/scd capability to better bridge spark and
postgres. The version 3 introduces a datasource API. It outperforms
sqoop by factor 8 and the apache spark core jdbc by infinity.

Features:
- use of pg COPY statements
- parallel reads/writes
- use of hdfs to store intermediary csv 
- reindex after bulk-loading
- SCD1 computations done on the spark side
- use unlogged tables when needed
- handle arrays and multiline string columns
- useful jdbc functions (ddl, updates...)

The official repository:
https://framagit.org/parisni/spark-etl/tree/master/spark-postgres

And its mirror on microsoft github:
https://github.com/EDS-APHP/spark-etl/tree/master/spark-postgres

-- 
nicolas

В списке pgsql-general по дате отправления:

Предыдущее

От: Matthias Apitz
Дата: 09 ноября 2019 г., 21:45:31
Сообщение: Re: type SERIAL in C host-struct

Следующее

От: Nicolas Paris
Дата: 11 ноября 2019 г., 03:16:49
Сообщение: Re: How to import Apache parquet files?

Вход в личный кабинет

Восстановление пароля

Подтверждение аккаунта

Изменение пароля

announce: spark-postgres 3 released

Предыдущее

Следующее