Re: [SQL] avoid update - insert churn

Поиск
Список
Период
Сортировка
От Steve Midgley
Тема Re: [SQL] avoid update - insert churn
Дата
Msg-id CAJexoSLeeZbBF8ZiV-i+DYNW_wff=mnCkGQZhFCwgMkKxBKOkA@mail.gmail.com
обсуждение исходный текст
Ответ на Re: [SQL] avoid update - insert churn  (Kevin Duffy <kevind0718@gmail.com>)
Список pgsql-sql


On Sun, Mar 5, 2017 at 12:07 PM, Kevin Duffy <kevind0718@gmail.com> wrote:
"PostgreSQL 9.5.4, compiled by Visual C++ build 1800, 64-bit"

some examples here:  http://www.postgresqltutorial.com/postgresql-upsert/

many thanks for your swift reply.

KD

On Sun, Mar 5, 2017 at 2:56 PM, Adrian Klaver <adrian.klaver@aklaver.com> wrote:
On 03/05/2017 10:20 AM, Kevin Duffy wrote:

Hello All:

I am looking for suggestions on how to optimize a relatively simple problem.

Say I have a group of shoe stores.  The store report sales daily and
daily I
calc stats off on the sales numbers.  Ie rolling averages & volatility
numbers, with a couple of different look backs.

I am storing the raw sales figures in one table and the calc'ed stats in
another.

Here is the interesting part,  it is possible and it does happen that a
store will not
report sales for a day or two or prior figures reported will get revised.
There is at max a five day window on this.

For the prior dates there are two possibilities, I will be updating
a record or inserting a new record in sales table.  For the current
date's sales it will be all inserts.
Currently what I do is loop through the data by store and shoe style
attempt an update and if that fails
do an insert.

Same deal for the stat's table except there are multiple records for
each Store/Shoe style pair.


So here is my question/issue:  Is this there a better design pattern for
doing this?
I could gather up the data for the prior dates and do one "bulk"
update.  But that would
fail for sales that have not been reported todate.
The current date's sales are of course an insert.  And I could fix this
to be one "bulk" insert.

Was just chewing this over in my heading wondering if there is a more
efficient way to this.
Other than attempt update if fail insert.  Seems to be a bit of churn to me.

You do not say what version of Postgres you are using, but if 9.5+

https://www.postgresql.org/docs/9.5/static/sql-insert.html

"ON CONFLICT Clause

The optional ON CONFLICT clause specifies an alternative action to raising a unique violation or exclusion constraint violation error. For each individual row proposed for insertion, either the insertion proceeds, or, if an arbiter constraint or index specified by conflict_target is violated, the alternative conflict_action is taken. ON CONFLICT DO NOTHING simply avoids inserting a row as its alternative action. ON CONFLICT DO UPDATE updates the existing row that conflicts with the row proposed for insertion as its alternative action."

It wraps the INSERT/UPDATE as an UPSERT in one command.




But at the end of the day, it do believe Keep It Simple.

Many thanks for considering this problem.

KD


This seems more like a data modeling problem than a PG SQL issue per se (though obviously your SQL vocab matters in terms what what/how you implement). If that's right, I'd suggest taking a look at Ralph Kimball's book Data Warehouse Toolkit. It basically deals with practical issues like this in many ways. Your summary table is basically a warehouse fact table aggregated against a couple of dimensions. If you like his approach, you might find that it eliminates the design problems you're facing and improves performance on the underlying calculations as well.

I haven't spent much time doing data warehouse work for more than 10 years, so I'm rusty, but I do remember reading his book and feeling grateful at all the many hours of time he saved me, and countless errors avoided.

Steve

В списке pgsql-sql по дате отправления:

Предыдущее
От: Kevin Duffy
Дата:
Сообщение: Re: [SQL] avoid update - insert churn
Следующее
От: Günce Kaya
Дата:
Сообщение: [SQL] parameter type is unknown error