Re: [GENERAL] is (not) distinct from

Поиск
Список
Период
Сортировка
От Adrian Klaver
Тема Re: [GENERAL] is (not) distinct from
Дата
Msg-id ccbfafd3-2f26-8f11-4a0b-ed7d4723c29e@aklaver.com
обсуждение исходный текст
Ответ на Re: [GENERAL] is (not) distinct from  (Johann Spies <johann.spies@gmail.com>)
Список pgsql-general
On 03/07/2017 01:22 AM, Johann Spies wrote:
> Thanks (again Adrian) and Tom.
>
> The situation was that I had a table with 731million records which I
> wanted to copy into a partitioned one where there was a unique
> constraint on the fields used in my query.
>
> The "backup" table was the single one.
>
> While inserting into the partitioned table from the backup one, several
> (about 120000) records failed to insert.
>
> I wanted to find out which records were involved and found that some had
> "'" characters in the values which broke some of the functions used to
> do some calculations..
>
> As there were fields that might have null values I have tried the "is
> not distinct from".
>
> Both sides of the query had primary keys and I did not use group by.
> That was why I used "distinct".

Would it not be easier to use a LEFT JOIN between the original
table(backup) and the new table:

https://www.postgresql.org/docs/9.6/static/sql-select.html

join_type

"LEFT OUTER JOIN returns all rows in the qualified Cartesian product
(i.e., all combined rows that pass its join condition), plus one copy of
each row in the left-hand table for which there was no right-hand row
that passed the join condition. This left-hand row is extended to the
full width of the joined table by inserting null values for the
right-hand columns. Note that only the JOIN clause's own condition is
considered while deciding which rows have matches. Outer conditions are
applied afterwards."

So something like:

SELECT
    a.pk
FROM
    original_table AS a
LEFT JOIN                    --The OUTER is not required
    new_table AS b
ON
    a.pk = b.pk
WHERE
    b.pk IS NULL

That would show all the rows in the original table that where not
transferred over.

>
> Anyhow in the end, I made some progress with a modified query:
>
> where
> s.citing_article = A.citing_article
>    and
>           s.cited_article !=  A.cited_article
>    and
>    s.pubyear is  distinct from A.pubyear
>    and
>    s.year_cited is distinct from A.year_cited
>    and
>    s.cited_author is distinct from A.cited_author
>    and
>         regexp_replace(s.cited_title,  $$'$$, $$''$$,'g') is distinct from
>         regexp_replace(A.cited_title,  $$'$$, $$''$$,'g')
>    and
>         regexp_replace(s.cited_work,  $$'$$, $$''$$,'g') is distinct
> from regexp_replace(A.cited_work,   $$'$$, $$''$$,'g')
>    and
>         s.doi is distinct from A.doi
>
> Regards.
> Johann
>
> --
> Because experiencing your loyal love is better than life itself,
> my lips will praise you.  (Psalm 63:3)


--
Adrian Klaver
adrian.klaver@aklaver.com


В списке pgsql-general по дате отправления:

Предыдущее
От: Stephen Frost
Дата:
Сообщение: Re: [GENERAL] intentional or oversight? pg_dump -c does not restoredefault priviliges on schema public
Следующее
От: Caleb Cushing
Дата:
Сообщение: Re: [GENERAL] Feature request - psql --quote-variable