Re: [GENERAL] COPY: row is too big

Поиск
Список
Период
Сортировка
От Adrian Klaver
Тема Re: [GENERAL] COPY: row is too big
Дата
Msg-id c0394462-95c3-cc3a-b861-0ef34f96aa0b@aklaver.com
обсуждение исходный текст
Ответ на Re: [GENERAL] COPY: row is too big  (vod vos <vodvos@zoho.com>)
Список pgsql-general
On 01/05/2017 04:44 AM, vod vos wrote:
> I finally figured it out as follows:
>
> 1. modified the corresponding data type of the columns to the csv file
>
> 2. if null values existed, defined the data type to varchar. The null
> values cause problem too.

Did you change the NULLs to something else?

As Pavel said the type does not really matter for NULL:

https://www.postgresql.org/docs/9.6/static/storage-page-layout.html

See marked(<***>) up part

"All table rows are structured in the same way. There is a fixed-size
header (occupying 23 bytes on most machines), followed by an optional
null bitmap, an optional object ID field, and the user data. The header
is detailed in Table 65-4. The actual user data (columns of the row)
begins at the offset indicated by t_hoff, which must always be a
multiple of the MAXALIGN distance for the platform. <***>The null bitmap
is only present if the HEAP_HASNULL bit is set in t_infomask. If it is
present it begins just after the fixed header and occupies enough bytes
to have one bit per data column (that is, t_natts bits altogether). In
this list of bits, a 1 bit indicates not-null, a 0 bit is a null. When
the bitmap is not present, all columns are assumed not-null. <***> The
object ID is only present if the HEAP_HASOID bit is set in t_infomask.
If present, it appears just before the t_hoff boundary. Any padding
needed to make t_hoff a MAXALIGN multiple will appear between the null
bitmap and the object ID. (This in turn ensures that the object ID is
suitably aligned.)"

In this post:

https://www.postgresql.org/message-id/1595fd48444.ba3ec57e13739.3837934651947496063%40zoho.com

you said:

"And some the values in the csv file contain nulls, do this null values
matter?"

It looks like there are a good deal of NULLs in a row. In your original
post COPY failed on the second line, so assuming the same data what is
the NULL count in that line. Or can you provide some estimate of the
high count of NULLS in your data rows?

>
> so 1100 culumns work well now.
>
> This problem wasted me three days. I have lots of csv data to COPY.

You solved the problem so it was not entirely wasted and it provided
information for future reference when folks hit this list with a similar
issue.

>
>
>
>
> ---- On 星期三, 04 一月 2017 08:39:42 -0800 *Adrian Klaver
> <adrian.klaver@aklaver.com>* wrote ----
>
>     On 01/04/2017 08:32 AM, Steve Crawford wrote:
>     > ...
>     >
>     > Numeric is expensive type - try to use float instead, maybe double.
>     >
>     >
>     > If I am following the OP correctly the table itself has all the
>     > columns declared as varchar. The data in the CSV file is a mix of
>     > text, date and numeric, presumably cast to text on entry into the
>     table.
>     >
>     >
>     > But a CSV *is* purely text - no casting to text is needed.
>     Conversion is
>     > only needed when the strings in the CSV are text representations of
>     > *non*-text data.
>
>     Yeah, muddled thinking.
>
>     >
>     > I'm guessing that the OP is using all text fields to deal with
>     possibly
>     > flawed input data and then validating and migrating the data in
>     > subsequent steps. In that case, an ETL solution may be a better
>     > approach. Many options, both open- closed- and hybrid-source exist.
>     >
>     > Cheers,
>     > Steve
>
>
>     --
>     Adrian Klaver
>     adrian.klaver@aklaver.com <mailto:adrian.klaver@aklaver.com>
>
>
>     --
>     Sent via pgsql-general mailing list (pgsql-general@postgresql.org
>     <mailto:pgsql-general@postgresql.org>)
>     To make changes to your subscription:
>     http://www.postgresql.org/mailpref/pgsql-general
>
>


--
Adrian Klaver
adrian.klaver@aklaver.com


В списке pgsql-general по дате отправления:

Предыдущее
От: junior
Дата:
Сообщение: Re: [GENERAL] Queries on very big table
Следующее
От: Tom DalPozzo
Дата:
Сообщение: [GENERAL] requested timeline doesn't contain minimum recovery point