Обсуждение: field error on refreshed materialized view

Поиск
Список
Период
Сортировка

field error on refreshed materialized view

От
Michael Nolan
Дата:
I'm in the process of updating a server from postgresql 10.4 on Centos
7 to 16.1 on AlmaLInux 9, and am reminded of the old line about when
you're up to your necks in alligators, it is difficult to remember
that your goal was to drain the swamp.

We have several materialized views that are refreshed from an external
mysql/mariadb server.

On the 10.4 server this materialized view works, but on the 16.1
server it fails:

uscf=# refresh materialized view uscf_vip;
refresh materialized view uscf_vip;
ERROR: value too long for type character varying(255)

Both materialized view tables appear to be identical.  (We did a
restore from a recent database dump to populate the new server.)

I don't see anything in the log files to help me determine which field
or row is causing the error.  As far as I can tell, encoding is the
same on both versions, but there could be issues in how the mysql
server is coding non-ASCII characters.

Is this something that has to be tracked down on the mysql/mariadb
server or is there anything I can do to help narrow down the problem
on my 16.1 server?
--
Mike Nolan
htfoot@gmail.com



Re: field error on refreshed materialized view

От
Adrian Klaver
Дата:


On 1/25/24 1:29 PM, Michael Nolan wrote:
I'm in the process of updating a server from postgresql 10.4 on Centos
7 to 16.1 on AlmaLInux 9, and am reminded of the old line about when
you're up to your necks in alligators, it is difficult to remember
that your goal was to drain the swamp.

We have several materialized views that are refreshed from an external
mysql/mariadb server.

Define "... refreshed from an external mysql/mariadb server"?


On the 10.4 server this materialized view works, but on the 16.1
server it fails:

uscf=# refresh materialized view uscf_vip;
refresh materialized view uscf_vip;
ERROR: value too long for type character varying(255)


Is there more to the error, that shows the actual value?

Or have you looked in server log to see if there is more information?



Both materialized view tables appear to be identical.  (We did a
restore from a recent database dump to populate the new server.)

I don't see anything in the log files to help me determine which field
or row is causing the error.  As far as I can tell, encoding is the
same on both versions, but there could be issues in how the mysql
server is coding non-ASCII characters.

Is this something that has to be tracked down on the mysql/mariadb
server or is there anything I can do to help narrow down the problem
on my 16.1 server?
--
Mike Nolan
htfoot@gmail.com


Re: field error on refreshed materialized view

От
Tom Lane
Дата:
Michael Nolan <htfoot@gmail.com> writes:
> On the 10.4 server this materialized view works, but on the 16.1
> server it fails:

> uscf=# refresh materialized view uscf_vip;
> refresh materialized view uscf_vip;
> ERROR: value too long for type character varying(255)

> Both materialized view tables appear to be identical.  (We did a
> restore from a recent database dump to populate the new server.)

> I don't see anything in the log files to help me determine which field
> or row is causing the error.  As far as I can tell, encoding is the
> same on both versions, but there could be issues in how the mysql
> server is coding non-ASCII characters.

Yeah, an encoding discrepancy could explain this, by causing the
server to count characters differently than before.  (Note that
we consider varchar(255) to allow 255 characters, not 255 bytes.)

One idea for tracking it down could be to remove or increase all the
varchar limits in the matview declaration, load the data, and then
probe for bad entries with something like "select * from matview where
length(some_varchar_field) > 255".

            regards, tom lane



Re: field error on refreshed materialized view

От
Michael Nolan
Дата:
On Thu, Jan 25, 2024 at 3:38 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:
>
> Michael Nolan <htfoot@gmail.com> writes:
> > On the 10.4 server this materialized view works, but on the 16.1
> > server it fails:
>
> > uscf=# refresh materialized view uscf_vip;
> > refresh materialized view uscf_vip;
> > ERROR: value too long for type character varying(255)
>
> > Both materialized view tables appear to be identical.  (We did a
> > restore from a recent database dump to populate the new server.)
>
> > I don't see anything in the log files to help me determine which field
> > or row is causing the error.  As far as I can tell, encoding is the
> > same on both versions, but there could be issues in how the mysql
> > server is coding non-ASCII characters.
>
> Yeah, an encoding discrepancy could explain this, by causing the
> server to count characters differently than before.  (Note that
> we consider varchar(255) to allow 255 characters, not 255 bytes.)

Thanks, Tom.

Both postgresql databases are using SQL-ASCII and C LC values, I don't
know what the mariadb is using, but I've asked the people who maintain
that server for that information.  I suspect it may be UTF-8 or
something else that supports multi-byte characters.  I don't know if
changing the encoding on the new database server would introduce more
issues than it would solve, but as we're still in testing mode I could
test that.

> One idea for tracking it down could be to remove or increase all the
> varchar limits in the matview declaration, load the data, and then
> probe for bad entries with something like "select * from matview where
> length(some_varchar_field) > 255".

The reasons for the 255 are mostly historical (the original server
this data came from was a COBOL system.)  Increasing or removing the
255 character limits may not cause any new problems.

==
Mike Nolan



Re: field error on refreshed materialized view

От
Michael Nolan
Дата:
Followup:  The problem turned out to be a field in the mysql server
that was not size-limited to 255 characters and had several rows with
as many as 299 characters in them.  Apparently when using an FDW and
materialized view in postgresql 10.4, field size limits aren't
checked, because a query on the matview in the 10.4 system finds those
records with more than 255 characters in that field.

So, not a character encoding issue at all.
--
Mike Nolan



Re: field error on refreshed materialized view

От
Tom Lane
Дата:
Michael Nolan <htfoot@gmail.com> writes:
> Followup:  The problem turned out to be a field in the mysql server
> that was not size-limited to 255 characters and had several rows with
> as many as 299 characters in them.  Apparently when using an FDW and
> materialized view in postgresql 10.4, field size limits aren't
> checked, because a query on the matview in the 10.4 system finds those
> records with more than 255 characters in that field.

> So, not a character encoding issue at all.

Hm ... if the foreign table's field was declared as varchar(255),
older PG might've believed that and not rechecked the length while
storing into the matview.  Not sure if this behavior change was
intentional or an accidental product of refactoring.

            regards, tom lane