Re: BUG #15121: Multiple UBSAN errors

Поиск
Список
Период
Сортировка
От Tomas Vondra
Тема Re: BUG #15121: Multiple UBSAN errors
Дата
Msg-id 7df78014-c931-6c75-6f92-ec48f390b2b9@2ndquadrant.com
обсуждение исходный текст
Ответ на BUG #15121: Multiple UBSAN errors  (PG Bug reporting form <noreply@postgresql.org>)
Ответы Re: BUG #15121: Multiple UBSAN errors
Список pgsql-bugs
On 03/18/2018 08:59 PM, PG Bug reporting form wrote:
> The following bug has been logged on the website:
> 
> Bug reference:      15121
> Logged by:          Martin Liška
> Email address:      marxin.liska@gmail.com
> PostgreSQL version: 10.3
> Operating system:   Linux
> Description:        
> 
> Building current trunk with -fsanitize=undefined I see following errors with
> make check:
> 
> clog.c:299:3: runtime error: null pointer passed as argument 1, which is
> declared to never be null
>     #0 0x65c865 in TransactionIdSetPageStatus
> /home/marxin/Programming/postgres/src/backend/access/transam/clog.c:299
>     #1 0x65c4a5 in TransactionIdSetTreeStatus
> /home/marxin/Programming/postgres/src/backend/access/transam/clog.c:190
>     #2 0x680830 in TransactionIdCommitTree
> /home/marxin/Programming/postgres/src/backend/access/transam/transam.c:262
>     #3 0x68d47d in RecordTransactionCommit
> /home/marxin/Programming/postgres/src/backend/access/transam/xact.c:1290
>     #4 0x68f1fd in CommitTransaction
> /home/marxin/Programming/postgres/src/backend/access/transam/xact.c:2037
>     #5 0x6908cd in CommitTransactionCommand
> /home/marxin/Programming/postgres/src/backend/access/transam/xact.c:2768
>     #6 0x6e297f in BootstrapModeMain
> /home/marxin/Programming/postgres/src/backend/bootstrap/bootstrap.c:515
>     #7 0x6e275f in AuxiliaryProcessMain
> /home/marxin/Programming/postgres/src/backend/bootstrap/bootstrap.c:434
>     #8 0xc1964c in main
> /home/marxin/Programming/postgres/src/backend/main/main.c:220
>     #9 0x7ffff635ca86 in __libc_start_main (/lib64/libc.so.6+0x21a86)
>     #10 0x4863d9 in _start
> (/home/marxin/Programming/postgres/tmp_install/usr/local/pgsql/bin/postgres+0x4863d9)
> 

Not sure what this is - the lines don't seem to match to the sources, so
presumably it's shifted somehow. So hard to say which pointer is it
complaining about ...

> relcache.c:5932:6: runtime error: null pointer passed as argument 1, which
> is declared to never be null
>     #0 0x140aa86 in write_item
> /home/marxin/Programming/postgres/src/backend/utils/cache/relcache.c:5932
>     #1 0x140a2e2 in write_relcache_init_file
> /home/marxin/Programming/postgres/src/backend/utils/cache/relcache.c:5837
>     #2 0x13f7a63 in RelationCacheInitializePhase3
> /home/marxin/Programming/postgres/src/backend/utils/cache/relcache.c:3887
>     #3 0x14612a5 in InitPostgres
> /home/marxin/Programming/postgres/src/backend/utils/init/postinit.c:997
>     #4 0x104661a in PostgresMain
> /home/marxin/Programming/postgres/src/backend/tcop/postgres.c:3777
>     #5 0xc19777 in main
> /home/marxin/Programming/postgres/src/backend/main/main.c:224
>     #6 0x7ffff635ca86 in __libc_start_main (/lib64/libc.so.6+0x21a86)
>     #7 0x4863d9 in _start
> (/home/marxin/Programming/postgres/tmp_install/usr/local/pgsql/bin/postgres+0x4863d9)
> 

This is apparently because we call write_item() like this:

    /* next, do the access method specific field */
    write_item(rel->rd_options,
               (rel->rd_options ? VARSIZE(rel->rd_options) : 0),
               fp);

and it then does this:

    static void
    write_item(const void *data, Size len, FILE *fp)
    {
        if (fwrite(&len, 1, sizeof(len), fp) != sizeof(len))
            elog(FATAL, "could not write init file");
        if (fwrite(data, 1, len, fp) != len)
            elog(FATAL, "could not write init file");
    }

So the second fwrite call may do "fwrite(NULL,1,0,fp)" i.e. it writes 0
bytes from NULL pointer. Which I guess should work fine, because it does
not need to access the pointer at all.

I don't know where does the "declared to never be null" comes from.

> pg_crc32c_sse42.c:37:18: runtime error: load of misaligned address
> 0x7fffffffd484 for type 'const uint64', which requires 8 byte alignment
> 0x7fffffffd484: note: pointer points here
>   c0 d4 ff ff 01 00 00 00  7f 06 00 00 09 00 00 00  b3 ee bd f7 b3 0a 02 00 
> cf 10 32 01 00 00 00 80
>               ^ 
>     #0 0x153f045 in pg_comp_crc32c_sse42
> /home/marxin/Programming/postgres/src/port/pg_crc32c_sse42.c:37
>     #1 0x6ca43d in XLogRecordAssemble
> /home/marxin/Programming/postgres/src/backend/access/transam/xloginsert.c:780
>     #2 0x6c8d6f in XLogInsert
> /home/marxin/Programming/postgres/src/backend/access/transam/xloginsert.c:459
>     #3 0x6997bb in XactLogCommitRecord
> /home/marxin/Programming/postgres/src/backend/access/transam/xact.c:5370
>     #4 0x68d3c0 in RecordTransactionCommit
> /home/marxin/Programming/postgres/src/backend/access/transam/xact.c:1225
>     #5 0x68f1fd in CommitTransaction
> /home/marxin/Programming/postgres/src/backend/access/transam/xact.c:2037
>     #6 0x6908cd in CommitTransactionCommand
> /home/marxin/Programming/postgres/src/backend/access/transam/xact.c:2768
>     #7 0x104442d in finish_xact_command
> /home/marxin/Programming/postgres/src/backend/tcop/postgres.c:2498
>     #8 0x104052a in exec_simple_query
> /home/marxin/Programming/postgres/src/backend/tcop/postgres.c:1145
>     #9 0x1046bf1 in PostgresMain
> /home/marxin/Programming/postgres/src/backend/tcop/postgres.c:4144
>     #10 0xc19777 in main
> /home/marxin/Programming/postgres/src/backend/main/main.c:224
>     #11 0x7ffff635ca86 in __libc_start_main (/lib64/libc.so.6+0x21a86)
>     #12 0x4863d9 in _start
> (/home/marxin/Programming/postgres/tmp_install/usr/local/pgsql/bin/postgres+0x4863d9)
> 

This comes from this call in pg_comp_crc32c_sse42

    crc = (uint32) _mm_crc32_u64(crc, *((const uint64 *) p));

and it's explained in the comment right above it:

/*
 * Process eight bytes of data at a time.
 *
 * NB: We do unaligned accesses here. The Intel architecture allows
 * that, and performance testing didn't show any performance gain
 * from aligning the begin address.
 */

So, not a bug.

> 
> arrayfuncs.c:3740:17: runtime error: member access within misaligned address
> 0x0000028b937c for type 'struct ExpandedObjectHeader', which requires 8 byte
> alignment
> 0x0000028b937c: note: pointer points here
>   6f 6f 00 00 80 02 00 00  01 00 00 00 00 00 00 00  19 00 00 00 08 00 00 00 
> 01 00 00 00 40 00 00 00
>               ^ 
>     #0 0x10d22b0 in array_cmp
> /home/marxin/Programming/postgres/src/backend/utils/adt/arrayfuncs.c:3740
>     #1 0x10d208a in btarraycmp
> /home/marxin/Programming/postgres/src/backend/utils/adt/arrayfuncs.c:3724
>     #2 0x14d7fd8 in comparison_shim
> /home/marxin/Programming/postgres/src/backend/utils/sort/sortsupport.c:53
>     #3 0x8f6bcb in ApplySortComparator
> ../../../src/include/utils/sortsupport.h:225
>     #4 0x9079c7 in compare_scalars
> /home/marxin/Programming/postgres/src/backend/commands/analyze.c:2855
>     #5 0x153d1e6 in qsort_arg
> /home/marxin/Programming/postgres/src/port/qsort_arg.c:140
>     #6 0x904cfa in compute_scalar_stats
> /home/marxin/Programming/postgres/src/backend/commands/analyze.c:2412
>     #7 0x10ed240 in compute_array_stats
> /home/marxin/Programming/postgres/src/backend/utils/adt/array_typanalyze.c:250
>     #8 0x8f990f in do_analyze_rel
> /home/marxin/Programming/postgres/src/backend/commands/analyze.c:579
>     #9 0x8f7c9f in analyze_rel
> /home/marxin/Programming/postgres/src/backend/commands/analyze.c:310
>     #10 0xa7e1bb in vacuum
> /home/marxin/Programming/postgres/src/backend/commands/vacuum.c:357
>     #11 0xa7d925 in ExecVacuum
> /home/marxin/Programming/postgres/src/backend/commands/vacuum.c:141
>     #12 0x104f38e in standard_ProcessUtility
> /home/marxin/Programming/postgres/src/backend/tcop/utility.c:667
>     #13 0x104e364 in ProcessUtility
> /home/marxin/Programming/postgres/src/backend/tcop/utility.c:358
>     #14 0x104c6d2 in PortalRunUtility
> /home/marxin/Programming/postgres/src/backend/tcop/pquery.c:1178
>     #15 0x104cca6 in PortalRunMulti
> /home/marxin/Programming/postgres/src/backend/tcop/pquery.c:1324
>     #16 0x104afc0 in PortalRun
> /home/marxin/Programming/postgres/src/backend/tcop/pquery.c:799
>     #17 0x1040463 in exec_simple_query
> /home/marxin/Programming/postgres/src/backend/tcop/postgres.c:1120
>     #18 0x1046bf1 in PostgresMain
> /home/marxin/Programming/postgres/src/backend/tcop/postgres.c:4144
>     #19 0xc19777 in main
> /home/marxin/Programming/postgres/src/backend/main/main.c:224
>     #20 0x7ffff635ca86 in __libc_start_main (/lib64/libc.so.6+0x21a86)
>     #21 0x4863d9 in _start
> (/home/marxin/Programming/postgres/tmp_install/usr/local/pgsql/bin/postgres+0x4863d9)
> 


Again, the line numbers don't really match the code I have, but I guess
it's the same issue as for pg_comp_crc32c_sse42. This is apparently
related to array serialization, and I guess we have a compact structure
(intentionally, to make it smaller), and we accept the unaligned access.

> print.c:916:4: runtime error: null pointer passed as argument 1, which is
> declared to never be null
>     #0 0x4904da in print_aligned_text
> /home/marxin/Programming/postgres/src/fe_utils/print.c:916
>     #1 0x4a0ca2 in printTable
> /home/marxin/Programming/postgres/src/fe_utils/print.c:3235
>     #2 0x4a171f in printQuery
> /home/marxin/Programming/postgres/src/fe_utils/print.c:3347
>     #3 0x414286 in PrintQueryTuples
> /home/marxin/Programming/postgres/src/bin/psql/common.c:890
>     #4 0x414d6f in PrintQueryResults
> /home/marxin/Programming/postgres/src/bin/psql/common.c:1224
>     #5 0x41559d in SendQuery
> /home/marxin/Programming/postgres/src/bin/psql/common.c:1408
>     #6 0x4356c6 in MainLoop
> /home/marxin/Programming/postgres/src/bin/psql/mainloop.c:431
>     #7 0x40d248 in process_file
> /home/marxin/Programming/postgres/src/bin/psql/command.c:3563
>     #8 0x44c8f8 in main
> /home/marxin/Programming/postgres/src/bin/psql/startup.c:375
>     #9 0x7ffff5feca86 in __libc_start_main (/lib64/libc.so.6+0x21a86)
>     #10 0x4048f9 in _start
> (/home/marxin/Programming/postgres/tmp_install/usr/local/pgsql/bin/psql+0x4048f9)
> 

No idea, line numbers shifted again. My guess would be something like
the fwrite() report, but this time with fputs(). Not sure which of the
calls, though.

regards

-- 
Tomas Vondra                  http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


В списке pgsql-bugs по дате отправления:

Предыдущее
От: Tom Lane
Дата:
Сообщение: Re: Weird return-value from pg_get_function_identity_arguments() on certain aggregate functions?
Следующее
От: Greg k
Дата:
Сообщение: Different behaviour for pg_ctl --wait between pg9.5 and pg10