Re: new heapcheck contrib module

Поиск
Список
Период
Сортировка
От Amul Sul
Тема Re: new heapcheck contrib module
Дата
Msg-id CAAJ_b94Wh_7kM9JuJk7wnNsYSNPdSzz35cOEaE_ZRnPcqvupCg@mail.gmail.com
обсуждение исходный текст
Ответ на Re: new heapcheck contrib module  (Amul Sul <sulamul@gmail.com>)
Ответы Re: new heapcheck contrib module  (Mark Dilger <mark.dilger@enterprisedb.com>)
Список pgsql-hackers
Few comments for v14 version:

v14-0001:

verify_heapam.c: In function ‘verify_heapam’:
verify_heapam.c:339:14: warning: variable ‘ph’ set but not used
[-Wunused-but-set-variable]
   PageHeader ph;
              ^
verify_heapam.c: In function ‘check_toast_tuple’:
verify_heapam.c:877:8: warning: variable ‘chunkdata’ set but not used
[-Wunused-but-set-variable]
  char    *chunkdata;

Got these compilation warnings


+++ b/contrib/amcheck/amcheck.h
@@ -0,0 +1,5 @@
+#include "postgres.h"
+
+Datum verify_heapam(PG_FUNCTION_ARGS);
+Datum bt_index_check(PG_FUNCTION_ARGS);
+Datum bt_index_parent_check(PG_FUNCTION_ARGS);

bt_index_* are needed?


#include "access/htup_details.h"
#include "access/xact.h"
#include "catalog/pg_type.h"
#include "catalog/storage_xlog.h"
#include "storage/smgr.h"
#include "utils/lsyscache.h"
#include "utils/rel.h"
#include "utils/snapmgr.h"
#include "utils/syscache.h"

These header file inclusion to verify_heapam.c. can be omitted. Some of those
might be implicitly got added by other header files or no longer need due to
recent changes.


+ *   on_error_stop:
+ *     Whether to stop at the end of the first page for which errors are
+ *     detected.  Note that multiple rows may be returned.
+ *
+ *   check_toast:
+ *     Whether to check each toasted attribute against the toast table to
+ *     verify that it can be found there.
+ *
+ *   skip:
+ *     What kinds of pages in the heap relation should be skipped.  Valid
+ *     options are "all-visible", "all-frozen", and "none".

I think it would be good if the description also includes what will be default
value otherwise.


+ /*
+ * Optionally open the toast relation, if any, also protected from
+ * concurrent vacuums.
+ */

Now lock is changed to AccessShareLock, I think we need to rephrase this comment
as well since we are not really doing anything extra explicitly to protect from
the concurrent vacuum.


+/*
+ * Return wehter a multitransaction ID is in the cached valid range.
+ */

Typo: s/wehter/whether


v14-0002:

+#define NOPAGER 0

Unused macro.


+ appendPQExpBuffer(querybuf,
+   "SELECT c.relname, v.blkno, v.offnum, v.attnum, v.msg"
+   "\nFROM public.verify_heapam("
+ "\nrelation := %u,"
+ "\non_error_stop := %s,"
+ "\nskip := %s,"
+ "\ncheck_toast := %s,"
+ "\nstartblock := %s,"
+ "\nendblock := %s) v, "
+ "\npg_catalog.pg_class c"
+   "\nWHERE c.oid = %u",
+   tbloid, stop, skip, toast, startblock, endblock, tbloid);
[....]
+ appendPQExpBuffer(querybuf,
+   "SELECT public.bt_index_parent_check('%s'::regclass, %s, %s)",
+   idxoid,
+   settings.heapallindexed ? "true" : "false",
+   settings.rootdescend ? "true" : "false");

The assumption that the amcheck extension will be always installed in the public
schema doesn't seem to be correct. This will not work if amcheck install
somewhere else.

Regards,
Amul




On Thu, Aug 20, 2020 at 5:17 PM Amul Sul <sulamul@gmail.com> wrote:
>
> On Thu, Aug 20, 2020 at 8:00 AM Mark Dilger
> <mark.dilger@enterprisedb.com> wrote:
> >
> >
> >
> > > On Aug 16, 2020, at 9:37 PM, Amul Sul <sulamul@gmail.com> wrote:
> > >
> > > In addition to this, I found a few more things while reading v13 patch are as
> > > below:
> > >
> > > Patch v13-0001:
> > >
> > > -
> > > +#include "amcheck.h"
> > >
> > > Not in correct order.
> >
> > Fixed.
> >
> > > +typedef struct BtreeCheckContext
> > > +{
> > > + TupleDesc tupdesc;
> > > + Tuplestorestate *tupstore;
> > > + bool is_corrupt;
> > > + bool on_error_stop;
> > > +} BtreeCheckContext;
> > >
> > > Unnecessary spaces/tabs between } and BtreeCheckContext.
> >
> > This refers to a change in verify_nbtree.c that has been removed.  Per discussions with Peter and Robert, I have
simplywithdrawn that portion of the patch. 
> >
> > > static void bt_index_check_internal(Oid indrelid, bool parentcheck,
> > > - bool heapallindexed, bool rootdescend);
> > > + bool heapallindexed, bool rootdescend,
> > > + BtreeCheckContext * ctx);
> > >
> > > Unnecessary space between * and ctx. The same changes needed for other places as
> > > well.
> >
> > Same as above.  The changes to verify_nbtree.c have been withdrawn.
> >
> > > ---
> > >
> > > Patch v13-0002:
> > >
> > > +-- partitioned tables (the parent ones) don't have visibility maps
> > > +create table test_partitioned (a int, b text default repeat('x', 5000))
> > > + partition by list (a);
> > > +-- these should all fail
> > > +select * from verify_heapam('test_partitioned',
> > > + on_error_stop := false,
> > > + skip := NULL,
> > > + startblock := NULL,
> > > + endblock := NULL);
> > > +ERROR:  "test_partitioned" is not a table, materialized view, or TOAST table
> > > +create table test_partition partition of test_partitioned for values in (1);
> > > +create index test_index on test_partition (a);
> > >
> > > Can't we make it work? If the input is partitioned, I think we could
> > > collect all its leaf partitions and process them one by one. Thoughts?
> >
> > I was following the example from pg_visibility.  I haven't thought about your proposal enough to have much opinion
asyet, except that if we do this for pg_amcheck we should do likewise to pg_visibility, for consistency of the user
interface.
> >
>
> pg_visibility does exist from before the declarative partitioning came
> in, I think it's time to improve that as well.
>
> > > + ctx->chunkno++;
> > >
> > > Instead of incrementing  in check_toast_tuple(), I think incrementing should
> > > happen at the caller  -- just after check_toast_tuple() call.
> >
> > I agree.
> >
> > > ---
> > >
> > > Patch v13-0003:
> > >
> > > + resetPQExpBuffer(query);
> > > + destroyPQExpBuffer(query);
> > >
> > > resetPQExpBuffer() will be unnecessary if the next call is destroyPQExpBuffer().
> >
> > Thanks.  I removed it in cases where destroyPQExpBuffer is obviously the very next call.
> >
> > > + appendPQExpBuffer(query,
> > > +   "SELECT c.relname, v.blkno, v.offnum, v.lp_off, "
> > > +   "v.lp_flags, v.lp_len, v.attnum, v.chunk, v.msg"
> > > +   "\nFROM verify_heapam(rel := %u, on_error_stop := %s, "
> > > +   "skip := %s, startblock := %s, endblock := %s) v, "
> > > +   "pg_class c"
> > > +   "\nWHERE c.oid = %u",
> > > +   tbloid, stop, skip, settings.startblock,
> > > +   settings.endblock, tbloid
> > >
> > > pg_class should be schema-qualified like elsewhere.
> >
> > Agreed, and changed.
> >
> > > IIUC, pg_class is meant to
> > > get relname only, instead, we could use '%u'::pg_catalog.regclass in the target
> > > list for the relname. Thoughts?
> >
> > get_table_check_list() creates the list of all tables to be checked, which check_tables() then iterates over,
callingcheck_table() for each one.  I think some verification that the table still exists is in order.  Using
'%u'::pg_catalog.regclassfor a table that has since been dropped would pass in the old table Oid and draw an error of
the'ERROR:  could not open relation with OID 36311' variety, whereas the current coding will just skip the dropped
table.
> >
> > > Also I think we should skip '\n' from the query string (see appendPQExpBuffer()
> > > in pg_dump.c)
> >
> > I'm not sure I understand.  pg_dump.c uses "\n" in query strings it passes to appendPQExpBuffer(), in a manner very
similarto what this patch does. 
> >
>
> I see there is a mix of styles, I was referring to dumpDatabase() from pg_dump.c
> which doesn't include '\n'.
>
> > > + appendPQExpBuffer(query,
> > > +   "SELECT i.indexrelid"
> > > +   "\nFROM pg_catalog.pg_index i, pg_catalog.pg_class c"
> > > +   "\nWHERE i.indexrelid = c.oid"
> > > +   "\n    AND c.relam = %u"
> > > +   "\n    AND i.indrelid = %u",
> > > +   BTREE_AM_OID, tbloid);
> > > +
> > > + ExecuteSqlStatement("RESET search_path");
> > > + res = ExecuteSqlQuery(query->data, PGRES_TUPLES_OK);
> > > + PQclear(ExecuteSqlQueryForSingleRow(ALWAYS_SECURE_SEARCH_PATH_SQL));
> > >
> > > I don't think we need the search_path query. The main query doesn't have any
> > > dependencies on it.  Same is in check_indexes(), check_index (),
> > > expand_table_name_patterns() & get_table_check_list().
> > > Correct me if I am missing something.
> >
> > Right.
> >
> > > + output = PageOutput(lines + 2, NULL);
> > > + for (lineno = 0; usage_text[lineno]; lineno++)
> > > + fprintf(output, "%s\n", usage_text[lineno]);
> > > + fprintf(output, "Report bugs to <%s>.\n", PACKAGE_BUGREPORT);
> > > + fprintf(output, "%s home page: <%s>\n", PACKAGE_NAME, PACKAGE_URL);
> > >
> > > I am not sure why we want PageOutput() if the second argument is always going to
> > > be NULL? Can't we directly use printf() instead of PageOutput() + fprintf() ?
> > > e.g. usage() function in pg_basebackup.c.
> >
> > Done.
> >
> >
> > Please find attached the next version of the patch.  In addition to your review comments (above), I have made
changesin response to Peter and Robert's review comments upthread. 
>
> Thanks for the updated version, I'll have a look.
>
> Regards,
> Amul



В списке pgsql-hackers по дате отправления:

Предыдущее
От: Greg Nancarrow
Дата:
Сообщение: Issue with past commit: Allow fractional input values for integer GUCs ...
Следующее
От: Bharath Rupireddy
Дата:
Сообщение: Avoid displaying unnecessary "Recheck Cond" in EXPLAIN ANALYZE output if the bitmap is non-lossy