Re: [PATCH] Add hint for misspelled relations
| От | Kirill Reshke |
|---|---|
| Тема | Re: [PATCH] Add hint for misspelled relations |
| Дата | |
| Msg-id | CALdSSPgcJ=zueMiR3su7Dh4YE4TSY4dG_F8FDszbbZC2XqPeUg@mail.gmail.com обсуждение исходный текст |
| Ответ на | [PATCH] Add hint for misspelled relations (Steve Chavez <steve@supabase.io>) |
| Список | pgsql-hackers |
On Tue, 2 Dec 2025 at 07:46, Steve Chavez <steve@supabase.io> wrote: > > Hello hackers, > > Currently misspelled columns offer a hint but not misspelled relations. > > This patch enables that, the result is like: > > -- having this table > create table clients (id int); > -- we misspell the table name > select * from cliants; > ERROR: relation "cliants" does not exist > LINE 1: select * from cliants; > HINT: Perhaps you meant to reference the table "public.clients". > > The possible matches are searched in pg_class for the schemas present in search_path (or if the relation is qualified,it only searches matches in that schema). The logic reuses the `varstr_levenshtein_less_equal` function similarto how the column matching is done. > > If there's a tie in the fuzzy match, it's solved by preferring the schema that appears first on the search path. If thatfails, then the lexicographic order is used to break the tie. > > One problem is that scanning all pg_class entries can get expensive on big catalogs, so the number of searches is cappedby MAX_REL_HINT_CANDIDATES. I've set this to 4096 arbitrarily, any guidance on what would be a good number is appreciated.Personally I've seen a catalog that contains 125K tables, with mostly auto generated names. For these casesI don't think the hint helps that much anyway, so it seemed fine to bail here. > > The changes are split into two commits, one refactoring some reusable functions for easier review and another one implementingthe relation hint. > > Any feedback is welcomed. > > Best regards, > Steve Chavez > > [1]: Hi! I did not reviewed this patch closely, but I stopped a this: > + reldistance = varstr_levenshtein_less_equal(target->relname, strlen(target->relname), > + candidate, candidatelen, > + 1, 1, 1, > + Min(fuzzystate->distance + 1, MAX_FUZZY_DISTANCE + 1), > + true); > + > + /* The above can return MAX_FUZZY_DISTANCE + 1 results, skip these */ > + if (reldistance > MAX_FUZZY_DISTANCE) > + continue; Why do we even do this? Can't we just pass fuzzystate->distance to varstr_levenshtein_less_equal? It is initialized in outer func to MAX_FUZZY_DISTANCE + 1 -- Best regards, Kirill Reshke
В списке pgsql-hackers по дате отправления: