Re: Advice - indexing on varchar fields where only last x characters known

Поиск
Список
Период
Сортировка
От David Johnston
Тема Re: Advice - indexing on varchar fields where only last x characters known
Дата
Msg-id 1382103852101-5775023.post@n5.nabble.com
обсуждение исходный текст
Ответ на Re: Re: Advice - indexing on varchar fields where only last x characters known  (Gary Stainburn <gary.stainburn@ringways.co.uk>)
Ответы Re: Re: Advice - indexing on varchar fields where only last x characters known  (Gary Stainburn <gary.stainburn@ringways.co.uk>)
Список pgsql-sql
Gary Stainburn wrote
> On Thursday 17 October 2013 20:16:30 David Johnston wrote:
>> Gary Stainburn wrote
>>
>> > However, it means that every time I'm trying to connect various tables
>> up
>> > using foreign keys
>>
>> The degree to which each input source guarantees uniqueness of a given
>> VIN
>> matters.  Keep in mind that any system that required the user to manually
>> enter the VIN has the propensity for errors.  Either outright invalid
>> VINs
>> or marginally correct VINs with typos (which mean the VIN might be less
>> or
>> more than 17 characters even if the 17-character version was intended).
>> Specifically it is not uncommon for the VIN to be made-up when it is a
>> required field but the user does not know what the VIN is.
>>
>> 6 characters are unique within a model year but you need at least 8
>> characters to be generally unique for a given manufacturer.
>>
>> For foreign key purposes it may be worthwhile to generate a "matching"
>> table and then during import use an algorithm to match up different
>> records.  Then during general queries that table can be used for joins. 
>> In
>> this way you only pay the price of matching once and that during import
>> as
>> opposed to during user requests.  Having a canonical VIN table helps here
>> though during import that table then has to be maintained.  The added
>> advantage is that such a mapping table allows you to search against a
>> single table and such a table (and likely its indexes) should be fairly
>> small so as to make good use of memory.
>>
>> David J.
> 
> David
> 
> You raise a number of valid points here. Thankfully it is very rare that
> the 
> user creates a VIN number so the chances of creating the wrong VIN is
> slim. 
> Most data is created from data feeds from external sources. Although these 
> may also be wrong (even Ford get it wrong some times) the chances are
> slim.
> 
> Where a user enters a 6 character VIN the times that more than one vehicle
> is 
> returned is negligable, and where it does I return an error and ask for a 
> longer VIN, which may be anywhere between 8 and the full 17 characters.

Unless they scan barcodes the service department usually has a frequent need
to key in VINs manually.

A better UI would be to list all matches and let the user pick.  Zero
matches could also result in a similarity search...

The nature of VINs make them poor FKs.  This is one case where I either use
a surrogate key or, in most cases, some other natural key and leae the VIN
as a descriptive attribute.  The VIN is often of secondary priority. Stock
numbers and invoice numbers are the primaries.

I haven't done much work with cross-department linking of vehicles; mostly
worked in a vertical fashion so a vehicle is informative.  I also only have
USA exposure.

As to the trigger usage: it is not redundant information.  One column
contains raw data and the other is normalized.  While they both represent
the same concept generally the specific characteristics makes them
non-redundant in the model.  A surrogate key is, in many ways, redundant to
the table's natural key but it does have different characteristics and
purpose so my dislike for them is not because they add redundant information
to the model.

David J.





--
View this message in context:
http://postgresql.1045698.n5.nabble.com/Advice-indexing-on-varchar-fields-where-only-last-x-characters-known-tp5774839p5775023.html
Sent from the PostgreSQL - sql mailing list archive at Nabble.com.



В списке pgsql-sql по дате отправления:

Предыдущее
От: skinner@britvault.co.uk (Craig R. Skinner)
Дата:
Сообщение: Re: Re: Advice - indexing on varchar fields where only last x characters known
Следующее
От: Gary Stainburn
Дата:
Сообщение: Re: Re: Advice - indexing on varchar fields where only last x characters known