Обсуждение: To all the pgsql developers..Have a look at the operators proposed by me in my research paper.

Поиск
Список
Период
Сортировка

To all the pgsql developers..Have a look at the operators proposed by me in my research paper.

От
Tasneem Memon
Дата:
<br /><p class="MsoNormal" style="MARGIN: 0in 0in 0pt"><font color="#000000" face="Times New Roman" size="3">Dear All,
</font><pclass="MsoNormal" style="MARGIN: 0in 0in 0pt"><font color="#000000" face="Times New Roman" size="3">I am
explaininghere how i have defined the operators NEAR and NOT NEAR in my research paper. I want critics/comments from
allof you on the operators.</font><p class="MsoNormal" style="MARGIN: 0in 0in 0pt"><font color="#000000" face="Times
NewRoman" size="3"> </font><p class="MsoNormal" style="MARGIN: 0in 0in 0pt"><font color="#000000" face="Times New
Roman"size="3">These operators belong to the category of “Fuzzy Operators” proposed in the paper for ANSI SQL.</font><p
class="MsoNormal"style="MARGIN: 0in 0in 0pt"><font color="#000000" face="Times New Roman" size="3"> </font><ol
style="MARGIN-TOP:0in" type="1"><li class="MsoNormal" style="MARGIN: 0in 0in 0pt; mso-list: l0 level1 lfo1; tab-stops:
list.5in"><font color="#000000" face="Times New Roman" size="3">NEAR</font></ol><p class="MsoNormal" style="MARGIN: 0in
0in0pt"><b style="mso-bidi-font-weight: normal"><span style="FONT-SIZE: 11pt"><font color="#000000" face="Times New
Roman"> </font></span></b><pclass="MsoNormal" style="MARGIN: 0in 0in 0pt; TEXT-INDENT: 0.5in"><font color="#000000"
face="TimesNew Roman" size="3">It deals with the NUMBER and DATE datatypes simulating the human behavior and processing
theinformation contained in NEAR in the same way as we humans take it. This is a binary operator with the
syntax:</font><pclass="MsoNormal" style="MARGIN: 0in 0in 0pt; TEXT-INDENT: 0.5in"><span style="FONT-FAMILY:
Verdana"><fontsize="3"><font color="#000000">op1<span style="mso-spacerun: yes">  </span>NEAR<span style="mso-spacerun:
yes"> </span>op2</font></font></span><p class="MsoNormal" style="MARGIN: 0in 0in 0pt"><font color="#000000" face="Times
NewRoman" size="3">Here, the <b style="mso-bidi-font-weight: normal">op1</b> refers to an attribute, whereas <b
style="mso-bidi-font-weight:normal">op2 </b>is a fixed value, both of the same datatype. </font><p class="MsoNormal"
style="MARGIN:0in 0in 0pt"><font color="#000000" face="Times New Roman" size="3">Suppose we want a list of all the
VGAs,price of which should be somewhere around 30$ .. the query will look like:</font><p class="MsoNormal"
style="MARGIN:0in 0in 0pt"><span style="FONT-SIZE: 8pt"><font color="#000000"><font face="Times New Roman"><span
style="mso-spacerun:yes"> </span></font></font></span><p class="MsoNormal" style="MARGIN: 0in 0in 0pt 9pt"><span
style="FONT-FAMILY:Verdana"><font size="3"><font color="#000000">SELECT <span style="mso-tab-count: 1">  
</span>*</font></font></span><pclass="MsoNormal" style="MARGIN: 0in 0in 0pt 9pt"><span style="FONT-FAMILY:
Verdana"><fontsize="3"><font color="#000000">FROM <span style="mso-tab-count: 1">    
</span>accessories</font></font></span><pclass="MsoNormal" style="MARGIN: 0in 0in 0pt 9pt"><span style="FONT-FAMILY:
Verdana"><fontsize="3"><font color="#000000">WHERE<span style="mso-tab-count: 1">    </span>prod_name =
‘VGA’</font></font></span><pclass="MsoNormal" style="MARGIN: 0in 0in 0pt 9pt"><span style="FONT-FAMILY: Verdana"><font
size="3"><fontcolor="#000000">AND<span style="mso-tab-count: 2">                </span>prod_price <b
style="mso-bidi-font-weight:normal">NEAR</b> 30</font></font></span><p class="MsoNormal" style="MARGIN: 0in 0in 0pt
9pt"><spanstyle="FONT-FAMILY: Verdana"><font color="#000000" size="3"> </font></span><p class="MsoNormal"
style="MARGIN:0in 0in 0pt"><font color="#000000" face="Times New Roman" size="3">A query for the datatype DATE will
looklike:</font><p class="MsoNormal" style="MARGIN: 0in 0in 0pt"><span style="FONT-SIZE: 8pt"><font color="#000000"
face="TimesNew Roman"> </font></span><p class="MsoNormal" style="MARGIN: 0in 0in 0pt 9pt"><span style="FONT-FAMILY:
Verdana"><fontsize="3"><font color="#000000">SELECT <span style="mso-tab-count: 1">   </span>*</font></font></span><p
class="MsoNormal"style="MARGIN: 0in 0in 0pt 9pt"><span style="FONT-FAMILY: Verdana"><font size="3"><font
color="#000000">FROM<span style="mso-tab-count: 1">     </span>sales</font></font></span><p class="MsoNormal"
style="MARGIN:0in 0in 0pt 9pt"><span style="FONT-FAMILY: Verdana"><font size="3"><font color="#000000">WHERE<span
style="mso-tab-count:1">    </span>item = ’printer’</font></font></span><p class="MsoNormal" style="MARGIN: 0in 0in 0pt
9pt"><fontsize="3"><font color="#000000"><span style="FONT-FAMILY: Verdana">AND<span style="mso-tab-count:
2">               </span>s_date <b style="mso-bidi-font-weight: normal">NEAR</b> </span><span style="FONT-FAMILY:
Verdana">10-7-06</span><spanstyle="FONT-FAMILY: Verdana"></span></font></font><p class="MsoNormal" style="MARGIN: 0in
0in0pt"><font color="#000000" face="Times New Roman" size="3"> </font><p class="MsoNormal" style="MARGIN: 0in 0in 0pt
9pt"><spanstyle="FONT-FAMILY: Verdana"><font color="#000000" size="3"> </font></span><p class="MsoNormal"
style="MARGIN:0in 0in 0pt; TEXT-INDENT: 0.5in"><font color="#000000" face="Times New Roman" size="3">The algorithm for
theNEAR operator works as follows:</font><p class="MsoNormal" style="MARGIN: 0in 0in 0pt; TEXT-INDENT: 0.5in"><font
color="#000000"face="Times New Roman" size="3"> </font><ol style="MARGIN-TOP: 0in" type="1"><li class="MsoNormal"
style="TEXT-JUSTIFY:inter-ideograph; MARGIN: 0in 0in 0pt; TEXT-ALIGN: justify; mso-list: l1 level1 lfo2; tab-stops:
list.5in"><font color="#000000" face="Times New Roman" size="3">The margins to the <b style="mso-bidi-font-weight:
normal">op2</b>,i.e. <b style="mso-bidi-font-weight: normal">m1</b> and<b style="mso-bidi-font-weight: normal"> m2,</b>
areadded dynamically on both the sides, considering the value it contains. To keep this margin big is important for a
certainreason discussed later.</font><li class="MsoNormal" style="TEXT-JUSTIFY: inter-ideograph; MARGIN: 0in 0in 0pt;
TEXT-ALIGN:justify; mso-list: l1 level1 lfo2; tab-stops: list .5in"><font color="#000000" face="Times New Roman"
size="3">The<b style="mso-bidi-font-weight: normal">NEAR</b> operator is supposed to obtain the values near to the op2,
thusthe target membership degree(<b style="mso-bidi-font-weight: normal">md</b>) is initially set to 0.8.</font><li
class="MsoNormal"style="TEXT-JUSTIFY: inter-ideograph; MARGIN: 0in 0in 0pt; TEXT-ALIGN: justify; mso-list: l1 level1
lfo2;tab-stops: list .5in"><font color="#000000" face="Times New Roman" size="3">The algorithm compares the <b
style="mso-bidi-font-weight:normal">op1</b>(column) values row by row to the elements of the set that <b
style="mso-bidi-font-weight:normal">NEAR</b> defined, i.e. the values from md 1.0 to 0.8, adding matching tuples to the
resultset.</font></ol><p class="MsoNormal" style="TEXT-JUSTIFY: inter-ideograph; MARGIN: 0in 0in 0pt 0.5in;
TEXT-INDENT:-0.25in; TEXT-ALIGN: justify; mso-list: l1 level1 lfo2; tab-stops: list .25in"><font face="Times New
Roman"><spanstyle="mso-fareast-font-family: 'Times New Roman'"><span style="mso-list: Ignore"><font
color="#000000"><fontsize="3">4.</font><span style="FONT: 7pt 'Times New Roman'">     
</span></font></span></span><fontcolor="#000000" size="3">It is very much possible that the result set is empty since
novalues within the range exist in the column. Thus, the algorithm checks for empty result set, and in that case,
decreasesthe target <b style="mso-bidi-font-weight: normal">md</b> by 0.2 and jumps to step 3. This is the reason big
marginsto the <b style="mso-bidi-font-weight: normal">op2</b> are added.</font></font><p class="MsoNormal"
style="TEXT-JUSTIFY:inter-ideograph; MARGIN: 0in 0in 0pt 0.5in; TEXT-INDENT: -0.25in; TEXT-ALIGN: justify; mso-list: l1
level1lfo2; tab-stops: list .25in"><font face="Times New Roman"><span style="mso-fareast-font-family: 'Times New
Roman'"><spanstyle="mso-list: Ignore"><font color="#000000"><font size="3">5.</font><span style="FONT: 7pt 'Times New
Roman'">     </span></font></span></span><font color="#000000" size="3">In case there are no values in <b
style="mso-bidi-font-weight:normal">op1</b> that are between <b style="mso-bidi-font-weight: normal">m1</b> and <b
style="mso-bidi-font-weight:normal">m2 </b>(where the<b style="mso-bidi-font-weight: normal"> </b>membership degree of
thevalues with respect to NEAR becomes 0.1) and the result set is empty, the algorithm fetches the two nearest values
(tuples)to <b style="mso-bidi-font-weight: normal">op2, </b>one smaller and one larger than the op2, as the
result.</font></font><pclass="MsoNormal" style="TEXT-JUSTIFY: inter-ideograph; MARGIN: 0in 0in 0pt; TEXT-ALIGN:
justify"><fontcolor="#000000" face="Times New Roman" size="3"> </font><p class="MsoNormal" style="MARGIN: 0in 0in
0pt"><fontcolor="#000000" face="Times New Roman" size="3">The algorithm will give an empty result only if the table
referredto in the query is empty. </font><p class="MsoNormal" style="MARGIN: 0in 0in 0pt"><font color="#000000"
face="TimesNew Roman" size="3"> </font><p class="MsoNormal" style="MARGIN: 0in 0in 0pt"><b style="mso-bidi-font-weight:
normal"><spanstyle="FONT-SIZE: 11pt"><font color="#000000"><font face="Times New Roman">2. NOT NEAR
</font></font></span></b><pclass="MsoNormal" style="MARGIN: 0in 0in 0pt"><b style="mso-bidi-font-weight: normal"><span
style="FONT-SIZE:11pt"><font color="#000000" face="Times New Roman"> </font></span></b><p class="MsoNormal"
style="MARGIN:0in 0in 0pt"><font size="3"><font color="#000000"><font face="Times New Roman"><b
style="mso-bidi-font-weight:normal"><span style="mso-tab-count: 1">            </span></b>This operator is also a
binaryoperator, dealing with the datatype NUMBER and DATE. It has the syntax:</font></font></font><p class="MsoNormal"
style="MARGIN:0in 0in 0pt; TEXT-INDENT: 0.5in"><span style="FONT-FAMILY: Verdana"><font size="3"><font
color="#000000">op1<spanstyle="mso-spacerun: yes">  </span>NOT NEAR<span style="mso-spacerun: yes"> 
</span>op2</font></font></span><pclass="MsoNormal" style="MARGIN: 0in 0in 0pt"><font color="#000000" face="Times New
Roman"size="3">The <b style="mso-bidi-font-weight: normal">op1</b> refers to an attribute, whereas <b
style="mso-bidi-font-weight:normal">op2 </b>is a fixed value, both of the same data type.</font><p class="MsoNormal"
style="MARGIN:0in 0in 0pt"><font size="3"><font color="#000000"><font face="Times New Roman"><span
style="mso-tab-count:1">            </span>A query containing the operator looks like:</font></font></font><p
class="MsoNormal"style="MARGIN: 0in 0in 0pt"><font color="#000000" face="Times New Roman" size="3"> </font><p
class="MsoNormal"style="MARGIN: 0in 0in 0pt 0.25in"><span style="FONT-FAMILY: Verdana"><font size="3"><font
color="#000000">SELECT<span style="mso-tab-count: 1"></span>id, name, age, history</font></font></span><p
class="MsoNormal"style="MARGIN: 0in 0in 0pt 0.25in"><span style="FONT-FAMILY: Verdana"><font size="3"><font
color="#000000">FROM<spanstyle="mso-tab-count: 1">    </span>casualties </font></font></span><p class="MsoNormal"
style="MARGIN:0in 0in 0pt 0.25in"><span style="FONT-FAMILY: Verdana"><font size="3"><font color="#000000">WHERE<span
style="mso-tab-count:1">  </span>cause = ‘heart attack’</font></font></span><p class="MsoNormal" style="MARGIN: 0in 0in
0pt0.25in"><span style="FONT-FAMILY: Verdana"><font size="3"><font color="#000000">AND<span style="mso-tab-count:
1">     </span>age NOT NEAR 55</font></font></span><p class="MsoNormal" style="MARGIN: 0in 0in 0pt 0.25in"><span
style="FONT-FAMILY:Verdana"><font color="#000000" size="3"> </font></span><p class="MsoNormal" style="MARGIN: 0in 0in
0pt;TEXT-INDENT: 0.5in"><font color="#000000" face="Times New Roman" size="3">Or suppose we need a list of some event
thatis not clashing with some commitment of ours:</font><p class="MsoNormal" style="MARGIN: 0in 0in 0pt; TEXT-INDENT:
0.5in"><fontcolor="#000000" face="Times New Roman" size="3"> </font><p class="MsoNormal" style="MARGIN: 0in 0in 0pt
0.25in"><spanstyle="FONT-FAMILY: Verdana"><font size="3"><font color="#000000">SELECT <span style="mso-tab-count:
1"></span>*</font></font></span><p class="MsoNormal" style="MARGIN: 0in 0in 0pt 0.25in"><span style="FONT-FAMILY:
Verdana"><fontsize="3"><font color="#000000">FROM <span style="mso-tab-count: 1">  
</span>events</font></font></span><pclass="MsoNormal" style="MARGIN: 0in 0in 0pt 0.25in"><span style="FONT-FAMILY:
Verdana"><fontsize="3"><font color="#000000">WHERE<span style="mso-tab-count: 1">  </span>e_name=
‘concert’</font></font></span><pclass="MsoNormal" style="MARGIN: 0in 0in 0pt 1in; TEXT-INDENT: -0.75in; tab-stops:
423.0pt"><fontsize="3"><font color="#000000"><span style="FONT-FAMILY: Verdana">AND <span style="mso-tab-count: 1">    
</span>dateNOT NEAR </span><span style="FONT-FAMILY: Verdana">8/28/2007</span><span style="FONT-FAMILY:
Verdana"></span></font></font><pclass="MsoNormal" style="MARGIN: 0in 0in 0pt"><span style="mso-tab-count: 1"><font
color="#000000"face="Times New Roman" size="3">            </font></span><p class="MsoNormal" style="MARGIN: 0in 0in
0pt;TEXT-INDENT: 0.5in"><font size="3"><font color="#000000"><font face="Times New Roman">The algorithm for NOT NEAR
workslike this:</font></font></font><ol style="MARGIN-TOP: 0in" type="1"><li class="MsoNormal" style="TEXT-JUSTIFY:
inter-ideograph;MARGIN: 0in 0in 0pt; TEXT-ALIGN: justify; mso-list: l2 level1 lfo3; tab-stops: list .5in"><font
color="#000000"face="Times New Roman" size="3">First of all it adds the margins to the <b style="mso-bidi-font-weight:
normal">op2</b>,i.e. <b style="mso-bidi-font-weight: normal">m1</b> and<b style="mso-bidi-font-weight: normal"> m2,</b>
dynamicallyon both the sides, considering the value <b style="mso-bidi-font-weight: normal">op2</b> contains.</font><li
class="MsoNormal"style="TEXT-JUSTIFY: inter-ideograph; MARGIN: 0in 0in 0pt; TEXT-ALIGN: justify; mso-list: l2 level1
lfo3;tab-stops: list .5in"><font size="3"><font color="#000000"><font face="Times New Roman"><b
style="mso-bidi-font-weight:normal">op1 v</b>alues outside the scope of the <b style="mso-bidi-font-weight:
normal">op2</b>(<b style="mso-bidi-font-weight: normal">m1, m2</b>) are retrieved and added to the result.
</font></font></font><liclass="MsoNormal" style="TEXT-JUSTIFY: inter-ideograph; MARGIN: 0in 0in 0pt; TEXT-ALIGN:
justify;mso-list: l2 level1 lfo3; tab-stops: list .5in"><font color="#000000" face="Times New Roman" size="3">If the
resultset is empty, the farthest values within the <b style="mso-bidi-font-weight: normal">op2 </b>fuzzy set (those
possessingthe least membership degree) are retrieved. This is done by continuing the search from values with md=0.1
tillthe md=0.6, where the md for <b style="mso-bidi-font-weight: normal">NOT NEAR</b> reaches 0.4. </font></ol><p
class="MsoNormal"style="MARGIN: 0in 0in 0pt"><b style="mso-bidi-font-weight: normal"><font color="#000000" face="Times
NewRoman" size="3"> </font></b><p class="MsoNormal" style="MARGIN: 0in 0in 0pt"><font color="#000000" face="Times New
Roman"size="3">Regards,</font><p class="MsoNormal" style="MARGIN: 0in 0in 0pt"><font color="#000000" face="Times New
Roman"size="3"> </font><p class="MsoNormal" style="MARGIN: 0in 0in 0pt"><font face="Times New Roman, Times, Serif"
size="3"><strong>TasneemIshaque Memon <br /></strong>MS (Computer Science) <br />Institute of Mathmetics and Computer
Science<br />University of Sindh, Jamshoro <br />Pakistan <br /><br /></font><a
href="http://www.geocities.com/tasneememon/"><fontface="Times New Roman, Times, Serif"
size="3">http://www.geocities.com/tasneememon/</font></a><br/><br /><font size="3"><font color="#660099"><font
face="TimesNew Roman, Times, Serif"><strong><font color="#3300cc">- </font></strong><strong><font color="#3300cc">The
processof gaining knowledge is not producing results but clearing away ignorance.</font><br
/></strong></font></font></font><br/><hr />Discover the new Windows Vista <a
href="http://search.msn.com/results.aspx?q=windows+vista&mkt=en-US&form=QBRE"target="_new">Learn more!</a> 
On Jun 1, 2007, at 8:24 AM, Tasneem Memon wrote:
> NEAR
>
> It deals with the NUMBER and DATE datatypes simulating the human
> behavior and processing the

Why just number and date?

> information contained in NEAR in the same way as we humans take it.
> This is a binary operator with the syntax:
> op1  NEAR  op2
> Here, the op1 refers to an attribute, whereas op2 is a fixed value,
> both of the same datatype.
> Suppose we want a list of all the VGAs, price of which should be
> somewhere around 30$ .. the query will look like:
>
> SELECT    *
> FROM      accessories
> WHERE    prod_name = ‘VGA’
> AND                prod_price NEAR 30
>
> A query for the datatype DATE will look like:
>
> SELECT    *
> FROM      sales
> WHERE    item = ’printer’
> AND                s_date NEAR 10-7-06
>
>
> The algorithm for the NEAR operator works as follows:
>
> The margins to the op2, i.e. m1 and m2, are added dynamically on
> both the sides, considering the value it contains. To keep this
> margin big is important for a certain reason discussed later.
> The NEAR operator is supposed to obtain the values near to the op2,
> thus the target membership degree(md) is initially set to 0.8.
> The algorithm compares the op1(column) values row by row to the
> elements of the set that NEAR defined, i.e. the values from md 1.0
> to 0.8, adding matching tuples to the result set.

How would one change 0.8 to some other value?

> 4.      It is very much possible that the result set is empty since
> no values within the range exist in the column. Thus, the algorithm
> checks for empty result set, and in that case, decreases the target
> md by 0.2 and jumps to step 3. This is the reason big margins to
> the op2 are added.
> 5.      In case there are no values in op1 that are between m1 and
> m2 (where the membership degree of the values with respect to NEAR
> becomes 0.1) and the result set is empty, the algorithm fetches the
> two nearest values (tuples) to op2, one smaller and one larger than
> the op2, as the result.
>
> The algorithm will give an empty result only if the table referred
> to in the query is empty.
>
> 2. NOT NEAR
>
>             This operator is also a binary operator, dealing with
> the datatype NUMBER and DATE. It has the syntax:
> op1  NOT NEAR  op2
> The op1 refers to an attribute, whereas op2 is a fixed value, both
> of the same data type.
>             A query containing the operator looks like:
>
> SELECT id, name, age, history
> FROM    casualties
> WHERE  cause = ‘heart attack’
> AND      age NOT NEAR 55
>
> Or suppose we need a list of some event that is not clashing with
> some commitment of ours:
>
> SELECT *
> FROM    events
> WHERE  e_name= ‘concert’
> AND      date NOT NEAR 8/28/2007
>
> The algorithm for NOT NEAR works like this:
> First of all it adds the margins to the op2, i.e. m1 and m2,
> dynamically on both the sides, considering the value op2 contains.
> op1 values outside the scope of the op2 (m1, m2) are retrieved and
> added to the result.
> If the result set is empty, the farthest values within the op2
> fuzzy set (those possessing the least membership degree) are
> retrieved. This is done by continuing the search from values with
> md=0.1 till the md=0.6, where the md for NOT NEAR reaches 0.4.

Why isn't this just the exact opposite set of NEAR?
--
Jim Nasby                                            jim@nasby.net
EnterpriseDB      http://enterprisedb.com      512.569.9461 (cell)




Tasneem,

> > The margins to the op2, i.e. m1 and m2, are added dynamically on  
> > both the sides, considering the value it contains. To keep this  
> > margin big is important for a certain reason discussed later.
> > The NEAR operator is supposed to obtain the values near to the op2,  
> > thus the target membership degree(md) is initially set to 0.8.
> > The algorithm compares the op1(column) values row by row to the  
> > elements of the set that NEAR defined, i.e. the values from md 1.0  
> > to 0.8, adding matching tuples to the result set.

Are we talking about a mathematical calculation on the values, or an algorithm
against the population of the result set?  I'm presuming the latter or you
could just use a function.  If so, is NEAR an absolute range or based on
something logarithmic like standard deviation?

Beyond that, I would think that this mechanism would need some kind of extra
heuristics to be at all performant, otherwise you're querying the entire
table (or at least the entire index) every time you run a query.  Have you
given any thought to this?

--
Josh Berkus
PostgreSQL @ Sun
San Francisco