Topic: Agglomerative clustering

Поиск
Список
Период
Сортировка
От Akansha Singh
Тема Topic: Agglomerative clustering
Дата
Msg-id 2701ba0a-ce70-45db-9cc2-bf098c019780@default
обсуждение исходный текст
Список pgsql-students
hi
 i have another clustering algorithm.. Please give your suggestions for this one whether it can be implemented? I will
beobliged. 

Agglomerative Clustering
In agglomorative clustering, each data point is initially placed in a cluster by itself. At each step, two clusters
withthe highest similarity score are merged. The output of agglomerative clustering is a tree, where the leaves are the
dataitems. At any intermediate step, the clusters so far are different trees. When two clusters are merged by the
algorithm,the respective trees are merged by making the roots of these trees the left and right children of a new node
(sothat the total number of trees decreases by 1). In the end, the process yields a single tree. See here for an
example.

This algorithm closely resembles Kruskal's algorithm for Minimum Spanning Tree construction. In Kruskal's algorithm, at
eachstep the smallest edge connecting two components is added till there is only one component left. Each new edge
addedmerges two components.  This can be seen as clustering where the similarity score between two components is the
negativethe length of the smallest edge between any two points in the cluster. 

Many different clustering algorithms can be obtained by varying the similarity measures between clusters. Some examples
ofsimilarity measures are the following. Here, A and B are two clusters, and sim(A,B) is the similarity measure between
them.The similarity measure sim(u,v) between input data items u and v is specified as input. Typically, the similarity
isa number between 0 and 1, with larger values implying greater similarity. 

Regards
Akansha SIngh


В списке pgsql-students по дате отправления:

Предыдущее
От: Christoph Berg
Дата:
Сообщение: Re: [GSoC 2013] [Debian]
Следующее
От: Thom Brown
Дата:
Сообщение: Re: GSoC proposals deadline