Обсуждение: : Fuzzy C means algorithms in MAdlib

Поиск
Список
Период
Сортировка

: Fuzzy C means algorithms in MAdlib

От
Akansha Singh
Дата:
HI
 I would like to know can this algorithm be implemented on MAdlIB
In the K-means algorithm, each vector is classified as belonging to a single cluster (hard clustering), and the
centroidsare updated based on the classified samples. In a variation of this approach known as fuzzy c-means, all
vectorshave a degree of membership for each cluster, and the respective centroids are calculated based on these
membershipdegrees. 

Whereas the K-means algorithm computes the average of the vectors in a cluster as the center, fuzzy c-means finds the
centeras a weighted average of all points, using the membership probabilities for each point as weights. Vectors with a
highprobability of belonging to the class have larger weights, and more influence on the centroid. 

Regards
Akansha Singh


--
Sent via pgsql-students mailing list (pgsql-students@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-students


Re: : Fuzzy C means algorithms in MAdlib

От
Tomas Vondra
Дата:
On 27.4.2013 12:36, Akansha Singh wrote:
> HI I would like to know can this algorithm be implemented on MAdlIB
> In the K-means algorithm, each vector is classified as belonging to a
> single cluster (hard clustering), and the centroids are updated based
> on the classified samples. In a variation of this approach known as
> fuzzy c-means, all vectors have a degree of membership for each
> cluster, and the respective centroids are calculated based on these
> membership degrees.
>
> Whereas the K-means algorithm computes the average of the vectors in
> a cluster as the center, fuzzy c-means finds the center as a weighted
> average of all points, using the membership probabilities for each
> point as weights. Vectors with a high probability of belonging to the
> class have larger weights, and more influence on the centroid.

While I'm a fan of data analysis, I'm still struggling with a question
why this should be implemented as a PostgreSQL GSoC project. What would
be the result? An update to MADlib, implementing k-means, or somethink
like a PostgreSQL extension?

regards
Tomas