AffinityPropagation
AffinityPropagation
A model type for constructing a Affinity Propagation clusterer, based on Clustering.jl, and implementing the MLJ model interface.
From MLJ, the type can be imported using
AffinityPropagation = @load AffinityPropagation pkg=Clustering
Do model = AffinityPropagation()
to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in AffinityPropagation(damp=...)
.
Affinity Propagation is a clustering algorithm based on the concept of "message passing" between data points. More information is available at the Clustering.jl documentation. Use predict
to get cluster assignments. Indices of the exemplars, their values, etc, are accessed from the machine report (see below).
This is a static implementation, i.e., it does not generalize to new data instances, and there is no training data. For clusterers that do generalize, see KMeans
or KMedoids
.
In MLJ or MLJBase, create a machine with
mach = machine(model)
Hyper-parameters
damp = 0.5
: damping factormaxiter = 200
: maximum number of iterationtol = 1e-6
: tolerance for converengepreference = nothing
: the (single float) value of the diagonal elements of the similarity matrix. If unspecified, choose median (negative) similarity of all pairs as mentioned heremetric = Distances.SqEuclidean()
: metric (seeDistances.jl
for available metrics)
Operations
predict(mach, X)
: return cluster label assignments, as an unorderedCategoricalVector
. HereX
is any table of input features (eg, aDataFrame
) whose columns are of scitypeContinuous
; check column scitypes withschema(X)
.
Report
After calling predict(mach)
, the fields of report(mach)
are:
- exemplars: indices of the data picked as exemplars in
X
- centers: positions of the exemplars in the feature space
- cluster_labels: labels of clusters given to each datum in
X
- iterations: the number of iteration run by the algorithm
- converged: whether or not the algorithm converges by the maximum iteration
Examples
using MLJ
X, labels = make_moons(400, noise=0.9, rng=1)
AffinityPropagation = @load AffinityPropagation pkg=Clustering
model = AffinityPropagation(preference=-10.0)
mach = machine(model)
## compute and output cluster assignments for observations in `X`:
yhat = predict(mach, X)
## Get the positions of the exemplars
report(mach).centers
## Plot clustering result
using GLMakie
scatter(MLJ.matrix(X)', color=yhat.refs)