01 Machine Learning — 2.7 Project — Clustering: Wine color, corporate bonds and wheat seed kernels.
This is a bitácora of the wine color project to track my learning.
Previously assimilated stuff:
2.Finding natural patterns in data
2.1 Course example — basketball players
2.2 Low dimensional visualization
2.3 k-means clustering
2.4 Gaussian mixture models
2.5 Interpretating the clusters
2.6 hierarchical clustering
2.7 Project — Clustering: Wine color, corporate bonds and wheat seed kernels.
1 WINE COLOR
The wineData contains information that describes the chemical composition of different wines. The numeric data is storerd in the same matrix numData.
We are importing the data (wineData.txt) with the classic readtable function, making it categorical and extracting the numeric data.
The wine data table (6448x13) contains:
Now let’s do the PCA stuff.
%TODO — task1: Perform PCA
[~,scrs,~,~,pexp] = pca(numData)
%Crete the pareto chart
Cool. Now let’s go to task 2. Here we are going to cluster the data and experiment with k-means and GMM clustering. Like always, we want to visulize this stuff twith a scatter plot.
%% TODO TASK 2: clustering into two gropus
g = kmeans(numData,2,’Replicates’,5)
gmm = fitgmdist(numData,2,’Replicates’,5)
Awesome. Now TASK 3 here we have to compare the resulting clsters with the groups in the variable Color. Let’s create a a stacked abr chart with group1 and 2 along the axis X and the red and white data are ploted with different colors. Let’s include a legend with the corresponding color.
%% TODO — TASK 3: Cross tabulate grouping and wine color
That’s it. that was fun. Now let’s do the part 2/3 Corporate Bonds
2 CORPORATE BODS
3 KERNEL SEEDS