Applying and evaluating the k-means data clustering algorithm, using the RapidMiner Data Mining tool on a given data set

computer science

Description

A. Objective: Applying and evaluating the k-means data clustering algorithm, using the RapidMiner Data Mining tool on a given data set. B. Data Set One of the well-known datasets that is being referenced in data mining is the “Iris data set”. The data set contains five attributes. 1. Class Label: Type of Iris Plant ( Iris Setosa, Iris Versicolour, Iris Virginica) 2. A1: sepal length in cm 3. A2: sepal width in cm 4. A3: petal length in cm 5. A4: petal width in cm Each class of Iris Plant has 50 instances (tuples/ examples). The data set has been traditionally used for classification. One class is linearly separable from the other 2; the latter are NOT linearly separable from each other. Knowing these facts, we will use the same data set for clustering. You will use the k-means clustering algorithm, which will cluster the database based on the attributes (2, 3, 4, 5). As you know in k-means clustering, you need to set the number of clusters that you wish to create. In this case, it will be three clusters. After applying the clustering model, you will compare the results with the facts that you already know. For example, you will test how many instances/examples have been clustered in each created cluster vs. the fact that there should be actually 50 instances of each Iris Plant type.


Related Questions in computer science category


Disclaimer
The ready solutions purchased from Library are already used solutions. Please do not submit them directly as it may lead to plagiarism. Once paid, the solution file download link will be sent to your provided email. Please either use them for learning purpose or re-write them in your own language. In case if you haven't get the email, do let us know via chat support.