Q1.
Download a CSV file from given URL and
store it as the dataset with name iris_new.
URL:
https://archive.ics.uci.edu/ml/machine-learning-databases/iris/iris.data
Put the headers of “Sepal.Length”,
“Sepal.Width”, “Petal.Length”, “Petal.Width” and “Species” for the dataset’s
columns, respectively.
Find the mean of first 5 rows of
“Sepal.Length” in your datset iris_new.
In “Species” variable, remove
"Iris-" string so that column values will be “sentosa”, “virginica”
and “versicolor”. Capitalize first letter of “Species” variable.
10
marks
Q2.
Create a new dataset called “iris_sub” and
load the observations from “iris_new” where “Sepal.Length” value is less than
6.4 and “Petal.Length” is greater than 5.1.
How many rows are there in iris_sub? What
is the mean value of “Sepal.Width” and “Petal. Length” in iris_sub?
20
marks
Q3.
Using a boxplot, find outliers in the
iris_new dataset. For each outlier, mention variable name and number of
outliers and value of outliers. Store the cases which have outliers in any of
the variables in a new dataset. Find mean of “Petal.Width” in this new dataset.
30 marks
Q4.
Use ggplot2 to visualize the variable in
iris dataset. Use a proper chart type to plot Sepal.Width vs. Sepal.Length, and
”Petal.Length” vs. “Species”
30
marks
Q5.
From iris_new dataset, subset Sepal.Length,
Petal.Length and Species. For the following values, find the type of Species
when Sepal.Length =5.7 ; Petal.Length=4.1.
10
marks
Get Free Quote!
385 Experts Online