check the attachements Please read the instructions and questions carefully in ” Assignment_4_2023_Fall.pdf” file and use “Auto.csv” to

check the attachements

Please read the instructions and questions carefully in ” Assignment_4_2023_Fall.pdf” file and use “Auto.csv” to finish the assignment. You should submit both 1) an R code ; 2) A PDF report with answers through the link “Submit Assignment 4 Here”

Guidelines:

· Use only R for this assignment

· Submit both R code and Report on findings

· Work is to be done individually for this assignment

Fitting a Classification Tree

1.
This problem involves the OJ data set which is part of the ISLR package (
Hint: the first three lines of codes should be: library (tree), library (ISLR), attach (OJ)).

1.1 Create a training set containing a random sample of 800 observations, and a test set containing the remaining observations. Take a screenshot of your code. (Hint: set.seed (2), train=sample())

1.2 Fit a tree to
the training data, with
Purchase as the response and the other variables as predictors. Use the summary( ) function to produce summary statistics about the tree. Take a screenshot of the summary statistics. How many terminal nodes does the tree have? What is the training misclassification error rate?

1.3 Plot the tree and take a screenshot of the tree (Hint: plot() and text())

1.4 Predict the response on the test data, and produce a confusion matrix comparing the test labels to the predicted test labels. What is the accuracy rate?

1.5 Apply the cv.tree() function to the training set in order to determine the optimal tree size. (Use set.seed(7)). Print the results (Hint: the results should contain the size, k, method etc).

1.6 Produce a plot with tree size (i.e. size) on the x-axis and cross-validated classification error rate (i.e. dev) on the y-axis.

1.7 Which tree size corresponds to the lowest cross-validated classification error rate (i.e. dev)?

1.8 Produce a pruned tree corresponding to the optimal tree size obtained using cross-validation. Take a screenshot of a pruned tree. What is the accuracy rate for the pruned tree? Is it improved compared to the accuracy rate in (1.4)?

1.9 If cross-validation does not lead to selection of a pruned tree (i.e. the accuracy rate produced in (1.8) is lower than the one in (1.4)), then create a pruned tree with five terminal nodes. What is the accuracy rate now?

1

Fitting a Regression Tree

2.
In the lab, a classification tree was applied to the Carseats data set after converting Sales into a qualitative response variable. Now we will seek to predict Sales using regression trees and related approaches, treating the response as a quantitative variable.

2.1 Using the validation-set approach to split the data set into a training set and a test set (Hint:
use set.seed(2); validation-set approach: half of the observations are selected as the training dataset while half of observations are treated as the test dataset). Take a screenshot of your code.

2.2 Fit a regression tree to the training set.

a) Use summary () to print out the results. How many terminal nodes do you get? What is RMD (Residual Mean Deviance)?

b) Plot the tree and take a screenshot of the tree;

c) What test MSE do you obtain?

2.3 Use cross-validation in order to determine the optimal level of tree complexity (use set.seed(2)).

a) Produce a plot with tree size on the x-axis and cross-validated classification error rate on the y-axis.

b) What is the optimal level of tree complexity?

c) Using the optimal level of tree size to prune the tree, does pruning the tree improve the test MSE?

2.4 Use the bagging approach in order to analyze this data. Take a screenshot of the results. What test MSE do you obtain? (Hint: use set.seed (1);
mtry=10 since we have 10 predictors in Carseats dataset and we use all of the predictors in the bagging approach).

2.5 Use random forests to analyze this data.

a) What test MSE do you obtain? (Hint: use set.seed(1);
mtry=10/3 since we usually use 1/3 of the predictors when building a random forest of regression trees)

b) Use the importance() function to determine which variables are most important. Take a screenshot of your results.

c) Plots of these importance measures can be produced using the varImpPlot() function. Take a screenshot of your output.

d) So which variables are most important?

What to submit:

1. R code.

a.

b.

c.

d.

2. Report.

a.

b.

c.

d.

e.

Should include all the code to accomplish the tasks.

Clear and concise comments to indicate what part of the assignment each code chunk pertains to.

Code should be easily readable.

Filename should be in the format of: LastnameFirstname_A4.R

Take screenshots of your outputs in R Studio and answer all the questions. Submit in PDF format.

Answers questions clearly and concisely.

Includes appropriate plots. Make sure the plots are properly labeled.

The assignment will be graded on the correctness of the answers, comprehensiveness of the analysis, clarity of results’ presentation and neatness of the report.

Share This Post

Email
WhatsApp
Facebook
Twitter
LinkedIn
Pinterest
Reddit

Order a Similar Paper and get 15% Discount on your First Order

Related Questions

Description ‫المملكة العربية السعودية‬ ‫وزارة التعليم‬ ‫الجامعة السعودية اإللكترونية‬ Kingdom of Saudi Arabia Ministry of Education Saudi

Description ‫المملكة العربية السعودية‬ ‫وزارة التعليم‬ ‫الجامعة السعودية اإللكترونية‬ Kingdom of Saudi Arabia Ministry of Education Saudi Electronic University College of Administrative and Financial Sciences Assignment – 1st Marketing Management (MGT 201) Due Date: 05/10/2024 @ 23:59 Course Name: Marketing Management Student’s Name: Course Code: MGT201 Student’s ID Number: Semester:

RWH week 7 Presentation on Private Mental Health Practice Ramona Wilkerson Herzing University 10-6-24 1 Current Practice Setting The setting

RWH week 7 Presentation on Private Mental Health Practice Ramona Wilkerson Herzing University 10-6-24 1 Current Practice Setting The setting is private mental health practice which operates independently from larger healthcare systems. It offers specialized mental health services in a more personalized and flexible environment. The patient population includes individuals

Description ‫المملكة العربية السعودية‬ ‫وزارة التعليم‬ ‫الجامعة السعودية اإللكترونية‬ Kingdom of Saudi Arabia Ministry of Education Saudi

Description ‫المملكة العربية السعودية‬ ‫وزارة التعليم‬ ‫الجامعة السعودية اإللكترونية‬ Kingdom of Saudi Arabia Ministry of Education Saudi Electronic University Department of Business Administration College of Administrative and Financial Sciences Assignment 1 MGT324 (1st Term 2024-2025) Public Management (MGT 324) Due Date: 5/10/2024 Course Name: Public Management Student’s Name: Course Code:

Description # Please I need this assignment within 48 hours, # Should not have a plagiarism, # Follow the “General Instructions” in the Assignment

Description # Please I need this assignment within 48 hours, # Should not have a plagiarism, # Follow the “General Instructions” in the Assignment ‫المملكة العربية السعودية‬ ‫وزارة التعليم‬ ‫الجامعة السعودية اإللكترونية‬ Kingdom of Saudi Arabia Ministry of Education Saudi Electronic University Department of Business Administration College of Administrative and

Individuals working in educational leadership positions may find themselves in situations where they must examine the legal and ethical guidelines for

Individuals working in educational leadership positions may find themselves in situations where they must examine the legal and ethical guidelines for determining whether an employee should be retained, disciplined, or terminated. In situations where there are reasonable arguments for these scenarios, it can be difficult to discern a correct decision. In

Description ‫المملكة العربية السعودية‬ ‫وزارة التعليم‬ ‫الجامعة السعودية اإللكترونية‬ Kingdom of Saudi Arabia Ministry of Education Saudi

Description ‫المملكة العربية السعودية‬ ‫وزارة التعليم‬ ‫الجامعة السعودية اإللكترونية‬ Kingdom of Saudi Arabia Ministry of Education Saudi Electronic University College of Administrative and Financial Sciences Assignment 1 Communications Management (MGT 421) Due Date: 5th October 2024 @ 23:59 Course Name: Communication Management Student’s Name: Course Code: MGT421 Student’s ID Number: