subject

The dataset ToyotaCorolla. csv contains data on used cars on sale during the late summer of 2004 in the Netherlands. It has 1436 records containing details on 38 attributes, including Price, Age, Kilometers, HP, and other specifications.

Explore the data using the data visualization capabilities of R. Which of the pairs among the variables seem to be correlated?

We plan to analyze the data using various data mining techniques described in future chapters. Prepare the data for use as follows:

The dataset has two categorical attributes, Fuel Type and Metallic. Describe how you would convert these to binary variables. Confirm this using R’s functions to transform categorical data into dummies.

Prepare the dataset (as factored into dummies) for data mining techniques of supervised learning by creating partitions in R. Select all the variables and use default values for the random seed and partitioning percentages for training (50%), validation (50%). Describe the roles that these partitions will play in modeling.

# how many total rows are there in the origDummies. df?

# randomly select row numbers for the training partition

# randomly select row numbers for the validation partition: sample from (all rows - training rows)

# Now create the train. data and valid. data dataframes

# 3) Propose three variables that could be used in a linear regression model

# 4) Create a linear regression model on the training dataset using variables
# Age, Kilometer and Manufacturer's Guarantee to predict "Price"
# use "reg" as the name of the model

# See the predicted values and actual values side by side

# plot the residuals

# see the model coefficients and their statistical significance# compute accuracy on a training set

# use the model to make a prediction using new data

# What could be your price estimate for a car 30 months old, 22000 kilometers,
# no manufacturer guarantee?

ansver
Answers: 2

Another question on Computers and Technology

question
Computers and Technology, 21.06.2019 21:30
Felicia wants to become a head surgeon by december 2013. she designs the career milestones that she would need to complete her goal. by june 2013, she was not licensed. which best describes what she should do?
Answers: 2
question
Computers and Technology, 23.06.2019 15:20
An ou structure in your domain has one ou per department, and all the computer and user accounts are in their respective ous. you have configured several gpos defining computer and user policies and linked the gpos to the domain. a group of managers in the marketing department need different policies that differ from those of the rest of the marketing department users and computers, but you don't want to change the top-level ou structure. which of the following gpo processing features are you most likely to use? a, block inheritance b, gpo enforcement c, wmi filtering d, loopback processing
Answers: 3
question
Computers and Technology, 23.06.2019 20:30
1. for which of the following are you not required to signal beforehand? a. changing lanes b. speeding up c. stopping
Answers: 2
question
Computers and Technology, 24.06.2019 03:30
What is the purpose of a computer network needs assessment? to analyze which workers need more training to improve their performance to compare worker productivity to determine what steps employees can take to increase company revenue to evaluate how to move from the current status to the desired goal
Answers: 2
You know the right answer?
The dataset ToyotaCorolla. csv contains data on used cars on sale during the late summer of 2004 in...
Questions
question
World Languages, 02.12.2021 01:00
question
English, 02.12.2021 01:00
Questions on the website: 13722367