subject

Please complete the following instructions: 1. Download the Chicago_Crimes_Assign_4500Bal. csv download dataset.

2. Create an Orange workflow that will do the following:

Ingest the Chicago_Crimes_Assign_4500Bal. csv download dataset.

Preprocess the data to retrieve the most relevant 2 features.

Continuize the discrete categorical variables as numerical. See the video for which option to select here.

Create a k-Means module and let the number of clusters be chosen by the Silhouette score.

Create a Silhoutte Plot

Create a Scatter Plot and compare the two features. Color by cluster.

Answer the following questions in a Word document:

How many clusters was your final result produced in? What was the Silhouette score of the most optimal cluster sizes?

What were the 2 features chosen by your preprocessing?

What can you say about the scatter plot produced? Think about how the categorical variables are transformed into numbers. You don't need to know what the values are that are encoded to make observations about the relationships between the variables.

Try to switch to a using few different number of clusters? Look at your scatterplot. Does this make more sense or less?

Using the Silhouette Plot at the same time as the Scatter Plot, how many of each cluster are ranked in the bottom of cohesion? You can highlight them to see them on the scatter plot (see video). What can you say about these data points in each cluster?

3. Open your workflow from the Chicago Crimes Classification Assignment. You can choose the Undersampled or Oversampled version (note the Oversampled version will take longer for the Neural Network to train)

Add the Neural Network model widget.

Configure the Neural Network as follows:

Give it 2 hidden layers of 50 neurons in each layer

Make the Activation function: ReLu

Make the Solver: Adam

Regularization: leave as is at 0.0001

Maximal number of iterations: 100

Ensure replicable training is checked

Connect the Neural Network widget to the training data as an input and the Test & Score as output (see video)

Connect the Neural Network output to the Predict widget.

Compare the new results in Test & Score, Confusion Matrix and ROC Score.

In the Word Document you created for clustering, answer these questions:

List the results in the Word Document.

Did the Neural Network model perform better or worse than the other models?

Why do you think it performed better or worse?

Send me your email for the link!

ansver
Answers: 2

Another question on Computers and Technology

question
Computers and Technology, 21.06.2019 16:00
Speed is how fast an object moves a certain distance within a length of time. how is speed calculated? a) distance/time b) time/distance c) velocity/time d) distance x time
Answers: 1
question
Computers and Technology, 22.06.2019 12:30
What characteristic of long period comets suggest they come directly from the oort cloud?
Answers: 2
question
Computers and Technology, 23.06.2019 10:00
Now, open this passage to read about fafsa requirements. describe the information you will need to provide in order to complete a fafsa. list at least three of the required documents you must include.
Answers: 3
question
Computers and Technology, 23.06.2019 14:30
Norder to receive financial aid at his vocational school, mario must fill out the fafsa. the fafsa is a form that must be completed to determine . in order to complete a fafsa, you must submit . the fafsa can students obtain
Answers: 2
You know the right answer?
Please complete the following instructions: 1. Download the Chicago_Crimes_Assign_4500Bal. csv down...
Questions
Questions on the website: 13722362