subject
Mathematics, 12.11.2019 00:31 124319

For a given day i, we let yi = 1 if the ground-level ozone concentration near some city (houston, in our data) is at a dangerously high level. this is called an "ozone day". we let yi = 0if the ozone concentration is low enough to be considered safe. we want to predict yi from more easily measured "features" describing atmospheric pollutant levels and meteorological conditions (temperature, humidity, wind speed, there are a total of m = 72 of these features collected each day, which we denote by xi = {xij j = 1, m}. each feature xij e r is a real number, and we will thus use a gaussian distribution to model these continuous random variables. we will build a "naive bayes" classifier, which predicts observation i to be an ozone day if p(yį = 1| xi) > p(yi = 0 | x; ), and a non-ozone day otherwise.
using bayes rule, this classifier is equivalent to one that chooses y; = 1 if and only if py(1)fx|v(xi | 1) py(0)fx|v(xi|0) fx(x; ) fx(xi) in py(1) + in fxy(xi|1) > in py(0) + in fx|v(xi|0).

in this equation, py(yi) is the probability mass function that defines the prior probability of ozone and non-ozone days. the conditional probability density function fxy(ti yi) describes the distribution of the m = 72 environmental features, which we assume depends on the type of day. we make two simplifying assumptions about these densities: the features x; are conditionally independent given y, and their distributions are gaussian. thus:

м. fx y(xi| 1) = ii 1 exp{_ (tij – h1; 1.1 2007, exp{-"*"2013" }; fx/y(xi|0) = ii 1 17270 exp exp{-(tij – 203} given y; = 1, xij is gaussian with mean mi; and variance on. given y, = 0, xij is gaussian with mean mo and variance go. there are a total of 2m mean parameters and 2m variance parameters, since every feature xij has a distinct distribution for each of the two classes.

a) derive equations for in fxy (u; | 1) and in fxy(x; 0), the (natural) logarithms of the conditional probability density functions in equations (2,3). for numerical robustness, simplify your answer so that it does not involve the exponential function.

because ozone days are relatively rare, a classifier that always predicts yi = 0 would be correct over 95% of the time, but would obviously not be practically useful for reducing ozone hazard. to evaluate our classifiers, we will thus separately compute the numbers of false alarms (predictions of ozone days when in reality y = 0) and missed detections (predictions of non-ozone days when in reality y; = 1). we are willing to allow some false alarms as long as there are very few missed detections. for all parts below, assume that the mean parameters muje moj are set to match the mean of the empirical distribution of the training data. the demo code computes these means.

b) start by assuming the classes are equally probable (py(1) = py(0) = 1/2), and have unit variance (01 = 0; = 1). write code to compute the log conditional densities from part (a). then using equation (1), classify each test example. report your classification accuracy, and the numbers of false alarms and missed detections. hint: your classifer should have fewer than 10 missed detections.

c) rather than assuming features have variance one, set the variance parameters oli, equal to the variance of the empirical distribution of the training data. classify each test example using equation (1) with these variance estimates. report your classification accuracy, and the numbers of false alarms and missed detections.

d) rather than assuming the classes are equally probable, estimate py(1) as the fraction of training examples that are ozone days. classify each test example using equation (1) with this informative class prior, and the variances from part (c). report your classification accuracy, and the numbers of false alarms and missed detections.

ansver
Answers: 1

Another question on Mathematics

question
Mathematics, 21.06.2019 17:00
Asif spent $745.10 on 13 new file cabinets for his office. small file cabinets cost $43.50 and large file cabinets cost $65.95. write and solve a system of equations to find the number of small cabinets and large cabinets he purchased. determine if each statement is true or false. 1. he purchased 5 small cabinets. true or false 2. he purchased 7 large cabinets. true or false 3. he spent $527.60 on large cabinets. true or false
Answers: 2
question
Mathematics, 22.06.2019 00:00
If each bricks costs and he can only buy one brick how much will it cost him to get the material to put around the outside of his garden?
Answers: 2
question
Mathematics, 22.06.2019 01:30
In the figure below, m∠r is 60°, and m∠t is 120°. what is m∠q? a. 60° b. 30° c. 120° d. 54
Answers: 1
question
Mathematics, 22.06.2019 04:10
Ofis to ? o y=-3x-4o y=-3x+(-4)o y=-4 + 3xoy=-4-3x
Answers: 1
You know the right answer?
For a given day i, we let yi = 1 if the ground-level ozone concentration near some city (houston, in...
Questions
question
History, 15.12.2020 19:10
question
Social Studies, 15.12.2020 19:10
Questions on the website: 13722363