# Mastering Data Analysis in Excel Week 5 Quiz Answer

## Mastering Data Analysis in Excel Week 5 Quiz Answer Coursera

### Probability, AUC, and Excel Linest Function

**Question 1)**

**Keep the 125 outcomes in the Histogram Spreadsheet unchanged. Change the bin ranges so that bin 1 is [-3, -1), bin 2 is [-1,1) bin 3 is [1, 3).**

- Histograms Spreadsheet.xlsx

**What is the approximate probability that a new outcome will fall within bin 1?**

- .4
- 4%
- 5
- 5%

**Question 2)**

- Use the Excel Probability Functions Spreadsheet.
- Excel_Probability_Functions.xlsx
- Assume a continuous uniform probability distribution over the range [47, 51.5].

**What is the skewness of the probability distribution?**

- 0
- 49.25
- 1.69
- 2.17

**Question 3)**

- Use the Excel Probability FunctionsSpreadsheet, provided in question #2.
- Assume a continuous uniform probability distribution over the range [-12, 20]

**What is the entropy of this distribution?**

- 3 bits
- 5 bits
- 6 bits
- 4 bits

**Question 4)**

- Use the Excel Probability Functions Spreadsheet that was previously provided.
- Assume a Gaussian Probability function with mean = 3 and standard deviation =4.

**What is the value of f(x) at f(3.5)?**

- .099
- .550
- 4.05
- .352

**Question 5)**

- Use the Excel Probability Functions Spreadsheet previously provided in this quiz.
- Assume a Gaussian Probability Distribution with mean = 3 and standard deviation = 4.

**What is the cumulative distribution at x = 7?**

- .960
- .841
- .060
- 1.00

**Question 6)**

- Use the AUC Calculator Spreadsheet.
- AUC_Calculator and Review of AUC Curve.xlsx

**If the “modification factor” in the original example given in the AUC Calculator Spreadsheet is changed from -1 to -2, what is the change in the actual Area Under the ROC Curve?**

- The area increases
- The area decreases
- No change

**Question 7)**

Use the AUC Calculator Spreadsheet provided in question #6.

**If the “modification factor” in the original example given in the AUC Calculator Spreadsheet is changed from -1 to -2, what is the threshold (row 10) that results in the lowest cost per event?**

- .45
- 3.5
- .9
- 1.3

**Question 8)**

**Refer to the AUC Calculator Spreadsheet previously provided.**

Assume a binary classification model is trained on 200 ordered pairs of scores and outcomes and has an AUC of .91 on this “training set.” The same model, on 5,000 new scores and outcomes, has an AUC of .5.

**Which statement is most likely to be correct?**

- The original model identified signal as noise and has no predictive value on new data.

- The model overfit the training set data and will need to be improved to work better on the new data.

- The original model is expected to perform worse on test set data and is functioning acceptably.

**Question 9)**

- Refer to the Excel Linest Function Spreadsheet.
- Excel Linest Function.xlsx

**If a multivariate linear regression gives a weight beta(1) of 0.4 on x(1) = “age in years,” and a new input x(7) of “age in months” is added to the regression data, which of the following statements is false?**

- If the x(1) data are removed, the new beta(7) on the new x(7) data will be .033

- Using Excel linest, and including x(1) and x(7) data, the new beta(7) on the age in months will be 0.

- If the x(1) data are removed, the new beta(7) on the new x(7) data will be 0.4.

**Question 10)**

- Use the Excel Linest Function Spreadsheet that was provided in question #9.

**What is the Correlation, R for the linear regression shown in the example?**

- .367
- .606
- .778 or – .778