**Directions**

- Familiarize yourself with the codebook for the Saratoga Houses dataset (codebook below)
- Import/load the Saratoga Houses dataset.
- Determine whether price and number of bathrooms are significantly associated based on an appropriate bivariate testing procedure from the previous weeks. Answer Question 1 below.
- Now construct a regression equation with price as your response variable and number of bathrooms as your explanatory variable. Build an appropriate model. Answer Questions 2-4 below.
- Test the bivariate association between living area and price. Answer Question 5 below.
- Test the bivariate association between living area and number of bathrooms. Answer Question 6 below.
- Construct a multiple regression equation to test whether number of bathrooms are significantly associated with price after controlling for living area.
- Familiarize yourself with the read me file for the Student Health dataset.
- Import/load the Student dataset.
- Determine the mean weight of students based on major. Answer Question 8 below.
- Determine whether weight and major are significantly associated based on an appropriate bivariate testing procedure from the previous weeks. Answer Question 9 below.
- Now construct a regression equation with weight as your response variable and major as your explanatory variable. Build an appropriate model. Answer Questions 10-11 below.
- Test the bivariate association between gender and weight. Answer Question 12 below.
- Test the bivariate assocaition between gender and major. Answer Question 13 below.
- Construct the appropriate model to determine whether weight and major are significantly associated after controlling for gender. Answer Question 14 below.

**Questions**

* Question 1*: What is the corresponding p-value to test whether number of bathrooms and house price are significantly associated. What conclusion can be drawn?

* Question 2*: What is the appropriate regression equation. How can you specifically explain the relationship between number of bathrooms and house price?

* Question 3:* Why are number of bathrooms significantly associated with house price?

* Question 4*: Why should living area be considered as a possible confounding variable?

* Question 5*: What is the corresponding p-value to test whether price and living area are significantly associated. What conclusion can be drawn?

* Question 6:* What is the corresponding p-value to test whether number of bathrooms and living area are significantly associated. What conclusion can be drawn?

* Question 7:* After controlling for living area, is there a significant association between number of bathrooms and house price?

* Question 8:* State the mean weight for nursing and engineering majors below.

* Question 9: *What is the corresponding test statistic and p-value to test whether major and weight are significantly associated. What conclusion can be drawn?

* Question 10: *Find an appropriate regression equation that uses weight as the response variable and major as the explanatory variable. What does the regression equation tell us about how the mean weight of Nursing majors and Engineering majors compare?

* Question 11:* Why is gender a reasonable or not reasonable choice to consider as a confounding variable?

* Question 12: *What is the corresponding test statistic and p-value to test whether gender and weight are significantly associated. What conclusion can be drawn?

* Question 13: *What is the corresponding test statistic and p-value to test whether gender and major are significantly associated. What conclusion can be drawn?

* Question 14:* Are major and weight significantly asssociated after controlling for gender?

Submit your answers in moodle under Mini-Assignment 9.

CODEBOOK: Saratoga Houses

The dataset contains information on 1,063 houses in Saratoga County, New York, USA in 2006.

The variables in the dataset include:

- Price: Amount of house in US dollars
- Living.Area: Square feet of house
- Baths: Number of baths
- Bedrooms: Number of bedrooms
- Fireplace: “yes” or “no”
- Acres: Number of acres on the property
- Age: Age of house in years

CODEBOOK: Student Health

The following data comes from self-reported volunteer data of 100 students majoring in nursing and engineering at University of Miami in 2014. The following data was recorded:

- major: Either “Nursing” or “Engineering” based on sample collected
- gender: Listed as “Male” or “Female”
- sleep: Estimated number of hours slept on a typical night
- depression: diagnosed as depressed (1) or not diagnosed as depressed (0)
- weight: Student weight in pounds.
- smedia: Estimated amount of time (in hours) spent on social media each day.