Mini-Assignment 11

Mini-Assignment 11


  1. Familiarize yourself with the codebook for the movies dataset below.
  2. Import/load the movies dataset.
  3. Suppose you want to determine whether movie budget is significantly associated with the movie rating. Construct the appropriate regression. Answer Question 1 below.
  4. Suppose you want to determine whether movie budge is significantly associated with movie rating after controlling for MPAA-designation. Construct the appropriate regression. Answer Questions 2-4 below.
  5. There is reason to believe that the relationship between budget and rating may vary differently based on it’s MPAA-designation. Construct the appropriate regression that will allow you to test whether this is the case. Answer Question 5 below.


Question 1: Is budget significantly associated with movie rating? Interpret the term in the model that describes the relationship.

Question 2: Is budget significantly associated with movie rating when controlling for MPAA-designation?

Question 3:  How do the ratings of PG-13 movies compare to the ratings of PG-movies on average when holding budget constant?

Question 4:  Are NC-17 movies rated significantly differently than PG movies?

Question 5:  Does it appear that the relationship

Question 6:  Is there a significant association between whether an email is spam and whether the email has an attachment after controlling for number of characters of an email message has? What is the associated p-value and conculsion?

Question 7: What is the correct interpretation of the odds ratio corresponding to characters.

Question 8: Controlling for all other predictor variables, is whether a message is spam independently associated with whether there is an exclaimation point in the subject? Why?

Submit your answers here.

CODEBOOK: Movies Data

The internet movie database,, is a website devoted to collecting movie data supplied by studios and fans. It claims to be the biggest movie database on the web and is run by amazon. More about information can be found online,, including information about the data collection process,

The description of the data is as follows:

  • title. Title of the movie.
  • year. Year of release.
  • budget. Total budget (if known) in US dollars
  • length. Length in minutes.
  • rating. Average IMDB user rating.
  • votes. Number of IMDB users who rated this movie.
  • r1-10. Multiplying by ten gives percentile (to nearest 10%) of users who rated this movie a 1.
  • mpaa. MPAA rating.
  • action, animation, comedy, drama, documentary, romance, short. Binary variables representing if movie was classified as belonging to that genre.