Mini-Assignment 2

For this assignment you will be working with Health Evaluation and Linkage to Primary Care (HELP) data set. The data set is located on the P-drive in the Practice Data folder.

The HELP study was a clinical trial for adult inpatients recruited from a detoxification unit. Patients with no primary care physician were randomized to receive a multidisciplinary assessment and a brief motivational intervention or usual care, with the goal of linking them to primary medical care.

The data set contains 453 observations on the following variables:

  • age: subject age at baseline (in years)
  • anysub: use of any substance post-detox: a factor with levels no yes
  • cesd: Center for Epidemiologic Studies Depression measure at baseline (high scores indicate more depressive symptoms)
  • d1: lifetime number of hospitalizations for medical problems (measured at baseline)
  • daysanysub: time (in days) to first use of any substance post-detox
  • dayslink: time (in days) to linkage to primary care
  • drugrisk: Risk Assessment Battery drug risk scale at baseline
  • e2b: number of times in past 6 months entered a detox program (measured at baseline)
  • female: 0 for male, 1 for female
  • sex: a factor with levels male female
  • g1b :experienced serious thoughts of suicide in last 30 days (measured at baseline): a factor with levels no yes
  • homeless: housing status: a factor with levels housed homeless
  • i1: average number of drinks (standard units) consumed per day, in the past 30 days (measured at baseline)
  • i2: maximum number of drinks (standard units) consumed per day, in the past 30 days (measured at baseline)
  • id: subject identifier
  • indtot: Inventory of Drug Use Consequences (InDUC) total score (measured at baseline)
  • linkstatus: post-detox linkage to primary care (0 = no, 1 = yes)
  • link: post-detox linkage to primary care: no yes
  • mcs: SF-36 Mental Component Score (measured at baseline, lower scores indicate worse status)
  • pcs: SF-36 Physical Component Score (measured at baseline, lower scores indicate worse status)
  • pss_fr: perceived social support by friends (measured at baseline, higher scores indicate more support)
  • racegrp: race/ethnicity: levels black hispanic other white
  • satreat: any BSAS substance abuse treatment at baseline: no yes
  • sexrisk: Risk Assessment Battery sex risk score (measured at baseline)
  • substance: primary substance of abuse: alcohol cocaine heroin
  • treat: randomized to HELP clinic: no yes


  1. Load in the HELP data set.
  2. Make a frequency table for sex. Then separately make a frequency table for d1.
  3. Answer Questions 1-3 below
  4. Now, subset the data to only include patients whose primary substance of abuse is cocaine and who are at least 40 years old.
  5.  Make a frequency table for sex based on this subset.
  6.  Answer Questions 4-5 below


Question 1: How many patients in the study are female?

Question 2: How many patients in the study have never been hospitalized for medical problems

Question 3: What percentage of patients in the study have been hospitalized fewer than 5 times? (Round to nearest whole percentage?

Question 4: How many patients in the study are at least 40 years old and have cocaine listed as his/her primary abuse substance?

Question 5: What percentage of patients who are at least 40 years old and have cocaine listed as his/her primary abuse substance are male?

Submit your answers in moodle under Mini-Assignment 2.