Quiz 11 Instructions

Please complete the following questions and submit a file named Quiz11.R to Gradescope for autograding.

Remember:

  • Do not rename external data files or edit them in any way. In other words, don’t modify CLEANDATA.txt, etc. unless specifically requested and instructed below. Your code won’t work properly on my version of that data set, if you do.

  • Do not use global paths in you script. Instead, use setwd() interactively in the console, but do not forget to remove or comment out this part of the code before you submit. The directory structure of your machine is not the same as the one on Gradescope’s virtual machines.

  • Do not destroy or overwrite any variables in your program. I check them only after I have run your entire program from start to finish.

  • Check to make sure you do not have any syntax errors. Code that doesn’t run will get a very bad grade.

  • Make sure to name your submission Quiz11.R

Tip: before submitting, it might help to clear all the objects from your workspace, and then source your file before you submit it. This will often uncover bugs.

Data Preparation – Practice converting XLSX file to CSV

Please review the instructions on “How to convert Excel file to CSV” at https://www.ablebits.com/office-addins-blog/convert-excel-csv/. Then, save the “BLACKFRIDAY” and “COFFEE” spreadsheets from “Datasets.xlsx” as two separate CSV files named “BLACKFRIDAY.csv” and “COFFEE.csv”.

After this step, we will disregard “Datasets.xlsx” and only use the two newly created CSV files for the following questions.

Question 1

  1. [1pt] Import the first 4 columns and first 8 rows of “COFFEE.csv” and save the data frame as COFFEE.PART1. [Hint: You may use nrows and colClasses for this task.]

The imported data frame should be

#>   DEMAND PRICE         X Pred_DEMAND
#> 1   1120   3.0 0.3333333   1088.8869
#> 2    999   3.1 0.3225806   1015.6815
#> 3    932   3.2 0.3125000    947.0514
#> 4    884   3.3 0.3030303    882.5807
#> 5    807   3.4 0.2941176    821.9024
#> 6    760   3.5 0.2857143    764.6914
#> 7    701   3.6 0.2777778    710.6588
#> 8    688   3.7 0.2702703    659.5469
  1. [2pts] Export the regression analysis results to file “REGRESSION_COFFEE_PART1.txt”.

[Hint: You may use cat() and sink() for this task; You may use the following R codes.

lmod <- lm(PRICE ~ DEMAND, data = COFFEE.PART1)
summary(lmod)
anova(lmod)

].

The final TXT file “REGRESSION_COFFEE_PART1.txt” should look like:

  1. [1pt] Import the last 3 columns of “COFFEE.csv” and save the data frame as COFFEE.PART2. Then rename the second column name as inverseP.

The imported data frame should be: (Below we only show the first and last 3 rows to save space)

head(COFFEE.PART2, 3)
#>      P  inverseP     Dhat
#> 1 3.00 0.3333333 1089.333
#> 2 3.01 0.3322259 1081.794
#> 3 3.02 0.3311258 1074.305
tail(COFFEE.PART2, 3)
#>       P  inverseP     Dhat
#> 69 3.68 0.2717391 670.0000
#> 70 3.69 0.2710027 664.9864
#> 71 3.70 0.2702703 660.0000
  1. [1pt] Use plot(COFFEE.PART2$P, COFFEE.PART2$inverseP) to draw the scatterplot and save the plot as “PLOT_COFFEE_PART2.png”, make sure you use bg = "transparent" to set the background to transparent.

Question 2

[2pts] Download the BLACKFRIDAY.csv file. Import “BLACKFRIDAY.csv” and save the data frame as BLACKFRIDAY. Filter the data frame with YEARS greater than 10, and save the result as BLACKFRIDAY_gt10. Export data frame BLACKFRIDAY_gt10 to CSV file “BLACKFRIDAY_gt10.csv”. Make sure you use row.names = FALSE to avoid writing the row names.

Question 3

Use “CLEANDATA.txt” for this question. Complete the following parts.

  1. [2pts]

Step 1: Skip the first five rows (consider the column names as Row 1) and import “CLEANDATA.txt”. Save the data frame as CLEANDATA.

Step 2: Scan the column names from “CLEANDATA.txt” as header. Then assemble the data frame CLEANDATA and header.

The data frame CLEANDATA should be:

#>    EXP AUTOMATI OUTPUT
#> 1   <1        M     12
#> 2   <1        M     10
#> 3   <1        L      5
#> 4   <1        L      4
#> 5   <1        L      8
#> 6  1-2        H     15
#> 7  1-2        H     18
#> 8  1-2        H     17
#> 9  1-2        M     10
#> 10 1-2        M     10
#> 11 1-2        M     12
#> 12 1-2        L      8
#> 13 1-2        L     10
#> 14 1-2        L     10
#> 15  >2        H     20
#> 16  >2        H     18
#> 17  >2        H     19
#> 18  >2        M     13
#> 19  >2        M     13
#> 20  >2        M     14
#> 21  >2        L     12
#> 22  >2        L     11
#> 23  >2        L     13
  1. [1pt] Write the CLEANDATA$AUTOMATI, one element per line, to the file “AUTOMATI.txt”.