Lab 3

Question 1

[5 pts] List all the ways that a list differs from an atomic vector.

Question 2

Use longley dataset from the R package datasets.

  1. [20 pts] Fix each of the following common data frame subsetting errors, and explain what’s wrong.

longley[longley$GNP.deflator = 100, ]
longley[-1:4, ] # remove the first four rows
longley[longley$GNP.deflator <= 99]
longley[longley$GNP.deflator == 83 | 99 | 100, ]
  1. [10 pts] Why does longley[1:15] return an error? How does it differ from the similar longley[1:15,]?

  2. [5 pts] Find the dimension of longley. Display the 7 column names of longley.

  3. [5 pts] Run the command longley[, c(TRUE, FALSE)] and describe the output. Also, describe how the recycling rule applies in this context.

  4. [5 pts] Brainstorm as many ways as possible (at least three ways) to extract the third value from the GNP variable, which is 258.054, in the longley dataset.

Question 3

  1. [5 pts] Please re-create the data frame DeathRate below. You may need to use rep() function and the following vectors. Also note that the values in the Death_rate column are decimals.
c("0-4", "5-9", "10-14", "15-19", "20-24", "25-34", "35-44", "45-54", "55-64", "65-74", "75+")
c(9.4, 10.0, 9.3, 8.6, 7.6, 13.3, 12.7, 11.3, 9.1, 5.8, 2.8)
c(11.8, 13.9, 12.8, 12.2, 9.6, 12.6, 11.0, 8.3, 4.6, 2.3, 1.0)
c(2056, 186, 140, 223, 370, 391, 545, 1085, 2036, 5219, 13645)
c(2392, 185, 184, 426, 645, 871, 1242, 1994, 3313, 6147, 14136)
#>      Age       Location Population_size Death_rate
#> 1    0-4          Maine             9.4    0.02056
#> 2    5-9          Maine            10.0    0.00186
#> 3  10-14          Maine             9.3    0.00140
#> 4  15-19          Maine             8.6    0.00223
#> 5  20-24          Maine             7.6    0.00370
#> 6  25-34          Maine            13.3    0.00391
#> 7  35-44          Maine            12.7    0.00545
#> 8  45-54          Maine            11.3    0.01085
#> 9  55-64          Maine             9.1    0.02036
#> 10 65-74          Maine             5.8    0.05219
#> 11   75+          Maine             2.8    0.13645
#> 12   0-4 South Carolina            11.8    0.02392
#> 13   5-9 South Carolina            13.9    0.00185
#> 14 10-14 South Carolina            12.8    0.00184
#> 15 15-19 South Carolina            12.2    0.00426
#> 16 20-24 South Carolina             9.6    0.00645
#> 17 25-34 South Carolina            12.6    0.00871
#> 18 35-44 South Carolina            11.0    0.01242
#> 19 45-54 South Carolina             8.3    0.01994
#> 20 55-64 South Carolina             4.6    0.03313
#> 21 65-74 South Carolina             2.3    0.06147
#> 22   75+ South Carolina             1.0    0.14136
  1. [5 pts] Write a command that would give you the following data from DeathRate. Please note that the rows are extracted where the Death_rate is greater than or equal to 0.023.
#>          Location   Age Population_size
#> 10          Maine 65-74             5.8
#> 11          Maine   75+             2.8
#> 12 South Carolina   0-4            11.8
#> 20 South Carolina 55-64             4.6
#> 21 South Carolina 65-74             2.3
#> 22 South Carolina   75+             1.0

Question 4

Use the expand.grid() to create the deck data frame as follows. (Not required, but you can use is.data.frame() to check if deck is a data frame.)

suit <- c('spades', 'hearts', 'clubs', 'diamonds')
face <- 1:13
deck <- expand.grid(suit, face)
  1. [5 pts] What are the column names of deck? Additionally, please rename the columns of deck as suit and face, respectively.

  2. [5 pts] Please write some commands to count the number of rows which suit are hearts.

  3. [10 pts] Please shuffle the rows of deck. Please use set.seed(1) and sample() for this question.