Lab 2

Question 1

Let \(X\) be a matrix with column names. For instance:

X <- matrix(1:12, byrow=TRUE, nrow=3)      # example matrix
dimnames(X)[[2]] <- c("a", "b", "c", "d")  # set column names
print(X)
#>      a  b  c  d
#> [1,] 1  2  3  4
#> [2,] 5  6  7  8
#> [3,] 9 10 11 12
  1. [5pts] Please use ncol instead of nrow to create the matrix. Use colnames() instead of dimnames() to set the column names.

  2. Explain the meaning of the following expressions involving matrix subsetting. Note that a few of them are invalid. For example,

Example 1

X[1,]
#> a b c d 
#> 1 2 3 4

Meaning: the first row of X

Example 2
X[, c(1, "c")]

The expression above is invalid. The correct expression should be as follows, which extract the first and third columns.

X[, c("a", "c")]
#>      a  c
#> [1,] 1  3
#> [2,] 5  7
#> [3,] 9 11

# OR
X[, c(1, 3)]
#>      a  c
#> [1,] 1  3
#> [2,] 5  7
#> [3,] 9 11

[10pts] Please work on the questions below.

X[2, ]

X[, 3]

X[3, 1]

X[, "a"]

X[, c("a", "b", "c")]

X[, -2]

X[X[,1] > 5, ]

X[X[,1]>=5 & X[,1]<=10, ]

X[X[,1]>=5 & X[,1]<=10, c("a", "b", "c")]

X[, c(1, "b", "d")]
  1. [5pts] Use the row-bind function rbind() to add a new row 13, 14, 15, 16 to \(X\) (still denote as \(X\) after adding) as follows.
#>       a  b  c  d
#> [1,]  1  2  3  4
#> [2,]  5  6  7  8
#> [3,]  9 10 11 12
#> [4,] 13 14 15 16
  1. [5pts] Use function rowMeans() to find the mean of each row, denote the vector as rmeans. Use the column-bind function cbind() to add rmeans to \(X\) in (c).

Question 2

[5pts] Create a matrix like:

(Please note the NA and 0 in the matrix, which don’t follow the pattern as the other elements. )

#>      [,1] [,2] [,3] [,4] [,5] [,6]
#> [1,]    1   21   41   61   81  101
#> [2,]    5   25   45   65   85  105
#> [3,]    9   NA   49   69   89  109
#> [4,]   13   33   53   73   93    0
#> [5,]   17   37   57   77   97  117

Question 3

Given the vector of months below, how can we identify if there are any typos in it? Find the answer by completing the following questions.

y <- c(
  "Jan", "Feb", "Feb", "Jan", "Jul", "Oct", "Apr", "Feb",
  "Jun", "Jul", "Dec", "Dec", "Mar", "Dec", "Aug", "Sep",
  "Aug", "Sep", "Dec", "Nov", "Sep", "Feb", "May", "Nov",
  "Oct", "Jun", "Sep", "Oct", "Aug", "Nov", "May", "May",
  "Jul", "May", "Sep", "Oct", "Jul", "Mar", "Aug", "Mar",
  "Mar", "Jan", "Aug", "Jun", "Oct", "Apr", "Dec", "Dec",
  "Oct", "Sep", "Oct", "Aug", "Feb", "Nov", "Dec", "Sep",
  "May", "Sep", "Feb", "Dec", "Oct", "Dec", "Aug", "Dec",
  "Nov", "Sep", "Mar", "Nov", "Dec", "May", "Jul", "Nov",
  "Dec", "Dec", "Mar", "Oct", "Nov", "Sep", "Dec", "May",
  "Jan", "Oct", "Jul", "Jun", "Oct", "Dec", "Jul", "Jun",
  "Mar", "Jan", "Aug", "Mar", "Dec", "Feb", "Jul", "May",
  "Jul", "Oct", "Jan", "Nov", "Aug", "Jan", "Nov", "Jun",
  "Dec", "Jul", "Mar", "May", "Feb", "May", "Jan", "Oct",
  "May", "Sep", "Jan", "Aug", "Jul", "Feb", "Nov", "Feb",
  "Oct", "Apr", "Jan", "Jun", "Aug", "Jan", "Jun", "Nov",
  "Ju1", "Sep", "May", "Sep", "Aug", "Aug", "Feb", "Feb",
  "Jun", "Jun", "Sep", "Apr", "Dec", "Mar", "Jan", "Jun",
  "Feb", "Jun", "Oct", "Nov", "Jul", "May", "Oct", "Jul",
  "Nov", "Mar", "Aug", "Mar", "Ju1", "Feb", "Sep", "May",
  "Sep", "Sep", "Sep", "Mar", "Feb", "Dec", "Nov", "Mar",
  "Apr", "Jan", "Jan", "Aug", "Apr", "Sep", "Jul", "Oct",
  "Dec", "Oct", "Jun", "Jun", "Dec", "Apr", "Oct", "Mar",
  "Jan", "Dec", "Jul", "Jun", "May", "Mar", "Sep", "Mar",
  "Jan", "Jul", "Dec", "Apr", "Sep", "Jul", "Jun", "Oct"
)
  1. [5pts] Step 1: Create a vector month_levles with the values from month.abb.

  2. [5pts] Step 2: Encode the vector y into a factor y_f, by using month_levels as the levels for the factor.

  3. [5pts] Step 3: Find how many NA’s in the vector y_f which is a factor. Store the value in N_NAs (Hint: You can use the is.na() and sum() functions to determine how many elements in y are not part of month.abb.)

  4. [5pts] Step 4: Please find what the typo(s) were in y by creating a vector typos storing the typo(s).

  5. [5pts] Step 5: Modify the typos in y and y_f.

  6. [5pts] Step 6: Sort the vectors y and y_f separately using the sort() function. Discuss the differences in the sorted results.