Week 2: Working with R's Basic Data

Learn about variables, data types, vectors, and operators in R.

Dive into Chapter 2

Chapter 2: Fundamental Data Types and Operations

Variables and Assignment.

In R, variables are used to store data values. You can think of them as named containers for information in your R environment.

Naming Variables

As discussed in Week 1, variable names in R should be descriptive. They typically consist of letters, numbers, dots (`.`), and underscores (`_`). Remember:

  • They must start with a letter or a dot (if starting with a dot, the second character cannot be a number).
  • R is case-sensitive (`myVar` is different from `myvar`).
  • Avoid names of built-in functions or keywords.
  • Common styles include `snake_case` (e.g., `user_age`) or using dots (e.g., `user.age`).

Assigning Values

The preferred way to assign a value to a variable in R is using the assignment operator `<-`. You can also use `=`, but `<-` is the standard convention and avoids potential confusion with argument passing in functions.

# Assigning values to variables
student_count <- 30         # Using preferred '<-'
course.name <- "Introduction to R"
pi_approx = 3.14159     # '=' also works but '<-' is conventional

# Printing the values
print(student_count)
course.name # Typing the name also prints in the console
pi_approx

R is dynamically typed, meaning you don't need to declare the type of a variable before assigning a value to it. R infers the data type from the value assigned.

Fundamental Data Types (Atomic Vectors).

In R, the most basic objects hold data of a single type. These are often referred to as atomic vectors. The main fundamental data types are:

  • Numeric (`numeric` or `double`): Represents real numbers, including decimals (floating-point numbers). This is the default type for numbers in R.
    height <- 175.5
    temperature <- -5.2
  • Integer (`integer`): Represents whole numbers. To explicitly create an integer, append `L` to the number.
    count <- 100L
    year <- 2025L
  • Logical (`logical`): Represents boolean values, which can only be `TRUE` or `FALSE` (note the uppercase). You can also use `T` for `TRUE` and `F` for `FALSE`, but the full words are recommended for clarity.
    is_valid <- TRUE
    has_data <- F
  • Character (`character`): Represents text strings. Enclose strings in double quotes (`"..."`) or single quotes (`'...'`).
    message <- "Welcome to R!"
    city <- 'Delhi'
  • Complex (`complex`): Represents complex numbers with real and imaginary parts (e.g., `3 + 2i`).
  • Raw (`raw`): Holds raw bytes.

You can check the data type of an object using the `class()` or `typeof()` function.

x <- 10.5
y <- 5L
z <- "hello"
a <- TRUE

class(x)  # Output: [1] "numeric"
typeof(x) # Output: [1] "double"
class(y)  # Output: [1] "integer"
class(z)  # Output: [1] "character"
class(a)  # Output: [1] "logical"

Introduction to Vectors.

Vectors are arguably the most fundamental data structure in R. They are ordered collections of elements of the same basic data type. Even single values like `x <- 5` are actually vectors of length one.

Creating Vectors

The most common way to create a vector is using the combine function `c()`.

# Numeric vector
numeric_vec <- c(1.5, 2.3, 0.7, 4.1)

# Integer vector
integer_vec <- c(1L, 5L, 10L, 15L)

# Logical vector
logical_vec <- c(TRUE, FALSE, T, F)

# Character vector
character_vec <- c("apple", "banana", "cherry")

# Printing vectors
numeric_vec
character_vec

Vector Coercion

If you try to combine different data types in a single vector using `c()`, R will coerce the elements to the least restrictive type to ensure all elements are the same type. The coercion hierarchy generally goes: Logical -> Integer -> Numeric -> Character.

mixed_vec <- c(1L, "apple", 3.5, TRUE)
mixed_vec  # Output: [1] "1" "apple" "3.5" "TRUE"
class(mixed_vec) # Output: [1] "character" (All elements coerced to character)

We will explore vector indexing, slicing, and operations in more detail next week.

Operators in R.

Operators are symbols that perform operations on variables and values (operands).

Arithmetic Operators

Used for mathematical calculations. They often work element-wise on vectors.

OperatorNameExample (`a<-5`, `b<-2`)Result
`+`Addition`a + b``7`
`-`Subtraction`a - b``3`
`*`Multiplication`a * b``10`
`/`Division`a / b``2.5`
`^` or `**`Exponentiation`a ^ b``25`
`%%`Modulo (Remainder)`a %% b``1`
`%/%`Integer Division`a %/% b``2`
vec1 <- c(1, 2, 3)
vec2 <- c(4, 5, 6)
vec1 + vec2 # Output: [1] 5 7 9 (Element-wise addition)

Comparison Operators

Used to compare values. Result is a logical vector (`TRUE` or `FALSE`).

OperatorNameExample (`a<-5`, `b<-2`)Result
`==`Equal to`a == b``FALSE`
`!=`Not equal to`a != b``TRUE`
`>`Greater than`a > b``TRUE`
`<`Less than`a < b``FALSE`
`>=`Greater than or equal to`a >= 5``TRUE`
`<=`Less than or equal to`a <= b``FALSE`
vec1 > vec2 # Output: [1] FALSE FALSE FALSE (Element-wise comparison)
vec1 == c(1,5,3) # Output: [1] TRUE FALSE TRUE

Logical Operators

Used to combine or negate logical values.

  • Element-wise: `&` (AND), `|` (OR), `!` (NOT) - Operate on each element of vectors.
  • Short-circuiting: `&&` (AND), `||` (OR) - Evaluate from left to right and stop as soon as the result is determined. Typically used in `if` statements with single logical values, not vectors.
OperatorNameExample (`x<-TRUE`, `y<-FALSE`)Result
`&`Element-wise AND`x & y``FALSE`
`|`Element-wise OR`x | y``TRUE`
`!`NOT`!x``FALSE`
`&&`Logical AND (Short-circuit)`x && y``FALSE`
`||`Logical OR (Short-circuit)`x || y``TRUE`
vec_a <- c(TRUE, TRUE, FALSE)
vec_b <- c(TRUE, FALSE, FALSE)

vec_a & vec_b  # Output: [1]  TRUE FALSE FALSE
vec_a | vec_b  # Output: [1]  TRUE  TRUE FALSE
!vec_a         # Output: [1] FALSE FALSE  TRUE

# && and || typically used with single values in control flow
if ( (5 > 3) && (2 == 2) ) { print("Both true") }
Syllabus