Introduction to Data Analysis for Business Fall 2020 Lecture 3 by Professor Michael Hamilton

BUSQOM 1080:
Data Analysis for Business
Fall 2020
Lecture 3 (8/27)
Professor: Michael Hamilton
Course Specifics
Lecture 3 - Introduction to R Programming
2
Outline for today:
1.
Misc. Course Updates
2.
Topic: Lists and Matrices [5 Mins Each]
3.
Topic: Dataframes [15 Mins]
4.
Topic: Functions in R [15 Mins]
Topic: Lists
Lists are more general than vectors and can contain multiple data
types. We won’t see them too often but they are important to
know.
Lists are made using the function list(), and the list's elements can
be accessed using brackets.
Example:
> test_list = list(5, “5”)
> test_vec = c(5, “5”)
> c(typeof(test_list[1]), typeof(test_vec[1]))
We can flexibly add and remove from lists as well
Example Cont.:
> test_list[3] = 6 # adds six in the previously undefined sixth position
> test_list[3] = NULL # removes the newly added third element
Lecture 3 - Introduction to R Programming
3
Topic: Matrices
In this class we’ll be interested in data that is primarily
multidimensional
 i.e. each observation corresponds to a row of
features. The most natural representation of this is a Matrix.
In R we can build matrices (col by col) out of long vectors using
the function matrix(vector, int num_rows, int num_cols).
Example:
We index matrices like vectors, with a comma separating rows and
columns e.g. mat_2x2[1,1] = 11, mat_2x2[1,2] = 21
Lecture 3 - Introduction to R Programming
4
Row 1
Row 2
Co1 2
Co1 1
Topic: Matrices
Indexing matrices is slightly more complicated that just vectors. In R we
can pass a vector of rows indices in the first component of the brackets
and a vector column indices in the second component. 
R will return the
intersection of the specified rows and columns!
We can also specify all rows or all columns by leaving it blank!
Examples:
Lecture 3 - Introduction to R Programming
5
Topic: Dataframes
While matrices are good, they lack several features that are helpful
for representing data. The major data structure we’ll use in this
class is what’s called a 
DataFrame (df)
. DataFrames can be accessed
similarly to matrices and can contain multiple data types.
We can make a DataFrame (df) out of (equal length!) vectors using
the command data.frame().
Lecture 3 - Introduction to R Programming
6
Example: Gradebook df
We make a df out of three
length-three vectors containing
different information about
students
Topic: Dataframes
Accessing Dataframes:
1.
Dataframes inherit column
names e.g.
> gradebook$grades # returns
vector of grades
> names(gradebook) # returns the
names of all the columns
2.
Dataframes can be subsetted
like matrices
> gradebook[grades > 98, ]
> gradebook[1:2, 2:3]
Lecture 3 - Introduction to R Programming
7
Topic: Functions
Finally in R you can write your own functions! This can be immensely
helpful but is more involved than previous code we’ve looked at so far.
As a beginner, I highly recommend that each time you write a function,
you do it in the R Notebook as its own separate chunk.
The syntax for a function is:
function_name <- function(arg_1, arg_2, ...) {
Function body
}
Note function is a special keyword. This syntax will let you build
functions exactly like the ones we’ve been using in R e.g. mean(), or
sd().
In general, any time you’re 
repeating 
a piece of computation many
times a function may we warranted!
Lecture 3 - Introduction to R Programming
8
Topic: Functions
Let’s demonstrate by example.
Lecture 3 - Introduction to R Programming
9
This is our function,
it takes in no
arguments
When we run the
block from the
notebook, it is sent
down to the console
<- Here we use our
newly defined
function. It simply
runs the code in the
function body!
Note that once we
run (“compile”) our
function it is stored
in the Env tab just
like variables!     
Topic: Functions
Let’s write a function that takes in a temperature (numeric) in F
and converts the temperature to C.
Lecture 3 - Introduction to R Programming
10
This function is take’s
an input (of any type!)
called temp. temp is
the only defined
variable for the code
inside the function!
This line performs a
computation using the input
return
() is a special
function keyword, it
sends it’s argument back
to where the function
was called
Order of execution goes like:
 > F_to_C(32)
1. 32 gets sent to the function F_to_C
2. temp = 32
3. new_temp = (temp - 32)*5/9 # i.e. 0
4. return(new_temp) sends new_temp back to the
console
For Next Time
In the zoom session we will do many more examples of writing
and using simple functions.
For additional practice with todays topics: Complete swirl lessons
(~15 Mins Each)
7: Matrices and Dataframes,
8: Logic,
9: Functions
All in the swirl course “R Programming”.
Lecture 3 - Introduction to R Programming
11
 
Slide Note
Embed
Share

In Lecture 3 of BUSQOM 1080, Professor Michael Hamilton covers Lists, Matrices, Functions in R, and Dataframes. Lists in R are versatile and can contain multiple data types, while Matrices are essential for handling multidimensional data. Functions in R allow for efficient programming, and Dataframes provide a structured way to represent data. Gain insights into R programming fundamentals and essential techniques for data analysis in a business context.

  • Data Analysis
  • R Programming
  • Business
  • Matrices
  • Dataframes

Uploaded on Feb 20, 2025 | 0 Views


Download Presentation

Please find below an Image/Link to download the presentation.

The content on the website is provided AS IS for your information and personal use only. It may not be sold, licensed, or shared on other websites without obtaining consent from the author. Download presentation by click this link. If you encounter any issues during the download, it is possible that the publisher has removed the file from their server.

E N D

Presentation Transcript


  1. BUSQOM 1080: Data Analysis for Business Fall 2020 Lecture 3 (8/27) Professor: Michael Hamilton

  2. Course Specifics Outline for today: 1. Misc. Course Updates 2. Topic: Lists and Matrices [5 Mins Each] 3. Topic: Dataframes [15 Mins] 4. Topic: Functions in R [15 Mins] Lecture 3 - Introduction to R Programming 2

  3. Topic: Lists Lists are more general than vectors and can contain multiple data types. We won t see them too often but they are important to know. Lists are made using the function list(), and the list's elements can be accessed using brackets. Example: > test_list = list(5, 5 ) > test_vec = c(5, 5 ) > c(typeof(test_list[1]), typeof(test_vec[1])) We can flexibly add and remove from lists as well Example Cont.: > test_list[3] = 6 # adds six in the previously undefined sixth position > test_list[3] = NULL # removes the newly added third element Lecture 3 - Introduction to R Programming 3

  4. Topic: Matrices In this class we ll be interested in data that is primarily multidimensional i.e. each observation corresponds to a row of features. The most natural representation of this is a Matrix. In R we can build matrices (col by col) out of long vectors using the function matrix(vector, int num_rows, int num_cols). Example: Row 1 Row 2 Co1 2 Co1 1 We index matrices like vectors, with a comma separating rows and columns e.g. mat_2x2[1,1] = 11, mat_2x2[1,2] = 21 Lecture 3 - Introduction to R Programming 4

  5. Topic: Matrices Indexing matrices is slightly more complicated that just vectors. In R we can pass a vector of rows indices in the first component of the brackets and a vector column indices in the second component. R will return the intersection of the specified rows and columns! intersection of the specified rows and columns! We can also specify all rows or all columns by leaving it blank! Examples: R will return the Lecture 3 - Introduction to R Programming 5

  6. Topic: Dataframes While matrices are good, they lack several features that are helpful for representing data. The major data structure we ll use in this class is what s called a DataFrame (df). DataFrames can be accessed similarly to matrices and can contain multiple data types. We can make a DataFrame (df) out of (equal length!) vectors using the command data.frame(). Example: Gradebook df We make a df out of three length-three vectors containing different information about students Lecture 3 - Introduction to R Programming 6

  7. Topic: Dataframes Accessing Dataframes: 1. Dataframes inherit column names e.g. > gradebook$grades # returns vector of grades > names(gradebook) # returns the names of all the columns 2. Dataframes can be subsetted like matrices > gradebook[grades > 98, ] > gradebook[1:2, 2:3] Lecture 3 - Introduction to R Programming 7

  8. Topic: Functions Finally in R you can write your own functions! This can be immensely helpful but is more involved than previous code we ve looked at so far. As a beginner, I highly recommend that each time you write a function, you do it in the R Notebook as its own separate chunk. The syntax for a function is: function_name <- function(arg_1, arg_2, ...) { Function body } Note function is a special keyword. This syntax will let you build functions exactly like the ones we ve been using in R e.g. mean(), or sd(). In general, any time you re repeating a piece of computation many times a function may we warranted! Lecture 3 - Introduction to R Programming 8

  9. Topic: Functions Note that once we run ( compile ) our function it is stored in the Env tab just like variables! Let s demonstrate by example. This is our function, it takes in no arguments When we run the block from the notebook, it is sent down to the console <- Here we use our newly defined function. It simply runs the code in the function body! Lecture 3 - Introduction to R Programming 9

  10. Topic: Functions Let s write a function that takes in a temperature (numeric) in F and converts the temperature to C. This line performs a computation using the input This function is take s an input (of any type!) called temp. temp is the only defined variable for the code inside the function! return() is a special function keyword, it sends it s argument back to where the function was called Order of execution goes like: > F_to_C(32) 1. 32 gets sent to the function F_to_C 2. temp = 32 3. new_temp = (temp - 32)*5/9 # i.e. 0 4. return(new_temp) sends new_temp back to the console Lecture 3 - Introduction to R Programming 10

  11. For Next Time In the zoom session we will do many more examples of writing and using simple functions. For additional practice with todays topics: Complete swirl lessons (~15 Mins Each) 7: Matrices and Dataframes, 8: Logic, 9: Functions All in the swirl course R Programming . Lecture 3 - Introduction to R Programming 11

Related


More Related Content

giItT1WQy@!-/#giItT1WQy@!-/#giItT1WQy@!-/#giItT1WQy@!-/#