Introduction to R Workshop - STAT CLUB November 19, 2014

Introduction to R
A workshop hosted by STAT CLUB
November 19, 2014
Outline
I.
About R
II.
Getting started
III.
The very basics
IV.
Importing data
V.
Common commands
9/30/2024
An Introduction to R
2
I.
 
About R
 
9/30/2024
An Introduction to R
3
What is R?
Free, open-source software with its own language
Works on all operating systems
Extensive: LOTS of built-in functions and downloadable packages
availble
Flexible: can define your own functions, modify existing commands,
cutsomize graphics, etc.
Powerful: can do all sorts of analyses and can handle large data sets
Integrates in other environments such as Excel, LaTeX, Hadoop, etc.
9/30/2024
An Introduction to R
4
II.
 
Getting started
 
9/30/2024
An Introduction to R
5
Installing R
Download from 
http://lib.stat.cmu.edu/R/CRAN/
Select the correct platform
Download the “base” package
Run the set-up
It’s quick, easy, and free!
9/30/2024
An Introduction to R
6
9/30/2024
An Introduction to R
7
Interacting with R
The command prompt ‘
>
’ indicates that we can begin typing a
command
Hit Esc key to exit out of a line of code
Basic rule: type a command and hit ‘enter’ to execute it
Hit the “Stop” button at top to stop running a line of code
For example: 
x = 1:100 
creates a vector of values 1,2,…,100
9/30/2024
An Introduction to R
8
R Script/Editor
A files where you can write, edit, and save codes
Go to File > New script
When you have typed in the code you want to run, highlight the
chunk you want to run and either hit ‘ctrl+R’ or right-click and select
“Run line or selection”
You can save this script for later use by hit ‘ctrl+S’ or going to File >
Save while the Script window is activated
Will NOT save any results of running the commands – saves the text script
only
9/30/2024
An Introduction to R
9
R Script/Editor
9/30/2024
An Introduction to R
10
Workspaces
An R workspace includes all the functions and variables (called
“objects”) defined in a session
The output associated with any command you’ve run will be stored in
the workspace
Can be saved by going to File > Save Workspace
Load workspace by going to File > Load Workspace
If you want to clear some part of the workspace, use 
rm()
Use 
ls() 
to see what has been saved
9/30/2024
An Introduction to R
11
Working directory
Where everything will be saved/loaded from by default
It is, by default, usually in My Documents
Can change this under File > Change dir
If you want to save/load from a different place, can usually just type
the file path into the name of the file and R will find it
Makes it easier to always work from your working directory
9/30/2024
An Introduction to R
12
R packages
Collections of R functions and datasets created by others
Many standard packages included, others have to be downloaded
If you know the name of the package, can install it by going to
Package > Install Packages or by using
install.packages(“packagename”) 
in command line
Even if a package is installed on your computer, R will not
automatically use it – so if you need to use a function from a package,
use 
library(“packagename”) 
in command line
9/30/2024
An Introduction to R
13
Use as a calculator
Basic arithmetic can be done intuitively: 
12+5
, 
3/8
, 
17+(6*5)
, 
5^2
, Etc.
Don’t use brackets!  They mean something else!  Use parentheses
9/30/2024
An Introduction to R
14
R commands and language
Mostly in the form of functions: 
mean(x)
, 
plot(x,y)
, etc.
CaSe SenSitVe!
Spaces don’t usually mean anything
Can use periods ‘
.
’ and underscores ‘_’ in object names
9/30/2024
An Introduction to R
15
Getting help in R
Go to Help menu
If you know the exact command that you need help with, can type ‘
?
before command name in console, and this will bring up an online
documentation for the commands
If you do not know the exact command but have an idea of what it
might look like or what words may be used in the description, type
??
’ before the command
Google!
9/30/2024
An Introduction to R
16
9/30/2024
An Introduction to R
17
III.
 
The very basics
 
9/30/2024
An Introduction to R
18
Basics to know first
Creating your own Objects (variables, vectors, matrices, lists,
functions, etc.)
Assigning names to these objects
Learning to access objects
Performing simple calculations and transformations on these objects
9/30/2024
An Introduction to R
19
Types of objects
Functions that you can perform on or with objects depends on their “class” or type:
Numeric (double-precision numbers)
Double (same as numeric)
Integer (integer-valued; rarely used)
Character (strings, non-numerical values)
Matrix (matrix of numerical values)
Logical (Boolean – true/false)
Factor (“groups” or levels)
List (list of other types of objects)
Dataframe (table or other collection of data that is numerical or non-numerical)
Functions (functions that take inputs)
To find out which class a variable belongs to, use 
class()
To determine the dimensions of an object use 
dim()
Verify a class by using 
is.numeric()
, 
is.character()
, 
is.logical()
, 
is.data.frame
, etc.
Change a class by using 
as.numeric()
, 
as.character()
, 
as.logical()
, 
as.data.frame
, etc.
9/30/2024
An Introduction to R
20
Single value
Use ‘
=
‘ or ‘
<-
’ to assign name to a value
Use quotations if not a numerical value
Example: 
x = 36
Example: 
y <- “age”
9/30/2024
An Introduction to R
21
Vectors
Vector: 
c()
Use ‘
=
‘ or ‘
<-
’ to assign name to vector
If the vector contains non-numerical values, use quotations
Example: 
mileage = c(1200,200,6700,1000,1200)
Example: 
type <- c(“Compact”, “Minivan”, “SUV” , “Roadster” ,
“Truck”)
9/30/2024
An Introduction to R
22
Matrix
Matrix: 
matrix(data=c(2,3,4,5), nrow=2, ncol=2)
data = vector of values you want entered in (enters in by COLUMN!)
nrow = number of rows
ncol = number of columns
9/30/2024
An Introduction to R
23
Dataframe
Like a table
Can contain both numerical and string variables
Use 
data.frame(vars)
9/30/2024
An Introduction to R
24
Lists
Each element in a list can be ANY object – vector, matrix, dataframe,
even another list!
Use 
list(vars)
9/30/2024
An Introduction to R
25
Functions
Creating functions are more complex
Of the form: 
g <-function(var1,var2) {var1 + var2}
g is the function name
Var1, var2 are the input variables
The function goes in the curly brackets
To use the function: 
g(input1, input2)
9/30/2024
An Introduction to R
26
IV.
 
Importing data
 
9/30/2024
An Introduction to R
27
Importing data
Can import from many formats (.txt, .csv, .xls, .xlsx, .sav, .dta, .ssd, …)
Recommend .txt or .csv – others need packages
If in working directory:
data1 = read.table(“mydata.txt”, header=TRUE, sep=“,”)
header = TRUE indicates that a row of column headings/titles are included in the file;
set to FALSE if not
sep=“,” indicates that a comma is separating records, like in a .csv; can remove this
code if separated by space or tab (default); or can modify if separated by something
else
If not in working directory, use file path:
data1 = read.table(“C:/Users/xyz/Desktop/folder/mydata.text”,
header=TRUE, sep=“,”)
9/30/2024
An Introduction to R
28
Working with data sets
Attach datasets to the current space: 
attach(dataset)
Use a variable from a dataset: 
dataset$varname
Retrieve the names of the variables: 
names(dataset)
Take a subset of your data according to some criterion:
subset(dataset,criterion)
9/30/2024
An Introduction to R
29
V.
 
Common commands
 
9/30/2024
An Introduction to R
30
Arithmetic/calculator
 
Add: 
+
Subtract: 
-
Multiply: 
*
Divide: 
/
Raise to a power: 
^
Natural logarithm: 
log()
Exponentiation: 
exp()
Square root: 
sqrt()
9/30/2024
An Introduction to R
31
Vector commands
Create a vector of numbers: 
c(num1,num2)
Combine vectors together to create one: 
c(vec1,vec2)
Create a vector of numbers from a to b in increments of 1: 
a:b
Create a vector of numbers from a to be in increments of d: 
seq(a,b,d)
Create a vector of numbers from a to b in equal increments such that the
there are k total numbers: 
seq(a,b,length=k)
Return the number of elements in a vector x: 
length(x)
Sort entries in a vector x: 
sort(x, decreasing=FALSE)
Element-wise arithmetic: 
3*x
,
 4+x
, 
log(x)
, 
sqrt(x)
, etc.
Arithmetic of two vectors x and y will be element-wise: 
x*y
, 
x+y
, etc.
9/30/2024
An Introduction to R
32
Matrix commands
Create a matrix: 
matrix(vals,nrow,ncol)
Create a diagonal matrix: 
diag(vals)
Multiply matrices M1 and M2: 
M1 %*% M2
Note that 
M1*M2 
will be element-wise
Find the determinant of matrix M: 
det(M)
Find inverse of matrix M: 
solve(M)
Find transpose of matrix M: 
t(M)
Combine matrices by column: 
cbind(M1,M2)
Combine matrices by row: 
rbind(M1,M2)
Find dimensions of a matrix M: 
dim(M)
9/30/2024
An Introduction to R
33
Retrieving parts of objects
Return the kth element of a vector x: 
x[k]
Return the i,j th element of a matrix x: 
x[i,j]
Return the kth object of a list x: 
x[[k]]
Return the ith element of the kth object of a list x: 
x[[k]][i]
Return the element or object called “name”: 
x$name
Can retrieve more than one element at a time
9/30/2024
An Introduction to R
34
Summaries and statistics
Mean: 
mean(x)
Standard deviation: 
sd(x)
Median: 
median(x)
Minimum: 
min(x)
Maximum: 
max(x)
Range(min and max): 
range(x)
Sum: 
sum(x)
Which index contains the minimum value: 
which.min
Which index contains the maximum value: 
which.max
9/30/2024
An Introduction to R
35
Logical
Operators: 
>
 greater than, 
>=
 greater than or equal to, 
<
 less than, 
<=
less than or equal to, 
==
 equal to, 
!=
 not equal to, 
&
 and, 
|
 or
Just entering some function of operators will return a Boolean(‘TRUE’
or ‘FALSE’) vector
R will many times treat TRUE as 1 and FALSE as 0 so that you can conduct
mathematical operations on them
Return 
indices
 of a vector that satisfies criterion: 
which(x > 45)
To get the actual value: 
x[x>45]
9/30/2024
An Introduction to R
36
Logical
If-then statements:  
if (criterion) {command} else {command}
For-loops: 
for (i in x){ commands }
9/30/2024
An Introduction to R
37
Apply functions
Apply a function to rows or columns of a
matrix:
apply(M, 1, mean) 
will take average across rows
apply(M, 2, sum) 
will sum columns
Apply a function to each element of a vector,
list or data.frame: 
sapply(L, length)
9/30/2024
An Introduction to R
38
Plots
Scatterplot of a vector x and vector y: 
plot(x,y)
Add points to an already-existing scatterplot: 
points(xvals,yvals)
Add a line to an already-existing scatterplot: 
lines(xvals,yvals)
Histogram of a vector of values x: 
hist(x)
9/30/2024
An Introduction to R
39
Tables
Create a table of frequencies from a vector of values x: 
table(x)
Create a two-way table between vectors x and y of same length:
table(x,y)
9/30/2024
An Introduction to R
40
Linear regression
Linear regression of
y on x: 
lm(y~x)
Can get more info
using
summary(model)
9/30/2024
An Introduction to R
41
Working with datasets: linear regression
Can find all objects in the model: 
names(model)
9/30/2024
An Introduction to R
42
Hypothesis tests
One-sample t-test for vector of values x: 
t.test(x,
alternative=“two.sided”,mu=0)
Two-sample t-test between vectors x and y: 
t.test(x,y)
Chi-squre test of independence in two-way table “tab”: 
chisq.test(tab)
9/30/2024
An Introduction to R
43
Final warnings!
Floating point arithmetic is not exact!
Missing values are not excluded by default – must use
na.rm = TRUE 
option
Combining different classes will all entries to be the same
class
Some things, such as quotation marks, cannot not be
easily copied and pasted into R from other applications
such as Word
9/30/2024
An Introduction to R
44
Slide Note
Embed
Share

Hosted by STAT CLUB on November 19, 2014, the Introduction to R Workshop covered the basics of R, including getting started, importing data, common commands, and the features of R as a free, open-source software. It outlined the installation process and interactions with R, emphasizing the script editor for writing, editing, and saving code.

  • R Workshop
  • STAT CLUB
  • Data Analysis
  • R Programming
  • Script Editor

Uploaded on Sep 30, 2024 | 0 Views


Download Presentation

Please find below an Image/Link to download the presentation.

The content on the website is provided AS IS for your information and personal use only. It may not be sold, licensed, or shared on other websites without obtaining consent from the author. Download presentation by click this link. If you encounter any issues during the download, it is possible that the publisher has removed the file from their server.

E N D

Presentation Transcript


  1. Introduction to R A workshop hosted by STAT CLUB November 19, 2014

  2. Outline I. II. Getting started III. The very basics IV. Importing data V. Common commands About R An Introduction to R 9/30/2024 2

  3. I. About R An Introduction to R 9/30/2024 3

  4. What is R? Free, open-source software with its own language Works on all operating systems Extensive: LOTS of built-in functions and downloadable packages availble Flexible: can define your own functions, modify existing commands, cutsomize graphics, etc. Powerful: can do all sorts of analyses and can handle large data sets Integrates in other environments such as Excel, LaTeX, Hadoop, etc. An Introduction to R 9/30/2024 4

  5. II. Getting started An Introduction to R 9/30/2024 5

  6. Installing R Download from http://lib.stat.cmu.edu/R/CRAN/ Select the correct platform Download the base package Run the set-up It s quick, easy, and free! An Introduction to R 9/30/2024 6

  7. An Introduction to R 9/30/2024 7

  8. Interacting with R The command prompt > indicates that we can begin typing a command Hit Esc key to exit out of a line of code Basic rule: type a command and hit enter to execute it Hit the Stop button at top to stop running a line of code For example: x = 1:100 creates a vector of values 1,2, ,100 An Introduction to R 9/30/2024 8

  9. R Script/Editor A files where you can write, edit, and save codes Go to File > New script When you have typed in the code you want to run, highlight the chunk you want to run and either hit ctrl+R or right-click and select Run line or selection You can save this script for later use by hit ctrl+S or going to File > Save while the Script window is activated Will NOT save any results of running the commands saves the text script only An Introduction to R 9/30/2024 9

  10. R Script/Editor An Introduction to R 9/30/2024 10

  11. Workspaces An R workspace includes all the functions and variables (called objects ) defined in a session The output associated with any command you ve run will be stored in the workspace Can be saved by going to File > Save Workspace Load workspace by going to File > Load Workspace If you want to clear some part of the workspace, use rm() Use ls() to see what has been saved An Introduction to R 9/30/2024 11

  12. Working directory Where everything will be saved/loaded from by default It is, by default, usually in My Documents Can change this under File > Change dir If you want to save/load from a different place, can usually just type the file path into the name of the file and R will find it Makes it easier to always work from your working directory An Introduction to R 9/30/2024 12

  13. R packages Collections of R functions and datasets created by others Many standard packages included, others have to be downloaded If you know the name of the package, can install it by going to Package > Install Packages or by using install.packages( packagename ) in command line Even if a package is installed on your computer, R will not automatically use it so if you need to use a function from a package, use library( packagename ) in command line An Introduction to R 9/30/2024 13

  14. Use as a calculator Basic arithmetic can be done intuitively: 12+5, 3/8, 17+(6*5), 5^2, Etc. Don t use brackets! They mean something else! Use parentheses An Introduction to R 9/30/2024 14

  15. R commands and language Mostly in the form of functions: mean(x), plot(x,y), etc. CaSe SenSitVe! Spaces don t usually mean anything Can use periods . and underscores _ in object names An Introduction to R 9/30/2024 15

  16. Getting help in R Go to Help menu If you know the exact command that you need help with, can type ? before command name in console, and this will bring up an online documentation for the commands If you do not know the exact command but have an idea of what it might look like or what words may be used in the description, type ?? before the command Google! An Introduction to R 9/30/2024 16

  17. An Introduction to R 9/30/2024 17

  18. III. The very basics An Introduction to R 9/30/2024 18

  19. Basics to know first Creating your own Objects (variables, vectors, matrices, lists, functions, etc.) Assigning names to these objects Learning to access objects Performing simple calculations and transformations on these objects An Introduction to R 9/30/2024 19

  20. Types of objects Functions that you can perform on or with objects depends on their class or type: Numeric (double-precision numbers) Double (same as numeric) Integer (integer-valued; rarely used) Character (strings, non-numerical values) Matrix (matrix of numerical values) Logical (Boolean true/false) Factor ( groups or levels) List (list of other types of objects) Dataframe (table or other collection of data that is numerical or non-numerical) Functions (functions that take inputs) To find out which class a variable belongs to, use class() To determine the dimensions of an object use dim() Verify a class by using is.numeric(), is.character(), is.logical(), is.data.frame, etc. Change a class by using as.numeric(), as.character(), as.logical(), as.data.frame, etc. An Introduction to R 9/30/2024 20

  21. Single value Use = or <- to assign name to a value Use quotations if not a numerical value Example: x = 36 Example: y <- age An Introduction to R 9/30/2024 21

  22. Vectors Vector: c() Use = or <- to assign name to vector If the vector contains non-numerical values, use quotations Example: mileage = c(1200,200,6700,1000,1200) Example: type <- c( Compact , Minivan , SUV , Roadster , Truck ) An Introduction to R 9/30/2024 22

  23. Matrix Matrix: matrix(data=c(2,3,4,5), nrow=2, ncol=2) data = vector of values you want entered in (enters in by COLUMN!) nrow = number of rows ncol = number of columns An Introduction to R 9/30/2024 23

  24. Dataframe Like a table Can contain both numerical and string variables Use data.frame(vars) An Introduction to R 9/30/2024 24

  25. Lists Each element in a list can be ANY object vector, matrix, dataframe, even another list! Use list(vars) An Introduction to R 9/30/2024 25

  26. Functions Creating functions are more complex Of the form: g <-function(var1,var2) {var1 + var2} g is the function name Var1, var2 are the input variables The function goes in the curly brackets To use the function: g(input1, input2) An Introduction to R 26 9/30/2024

  27. IV. Importing data An Introduction to R 9/30/2024 27

  28. Importing data Can import from many formats (.txt, .csv, .xls, .xlsx, .sav, .dta, .ssd, ) Recommend .txt or .csv others need packages If in working directory: data1 = read.table( mydata.txt , header=TRUE, sep= , ) header = TRUE indicates that a row of column headings/titles are included in the file; set to FALSE if not sep= , indicates that a comma is separating records, like in a .csv; can remove this code if separated by space or tab (default); or can modify if separated by something else If not in working directory, use file path: data1 = read.table( C:/Users/xyz/Desktop/folder/mydata.text , header=TRUE, sep= , ) An Introduction to R 9/30/2024 28

  29. Working with data sets Attach datasets to the current space: attach(dataset) Use a variable from a dataset: dataset$varname Retrieve the names of the variables: names(dataset) Take a subset of your data according to some criterion: subset(dataset,criterion) An Introduction to R 9/30/2024 29

  30. V. Common commands An Introduction to R 9/30/2024 30

  31. Arithmetic/calculator Add: + Subtract: - Multiply: * Divide: / Raise to a power: ^ Natural logarithm: log() Exponentiation: exp() Square root: sqrt() An Introduction to R 9/30/2024 31

  32. Vector commands Create a vector of numbers: c(num1,num2) Combine vectors together to create one: c(vec1,vec2) Create a vector of numbers from a to b in increments of 1: a:b Create a vector of numbers from a to be in increments of d: seq(a,b,d) Create a vector of numbers from a to b in equal increments such that the there are k total numbers: seq(a,b,length=k) Return the number of elements in a vector x: length(x) Sort entries in a vector x: sort(x, decreasing=FALSE) Element-wise arithmetic: 3*x, 4+x, log(x), sqrt(x), etc. Arithmetic of two vectors x and y will be element-wise: x*y, x+y, etc. An Introduction to R 9/30/2024 32

  33. Matrix commands Create a matrix: matrix(vals,nrow,ncol) Create a diagonal matrix: diag(vals) Multiply matrices M1 and M2: M1 %*% M2 Note that M1*M2 will be element-wise Find the determinant of matrix M: det(M) Find inverse of matrix M: solve(M) Find transpose of matrix M: t(M) Combine matrices by column: cbind(M1,M2) Combine matrices by row: rbind(M1,M2) Find dimensions of a matrix M: dim(M) An Introduction to R 9/30/2024 33

  34. Retrieving parts of objects Return the kth element of a vector x: x[k] Return the i,j th element of a matrix x: x[i,j] Return the kth object of a list x: x[[k]] Return the ith element of the kth object of a list x: x[[k]][i] Return the element or object called name : x$name Can retrieve more than one element at a time An Introduction to R 9/30/2024 34

  35. Summaries and statistics Mean: mean(x) Standard deviation: sd(x) Median: median(x) Minimum: min(x) Maximum: max(x) Range(min and max): range(x) Sum: sum(x) Which index contains the minimum value: which.min Which index contains the maximum value: which.max An Introduction to R 9/30/2024 35

  36. Logical Operators: > greater than, >= greater than or equal to, < less than, <= less than or equal to, == equal to, != not equal to, & and, | or Just entering some function of operators will return a Boolean( TRUE or FALSE ) vector R will many times treat TRUE as 1 and FALSE as 0 so that you can conduct mathematical operations on them Return indices of a vector that satisfies criterion: which(x > 45) To get the actual value: x[x>45] An Introduction to R 9/30/2024 36

  37. Logical If-then statements: if (criterion) {command} else {command} For-loops: for (i in x){ commands } An Introduction to R 9/30/2024 37

  38. Apply functions Apply a function to rows or columns of a matrix: apply(M, 1, mean) will take average across rows apply(M, 2, sum) will sum columns Apply a function to each element of a vector, list or data.frame: sapply(L, length) An Introduction to R 9/30/2024 38

  39. Plots Scatterplot of a vector x and vector y: plot(x,y) Add points to an already-existing scatterplot: points(xvals,yvals) Add a line to an already-existing scatterplot: lines(xvals,yvals) Histogram of a vector of values x: hist(x) Histogram of err 20 15 Frequency 10 5 0 -3 -2 -1 0 1 2 err An Introduction to R 9/30/2024 39

  40. Tables Create a table of frequencies from a vector of values x: table(x) Create a two-way table between vectors x and y of same length: table(x,y) An Introduction to R 9/30/2024 40

  41. Linear regression Linear regression of y on x: lm(y~x) Can get more info using summary(model) An Introduction to R 9/30/2024 41

  42. Working with datasets: linear regression Can find all objects in the model: names(model) An Introduction to R 9/30/2024 42

  43. Hypothesis tests One-sample t-test for vector of values x: t.test(x, alternative= two.sided ,mu=0) Two-sample t-test between vectors x and y: t.test(x,y) Chi-squre test of independence in two-way table tab : chisq.test(tab) An Introduction to R 9/30/2024 43

  44. Final warnings! Floating point arithmetic is not exact! Missing values are not excluded by default must use na.rm = TRUE option Combining different classes will all entries to be the same class Some things, such as quotation marks, cannot not be easily copied and pasted into R from other applications such as Word An Introduction to R 9/30/2024 44

Related


More Related Content

giItT1WQy@!-/#giItT1WQy@!-/#giItT1WQy@!-/#giItT1WQy@!-/#