Understanding Binomial Distribution in R Programming

 
B
a
s
i
c
 
P
r
o
b
a
b
i
l
i
t
y
 
D
i
s
t
r
i
b
u
t
i
o
n
s
 
i
n
R
 
P
r
o
g
r
a
m
m
i
n
g
 
By
Dr. Mohamed Surputheen
 
p
r
o
b
a
b
i
l
i
t
y
 
d
i
s
t
r
i
b
u
t
i
o
n
s
 
i
n
 
R
 
Many statistical tools and techniques used in data analysis are based on probability.
Probability measures how likely it is for an event to occur on a scale from 0 (the event never
occurs) to 1 (the event always occurs).
A probability distribution describes how a random variable is distributed; it tells us which values a
random variable is most likely to take on and which values are less likely.
R comes with built-in implementations of many probability distributions.
Each probability distribution in R is associated with 
four functions which follow a naming
convention:
The d-prefix
 function calculates the 
probability density function (PDF) of a continuous probability
distribution, or the probability mass function (PMF) of a discrete probability distribution
, at a
specific value of the random variable.
The p-prefix
 function calculates the 
cumulative distribution function (CDF) 
of a probability
distribution, which gives the probability of observing a value less than or equal to a given value of
the random variable
The q-prefix 
function calculates the 
quantile of a probability distribution, which is the inverse of
the CDF.
The r-prefix 
function 
generates random numbers from a probability distribution
 
Binomial Distribution
 
The binomial distribution is a discrete probability distribution that 
describes the number of
successes in a fixed number of independent trials with two possible outcomes (success or
failure) 
and a constant probability of success for each trial.
A 
binomial experiment
 has the following properties:
experiment consists of n identical and independent trials
each trial results in one of two outcomes: success or failure
P(success) = 
p
P(failure) = 
q = 1 - p 
for all trials
The random variable of interest, 
X
, is the number of successes in the n trials.
X
 has a 
binomial distribution with parameters 
n and p
 
 
 
Binomial Distribution
 
If the probability of success in each trial is given by p , then the probability
of getting exactly x successful events among n trials is given by the
Binomial PMF
 
 or
 
-n is the total number of trials
-x is the number of successes
-p is the probability of success on each trial.
undefined
 
Mean , variance and Standard deviation of Binomial Distribution
The mean, 
E(X) = p + p + … + p = n*p
The variance, 
V(X) = pq + pq + … + pq = n*pq
The standard deviation =
 
P
r
o
b
a
b
i
l
i
t
y
 
C
o
m
p
u
t
a
t
i
o
n
s
 
R
e
l
a
t
e
d
 
t
o
 
B
i
n
o
m
i
a
l
 
D
i
s
t
r
i
b
u
t
i
o
n
s
 
R has several functions related to the binomial distribution. Here are some commonly used ones:
 
1. 
dbinom(x, size, prob) 
- 
Probability Mass Function (PMF) or probability distribution  
of the binomial
distribution
. 
Calculates the probability of getting 
exactly x successes 
in size trials, given a probability prob of
success on each trial.
2.  
pbinom(q, size, prob) 
- 
Cumulative Distribution Function (CDF) of the binomial distribution. 
Calculates the
probability of getting up to q successes 
in size trials, given a probability prob of success on each trial.
3. 
qbinom(p, size, prob) 
- 
Inverse CDF of the binomial distribution
. 
Calculates 
the smallest number q such that
the CDF is less than or equal to p
, given size trials and a probability prob of success on each trial.
(ie) This function takes the probability value and gives a number whose cumulative value matches the
probability
 
4. 
rbinom(n, size, prob
) 
- 
Random number generator for the binomial distribution
. 
Generates n random
samples from a binomial distribution
 with size trials and a probability prob of success on each trial.
 
Binomial probabilities using dbinom() function in R
 
dbinom is the function used to find the  
probability mass function for the binomial distribution
.
The function ‘dbinom’ is used to obtain the exact probability using Binomial distribution, i.e. P(X=x).
The syntax to compute the probability at x for binomial distribution using R is
dbinom(x,size,prob)
where
    x : the value(s) of the variable,
    size : the number of trials, and
    prob : the probability of success (prob).
The dbinom() function gives the probability for given value(s) x (no. of successes), size (no. of trials) and
prob (probability of success).
 
Example: dbinom
A coin is tossed 5 times. What is the probability of
getting one head and three heads?
To solve this problem using R language, we can use
the dbinom() function, which calculates the
binomial probability mass function.
For getting one head:
> dbinom(1, size=5, prob=0.5)
[1] 0.15625
For getting three heads:
> dbinom(3, size=5, prob=0.5)
[1] 0.3125
 
Manual verification
The probability of getting a head on a single coin
toss is 1/2, and the probability of getting a tail is
also 1/2.
To find the probability of getting a certain number
of heads in 5 coin tosses, we can use the binomial
probability formula:
P(x) = nCx p
x
q
(n-x)
Given  n=5, p=0.5,q=0.5
 
So for getting one head, we have:
P(X=1) = (5C1) * (1/2)^1 * (1/2)^4 = 5/32 =0.15625
For getting three heads, we have:
P(X=3) = (5 C 3) * (1/2)^3 * (1/2)^2 = 10/32 = 5/16
=0.3125
 
B
i
n
o
m
i
a
l
 
c
u
m
u
l
a
t
i
v
e
 
p
r
o
b
a
b
i
l
i
t
y
 
u
s
i
n
g
 
p
b
i
n
o
m
(
)
 
f
u
n
c
t
i
o
n
 
i
n
 
R
 
The syntax to compute the cumulative probability distribution function (CDF) for binomial distribution
using R is
pbinom(q,size,prob)
where
q : the value(s) of the variable,
size : the number of trials, and
prob : the probability of success (prob).
This function is very useful for calculating the cumulative binomial probabilities for given value(s) of q
(value of the variable x), size (no. of trials) and prob (probability of success).
 
In a university 45% of the
students are female. A random
sample of 10 students are
selected. What is the
probability that 2 or less
female students are selected?
Answer:
pbinom(q,size,prob)
pbinom(2,10,0.45)
[1] 0.09955965
 
 
Binomial Distribution Quantiles using qbinom() in R
 
qbinom is the R function that calculates the  inverse CDF (or quqntiles) of the binomial distribution.
W.k.t, This function takes the probability value and gives a number whose cumulative value matches the probability
value.
The syntax to compute the inverse CDF or quantiles of binomial distribution using R is
qbinom(p,size,prob)
where
    p : the value(s) of the probabilities,
    size : the number of trials, and
    prob : the probability of success (prob).
 
The function qbinom(p,size,prob) gives the Inverse CDF of Binomial distribution for given value of p, size and prob.
Note: qbinom is the inverse of the pbinom function.
 pbinom 
calculates the cumulative probability distribution function (CDF) of a binomial random variable
, while
qbinom 
calculates the inverse CDF or the quantile function of the binomial distribution.
 
Diff between CDF and invers CDF
The CDF represents the probability that a random variable takes on a
value less than or equal to a given value. The inverse CDF, on the other
hand, does the opposite. 
It takes a probability as input and returns the
value of the random variable that corresponds to that probability.
 
Example  problem 
that demonstrates the relationship between 
pbinom and qbinom:
Suppose we flip a fair coin 10 times. What is the probability of getting 3 or fewer heads?
To solve this problem using pbinom, we can set n = 10 and p = 0.5 (since the coin is fair) and use the
following code:
> pbinom(3, 10, 0.5)
[1] 0.171875
This returns a probability of approximately 0.1719, meaning there is a 17.19% chance of getting 3 or fewer
heads in 10 coin flips.
To solve this problem using qbinom, we can again set n = 10 and p = 0.5 and use the following code:
> qbinom( 0.171875, 10, 0.5) # 
This function takes the probability value and gives a number whose
cumulative value matches the probability value.
[1] 3
This returns a value of 3, which confirms that the probability of getting 3 or fewer heads is approximately
0.1719. Here, we used qbinom to find the value of k such that P(X ≤ k) = 0.1719.
 
S
i
m
u
l
a
t
i
n
g
 
B
i
n
o
m
i
a
l
 
r
a
n
d
o
m
 
v
a
r
i
a
b
l
e
 
u
s
i
n
g
 
r
b
i
n
o
m
(
)
 
f
u
n
c
t
i
o
n
 
i
n
 
R
 
The general R function to generate random numbers from Binomial distribution is
rbinom(n,size,prob)
where,
    n is the sample size,
    size is the number of trials, and
    prob is the the probability of success in binomial distribution.
The function rbinom(n,size,prob) generates n random numbers from Binomial distribution with the
number of trials size and the probability of success prob
.
Example: Generate 8 random values from a sample of 150 with probability of 0.4.
> x <- rbinom(8,150,.4)
> x
[1] 61 51 54 54 56 62 62 48
 
P
o
i
s
s
o
n
 
D
i
s
t
r
i
b
u
t
i
o
n
 
The Poisson distribution is a probability distribution that describes 
the probability of
a certain number of events occurring within 
a fixed time 
or space interval, given the
average rate of occurrence(
)
of those events
 
(ie) 
The Poisson distribution models the probability of a certain number of events occurring in a 
fixed interval
of time, given the average 
rate at which the events occur.
 
The binomial distribution models the probability of a fixed number of successes in
a
 
fixed number of independent trials
, 
while the Poisson distribution models the
probability of a fixed number of occurrences in a 
fixed time or space interval
.
 
 
 
 
In 1837 French mathematician Simeon Dennis Poisson derived the
distribution as a limiting case of Binomial distribution. It is called after
his name as Poisson distribution.
Conditions:
(i) The number of trails ‘
n’ 
is indefinitely large i.e., 
n
→∞
(ii) The probability of a success ‘
p
’ for each trial is very small i.e., 
p
→0
(iii) np
= 
 
is finite
(iv) Events are Independent
 
 
The random variable X is said to follow the Poisson probability distribution if it has
the probability function:
The pmf is given by
P(X=x)= p(x) = e
-
 
x
 / x! , for x=0,1,2…
where
P(x)  = the probability of x successes over a given period of time or space, given 
 
      = the expected number of successes per time 
 > 0
e       = 2.71828 (the base for natural logarithms)
The 
mean
 of the distribution is 
λ.
The 
variance
 of the distribution is also
 λ.
The 
standard deviation 
of the distribution is 
√λ.
 
P
r
o
b
a
b
i
l
i
t
y
 
C
o
m
p
u
t
a
t
i
o
n
s
 
R
e
l
a
t
e
d
 
t
o
 
P
o
i
s
s
o
n
 
 
D
i
s
t
r
i
b
u
t
i
o
n
s
 
i
n
 
R
 
In R, you can use the 
dpois(), ppois(), qpois(), and rpois() 
functions to work with the
Poisson distribution.
 1.dpois(x, lambda) 
calculates the 
Probability Mass Function (PMF) 
of the Poisson
distribution at a specific value of x, given a Poisson parameter lambda.
2.  ppois(q, lambda) 
calculates the 
Cumulative Distribution Function (CDF) 
of the Poisson
distribution at a specific value of q, given a Poisson parameter lambda.
3. qpois(p, lambda) 
calculates the
 Inverse Cumulative Distribution Function (quantile
function) 
of the Poisson distribution at a specific probability value p, given a Poisson
parameter lambda.
4. rpois(n, lambda) 
generates
 n random samples 
from a Poisson distribution with a
Poisson parameter lambda.
 
dpois
The 
dpois
 function calculates the 
probability mass function
for a Poisson distribution, given a particular value x and a
parameter lambda.
dpois(x, lambda)
x:
 number of successes
lambda:
 average rate of success
Example:
Suppose a call center receives an average of 10 customer
calls per hour. What is the probability that the call center
will receive exactly 7 calls in the next hour?
Ans:
> lambda <- 10
> x <- 7
> prob <- dpois(x, lambda)
> prob
[1] 
0.09007923
 
 
To solve this problem manually using the Poisson
distribution, we can use the formula:
 
P(X=x)= p(x) = e
-
 
x
 / x!
where 
lambda
 is the average number of events per interval
(in this case, 
10
 customer calls per hour), 
x
 is the number of
events we're interested in (in this case, 
7
 customer calls in
the next hour), and e is the mathematical constant
approximately equal to 
2.71828.
P(X = 7) = (e 
-10 
* 10 
7
) / 7!
= (0.0000454 * 10,000,000) / (7 * 6 * 5 * 4 * 3 * 2 * 1)
= 0.09008
Therefore, the probability of receiving exactly 7 calls in the
next hour is approximately 0.090 or 9.0%.
 
 
ppois
In R, you can use the 
ppois
 function to calculate the
Cumulative Distribution Function (CDF)
 of the Poisson
distribution. The CDF gives the probability of getting k or
fewer events in a certain interval of time, given the
average rate of events per unit time.
ppois(q, lambda)
q:
 number of successes
lambda: 
average rate of success
Examples: It is known that a certain hospital experience 4
births per hour. In a given hour, what is the probability
that 4 or less births occur?
Answer: Using the Poisson Distribution with λ = 4 and x =
4, we find that P(X≤4) = 0.62884.
> 
ppois(4,4)
[1] 0.6288369
So the probability of 4 or fewer births in an hour is
approximately 0.6288 or 62.88%, which matches the
result we obtained earlier.
 
To solve this problem manually using the Poisson
distribution, we can use the formula:
P(X=x)= p(x) = e
-
 
x
 / x!
where λ is the average rate of events per hour and x is
the number of events.
To find the probability of 4 or fewer births in an hour,
we need to calculate the probabilities for k = 0, 1, 2, 3,
and 4, and add them up:
P(0) = (4
0
 * e 
(-4) 
)/ 0! = 0.0183
P(1) = (4
1
 * e 
(-4)
)
 
/ 1! = 0.0733
P(2) = (4 
2
 * e 
(-4) 
) / 2! = 0.1465
P(3) = (4
3
 * e 
(-4) 
) / 3! = 0.1953
P(4) = (4 
4
 * e 
(-4) 
) / 4! = 0.1953
Therefore, the probability of 4 or fewer births in an
hour is:
P(0 or 1 or 2 or 3 or 4) = P(0) + P(1) + P(2) + P(3) + P(4)
= 0.6287
 
Qpois
qpois 
is a function in the R programming language 
that calculates the inverse cumulative distribution
function (also known as the quantile function) 
for the Poisson distribution.
qpois(p, lambda)
p:
 
 the probability value for which you want to find the Inverse CDF.
lambda:
 
the average rate of events per unit time.
Example:
Suppose that the number of people who visit a website in a day follows a Poisson distribution with a mean of 500
people. What is the minimum number of people that we can expect to visit the website in a day with a probability of at
least 95%?
To solve this problem using qpois, we can first find the Poisson distribution value that corresponds to a 
probability of
0.95 
using the qpois function and the 
mean value of 500
:
> qpois(0.95, 500)
[1] 537
This means that we can expect at least 537 people to visit the website in a day with a probability of at least 95%
 
 
rpois
rpois is a function in R that generates random numbers from a Poisson distribution with a specified
mean. The function takes two arguments: the number of random numbers to generate (n) and the
mean of the Poisson distribution (lambda).
rpois(n, lambda)
n:
 
number of random variables to generate
lambda:
 
mean of the Poisson distribution
Example:
 suppose we want to generate 10 random numbers from a Poisson distribution with a mean of 5
> rpois(10, 5)
 [1] 5 3 8 2 6 6 4 2 3 8
 
T
h
e
 
N
o
r
m
a
l
 
D
i
s
t
r
i
b
u
t
i
o
n
 
In probability theory and statistics, the Normal
Distribution, also called the 
Gaussian Distribution
, is the
most significant 
continuous probability distribution.
The normal distribution is a 
bell-shaped, symmetrical
distribution(
the values to the left of the mean are a
mirror image of the values to the right of the
mean.
)
 
in which the 
mean, median and mode are all
equal.
If the mean, median and mode are unequal, 
the
distribution will be either 
positively or negatively
skewed.
A continuous random variable X having the bell-shaped
distribution is called a normal random variable.
 
 
A random variable X is said to have a Normal distribution with
parameters  with mean μ and  variance 
2
 if its probability density
function is given by
 
It is denoted by X ~ N (μ, 
2
)
Where
f(x)
 
=
 
frequency of random variable x
 
=
 
3.14159;     e = 2.71828
 
=
 
population standard deviation (
> 0 )
x
 
=
 
value of random variable -∞ < x <∞
µ
 
=
 
population mean(
-∞< 
μ <∞
 )
 
P
r
o
p
e
r
t
i
e
s
 
o
f
 
N
o
r
m
a
l
 
D
i
s
t
r
i
b
u
t
i
o
n
undefined
 
Standard Normal Distribution
The simplest case of a normal distribution is known as the standard normal distribution.
This is a special case when µ=0 and =1, and it is described by this probability density
function.
If X ~ N(µ, σ
2
), let Z = (X - µ) / σ, [Z-transformation]   
then
E(Z) = 0, V (Z) = 1.
(i.e)Z ~ N(0, 1), Z is said to have a standard normal distribution.
 
 
P
r
o
b
a
b
i
l
i
t
y
 
C
o
m
p
u
t
a
t
i
o
n
s
 
R
e
l
a
t
e
d
 
t
o
 
N
o
r
m
a
l
 
 
D
i
s
t
r
i
b
u
t
i
o
n
s
 
i
n
 
R
 
d
norm:
 
density function
 of the normal distribution
p
norm:
 
cumulative density function
 of the normal distribution
q
norm:
 
quantile function
 of the normal distribution
r
norm
random sampling
 from the normal distribution
 
N
o
r
m
a
l
 
p
r
o
b
a
b
i
l
i
t
i
e
s
 
u
s
i
n
g
 
d
n
o
r
m
(
)
 
f
u
n
c
t
i
o
n
 
i
n
 
R
 
dnorm
The function 
dnorm
 returns the value of 
the probability density function (pdf) of
the normal distribution 
given a certain random variable 
x
, a population
mean 
μ
 
and population standard deviation 
σ
.
The syntax for using dnorm is as follows:
dnorm(x, mean, sd)
Example:
The GRE(
Graduate Record Examinations )
 is widely used to help predict the
performance of applicants to graduate schools. The range of possible scores on a
GRE is 200 to 900. The psychology department at a university finds that the
students in their department have scores with a mean of 544 and standard
deviation of 103. 
Find the value of the density function at x=550
> dnorm(550,544,103)
[1] 0.00386666
 
Manual verification
The value of the density function at
x=550  is
 
Normal cumulative Density Function using pnorm() function in R
The function 
pnorm
 returns the value of the Cumulative Density Function (CDF) of the normal
distribution given a certain random variable 
q
, a population mean 
μ
 
and population standard
deviation 
σ
.
The syntax for using pnorm is as follows:
pnorm(q, mean, sd)
(ie) pnorm is the cumulative density function for the normal distribution. By definition pnorm(x) = P(X ≤
x)
Example :
  The GRE(Graduate Record Examinations ) is widely used to help predict the performance of
applicants to graduate schools. The range of possible scores on a GRE is 200 to 900. The psychology
department at a university finds that the students in their department have scores with a mean of 544 and
standard deviation of 103. 
Find the probability that a student in psychology department has a score less
than 480
we need to find the probability P(X≤480)
> pnorm(480,544,103)
[1] 0.2671816
 
 
Normal Distribution Quantiles using qnorm() in R
qnorm
The function 
qnorm
 returns the value of the inverse cumulative density function (cdf) of the normal
distribution given a certain random variable 
p
, a population mean 
μ
 
and population standard deviation 
σ
.
The syntax for using qnorm is as follows:
qnorm(p, mean, sd)
qnorm is the inverse function for pnorm.
Example:
 Suppose that the heights of a certain population follow a normal distribution with a mean of 170 cm and a
standard deviation of 5 cm. What is the height below which 90% of the population lies?
> qnorm(0.9,170,5)
[1] 176.4078
So the height below which 90% of the population lies is approximately 178.16 cm.
 
 
Simulating Normal random variable using rnorm() function in R
rnorm is a function in R that generates random numbers from a normal distribution.
rnorm(n, mean, sd)
This function generates n random numbers from Normal distribution with given mean and sd
rnorm generates random values from a standard normal distribution. The required argument is a number
specifying the number of normal variates to produce.
Example:
generate 10 random numbers from a normal distribution with a mean of 5 and a standard deviation of 2:
> rnorm( 10, 5, 2)
 [1] 7.448164 5.719628 5.801543 5.221365 3.888318 8.573826 5.995701 1.066766
 [9] 6.402712 4.054417
Slide Note
Embed
Share

Probability distributions play a crucial role in data analysis, with the binomial distribution being a key one in R. This distribution helps describe the number of successes in a fixed number of trials with two possible outcomes. Learn about the properties, probability computations, mean, variance, and standard deviation associated with the binomial distribution. R provides built-in functions for working with this distribution, making it easier to analyze data effectively.


Uploaded on Jul 25, 2024 | 5 Views


Download Presentation

Please find below an Image/Link to download the presentation.

The content on the website is provided AS IS for your information and personal use only. It may not be sold, licensed, or shared on other websites without obtaining consent from the author. Download presentation by click this link. If you encounter any issues during the download, it is possible that the publisher has removed the file from their server.

E N D

Presentation Transcript


  1. Basic Probability Distributions in Basic Probability Distributions in R Programming R Programming By Dr. Mohamed Surputheen

  2. probability distributions in R probability distributions in R Many statistical tools and techniques used in data analysis are based on probability. Probability measures how likely it is for an event to occur on a scale from 0 (the event never occurs) to 1 (the event always occurs). A probability distribution describes how a random variable is distributed; it tells us which values a random variable is most likely to take on and which values are less likely. R comes with built-in implementations of many probability distributions. Each probability distribution in R is associated with four functions which follow a naming convention: The d-prefix function calculates the probability density function (PDF) of a continuous probability distribution, or the probability mass function (PMF) of a discrete probability distribution, at a specific value of the random variable. The p-prefix function calculates the cumulative distribution function (CDF) of a probability distribution, which gives the probability of observing a value less than or equal to a given value of the random variable The q-prefix function calculates the quantile of a probability distribution, which is the inverse of the CDF. The r-prefix function generates random numbers from a probability distribution

  3. Binomial Distribution The binomial distribution is a discrete probability distribution that describes the number of successes in a fixed number of independent trials with two possible outcomes (success or failure) and a constant probability of success for each trial. A binomial experiment has the following properties: experiment consists of n identical and independent trials each trial results in one of two outcomes: success or failure P(success) = p P(failure) = q = 1 - p for all trials The random variable of interest, X, is the number of successes in the n trials. X has a binomial distribution with parameters n and p

  4. Binomial Distribution If the probability of success in each trial is given by p , then the probability of getting exactly x successful events among n trials is given by the Binomial PMF or ! n = = x n x ( ) 1 ( ) for , 1 , 0 ..., P x p p x n x ( ! )! n x -n is the total number of trials -x is the number of successes -p is the probability of success on each trial.

  5. Mean , variance and Standard deviation of Binomial Distribution The mean, E(X) = p + p + + p = n*p The variance, V(X) = pq + pq + + pq = n*pq The standard deviation =

  6. Probability Computations Related to Binomial Distributions Probability Computations Related to Binomial Distributions R has several functions related to the binomial distribution. Here are some commonly used ones: 1. dbinom(x, size, prob) - Probability Mass Function (PMF) or probability distribution of the binomial distribution. Calculates the probability of getting exactly x successes in size trials, given a probability prob of success on each trial. 2. pbinom(q, size, prob) - Cumulative Distribution Function (CDF) of the binomial distribution. Calculates the probability of getting up to q successes in size trials, given a probability prob of success on each trial. 3. qbinom(p, size, prob) - Inverse CDF of the binomial distribution. Calculates the smallest number q such that the CDF is less than or equal to p, given size trials and a probability prob of success on each trial. (ie) This function takes the probability value and gives a number whose cumulative value matches the probability 4. rbinom(n, size, prob) - Random number generator for the binomial distribution. Generates n random samples from a binomial distribution with size trials and a probability prob of success on each trial.

  7. Binomial probabilities using dbinom() function in R dbinom is the function used to find the probability mass function for the binomial distribution. The function dbinom is used to obtain the exact probability using Binomial distribution, i.e. P(X=x). The syntax to compute the probability at x for binomial distribution using R is dbinom(x,size,prob) where x : the value(s) of the variable, size : the number of trials, and prob : the probability of success (prob). The dbinom() function gives the probability for given value(s) x (no. of successes), size (no. of trials) and prob (probability of success).

  8. Manual verification Example: dbinom The probability of getting a head on a single coin toss is 1/2, and the probability of getting a tail is also 1/2. A coin is tossed 5 times. What is the probability of getting one head and three heads? To solve this problem using R language, we can use the dbinom() function, which calculates the binomial probability mass function. To find the probability of getting a certain number of heads in 5 coin tosses, we can use the binomial probability formula: For getting one head: P(x) = nCx pxq(n-x) > dbinom(1, size=5, prob=0.5) Given n=5, p=0.5,q=0.5 [1] 0.15625 For getting three heads: So for getting one head, we have: P(X=1) = (5C1) * (1/2)^1 * (1/2)^4 = 5/32 =0.15625 > dbinom(3, size=5, prob=0.5) For getting three heads, we have: [1] 0.3125 P(X=3) = (5 C 3) * (1/2)^3 * (1/2)^2 = 10/32 = 5/16 =0.3125

  9. Binomial cumulative probability using Binomial cumulative probability usingpbinom() pbinom() function in R function in R The syntax to compute the cumulative probability distribution function (CDF) for binomial distribution using R is pbinom(q,size,prob) where q : the value(s) of the variable, size : the number of trials, and prob : the probability of success (prob). This function is very useful for calculating the cumulative binomial probabilities for given value(s) of q (value of the variable x), size (no. of trials) and prob (probability of success).

  10. In a university 45% of the students are female. A random sample of 10 students are selected. What is the probability that 2 or less female students are selected? Answer: pbinom(q,size,prob) pbinom(2,10,0.45) [1] 0.09955965

  11. Binomial Distribution Quantiles using qbinom() in R qbinom is the R function that calculates the inverse CDF (or quqntiles) of the binomial distribution. W.k.t, This function takes the probability value and gives a number whose cumulative value matches the probability value. The syntax to compute the inverse CDF or quantiles of binomial distribution using R is qbinom(p,size,prob) where p : the value(s) of the probabilities, size : the number of trials, and prob : the probability of success (prob). Diff between CDF and invers CDF The CDF represents the probability that a random variable takes on a value less than or equal to a given value. The inverse CDF, on the other hand, does the opposite. It takes a probability as input and returns the value of the random variable that corresponds to that probability. The function qbinom(p,size,prob) gives the Inverse CDF of Binomial distribution for given value of p, size and prob. Note: qbinom is the inverse of the pbinom function. pbinom calculates the cumulative probability distribution function (CDF) of a binomial random variable, while qbinom calculates the inverse CDF or the quantile function of the binomial distribution.

  12. Example problem that demonstrates the relationship between pbinom and qbinom: Suppose we flip a fair coin 10 times. What is the probability of getting 3 or fewer heads? To solve this problem using pbinom, we can set n = 10 and p = 0.5 (since the coin is fair) and use the following code: > pbinom(3, 10, 0.5) [1] 0.171875 This returns a probability of approximately 0.1719, meaning there is a 17.19% chance of getting 3 or fewer heads in 10 coin flips. To solve this problem using qbinom, we can again set n = 10 and p = 0.5 and use the following code: > qbinom( 0.171875, 10, 0.5) # This function takes the probability value and gives a number whose cumulative value matches the probability value. [1] 3 This returns a value of 3, which confirms that the probability of getting 3 or fewer heads is approximately 0.1719. Here, we used qbinom to find the value of k such that P(X k) = 0.1719.

  13. Simulating Binomial random variable using Simulating Binomial random variable using rbinom rbinom() function in R () function in R The general R function to generate random numbers from Binomial distribution is rbinom(n,size,prob) where, n is the sample size, size is the number of trials, and prob is the the probability of success in binomial distribution. The function rbinom(n,size,prob) generates n random numbers from Binomial distribution with the number of trials size and the probability of success prob. Example: Generate 8 random values from a sample of 150 with probability of 0.4. > x <- rbinom(8,150,.4) > x [1] 61 51 54 54 56 62 62 48

  14. Poisson Distribution Poisson Distribution The Poisson distribution is a probability distribution that describes the probability of a certain number of events occurring within a fixed time or space interval, given the average rate of occurrence( )of those events (ie) The Poisson distribution models the probability of a certain number of events occurring in a fixed interval of time, given the average rate at which the events occur. The binomial distribution models the probability of a fixed number of successes in a fixed number of independent trials, while the Poisson distribution models the probability of a fixed number of occurrences in a fixed time or space interval.

  15. In 1837 French mathematician Simeon Dennis Poisson derived the distribution as a limiting case of Binomial distribution. It is called after his name as Poisson distribution. Conditions: (i) The number of trails n is indefinitely large i.e., n (ii) The probability of a success p for each trial is very small i.e., p 0 (iii) np= is finite (iv) Events are Independent

  16. The random variable X is said to follow the Poisson probability distribution if it has the probability function: The pmf is given by P(X=x)= p(x) = e- x/ x! , for x=0,1,2 where P(x) = the probability of x successes over a given period of time or space, given = the expected number of successes per time > 0 e = 2.71828 (the base for natural logarithms) The mean of the distribution is . The variance of the distribution is also . The standard deviation of the distribution is .

  17. Probability Computations Related to Poisson Distributions in R Probability Computations Related to Poisson Distributions in R In R, you can use the dpois(), ppois(), qpois(), and rpois() functions to work with the Poisson distribution. 1.dpois(x, lambda) calculates the Probability Mass Function (PMF) of the Poisson distribution at a specific value of x, given a Poisson parameter lambda. 2. ppois(q, lambda) calculates the Cumulative Distribution Function (CDF) of the Poisson distribution at a specific value of q, given a Poisson parameter lambda. 3. qpois(p, lambda) calculates the Inverse Cumulative Distribution Function (quantile function) of the Poisson distribution at a specific probability value p, given a Poisson parameter lambda. 4. rpois(n, lambda) generates n random samples from a Poisson distribution with a Poisson parameter lambda.

  18. dpois The dpois function calculates the probability mass function for a Poisson distribution, given a particular value x and a parameter lambda. To solve this problem manually using the Poisson distribution, we can use the formula: dpois(x, lambda) x: number of successes P(X=x)= p(x) = e- x / x! lambda: average rate of success where lambda is the average number of events per interval (in this case, 10 customer calls per hour), x is the number of events we're interested in (in this case, 7 customer calls in the next hour), and e is the mathematical constant approximately equal to 2.71828. P(X = 7) = (e -10 * 10 7) / 7! = (0.0000454 * 10,000,000) / (7 * 6 * 5 * 4 * 3 * 2 * 1) = 0.09008 Therefore, the probability of receiving exactly 7 calls in the next hour is approximately 0.090 or 9.0%. Example: Suppose a call center receives an average of 10 customer calls per hour. What is the probability that the call center will receive exactly 7 calls in the next hour? Ans: > lambda <- 10 > x <- 7 > prob <- dpois(x, lambda) > prob [1] 0.09007923

  19. ppois In R, you can use the ppois function to calculate the Cumulative Distribution Function (CDF) of the Poisson distribution. The CDF gives the probability of getting k or fewer events in a certain interval of time, given the average rate of events per unit time. ppois(q, lambda) q: number of successes lambda: average rate of success To solve this problem manually using the Poisson distribution, we can use the formula: P(X=x)= p(x) = e- x / x! where is the average rate of events per hour and x is the number of events. To find the probability of 4 or fewer births in an hour, we need to calculate the probabilities for k = 0, 1, 2, 3, and 4, and add them up: Examples: It is known that a certain hospital experience 4 births per hour. In a given hour, what is the probability that 4 or less births occur? P(0) = (40 * e (-4) )/ 0! = 0.0183 Answer: Using the Poisson Distribution with = 4 and x = 4, we find that P(X 4) = 0.62884. P(1) = (41 * e (-4))/ 1! = 0.0733 P(2) = (4 2 * e (-4) ) / 2! = 0.1465 > ppois(4,4) [1] 0.6288369 P(3) = (43 * e (-4) ) / 3! = 0.1953 So the probability of 4 or fewer births in an hour is approximately 0.6288 or 62.88%, which matches the result we obtained earlier. P(4) = (4 4 * e (-4) ) / 4! = 0.1953 Therefore, the probability of 4 or fewer births in an hour is: P(0 or 1 or 2 or 3 or 4) = P(0) + P(1) + P(2) + P(3) + P(4) = 0.6287

  20. Qpois qpois is a function in the R programming language that calculates the inverse cumulative distribution function (also known as the quantile function) for the Poisson distribution. qpois(p, lambda) p: the probability value for which you want to find the Inverse CDF. lambda: the average rate of events per unit time. Example: Suppose that the number of people who visit a website in a day follows a Poisson distribution with a mean of 500 people. What is the minimum number of people that we can expect to visit the website in a day with a probability of at least 95%? To solve this problem using qpois, we can first find the Poisson distribution value that corresponds to a probability of 0.95 using the qpois function and the mean value of 500: > qpois(0.95, 500) [1] 537 This means that we can expect at least 537 people to visit the website in a day with a probability of at least 95%

  21. rpois rpois is a function in R that generates random numbers from a Poisson distribution with a specified mean. The function takes two arguments: the number of random numbers to generate (n) and the mean of the Poisson distribution (lambda). rpois(n, lambda) n: number of random variables to generate lambda: mean of the Poisson distribution Example: suppose we want to generate 10 random numbers from a Poisson distribution with a mean of 5 > rpois(10, 5) [1] 5 3 8 2 6 6 4 2 3 8

  22. The Normal Distribution The Normal Distribution In probability theory and statistics, the Normal Distribution, also called the Gaussian Distribution, is the most significant continuous probability distribution. The normal distribution is a bell-shaped, symmetrical distribution(the values to the left of the mean are a mirror image of the values to the right of the mean.) in which the mean, median and mode are all equal. If the mean, median and mode are unequal, the distribution will be either positively or negatively skewed. A continuous random variable X having the bell-shaped distribution is called a normal random variable.

  23. A random variable X is said to have a Normal distribution with parameters with mean and variance 2 if its probability density function is given by It is denoted by X ~ N ( , 2) Where f(x) = frequency of random variable x = 3.14159; e = 2.71828 = population standard deviation (> 0 ) x = value of random variable - < x < = population mean(- < < )

  24. Properties of Normal Distribution Properties of Normal Distribution 1. It is a continuous distribution 2. The normal distribution curve is bell-shaped. 3. The mean, median, and mode are equal(Mean = Median = Mode = ) and located at the center of the distribution. 4. The normal distribution curve is unimodal (single mode). 5. The curve is symmetrical about the mean. (ie) Each half of the distribution is a mirror image of the other half. 6. It is asymptotic to the horizontal axis. That is, it does not touch the x-axis and it goes on forever in each direction. The random variable ?can take any value from ?? . 7. 8. The total area under the normal distribution curve is equal to 1 or 100%.(ie) 1 2?2(? ?)2?? = 1 ? ? ?? = ? 1 ? 2?

  25. Standard Normal Distribution The simplest case of a normal distribution is known as the standard normal distribution. This is a special case when =0 and =1, and it is described by this probability density function. If X ~ N( , 2), let Z = (X - ) / , [Z-transformation] then E(Z) = 0, V (Z) = 1. (i.e)Z ~ N(0, 1), Z is said to have a standard normal distribution.

  26. Probability Computations Related to Normal Distributions in R Probability Computations Related to Normal Distributions in R dnorm: density function of the normal distribution pnorm: cumulative density function of the normal distribution qnorm: quantile function of the normal distribution rnorm: random sampling from the normal distribution

  27. Normal probabilities using Normal probabilities using dnorm dnorm dnorm() function in R () function in R The function dnorm returns the value of the probability density function (pdf) of the normal distribution given a certain random variable x, a population mean and population standard deviation . The syntax for using dnorm is as follows: dnorm(x, mean, sd) Manual verification The value of the density function at x=550 is Example: The GRE(Graduate Record Examinations ) is widely used to help predict the performance of applicants to graduate schools. The range of possible scores on a GRE is 200 to 900. The psychology department at a university finds that the students in their department have scores with a mean of 544 and standard deviation of 103. Find the value of the density function at x=550 > dnorm(550,544,103) [1] 0.00386666

  28. Normal cumulative Density Function using pnorm() function in R The function pnorm returns the value of the Cumulative Density Function (CDF) of the normal distribution given a certain random variable q, a population mean and population standard deviation . The syntax for using pnorm is as follows: pnorm(q, mean, sd) (ie) pnorm is the cumulative density function for the normal distribution. By definition pnorm(x) = P(X x) Example : The GRE(Graduate Record Examinations ) is widely used to help predict the performance of applicants to graduate schools. The range of possible scores on a GRE is 200 to 900. The psychology department at a university finds that the students in their department have scores with a mean of 544 and standard deviation of 103. Find the probability that a student in psychology department has a score less than 480 we need to find the probability P(X 480) > pnorm(480,544,103) [1] 0.2671816

  29. Normal Distribution Quantiles using qnorm() in R qnorm The function qnorm returns the value of the inverse cumulative density function (cdf) of the normal distribution given a certain random variable p, a population mean and population standard deviation . The syntax for using qnorm is as follows: qnorm(p, mean, sd) qnorm is the inverse function for pnorm. Example: Suppose that the heights of a certain population follow a normal distribution with a mean of 170 cm and a standard deviation of 5 cm. What is the height below which 90% of the population lies? > qnorm(0.9,170,5) [1] 176.4078 So the height below which 90% of the population lies is approximately 178.16 cm.

  30. Simulating Normal random variable using rnorm() function in R rnorm is a function in R that generates random numbers from a normal distribution. rnorm(n, mean, sd) This function generates n random numbers from Normal distribution with given mean and sd rnorm generates random values from a standard normal distribution. The required argument is a number specifying the number of normal variates to produce. Example: generate 10 random numbers from a normal distribution with a mean of 5 and a standard deviation of 2: > rnorm( 10, 5, 2) [1] 7.448164 5.719628 5.801543 5.221365 3.888318 8.573826 5.995701 1.066766 [9] 6.402712 4.054417

More Related Content

giItT1WQy@!-/#giItT1WQy@!-/#giItT1WQy@!-/#giItT1WQy@!-/#giItT1WQy@!-/#giItT1WQy@!-/#giItT1WQy@!-/#giItT1WQy@!-/#giItT1WQy@!-/#giItT1WQy@!-/#giItT1WQy@!-/#giItT1WQy@!-/#giItT1WQy@!-/#