Understanding Descriptive Statistics: A Comprehensive Overview

 
Chapter 2
 
Organizing and Visualizing Data
 
9/10/2024
 
Towson University - J. Jung
 
2.1
Chapters
1.    Introduction
2.    Graphs
3.    Descriptive statistics
4.    Basic probability
5.    Discrete distributions
6.    Continuous distributions
7.    Central limit theorem
8.    Estimation
9.    Hypothesis testing
10.  Two-sample tests
13.  Linear regression
14.  Multivariate regression
 
Introduction & Re-cap…
 
Descriptive statistics
 involves arranging, summarizing, and
presenting a 
set of data
 in such a way that useful 
information
 is
produced.
 
 
 
 
Its methods make use of graphical techniques and numerical
descriptive measures (such as averages) to summarize and
present the data.
 
9/10/2024
 
Towson University - J. Jung
 
2.2
Data
Statistics
Information
 
Populations & Samples
 
The graphical & tabular methods presented here apply to both
entire populations 
and
 samples drawn from populations.
 
9/10/2024
 
Towson University - J. Jung
 
2.3
 
Population
 
Sample
Subset
 
Definitions
 
A 
variable
 is some characteristic of a population or
sample. Typically denoted with a capital letter: X, Y,
Z…
  
E.g. student grades: X={B, A-, C, A, B,…}
The 
values
 
of the variable are the range of possible
values for a variable.
  
E.g. student marks (0..100)
 
Data
 are the 
observed values
 of a variable.
  
E.g. student marks: {67, 74, 71, 83, 93, 55, 48}
 
9/10/2024
 
Towson University - J. Jung
 
2.4
 
Types of Data & Information
 
Data (at least for purposes of Statistics) fall into
three main groups:
 
Quantitative or  (1) Numerical (Interval) Data:
Discrete Data, Continuous Data
 
Qualitative or Categorical Data:
(2) Ordinal Data, (3) Nominal Data,
 
 
9/10/2024
 
Towson University - J. Jung
 
2.5
 
9/10/2024
 
Towson University - J. Jung
 
2.6
 
To count number of observations per
category use Excel:
= countif(cell,”sophomore”)
 
Example: Types of Data
 
Types of Data
 
9/10/2024
 
Towson University - J. Jung
 
2.7
Can you do
math?
Data
1 Numerical
Data
3 Nominal
Data
2 Ordinal
Data
 
Yes
Ordered?
 
No
 
Yes
 
No
 
Categorical
Data
Discrete
Continuous
 
1 Interval data
 
Real numbers, i.e. heights, weights, prices, etc.
Also referred to as 
quantitative 
or 
numerical
.
Arithmetic operations can be performed on Interval
Data, thus its meaningful to talk about 2*Height, or
Price + $1, and so on.
Discrete Data: gaps exist between possible values
e.g. # of children in a family
Continuous Data: no gaps exist between possible
values
e.g. annual income of a family
 
9/10/2024
 
Towson University - J. Jung
 
2.8
 
2 Ordinal Data…
 
Ordinal
 
Data
 appear to be categorical in nature, but
their values have an 
order
; a ranking to them:
 
 
E.g. College course rating system:
poor = 1, fair = 2, good = 3, very good = 4, excellent = 5
While it’s still not meaningful to do arithmetic on this
data (e.g. does 2*fair = very good?!), we can say things
like:
excellent > poor
   or   
fair < very good
That is, order is maintained no matter what numeric
values are assigned to each category.
 
9/10/2024
 
Towson University - J. Jung
 
2.9
 
 
3 Nominal Data…
 
The
 
values of 
nominal
 data are 
categories.
  
E.g. responses to questions about marital status,
  
Single = 1, Married = 2, Divorced = 3, Widowed = 4
 
Because the numbers are arbitrary, arithmetic
operations don’t make any sense (e.g. does Widowed
÷
 2 = Married?!)
 
Only counts of the number of items in a category are
allowed.
 
More examples: gender, religious preference, etc.
 
9/10/2024
 
Towson University - J. Jung
 
2.10
 
Hierarchy of Data…
 
1 Interval
 
Values are real numbers.
 
All calculations are valid.
 
Data may be treated as ordinal or nominal.
 
2 Ordinal
 
Values must represent the ranked order of the data.
 
Calculations based on an ordering process are valid.
 
Data may be treated as nominal but not as interval.
 
3 Nominal
 
Values are the arbitrary numbers that represent categories.
 
Only calculations based on the frequencies of occurrence are valid.
 
Data may not be treated as ordinal or interval.
 
9/10/2024
 
Towson University - J. Jung
 
2.11
 
Graphical & Tabular Techniques for
Nominal Data…
 
 
The only allowable calculation on nominal data
is to count the frequency of each value of the
variable.
We can summarize the data in a table that
presents the categories and their counts called a
frequency distribution.
A 
relative frequency distribution
 lists the
categories and the 
proportion
 with which each
occurs.
 
9/10/2024
 
Towson University - J. Jung
 
2.12
 
Nominal Data (Tabular Summary)
 
9/10/2024
 
Towson University - J. Jung
 
2.13
 
Nominal Data (Frequency)
 
9/10/2024
 
Towson University - J. Jung
 
2.14
 
Bar Charts are often used to display 
absolute
 
frequencies…
 
Nominal Data (Relative Frequency)
 
9/10/2024
 
Towson University - J. Jung
 
2.15
 
Pie Charts show 
relative frequencies…
 
Nominal Data
 
9/10/2024
 
Towson University - J. Jung
 
2.16
 
 It’s all the same 
information,
(based on the same 
data
).
 Just different 
presentation.
 
Graphical Techniques for Interval Data
 
There are several graphical methods that are
used when the data are 
interval
 (i.e.
numeric, non-categorical).
The most important of these graphical
methods is the 
histogram
.
The histogram is not only a powerful
graphical technique used to 
summarize
interval data, but it is also used to help
explain
 probabilities.
 
9/10/2024
 
Towson University - J. Jung
 
2.17
 
Building a Histogram…
 
1)
Create a frequency distribution for the data…
 
How?
 
a) Determine the number of 
classes
 to use.
 
b) Determine how large to make each class.
 
c) Place the data into each class…
classes are mutually exclusive and collectively exhaustive;
each item can only belong to one class;
classes contain observations greater than or equal to their
lower limits and less than their upper limits -> […)
class limits; class mark; class interval
 
9/10/2024
 
Towson University - J. Jung
 
2.18
 
Example: Histogram
 
As part of a larger study, a long-distance company
wanted to acquire information about the monthly
bills of new subscribers in the first month after
signing with the company.
The company’s marketing manager conducted a
survey of 200 new residential subscribers
wherein the first month’s bills were recorded.
The general manager planned to present his
findings to senior executives.
What information can be extracted from these
data?
 
9/10/2024
 
Towson University - J. Jung
 
2.19
 
Building a Histogram
 
1.
Collect the Data
2.
Create a frequency distribution for the data…
1.
How?
2.
Determine the number of 
classes
 to use…How?
   
Refer to table 2.6:
 
9/10/2024
 
Towson University - J. Jung
 
2.20
With 200
observations, we
should have between
7 & 10 classes…
 
Alternative, we could use Sturges’
formula:
Number of class intervals =
1 + 3.3 log (n)
 
Histogram
 
1)
Collect the Data
2)
Create a frequency distribution for the data…
 
How?
 
a) Determine the number of 
classes
 to use. [8]
 
b) Determine how large to make each class…
  
How?
Look at the 
range
 of the data, that is:
Range = Largest Observation – Smallest Observation
Range = $119.63 – $0 = $119.63
 
Then each class width becomes:
Range ÷ (# classes) = 119.63 ÷ 8 ≈ 15
 
9/10/2024
 
Towson University - J. Jung
 
2.21
 
Example: Histogram
 
In the previous example we created a
frequency distribution of the 5 categories.
In this example we also create a frequency
distribution by counting the number of
observations that fall into a series of intervals,
called 
classes
.
We have chosen 
eight classes 
defined in such
a way that each observation falls into one and
only one class.
 
 
9/10/2024
 
Towson University - J. Jung
 
2.22
 
Example: Histogram
 
Classes
1.
Amounts that are less than 15; [0, 15)
2.
Amounts that are more than or equal 15 but less than 30; [15, 30)
3.
Amounts that are more than or equal 30 but less than 45; [30, 45)
4.
Amounts that are more than or equal  45 but less than 60; [45, 60)
5.
Amounts that are more than or equal 60 but less than 75; [60, 75)
6.
Amounts that are more than or equal 75 but less than90; [75, 90)
7.
Amounts that are more than or equal 90 but less than 105 ; [90, 105)
8.
Amounts that are more than or equal 105 but less than 120 ; [105, 120)
 
9/10/2024
 
Towson University - J. Jung
 
2.23
 
Example: Histogram
 
9/10/2024
 
Towson University - J. Jung
 
2.24
 
Interpretation
 
9/10/2024
 
Towson University - J. Jung
 
2.25
 
about half (71+37=108)
of the bills are “small”,
i.e. less than $30
 
There are only a few telephone
bills in the middle range.
 
(18+28+14=60)÷200 = 30%
i.e. nearly a third of the phone bills
are $90 or more.
 
9/10/2024
 
Towson University - J. Jung
 
2.26
 
 
Shapes of Histograms…
 
Symmetry
A histogram is said to be 
symmetric
 if, when we draw a
vertical line
 down the center of the histogram, the two
sides are identical in shape and size:
 
9/10/2024
 
Towson University - J. Jung
 
2.27
 
Frequency
 
Variable
 
Frequency
 
Variable
 
Frequency
 
Variable
 
Shapes of Histograms…
 
Skewness
A skewed histogram is one with a long tail
extending to either the right or the left:
 
9/10/2024
 
Towson University - J. Jung
 
2.28
 
Frequency
 
Variable
 
Frequency
 
Variable
 
Positively Skewed
(Right Skewed)
 
Negatively Skewed
(Left Skewed)
 
Shapes of Histograms…
 
Modality
A 
unimodal 
histogram is one with a 
single peak
,
while a 
bimodal
 histogram is one with 
two peaks
:
 
9/10/2024
 
Towson University - J. Jung
 
2.29
 
Frequency
 
Variable
 
Unimodal
 
Frequency
 
Variable
 
Bimodal
 
A 
modal class 
is the class with
the largest number of observations
 
Shapes of Histograms…
 
Bell Shape
A special type of 
symmetric
 
unimodal
 histogram
is one that is bell shaped:
 
9/10/2024
 
Towson University - J. Jung
 
2.30
 
Frequency
 
Variable
 
Bell Shaped
 
 Many statistical
techniques require that
the population be bell
shaped.
 
 Drawing the histogram
helps verify the shape of
the population in
question.
 
Histogram Comparison
 
9/10/2024
 
Towson University - J. Jung
 
2.31
 
The two courses, Business
Statistics and Mathematical
Statistics have very different
histograms…
 
unimodal vs. bimodal
 
spread of the marks (narrower | wider)
 
Frequency Polygon
 
It is a line version of the
histogram.
It is plotted using class midpoints
as X values and frequencies as Y
values.
Refer to Lab Manual Chapter 2!
 
9/10/2024
 
Towson University - J. Jung
 
2.32
 
Ogive…
 
(pronounced “Oh-jive”) is a graph of a 
cumulative
frequency distribution
.
We create an Ogive in three steps…
1) First, from the frequency distribution created 
earlier
,
calculate 
relative frequencies
:
Relative Frequency = # of observations in a class
   
                Total # of observations
2) Calculate 
cumulative relative frequencies
 by adding
the current class’ relative frequency to the previous
class’ cumulative relative frequency.
(For the first class, its cumulative relative frequency is just its relative frequency)
 
 
9/10/2024
 
Towson University - J. Jung
 
2.33
 
Cumulative Relative Frequencies…
 
9/10/2024
 
Towson University - J. Jung
 
2.34
 
first class…
 
next class: .355+.185=.540
 
last class: .930+.070=1.00
 
:
:
 
Ogive…
 
Is a graph of a 
cumulative
 
frequency
distribution
.
1) Calculate relative frequencies.
2) Calculate cumulative relative frequencies.
3) Graph the cumulative relative frequencies…
 
9/10/2024
 
Towson University - J. Jung
 
2.35
 
Ogive…
 
9/10/2024
 
Towson University - J. Jung
 
2.36
 
The Ogive can be used to
answer questions like:
 
What telephone bill value
is at the 50th percentile?
 
(Refer also to Fig. 2.13 in your textbook)
 
around $35”
 
One Nominal Variable
 
Bar or Column Chart:
   
  
X axis: category labels
  
Y axis: absolute frequencies
 
Pie Chart: relative frequency
 
Pareto Diagram: a special type of column chart
categories are ordered from left to right, largest
frequency to smallest
 
 
9/10/2024
 
Towson University - J. Jung
 
2.37
 
Graphing the Relationship Between Two 
Interval
Variables
 
How two interval variables are related? We employ a
scatter plot
, which plots two variables against one
another.
 
Example 2.9 
A real estate agent wanted to know to
what extent the selling price of a home is related to
its size…
 
1)
Collect the data 
2)
Determine the 
independent 
variable (X – house size)
and the 
dependent
 variable (Y – selling price) 
3)
Use Excel to create a “scatter plot”…
 
9/10/2024
 
Towson University - J. Jung
 
2.38
 
Patterns of Scatter Plots…
 
Linearity and Direction are two concepts we
are interested in
 
9/10/2024
 
Towson University - J. Jung
 
2.39
 
Positive Linear Relationship
 
Negative Linear Relationship
 
Weak or Non-Linear Relationship
 
Time Series Data…
 
Observations measured at the same point in time
are called 
cross-sectional
 data.
 
Observations measured at successive points in
time are called 
time-series
 data.
 
Time-series data graphed on a 
line chart
, which
plots the value of the variable on the vertical axis
against the time periods on the horizontal axis.
 
9/10/2024
 
Towson University - J. Jung
 
2.40
 
Line Chart…
 
From ’87 to ’92, the tax was fairly flat. Starting ’93, there was a rapid
increase taxes until 2001. Finally, there was a downturn in 
2002
.
 
9/10/2024
 
Towson University - J. Jung
 
2.41
 
Appendix
 
 
9/10/2024
 
Towson University - J. Jung
 
2.42
 
Summation Notation
 
 
where a and b are integers satisfying
a
 is the starting value for 
i
, 
b
 is the ending
value.
The above notation is to sum up      to      .
 
 
 
 
 
 
That is, for variable 
X
 where 
X
 has values
 
 
The sum of all the values of 
X
 can be written as
 
 
 
 
If all the values for 
X’
s are given, we can get the value for
 
 
 
Example:
   Suppose X1=2, X2=0, X3=2, X4=5, and X5=1
 
 
 
 
 
 
In general,
 
 
 
Example:
 
                       , while
 
 
 
 
 
Summary II…
 
9/10/2024
 
Towson University - J. Jung
 
2.47
 
Review: Chapter 2 - Graphs
 
What is categorical data?
What is numeric/interval data?
What graphs can you make for ordinal data?
What graphs can you make for interval data?
What are the steps involved in making a bar
chart in Excel?
How do you make a histogram in Excel?
 
 
9/10/2024
 
Towson University - J. Jung
 
48/15
Slide Note
Embed
Share

Descriptive statistics involve organizing and presenting data to extract valuable insights. This overview covers key concepts like populations, samples, variable definitions, types of data, and methods for summarizing information. It also touches on organizing and visualizing data using graphical techniques and numerical measures. Dive into this comprehensive guide to enhance your statistical knowledge.


Uploaded on Sep 10, 2024 | 0 Views


Download Presentation

Please find below an Image/Link to download the presentation.

The content on the website is provided AS IS for your information and personal use only. It may not be sold, licensed, or shared on other websites without obtaining consent from the author. Download presentation by click this link. If you encounter any issues during the download, it is possible that the publisher has removed the file from their server.

E N D

Presentation Transcript


  1. Chapters 1. Introduction 2. Graphs 3. Descriptive statistics 4. Basic probability 5. Discrete distributions 6. Continuous distributions 7. Central limit theorem 8. Estimation 9. Hypothesis testing 10. Two-sample tests 13. Linear regression 14. Multivariate regression Chapter 2 Organizing and Visualizing Data 9/10/2024 Towson University - J. Jung 2.1

  2. Introduction & Re-cap Descriptive statistics involves arranging, summarizing, and presenting a set of data in such a way that useful information is produced. Statistics Data Information Its methods make use of graphical techniques and numerical descriptive measures (such as averages) to summarize and present the data. 9/10/2024 Towson University - J. Jung 2.2

  3. Populations & Samples Population Sample Subset The graphical & tabular methods presented here apply to both entire populations and samples drawn from populations. 9/10/2024 Towson University - J. Jung 2.3

  4. Definitions A variable is some characteristic of a population or sample. Typically denoted with a capital letter: X, Y, Z E.g. student grades: X={B, A-, C, A, B, } The valuesof the variable are the range of possible values for a variable. E.g. student marks (0..100) Data are the observed values of a variable. E.g. student marks: {67, 74, 71, 83, 93, 55, 48} 9/10/2024 Towson University - J. Jung 2.4

  5. Types of Data & Information Data (at least for purposes of Statistics) fall into three main groups: Quantitative or (1) Numerical (Interval) Data: Discrete Data, Continuous Data Qualitative or Categorical Data: (2) Ordinal Data, (3) Nominal Data, 9/10/2024 Towson University - J. Jung 2.5

  6. Example: Types of Data Person Nr. age age2 income in $ student year major weight in pounds 1 19 1000 freshman econ 170 2 20 0 sophomore finance 120 3 23 30000 junior finance 147 4 20 2000 senior accounting 160 To count number of observations per category use Excel: = countif(cell, sophomore ) 25% 25% econ finance category frequency relative frequency accounting econ 1 0.25 finance 2 0.5 50% accounting 1 0.25 sum 4 9/10/2024 Towson University - J. Jung 2.6

  7. Types of Data Discrete 1 Numerical Data Yes Can you do math? Data Continuous No Yes 2 Ordinal Data Ordered? Categorical Data No 3 Nominal Data 9/10/2024 Towson University - J. Jung 2.7

  8. 1 Interval data Real numbers, i.e. heights, weights, prices, etc. Also referred to as quantitative or numerical. Arithmetic operations can be performed on Interval Data, thus its meaningful to talk about 2*Height, or Price + $1, and so on. Discrete Data: gaps exist between possible values e.g. # of children in a family Continuous Data: no gaps exist between possible values e.g. annual income of a family 9/10/2024 Towson University - J. Jung 2.8

  9. 2 Ordinal Data OrdinalData appear to be categorical in nature, but their values have an order; a ranking to them: E.g. College course rating system: poor = 1, fair = 2, good = 3, very good = 4, excellent = 5 While it s still not meaningful to do arithmetic on this data (e.g. does 2*fair = very good?!), we can say things like: excellent > poor or fair < very good That is, order is maintained no matter what numeric values are assigned to each category. 9/10/2024 Towson University - J. Jung 2.9

  10. 3 Nominal Data Thevalues of nominal data are categories. E.g. responses to questions about marital status, Single = 1, Married = 2, Divorced = 3, Widowed = 4 Because the numbers are arbitrary, arithmetic operations don t make any sense (e.g. does Widowed 2 = Married?!) Only counts of the number of items in a category are allowed. More examples: gender, religious preference, etc. 9/10/2024 Towson University - J. Jung 2.10

  11. Hierarchy of Data 1 Interval Values are real numbers. All calculations are valid. Data may be treated as ordinal or nominal. 2 Ordinal Values must represent the ranked order of the data. Calculations based on an ordering process are valid. Data may be treated as nominal but not as interval. 3 Nominal Values are the arbitrary numbers that represent categories. Only calculations based on the frequencies of occurrence are valid. Data may not be treated as ordinal or interval. 9/10/2024 Towson University - J. Jung 2.11

  12. Graphical & Tabular Techniques for Nominal Data The only allowable calculation on nominal data is to count the frequency of each value of the variable. We can summarize the data in a table that presents the categories and their counts called a frequency distribution. A relative frequency distribution lists the categories and the proportion with which each occurs. 9/10/2024 Towson University - J. Jung 2.12

  13. Nominal Data (Tabular Summary) 9/10/2024 Towson University - J. Jung 2.13

  14. Nominal Data (Frequency) Bar Charts are often used to display absolutefrequencies 9/10/2024 Towson University - J. Jung 2.14

  15. Nominal Data (Relative Frequency) Pie Charts show relative frequencies 9/10/2024 Towson University - J. Jung 2.15

  16. Nominal Data It s all the same information, (based on the same data). Just different presentation. 9/10/2024 Towson University - J. Jung 2.16

  17. Graphical Techniques for Interval Data There are several graphical methods that are used when the data are interval (i.e. numeric, non-categorical). The most important of these graphical methods is the histogram. The histogram is not only a powerful graphical technique used to summarize interval data, but it is also used to help explain probabilities. 9/10/2024 Towson University - J. Jung 2.17

  18. Building a Histogram 1) Create a frequency distribution for the data How? a) Determine the number of classes to use. b) Determine how large to make each class. c) Place the data into each class classes are mutually exclusive and collectively exhaustive; each item can only belong to one class; classes contain observations greater than or equal to their lower limits and less than their upper limits -> [ ) class limits; class mark; class interval 9/10/2024 Towson University - J. Jung 2.18

  19. Example: Histogram As part of a larger study, a long-distance company wanted to acquire information about the monthly bills of new subscribers in the first month after signing with the company. The company s marketing manager conducted a survey of 200 new residential subscribers wherein the first month s bills were recorded. The general manager planned to present his findings to senior executives. What information can be extracted from these data? 9/10/2024 Towson University - J. Jung 2.19

  20. Building a Histogram 1. 2. Collect the Data Create a frequency distribution for the data 1. How? 2. Determine the number of classesto use How? Refer to table 2.6: With 200 observations, we should have between 7 & 10 classes Alternative, we could use Sturges formula: Number of class intervals = 1 + 3.3 log (n) 9/10/2024 Towson University - J. Jung 2.20

  21. Histogram 1) Collect the Data 2) Create a frequency distribution for the data How? a) Determine the number of classes to use. [8] b) Determine how large to make each class How? Look at the range of the data, that is: Range = Largest Observation Smallest Observation Range = $119.63 $0 = $119.63 Range (# classes) = 119.63 8 15 Then each class width becomes: 9/10/2024 Towson University - J. Jung 2.21

  22. Example: Histogram In the previous example we created a frequency distribution of the 5 categories. In this example we also create a frequency distribution by counting the number of observations that fall into a series of intervals, called classes. We have chosen eight classes defined in such a way that each observation falls into one and only one class. 9/10/2024 Towson University - J. Jung 2.22

  23. Example: Histogram Classes 1. 2. 3. 4. 5. 6. 7. 8. Amounts that are less than 15; [0, 15) Amounts that are more than or equal 15 but less than 30; [15, 30) Amounts that are more than or equal 30 but less than 45; [30, 45) Amounts that are more than or equal 45 but less than 60; [45, 60) Amounts that are more than or equal 60 but less than 75; [60, 75) Amounts that are more than or equal 75 but less than90; [75, 90) Amounts that are more than or equal 90 but less than 105 ; [90, 105) Amounts that are more than or equal 105 but less than 120 ; [105, 120) 9/10/2024 Towson University - J. Jung 2.23

  24. Example: Histogram Histogram 80 70 60 Frequency 50 40 30 20 10 0 15 30 45 60 75 90 105 120 Bills 9/10/2024 Towson University - J. Jung 2.24

  25. Interpretation (18+28+14=60) 200 = 30% i.e. nearly a third of the phone bills are $90 or more. about half (71+37=108) of the bills are small , i.e. less than $30 There are only a few telephone bills in the middle range. 9/10/2024 Towson University - J. Jung 2.25

  26. 9/10/2024 Towson University - J. Jung 2.26

  27. Shapes of Histograms Symmetry A histogram is said to be symmetric if, when we draw a vertical line down the center of the histogram, the two sides are identical in shape and size: Frequency Frequency Frequency Variable Variable Variable 9/10/2024 Towson University - J. Jung 2.27

  28. Shapes of Histograms Skewness A skewed histogram is one with a long tail extending to either the right or the left: Frequency Frequency Variable Variable Positively Skewed (Right Skewed) Negatively Skewed (Left Skewed) 9/10/2024 Towson University - J. Jung 2.28

  29. Shapes of Histograms Modality A unimodal histogram is one with a single peak, while a bimodal histogram is one with two peaks: Bimodal Unimodal Frequency Frequency Variable Variable A modal class is the class with the largest number of observations 9/10/2024 Towson University - J. Jung 2.29

  30. Shapes of Histograms Bell Shape A special type of symmetric unimodal histogram is one that is bell shaped: Many statistical techniques require that the population be bell shaped. Frequency Drawing the histogram helps verify the shape of the population in question. Variable Bell Shaped 9/10/2024 Towson University - J. Jung 2.30

  31. Histogram Comparison The two courses, Business Statistics and Mathematical Statistics have very different histograms unimodal vs. bimodal spread of the marks (narrower | wider) 9/10/2024 Towson University - J. Jung 2.31

  32. Frequency Polygon It is a line version of the histogram. It is plotted using class midpoints as X values and frequencies as Y values. Refer to Lab Manual Chapter 2! 9/10/2024 Towson University - J. Jung 2.32

  33. Ogive (pronounced Oh-jive ) is a graph of a cumulative frequency distribution. We create an Ogive in three steps 1) First, from the frequency distribution created earlier, calculate relative frequencies: Relative Frequency = # of observations in a class Total # of observations 2) Calculate cumulative relative frequencies by adding the current class relative frequency to the previous class cumulative relative frequency. (For the first class, its cumulative relative frequency is just its relative frequency) 9/10/2024 Towson University - J. Jung 2.33

  34. Cumulative Relative Frequencies first class next class: .355+.185=.540 : : last class: .930+.070=1.00 9/10/2024 Towson University - J. Jung 2.34

  35. Ogive Is a graph of a cumulativefrequency distribution. 1) Calculate relative frequencies. 2) Calculate cumulative relative frequencies. 3) Graph the cumulative relative frequencies 9/10/2024 Towson University - J. Jung 2.35

  36. Ogive The Ogive can be used to answer questions like: What telephone bill value is at the 50th percentile? around $35 (Refer also to Fig. 2.13 in your textbook) 9/10/2024 Towson University - J. Jung 2.36

  37. One Nominal Variable Bar or Column Chart: X axis: category labels Y axis: absolute frequencies Pie Chart: relative frequency Pareto Diagram: a special type of column chart categories are ordered from left to right, largest frequency to smallest 9/10/2024 Towson University - J. Jung 2.37

  38. Graphing the Relationship Between Two Interval Variables How two interval variables are related? We employ a scatter plot, which plots two variables against one another. Example 2.9 A real estate agent wanted to know to what extent the selling price of a home is related to its size 1) Collect the data 2) Determine the independent variable (X house size) and the dependent variable (Y selling price) 3) Use Excel to create a scatter plot 9/10/2024 Towson University - J. Jung 2.38

  39. Patterns of Scatter Plots Linearity and Direction are two concepts we are interested in Positive Linear Relationship Negative Linear Relationship Towson University - J. Jung Weak or Non-Linear Relationship 9/10/2024 2.39

  40. Time Series Data Observations measured at the same point in time are called cross-sectional data. Observations measured at successive points in time are called time-series data. Time-series data graphed on a line chart, which plots the value of the variable on the vertical axis against the time periods on the horizontal axis. 9/10/2024 Towson University - J. Jung 2.40

  41. Line Chart From 87 to 92, the tax was fairly flat. Starting 93, there was a rapid increase taxes until 2001. Finally, there was a downturn in 2002. 9/10/2024 Towson University - J. Jung 2.41

  42. Appendix 9/10/2024 Towson University - J. Jung 2.42

  43. Summation Notation = a i b X i b a where a and b are integers satisfying a is the starting value for i, b is the ending value. The above notation is to sum up to . X X a b

  44. That is, for variable X where X has values a X X X , = , ... , X X + 1 1 a b b The sum of all the values of X can be written as = a i b X i If all the values for X s are given, we can get the value for b X X X X + + = + + 2 1 = + + + + ... X X X + 3 1 i a a a a b b i a

  45. Example: Suppose X1=2, X2=0, X3=2, X4=5, and X5=1 5 = i = + + + + = + + + + = 2 0 2 5 1 10 X X X X X X 1 2 3 4 5 i 1 X X X X X 5 1 2 0 2 5 1 = i = + + + + = + + + + = 3 5 1 2 4 2 X i 5 5 5 5 5 5 5 5 5 5 5 1 5 = i 2 = + + + + = + + + + = 2 2 2 2 2 2 0 2 5 1 4 0 4 25 1 34 iX 1

  46. In general, 2 b b = i 2 X X i i = a i a Example: 2 5 = i 5 2= = = 102 34 X 100 i X , while 1 = i i 1

  47. Summary II Interval Data Nominal Data Histogram Ogive Frequency Polygon Stem-and-Leaf Pie Charts Column/Bar Chart Pareto Diagram Single Set of Data Scatter Plot Contingency Table, Bar Charts Relationship Between Two Variables 9/10/2024 Towson University - J. Jung 2.47

  48. Review: Chapter 2 - Graphs What is categorical data? What is numeric/interval data? What graphs can you make for ordinal data? What graphs can you make for interval data? What are the steps involved in making a bar chart in Excel? How do you make a histogram in Excel? 9/10/2024 Towson University - J. Jung 48/15

Related


More Related Content

giItT1WQy@!-/#giItT1WQy@!-/#giItT1WQy@!-/#giItT1WQy@!-/#giItT1WQy@!-/#giItT1WQy@!-/#giItT1WQy@!-/#giItT1WQy@!-/#giItT1WQy@!-/#giItT1WQy@!-/#giItT1WQy@!-/#giItT1WQy@!-/#