Introduction to Descriptive Statistics in F.Y.B.Sc. Statistics Paper I Semester 1
Statistics is the science of numbers and data, involving populations and samples. Characteristics can be qualitative or quantitative, while variables can be discrete or continuous. Data can be primary or secondary, collected through various methods. Scales of measurement include nominal and ordinal scales, categorizing data into groups. Learn about sampling techniques, types of data, and more in this introductory study of descriptive statistics.
Download Presentation
Please find below an Image/Link to download the presentation.
The content on the website is provided AS IS for your information and personal use only. It may not be sold, licensed, or shared on other websites without obtaining consent from the author. Download presentation by click this link. If you encounter any issues during the download, it is possible that the publisher has removed the file from their server.
E N D
Presentation Transcript
F.Y.B Sc Statistics Paper I Semester 1 Descriptive Statistics Introduction Statistics is the science of numbers and data Definition: Population Collection of all units in the universe. Population can be finite or Infinite. Example: Total students in a college or class . Collection of stars in sky can be termed as infinite population. Sample: A subset of the population. say for ex. Number of students for F Y B Sc class is a sample. One can always draw inference about population on the basis of sample.
The process of drawing a unit from population is sampling. Types of sampling: There are two ways of drawing a sample. 1.Simple Random sample with replacement (SRSWR) 2.Simple Random sample without replacement (SRSWOR) IN SRSWR the number of units n remain the same whereas for SRSWOR the number of units decrease ie n, n-1, n-2 . Types of Characteristics: Characteristics are qualitative and quantitative in nature. They can also be termed as Attributes (Non-numeric and Variables (Numeric). Example of Attribute- Colour of eyes , hair, beauty, intelligence are qualities possessed by the individuals Example of variable- Any numeric value taken by a random variable, height or weight of individuals, marks of students.
Variables we further classified as Discrete and Continuous random variables. When a variable takes whole number or integer values it is said to be discrete in nature. Ex- Number of children in a family When a variable takes intermittent or decimal values it is continuous in nature. Ex- Temperature of a place, rainfall data. Time series data and Geographic data Time series related to time. Week/day/year Ex. Population for 10 years, Results of Univ for 2 years Geographic data relates to location. Ex. Temperature or rainfall in four cities
Ex. Temperature or rainfall in four cities Types of data- Primary and Secondary data Primary data The data that is actually collected. Ex- Through interview/questionaire Secondary data is not collected personally. Data from journals/reports/ magazines
Methods of Primary data collection 1.Direct Personal observation 2. Indirect oral interview- Not directly from source but interview persons closely related. used in sensitive issues wealth ,illegal activities 3. Mailed questionnaire 4. Schedule Method- Through enumerators 5. Local agents 6. Online survey Methods of Secondary data collection 1.Published sources-Reports, Publications, Journals, Magazines. 2. Unpublished sources-Records by NGO s, unpublished research material. 1
Scales of Measurement- Describes the nature of information within the numbers assigned to variables. 4 scales of measurement are: Nominal scale-applied to qualitative data which is classified into groups. Ex. Sex, Religion, Eye colour. Ordinal Scale- applied to data which can be classified and ranked. Ex. Socioeconomic status of families. Interval Scale-Allows classification, ranking and find the differences between the items. Ex. difference between temperature measured with Celsius and farenheit scale. Ratio Scale- A ratio scale has absolute zero point. And can be expressed as a ratio. Ex. Weight, height, distance.
Tabulation of Data Process of consolidating raw data is known as classification and tabulation. Classification- Arrangement/organizing of data according to behaviour, nature and characteristics. Ex. Faculty wise classification of students/ Sexwise / category wise etc Classification enables identifying a specific pattern in the data. Data are classified according to attributes and variables. Attributes are further classified as a)Simple classification(Dichotomy).Presence of attibute is denoted by capital letter and absence by small letter. and b)Multiple classification
For example Population divided as Males and females is dichotomous classification. Further division of Males as employed/un employed,education level- HSC,UG,PG is manifold or multiple classification. Tabulation-Process of data condensation for convenience in presentation,processing and interpretation. Components of a Table a. Table number and Title. b. Stubs and captions c. Body of table d. Foot note(symbols, abbreviations) , source note, and units of measurements
Table Number 01- Workers in factory according to sex and location in thousands Location Male Female Total Urban 21 14 35 Rural 15 5 20 Total 36 19 55
Source office records, Footnote- Figures are rounded to nearest thousands Merits and Demerits of Table 1 Simplifies complex data and facilitates comparision. 2 Easy for graphical presentation where prominent features of data are revealed. 3 Time consuming and lack of description are the main Demerits. 4 Tables can be simple, complex general and specific according to the nature of the data.
Attributes: Classification of attributes into two disjoint classes depending on the presence or absence of the attribute is dichotomous classification Ex. A:Literate ; :Illiterate Class frequencies. The different attributes, their subgroups and combinations are called different classes and the number of observations assigned to them are class frequencies. For k attributes the number of class frequencies =3^k, k=2 then the number of frequencies =9.
The contingency table of order (2X2) for two attributes A and B can be displayed as given below A Total B AB B B A Total A N
Relationship betweenthe class frequencies: The frequency of a lower order class can always be expressed in terms of the higher order class frequencies.i.e., N = ( A ) + ( )= (B) + ( ) (A)= (AB) + (A ) ; ( ) = ( B) + ( ) (B) = (AB) + ( B) ; ( ) = (A )+ ( ) Consistency of Data- 1. All class frequencies are positive 2. All relations between the class frequencies is satisfied.
When we have one attributes 1(A) 0; 2( ) 0 i.e, (A) N 0 (A) N When we have two attributes 1(AB) 0; 2( B) 0 i.e, (AB) (B) 3(A ) 0 i.e. (AB) (A) 4( ) 0 i.e. (AB) (A)+(B)-N Max{0, (A)+(B)-N } (AB) Min{(A) , (B) } When we have Three attributes 1(ABC) 0 ; 2 (A C ) 0i.e. (ABC) (AC) 3 (AB ) 0i.e. (ABC) (AB) ; 4 (A ) 0 i.e. (ABC) (AB)+(AC)-(A) 5( BC) 0 i.e,(ABC) (BC) ; 6 ( B ) 0 i.e, (ABC) (BC)+(AB)-(B) 7( C) 0 i.e. (ABC) (AC)+(BC)-( C ) 8( ) 0 i.e. (ABC) (AB)+(BC)+(AC)-(A)-(B)-(C)+N Max{0,(AB)+(AC)-(A), (BC)+(AB)-(B) , (AB)+(BC)-( C) } (ABC) Min{(AB) , (BC) , (AC), (AB)+(BC)+(AC)-(A)-(B)-(C)+N }
Ex. Test the consistency of the following data with the symbols having their usual meaning.i)N = 1000 (A) = 600 (B) = 500 (AB) = 50 ii)(A)=48; (AB)=50 Solution:i) Since ( ) = -50, (AB) (A)+(B)-N the given data is inconsistent ii) Here (AB) >(A) but for consistency (AB) (A) , hence data is inconsistent. Independence of Attributes: If the attributes are said to be independent the presence or absence of one attribute does not affect the presence or absence of the other. For example, the attributes skin colour and intelligence of persons are independent.If two attributes A and B are independent then the actual frequency is equal to the expected frequency (AB) = (A).(B)/ N. Similarly ( ) = ( ).( )/ N Association of attributes: Two attributes A and B are said to be associated if they are not independent but they are related with each other in some way or other. The attributes A and B are said to be positively associated if(AB) > (A).(B)/ N If (AB) < (A).(B)/N ,. Then they are said to be negatively associated.
If A and B are such that A always occurs with B and never with i.e.(AB)=A or (AB)= B then A & B are said to be completely associated. And if A never occurs with B i.e (AB)=0 then they are completely dissociated. Yule s co-efficient of association: Prof. G. Undy Yule has suggested a formula to measure the degree of association. It is a relative measure of association between two attributes A and B. If (AB), ( B), (A ) and ( ) are the four distinct combination of A, B, and then Yules co-efficient of association and lies between -1 to +1. Q=(AB)( )-(A )( B)/(AB)( )+(A )( B) If Q = +1 there is perfect positive association If Q = -1 there is perfect negative association If Q=0 then A & B are independent. Yule s coefficient of colligation Y=Sqr(AB)( )- Sqr(A )( B)/ Sqr(AB)( )+ Sqr(A )( B) Relation between Q and Y is Q=2Y/1+Y^2