Understanding Dominance Rule in Data Analysis
Dominance rule in data analysis involves identifying and attributing information to dominant observations that account for a majority of a measure. Concentration ratios, like Herfindahl Indices, can reveal dominant players in various sectors. Dramatic changes in time series data may also indicate dominance. Safe Analyst Training - Dominance Rule module developed by Cancer Research UK, DKFZ, and The Health Foundation focuses on applying these concepts in safe data analysis.
Download Presentation
Please find below an Image/Link to download the presentation.
The content on the website is provided AS IS for your information and personal use only. It may not be sold, licensed, or shared on other websites without obtaining consent from the author. Download presentation by click this link. If you encounter any issues during the download, it is possible that the publisher has removed the file from their server.
E N D
Presentation Transcript
Dominance If a single observation accounts for a majority (or large part) of a measure then that observation can be considered to be dominant. Dominant observations are easier to identify or to attribute information to. To ensure that the dominance rule is met, no single observation should account for more than x% of a measure.
Dominance example Shop Annual Turnover Imagine we worked for Grocery Superstore and we wanted to work out the annual turnover of Supermarket Megastore. Grocery Superstore 8,500,000 Hairdressers 110,000 We know that the total annual turnover for this parade of shops is 21,055,000. We also know that our annual turnover is 8,500,000. Pet Shop 130,000 Nail Salon 85,000 We can deduce that the turnover of the other 9 shops is 12,555,000. This is a reasonable guess for the turnover of Supermarket Megastore (approximately 8% difference). We could also assume that the turnover for the other 8 businesses would not be 0, and so could improve our estimate. Antique Shop 100,000 Caf 160,000 Fish and Chips 200,000 Yoga Studio 75,000 Charity Shop 95,000 If Supermarket Megastore want to estimate the annual turnover for Grocery Superstore they could do a similar calculation. They could deduce that the turnover for the other shops is 9,455,000. Supermarket Megastore 11,600,000 Total turnover 21,055,000 This is 11% more than the actual value. Because the share of the total for Grocery Superstore is lower, it is harder for another dominant observation to accurately calculate its value.
Concentration ratios Concentration ratios (for example Herfindahl Indices) are used to show how much of a measure is attributable to a small number of observations. These may highlight dominance issues. Here we can see the concentration ratios for the 3 largest companies in various sectors. Industry Number of firms Top 3 C.R. 2018 Manufacturing 1204 0.05 Retail 8365 0.01 Digital services 190 0.2 We can see that for oil refiners, the 3 largest companies account for around 70% of the whole sector. This might suggest that there is a dominant company in this sector. Insurance 48 0.3 Oil refining 13 0.7 If a concentration ratio is approaching 1, is there a chance that any one observation might account for more than 40% of the measure (i.e. 0.4 in this example?) Is the concentration ratio close to 1?
Dominance in time series data Does dramatic change suggest dominance? Total household income 12,000,000 Year Number of households If business data have been analysed: be aware that some business activity (e.g. investment) tends to be lumpy e.g. a business may spend a large amount investing in new capital equipment in one particular year, and therefore this single organisation would dominate any measure of this investment 10,000,000 2010 18 8,000,000 2011 18 Household Income 2012 19 6,000,000 2013 17 4,000,000 2014 17 2,000,000 2015 19 2016 20 0 2010 2011 2012 2013 2014 2015 2016 2017 2018 2017 22 Year 2018 21
Assessments for this module SDAP: Safe Analyst Training - Dominance Rule Created by Cancer Research UK, DKFZ, and The Health Foundation for the Safe Data Access Professionals Working Group