Machine learning applied to mobile phone data for public statistics prediction

Slide Note
Embed
Share

Researchers from multiple institutions are utilizing mobile phone data in Senegal to predict public statistics, leveraging machine learning techniques. Projects like GUISSTANN aim to forecast indicators from census data using telephone and mobile money data. The analysis includes call/text information, international communication patterns, and population mobility insights. Challenges such as data confidentiality and restricted access are being addressed to ensure the success of these predictive models.


Uploaded on Mar 20, 2024 | 4 Views


Download Presentation

Please find below an Image/Link to download the presentation.

The content on the website is provided AS IS for your information and personal use only. It may not be sold, licensed, or shared on other websites without obtaining consent from the author. Download presentation by click this link. If you encounter any issues during the download, it is possible that the publisher has removed the file from their server.

E N D

Presentation Transcript


  1. Machine learning applied to mobile phone data for public statistics prediction Ayizou R.1,Clochard G.2,Diakit S.3,Diallo H.1,Fall O.3,Hollard G.4 1 ENSAE of Dakar, 2 University of Chicago, 4 Crest,Ecole Polytechnique and CNRS 3 ANSD, April 2023

  2. GUISSTANN ANSD Sonatel IP-Paris An on going project in Senegal To study the possibility of predicting indicators contained in the general population census of Senegal (to be conducted in 2023) from telephone and mobile money data.

  3. GUISSTANN PRELIMINARY PHASE Effective access to pre-census data The purpose is to predict some Process still going on for access to CDR indicators using the pre-census data No empirical results at this time and Orange s subscribers mobile phone data.

  4. Mobile phone data: digital emprunt Tower Informations generated : Call/text - origin/destination Call duration (volume for text) International Call/text date (day of the week, time, ) Etc Tower Phone data vs Big data: High frequency of generation Large part of the population use Subscriber mobile phone Available from telephone operators Tower with good collaboration National

  5. Mobile phone data: empirical evidence Example of poverty map using CDR in Rwanda (Blumenstock et al. 2015) Successfully used for: High resolution poverty map (Blumenstock et al. 2015) Improve humanitarian aid targeting (Aiken et al.2022) Polulation mobility analysis (Erfani and Frias-Martinez, 2022) Individuals characteristics prediction (Eaman J. et al. 2017)

  6. Mobile phone data and ML: The paradigm Predictive Program Train set Processing Predict indicator s level for all the operator suscribers Test set Field Survey Data CDRs Field Survey geographic level agregation Geographic area level agregation Public Statistics Indicator Features extraction External validation (correlation analysis)

  7. Mobile phone data and ML: Keys concerns The matching of Sonatel and ANSD data introduces a risk for the confidentiality of personal data IT architecture dedicated to the project: avoid data exchange between ANSD and Sonatel Very restricted access to Senegal only Use of pseudonymized data Ethical aspect evaluation (CDP) Models estimated using only subscriber data Poor consideration of people who do not have a phone

  8. The contribution of GUISSTANN More frequent updates of census indicators Possibility to conduct surveys on small samples and be able to calculate indicators for the whole population with finer scales of precision Possibility to combine the mobile data with census information to measure what we can predict with mobile phone data, something that is still unclear. Possibility to evaluate the accuracy using census data (exhaustivity !)

  9. CONCLUSION Using CDR data to predict something else requires to match databases that come from different data producers Learning can be performed on relatively small datasets (far from big data thus) AI projects using CDR require to collect data on the ground (AI is not a substitute) The idea would be to collect better data on the ground and to use AI as an amplifier No AI project without Ethics and data security!

  10. Thanks for your attention !

Related


More Related Content