Real-Time Cough and Sneeze Detection Project Overview

Slide Note
Embed
Share

This project focuses on real-time cough and sneeze detection for assessing disease likelihood and individual well-being. Deep learning, particularly CNN and CRNN models, is utilized for efficient detection and classification. The team conducted a literature survey on keyword spotting techniques and leveraged datasets like AudioSet by Google and COVID-19 Cough Dataset for training. Feature extraction methods such as PCEN and Mel spectrogram are employed for analysis. Baseline CNN and CRNN model architectures are detailed, showcasing the design for accurate detection in audio samples.


Uploaded on Sep 08, 2024 | 1 Views


Download Presentation

Please find below an Image/Link to download the presentation.

The content on the website is provided AS IS for your information and personal use only. It may not be sold, licensed, or shared on other websites without obtaining consent from the author. Download presentation by click this link. If you encounter any issues during the download, it is possible that the publisher has removed the file from their server.

E N D

Presentation Transcript


  1. ECE 228: Real-Time Cough and Sneeze Detection Group 71: Ravi Patel, Victor Miranda, Ali Zaidi

  2. Project Background Detection of cough or sneeze shows likelihood of a disease and well-being of an individual

  3. Project Background System can be used to further analyze the health of an individual from the detected cough and/or sneeze

  4. Literature Survey Traditionally, KWS is based on HMM [1] State-of-the-art KWS system is replaced by DNN [1] End-to-End CRNN for keyword spotting [2]

  5. Why Deep Learning Performance Deep Learning has shown to outperform other models Works well with big data Speed Fast classification time Power Lightweight models with low power consumption Can run on mobile platforms E.g. phone app, Raspberry Pi

  6. Dataset Details 1. AudioSet by Google a. CSV of ~2M YouTube video IDs containing 527 classes b. Developed script to extract ~650 cough and ~850 sneeze WAV files 2. Freesound Collaborative Database a. ~40 each of cough and sneeze WAV files 3. COVID-19 Cough Dataset (from TA Brain) a. ~1000 cough WAV files

  7. Feature Extraction Per-Channel Energy Normalization (PCEN) Mel Spectrogram [1][2][3][4] FFT window length = 25ms Step size = 10ms Number of Mel channels = 40 Input dimension: (batch_size, time_samples, frequency_samples, channel_size) = (N_samples, T_samples, 40, 1)

  8. Baseline CNN Model Architecture [1] 2 1. Input 2. Conv2D (8 filters) a. 3x3 kernel b. 1x1 stride c. No activation 3. MaxPooling2D (2x2) 4. Flatten 5. Dense (64 units) a. ReLU activation 6. Softmax Output (2 units) 1 4 3 5 6

  9. CRNN Model Architecture [2] 1. 2. Input Conv2D (32 filters) a. 20x5 kernel b. 8x2 stride c. ReLU activation d. BatchNorm MaxPooling2D (2x2) a. Dropout (30%) Reshape for RNN RNN (32 units) RNN (32 units) Flatten Dense (64 units) a. ReLU activation Softmax Output (2 units) 1 2 3 4 6 8 3. 4. 5. 6. 7. 8. 9 5 7 9.

  10. Baseline CNN Results

  11. CRNN Results --- GRU Cells

  12. CRNN Results --- LSTM Cell

  13. Future Steps So far: developed classification model for sneeze vs. cough Will stick with and improve CRNN model Next step: improve live audio feed Pass in real-time audio chunks into model Next step: develop inverse anomaly detection Ignores sound files that are NOT sneeze or cough

  14. References 1. Tara Sainath, et al. "Convolutional Neural Networks for Small-Footprint Keyword Spotting." Interspeech, 2015. Sercan O. Arik, et al. "Convolutional Recurrent Neural Networks for Small-Footprint Keyword Spotting, 2017. G. Chen, et al. "Small-footprint keyword spotting using deep neural networks." 2014 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), 2014. A. H. Michaely, et al. "Keyword spotting for Google assistant using contextual speech recognition." 2017 IEEE Automatic Speech Recognition and Understanding Workshop (ASRU), 2017. J. Liu, et al. "Cough detection using deep neural networks." 2014 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), 2014. 2. 3. 4. 5.

  15. Code Rundown 1. AudioSet data extraction 2. Feature extraction 3. Model building and training 4. Model performance metrics 5. Initial real-time examples

Related


More Related Content