Decoding Sarcasm in Tweets: A Comprehensive Analysis

Slide Note
Embed
Share

This research delves into the realm of sarcasm detection in tweets, utilizing a dataset of sarcastic and non-sarcastic tweets to build a model for classification. Through methods like feature extraction and model building with WEKA, the study aims to enhance the understanding of sarcasm detection on social media platforms, particularly Twitter. The content encompasses data analysis, skewed and non-skewed tweet sets, evaluation methodology, and the process of determining tweet sentiments using features and attributes. The study also highlights the challenges and techniques involved in identifying sarcasm within the context of social media communication.


Uploaded on Sep 12, 2024 | 0 Views


Download Presentation

Please find below an Image/Link to download the presentation.

The content on the website is provided AS IS for your information and personal use only. It may not be sold, licensed, or shared on other websites without obtaining consent from the author. Download presentation by click this link. If you encounter any issues during the download, it is possible that the publisher has removed the file from their server.

E N D

Presentation Transcript


  1. Analyzing Sarcasm in Tweets Sydney Mabry Dr. Bridget McInnes

  2. Finding Sarcastic Tweets @Microsoft Is it normal that it takes hours to check the email name for creating an microsoft account? 2nd try! Looks like we're getting the heaviest snowfall in five years tomorrow. Awesome. I'll never get tired of winter.

  3. Data - Skewed set: - 100 sarcastic tweets and 491 non-sarcastic tweets Non-skewed set - 100 sarcastic tweets and 100 non-sarcastic tweets - 430871355035119617 TS111209 null 434472369948991488 TS111210 positive @frisky_ferret @ThunderDramon @XerxesWuff 523094999802863616 TS111420 neutral 0.2 It's October 17th and I just saw a Christmas commercial... 621375187255083008 TS113870 null 0.04 Today only is Amazon Prime Day, which offers more deals than Black Friday! If you plan to shop, don't forget to... http://t.co/NAr3TbYeMd 1.0 1.0 @NickKlopsis c'mon man, you HAVE to take a side...#sarcasm More snow tomorrow Wooooooooooooooo! #sarcasm @Taggzzz @SchismV

  4. Method WEKA Results: - - - Makes a model based on the features Precision Recall F-measure Tweets

  5. Features of a Tweet Tweet: So excited for snow #SnowOnTuesday Unigrams: So excited for snow Bigrams: So excited excited for for snow Hashtags: #SnowOnTuesday Label: sarcastic

  6. Method - Make a file to send to Weka - Determining if the tweets contain words listed in %docWords @RELATION tweet_sentiment_train @RELATION tweet_sentiment_test @ATTRIBUTE wed NUMERIC @ATTRIBUTE wednesday NUMERIC @ATTRIBUTE week NUMERIC @ATTRIBUTE weekend NUMERIC @ATTRIBUTE well NUMERIC @ATTRIBUTE were NUMERIC @ATTRIBUTE label {sarcastic, not_sarcastic} @ATTRIBUTE wed NUMERIC @ATTRIBUTE wednesday NUMERIC @ATTRIBUTE week NUMERIC @ATTRIBUTE weekend NUMERIC @ATTRIBUTE well NUMERIC @ATTRIBUTE were NUMERIC @ATTRIBUTE label {sarcastic, not_sarcastic} @DATA 0,1,1,1,0,0,not_sarcastic 1,0,0,0,0,1,sarcastic 0,1,0,1,0,0,sarcastic 0,1,0,1,0,1,not_sarcastic 1,0,1,0,1,1,sarcastic @DATA 0,0,0,1,0,0,not_sarcastic 1,0,0,1,0,0,sarcastic 0,0,0,1,1,0,not_sarcastic

  7. Evaluation Methodology - Split data into 10 buckets - 9 buckets train, 1 bucket tests - Gathered all the attributes in the tweets hash table - Attributes from the training buckets would go in %docAtts - Perl module - Hash table of attributes - Bucket

  8. Evaluation Precision Recall F-Measure Class 0.800 0.571 0.727 0.889 0.667 0.800 sarcastic not_sarcastic Weighted Avg. 0.759 0.750 0.742

  9. Results Skewed: - Getting around 87% accuracy when using unigrams - Regardless of other attributes - Without unigrams it is around 84% accuracy - - Non-skewed: - Ranges from 85%-91% accuracy when using unigrams and other attributes - Without using unigrams it is around 77% accuracy

Related