Decoding Sarcasm in Tweets: A Comprehensive Analysis

 
Analyzing Sarcasm in Tweets
 
Sydney Mabry
Dr. Bridget McInnes
 
Finding Sarcastic Tweets
 
@Microsoft Is it normal that it takes hours to check the
email name for creating an microsoft account? 2nd try!
 
Looks like we're getting the heaviest snowfall in five
years tomorrow. Awesome. I'll never get tired of winter.
 
Data
 
-
Skewed set:
-
100 sarcastic tweets and 491 non-sarcastic tweets
-
Non-skewed set
-
100 sarcastic tweets and 100 non-sarcastic tweets
 
430871355035119617
 
TS111209
 
 null
 
1.0
 
@NickKlopsis c'mon man, you HAVE to take a side...#sarcasm
434472369948991488
 
TS111210
 
 positive
 
1.0
 
More snow tomorrow Wooooooooooooooo! #sarcasm @Taggzzz @SchismV
@frisky_ferret @ThunderDramon @XerxesWuff
523094999802863616    TS111420    neutral    0.2    It's October 17th and I just saw a Christmas commercial...
621375187255083008    TS113870    null    0.04    Today only is Amazon Prime Day, which offers more deals than Black Friday! If you
plan to shop, don't forget to... http://t.co/NAr3TbYeMd
 
Method
 
WEKA
 
Tweets
 
Results:
-
Precision
-
Recall
-
F-measure
 
Makes a
model based
on the
features
 
Features of a Tweet
 
Unigrams:
So excited
So
sarcastic
excited
for
snow
excited for
for snow
 
Bigrams:
 
Label:
 
Tweet:
So excited for snow #SnowOnTuesday
 
Hashtags:
#SnowOnTuesday
 
Method
 
-
Make a file to send to Weka
-
Determining if the tweets contain words listed in %docWords
 
@RELATION tweet_sentiment_test
 
@ATTRIBUTE wed NUMERIC
@ATTRIBUTE wednesday NUMERIC
@ATTRIBUTE week NUMERIC
@ATTRIBUTE weekend NUMERIC
@ATTRIBUTE well NUMERIC
@ATTRIBUTE were NUMERIC
@ATTRIBUTE label {sarcastic, not_sarcastic}
 
@DATA
0,0,0,1,0,0,not_sarcastic
1,0,0,1,0,0,sarcastic
0,0,0,1,1,0,not_sarcastic
 
@RELATION tweet_sentiment_train
 
@ATTRIBUTE wed NUMERIC
@ATTRIBUTE wednesday NUMERIC
@ATTRIBUTE week NUMERIC
@ATTRIBUTE weekend NUMERIC
@ATTRIBUTE well NUMERIC
@ATTRIBUTE were NUMERIC
@ATTRIBUTE label {sarcastic, not_sarcastic}
 
@DATA
0,1,1,1,0,0,not_sarcastic
1,0,0,0,0,1,sarcastic
0,1,0,1,0,0,sarcastic
0,1,0,1,0,1,not_sarcastic
1,0,1,0,1,1,sarcastic
 
Evaluation Methodology
 
-
Split data into 10 buckets
-
9 buckets train, 1 bucket tests
 
-
Gathered all the attributes in the tweets – hash table
-
Attributes from the training buckets would go in %docAtts
 
-
Perl module
-
Hash table of attributes
-
Bucket
 
Evaluation
 
Precision  Recall   F-Measure  Class
0.800  
 
   0.571
 
 0.667 
 
      sarcastic
0.727  
 
   0.889
 
 0.800  
 
      not_sarcastic
 
Weighted Avg.
0.759  
 
   0.750
 
 0.742
 
Results
 
-
Skewed:
-
Getting around 87% accuracy when using unigrams
-
Regardless of other attributes
-
Without unigrams it is around 84% accuracy
 
-
Non-skewed:
-
Ranges from 85%-91% accuracy when using unigrams and other
attributes
-
Without using unigrams it is around 77% accuracy
Slide Note
Embed
Share

This research delves into the realm of sarcasm detection in tweets, utilizing a dataset of sarcastic and non-sarcastic tweets to build a model for classification. Through methods like feature extraction and model building with WEKA, the study aims to enhance the understanding of sarcasm detection on social media platforms, particularly Twitter. The content encompasses data analysis, skewed and non-skewed tweet sets, evaluation methodology, and the process of determining tweet sentiments using features and attributes. The study also highlights the challenges and techniques involved in identifying sarcasm within the context of social media communication.

  • Sarcasm Analysis
  • Social Media
  • Data Analysis
  • WEKA
  • Tweet Sentiment Detection.

Uploaded on Sep 12, 2024 | 0 Views


Download Presentation

Please find below an Image/Link to download the presentation.

The content on the website is provided AS IS for your information and personal use only. It may not be sold, licensed, or shared on other websites without obtaining consent from the author. Download presentation by click this link. If you encounter any issues during the download, it is possible that the publisher has removed the file from their server.

E N D

Presentation Transcript


  1. Analyzing Sarcasm in Tweets Sydney Mabry Dr. Bridget McInnes

  2. Finding Sarcastic Tweets @Microsoft Is it normal that it takes hours to check the email name for creating an microsoft account? 2nd try! Looks like we're getting the heaviest snowfall in five years tomorrow. Awesome. I'll never get tired of winter.

  3. Data - Skewed set: - 100 sarcastic tweets and 491 non-sarcastic tweets Non-skewed set - 100 sarcastic tweets and 100 non-sarcastic tweets - 430871355035119617 TS111209 null 434472369948991488 TS111210 positive @frisky_ferret @ThunderDramon @XerxesWuff 523094999802863616 TS111420 neutral 0.2 It's October 17th and I just saw a Christmas commercial... 621375187255083008 TS113870 null 0.04 Today only is Amazon Prime Day, which offers more deals than Black Friday! If you plan to shop, don't forget to... http://t.co/NAr3TbYeMd 1.0 1.0 @NickKlopsis c'mon man, you HAVE to take a side...#sarcasm More snow tomorrow Wooooooooooooooo! #sarcasm @Taggzzz @SchismV

  4. Method WEKA Results: - - - Makes a model based on the features Precision Recall F-measure Tweets

  5. Features of a Tweet Tweet: So excited for snow #SnowOnTuesday Unigrams: So excited for snow Bigrams: So excited excited for for snow Hashtags: #SnowOnTuesday Label: sarcastic

  6. Method - Make a file to send to Weka - Determining if the tweets contain words listed in %docWords @RELATION tweet_sentiment_train @RELATION tweet_sentiment_test @ATTRIBUTE wed NUMERIC @ATTRIBUTE wednesday NUMERIC @ATTRIBUTE week NUMERIC @ATTRIBUTE weekend NUMERIC @ATTRIBUTE well NUMERIC @ATTRIBUTE were NUMERIC @ATTRIBUTE label {sarcastic, not_sarcastic} @ATTRIBUTE wed NUMERIC @ATTRIBUTE wednesday NUMERIC @ATTRIBUTE week NUMERIC @ATTRIBUTE weekend NUMERIC @ATTRIBUTE well NUMERIC @ATTRIBUTE were NUMERIC @ATTRIBUTE label {sarcastic, not_sarcastic} @DATA 0,1,1,1,0,0,not_sarcastic 1,0,0,0,0,1,sarcastic 0,1,0,1,0,0,sarcastic 0,1,0,1,0,1,not_sarcastic 1,0,1,0,1,1,sarcastic @DATA 0,0,0,1,0,0,not_sarcastic 1,0,0,1,0,0,sarcastic 0,0,0,1,1,0,not_sarcastic

  7. Evaluation Methodology - Split data into 10 buckets - 9 buckets train, 1 bucket tests - Gathered all the attributes in the tweets hash table - Attributes from the training buckets would go in %docAtts - Perl module - Hash table of attributes - Bucket

  8. Evaluation Precision Recall F-Measure Class 0.800 0.571 0.727 0.889 0.667 0.800 sarcastic not_sarcastic Weighted Avg. 0.759 0.750 0.742

  9. Results Skewed: - Getting around 87% accuracy when using unigrams - Regardless of other attributes - Without unigrams it is around 84% accuracy - - Non-skewed: - Ranges from 85%-91% accuracy when using unigrams and other attributes - Without using unigrams it is around 77% accuracy

Related


More Related Content

giItT1WQy@!-/#giItT1WQy@!-/#giItT1WQy@!-/#