CS-485 FINAL PROJECT

undefined
 
CS-485 FINAL PROJECT
 
Corrine Elliott
Data Mining / Liu
28 April 2016
 
PROBLEM OVERVIEW
 
 
Research Question: 
Given information on a shelter cat or dog’s breed, color, sex and
age, can we predict the animal’s fate?
 
Data-Mining Approaches:
Naïve Bayes Classifier
C4.5 Decision Tree
A priori
Frequent-Pattern (FP) Growth
 
Existing Kaggle submissions:
Random Forest
Conditional probabilities, 
e.g.
, P(outcome|age)
 
DATASET: SHELTER ANIMALS
 
Training Data-set:
 
 
26729 animals
 
Attributes:
ID: A######
Name
Date / Time
Outcome 
/ subtype
Species
: Cat or Dog
Sex
: Intact, Neutered or Spayed + M/F
Age
: # + units
Breed 
and 
Color
 
Test Data-set:
 
 
11456 animals
 
Attributes:
ID: 1 - 11456
Name
Date / Time
Species
: Cat or Dog
Sex
: Intact, Neutered or Spayed + M/F
Age
: # + units
Breed 
and 
Color
 
NAÏVE BAYES CLASSIFIER
 
 
Missing data omitted when computing conditional probabilities
 
Analysis:
k
-fold cross-validation
Assigned highest-probability classification
 
 
 
 
 
 
C4.5 Decision Tree: 37.9 %
 
A PRIORI / FP GROWTH
 
 
Minimum support: 20 %
 
Maximal itemsets:
{Transfer, Cat} : 20.60 %
{Adoption, <1 year} : 21.47 %
{Adoption, Dog} : 24.31 %
Relative to 15.98 % for {Adoption, Cat}
 
Association Rules:
{Transfer, Cat} -> Domestic Shorthair Mix
Support : 20.60 %
Confidence : 82.4342 %
 
“Take A Look at the Data” 
[1]
 
 
“Dogs tend to be returned to owner more
often than cats … and cats are transferred
more often than dogs.”
 
“Young cats and dogs [tend] to be adopted
or transferred, while older animals with
approximately equal probability can be
adopted, transferred or returned.”
 
“Neutered animals have high chances to be
adopted, while intact animals are more likely
to be transferred.”
 
[1] https://www.kaggle.com/uchayder/shelter-animal-outcomes/take-a-look-at-the-data
 
A PRIORI / FP GROWTH
 
 
Minimum support: 20 %
 
Maximal itemsets:
{Transfer, Cat}
 : 20.60 %
{Adoption, <1 year}
 : 21.47 %
{Adoption, Dog} : 24.31 %
Relative to 15.98 % for {Adoption, Cat}
 
Association Rules:
{Transfer, Cat} -> Domestic Shorthair Mix
Support : 20.60 %
Confidence : 82.4342 %
 
“Take A Look at the Data” 
[1]
 
 
“Dogs tend to be returned to owner more
often than cats … and 
cats are transferred
more often than dogs.”
 
Young cats and dogs [tend] to be adopted
or transferred, while older animals with
approximately equal probability can be
adopted, transferred or returned.”
 
“Neutered animals have high chances to be
adopted, while intact animals are more likely
to be transferred.”
 
[1] https://www.kaggle.com/uchayder/shelter-animal-outcomes/take-a-look-at-the-data
 
ROOM FOR IMPROVEMENT:
 
 
Incorporate name data
 
Subset by species
 
Categorize breeds
 
Reassess age categories
 
Visualize the data
 
 
 
Figure source: Megan L. Risdal’s “Quick & Dirty Random Forest” Kaggle submission
 
https://www.kaggle.com/mrisdal/shelter-animal-outcomes/quick-dirty-randomforest
Slide Note
Embed
Share

This project delves into data mining approaches using Naive Bayes Classifier, C4.5 Decision Tree, and a priori Frequent-Pattern Growth to predict outcomes of shelter cats and dogs based on breed, color, sex, and age. The analysis includes k-fold cross-validation, assigned probability classifications, error rates, association rules, and insights into adoption, transfers, and returns of animals. Explore the dataset, methods used, and findings to deepen understanding of animal shelter outcomes.

  • Shelter Animals
  • Data Mining
  • Naive Bayes
  • Decision Tree
  • Association Rules

Uploaded on Feb 15, 2025 | 0 Views


Download Presentation

Please find below an Image/Link to download the presentation.

The content on the website is provided AS IS for your information and personal use only. It may not be sold, licensed, or shared on other websites without obtaining consent from the author.If you encounter any issues during the download, it is possible that the publisher has removed the file from their server.

You are allowed to download the files provided on this website for personal or commercial use, subject to the condition that they are used lawfully. All files are the property of their respective owners.

The content on the website is provided AS IS for your information and personal use only. It may not be sold, licensed, or shared on other websites without obtaining consent from the author.

E N D

Presentation Transcript


  1. Corrine Elliott Data Mining / Liu 28 April 2016 CS-485 FINAL PROJECT

  2. PROBLEM OVERVIEW Research Question: Given information on a shelter cat or dog s breed, color, sex and age, can we predict the animal s fate? Data-Mining Approaches: Na ve Bayes Classifier C4.5 Decision Tree A priori Frequent-Pattern (FP) Growth Existing Kaggle submissions: Random Forest Conditional probabilities, e.g., P(outcome|age)

  3. DATASET: SHELTER ANIMALS Training Data-set: Test Data-set: 26729 animals 11456 animals Attributes: ID: A###### Name Date / Time Outcome / subtype Species: Cat or Dog Sex: Intact, Neutered or Spayed + M/F Age: # + units Breed and Color Attributes: ID: 1 - 11456 Name Date / Time Species: Cat or Dog Sex: Intact, Neutered or Spayed + M/F Age: # + units Breed and Color

  4. NAVE BAYES CLASSIFIER Missing data omitted when computing conditional probabilities Analysis: k-fold cross-validation Assigned highest-probability classification k Expected Error Rate Variance in Error Rate 2 0.469619874289 2.90263253541e-05 4 0.46905866507 9.53140200466e-05 6 0.471448884897 4.986052551e-05 8 0.466252618976 1.99723963229e-05 10 0.468163448586 0.000100299847022 C4.5 Decision Tree: 37.9 %

  5. A PRIORI / FP GROWTH Take A Look at the Data [1] Minimum support: 20 % Maximal itemsets: {Transfer, Cat} : 20.60 % {Adoption, <1 year} : 21.47 % {Adoption, Dog} : 24.31 % Relative to 15.98 % for {Adoption, Cat} Dogs tend to be returned to owner more often than cats and cats are transferred more often than dogs. Young cats and dogs [tend] to be adopted or transferred, while older animals with approximately equal probability can be adopted, transferred or returned. Association Rules: {Transfer, Cat} -> Domestic Shorthair Mix Support : 20.60 % Confidence : 82.4342 % Neutered animals have high chances to be adopted, while intact animals are more likely to be transferred. [1] https://www.kaggle.com/uchayder/shelter-animal-outcomes/take-a-look-at-the-data

  6. A PRIORI / FP GROWTH Take A Look at the Data [1] Minimum support: 20 % Maximal itemsets: {Transfer, Cat} : 20.60 % {Adoption, <1 year} : 21.47 % {Adoption, Dog} : 24.31 % Relative to 15.98 % for {Adoption, Cat} Dogs tend to be returned to owner more often than cats and cats are transferred more often than dogs. Young cats and dogs [tend] to be adopted or transferred, while older animals with approximately equal probability can be adopted, transferred or returned. Association Rules: {Transfer, Cat} -> Domestic Shorthair Mix Support : 20.60 % Confidence : 82.4342 % Neutered animals have high chances to be adopted, while intact animals are more likely to be transferred. [1] https://www.kaggle.com/uchayder/shelter-animal-outcomes/take-a-look-at-the-data

  7. ROOM FOR IMPROVEMENT: Incorporate name data Subset by species Categorize breeds Reassess age categories Visualize the data Figure source: Megan L. Risdal s Quick & Dirty Random Forest Kaggle submission https://www.kaggle.com/mrisdal/shelter-animal-outcomes/quick-dirty-randomforest

Related


More Related Content

giItT1WQy@!-/#giItT1WQy@!-/#giItT1WQy@!-/#giItT1WQy@!-/#giItT1WQy@!-/#