Chunking with Support Vector Machines: An Overview

Slide Note
Embed
Share

Chunking with Support Vector Machines involves identifying proper chunks from a sequence of tokens and classifying them into grammatical classes using SVMs. This method utilizes chunk representations like IOB1, IOBES, and pairwise classification to achieve better performance in text chunking tasks. Weighted voting and feature design with surrounding context play important roles in enhancing the accuracy of the classification process.


Uploaded on Sep 26, 2024 | 0 Views


Download Presentation

Please find below an Image/Link to download the presentation.

The content on the website is provided AS IS for your information and personal use only. It may not be sold, licensed, or shared on other websites without obtaining consent from the author. Download presentation by click this link. If you encounter any issues during the download, it is possible that the publisher has removed the file from their server.

E N D

Presentation Transcript


  1. Chunking with Support Vector Machines Zheng Luo

  2. Background Chunking Identify proper chunks from a sequence of tokens. Classify these chunks into some grammatical classes. Support Vector Machine Soft margin hyperplane. Polynomial kernel K(xi, xj) = (xi xj+ 1)dfor combinations of features up to d.

  3. Chunk Representations IOB1 I - Current token is inside of a chunk. O - Current token is outside of any chunk. B - Current token is the beginning of a chunk which immediately follows another chunk. IOB2 B tag is given for every token which exists at the beginning of a chunk. IOE1 E tag is used to mark the last token of a chunk immediately preceding another chunk. IOE2 E tag is given for every token which exists at the end of a chunk.

  4. Chunk Representations IOBES B - Current token is the start of a chunk consisting of more than one token. E - Current token is the end of a chunk consisting of more than one token. I - Current token is a middle of a chunk consisting of more than two tokens. S - Current token is a chunk consisting of only one token. O - Current token is outside of any chunk.

  5. Chunking with SVMs One vs. All Others Requires K SVMs for K different classes. Pairwise Classification K (K 1)/2 SVMs for K different classes. Pairwise classification is used in this paper: Better performance. Tractable for small size of training data for individual SVMs.

  6. Feature Design Surrounding Context Two direction Forward Parsing Backward Parsing

  7. Weighted Voting There are multiple systems of SVMs due to different combinations of chunk representations and parsing directions. Each system contains K (K 1)/2 SVMs for pairwise classification. (1) Uniform Weights (2) Cross Validation Final voting weights are given by the average of the N accuracies using N-fold cross validation.

  8. Weighted Voting (3) VC Bound w = 1 VC bound D max distance from the origin to a training sample

  9. Weighted Voting (4) Leave-One-Out Bound w = 1 El

  10. Experiment Setting 2 parsing directions for each of 4 chunk representations. In total 8 systems of SVMs. Convert results to one of 4 different chunk representations. Make the results comparable in a uniform representation (for weighted voting). 4 uniform representations and 4 types of weights. 16 results for a given dataset using the 8 systems of SVMs.

  11. Experiment Results Accuracy measure: baseNP-L: some experiments are omitted since the dataset is too large. Accuracy vs. Chunk Representation: SVMs perform well regardless of the chunk representation, since SVMs have a high generalization performance and a potential to select the optimal features for the given task.

  12. Experiment Results Effects of Weighted Voting: By applying weighted voting, higher accuracy is achieved than any of single representation system regardless of the voting weights. VC bound has a potential to predict the error rate for the true test data accurately. Performance: VC bound is nearly the same as cross validation. Leave-one-out bound is worse.

Related