Understanding Active Learning in Machine Learning

Slide Note
Embed
Share

Active Learning (AL) is a subset of machine learning where a learning algorithm interacts with a user to label data for desired outputs. It aims to minimize the labeling bottleneck by achieving high accuracy with minimal labeled instances, thus reducing the cost of obtaining labeled data. Techniques like membership query synthesis, stream-based selective sampling, and pool-based sampling play key roles in AL, along with query strategies like Uncertainty Sampling and Query-By-Committee (QBC).


Uploaded on Jul 16, 2024 | 0 Views


Download Presentation

Please find below an Image/Link to download the presentation.

The content on the website is provided AS IS for your information and personal use only. It may not be sold, licensed, or shared on other websites without obtaining consent from the author. Download presentation by click this link. If you encounter any issues during the download, it is possible that the publisher has removed the file from their server.

E N D

Presentation Transcript


  1. Active Learning(AL)

  2. Machine Learning Supervised Learning label data Machine Learning Unsupervised Learning unlabel data label data & unlabel data Semi-supervised Learning

  3. What is AL? Active learning(AL)is the subset of machine learning in which a learning algorithm can query a user interactively to label data with the desired outputs. In statistics literature, it is sometimes also called query learning( ) optimal experimental design( ).The information source is also called teacher or oracle.

  4. Why AL? attempt to overcome the labeling bottleneck by asking queries achieve high accuracy using as few labeled instances as possible(minimizing the cost of obtaining labeled data)

  5. How AL?---Three Senerios 1 2 3

  6. Membership query synthsis learner randomly choose query instance to oracle to label the data

  7. Stream-based Selective Sampling the learner decide whether or not to request its label

  8. Pool-based Sampling massive unlabeled data in the real world

  9. How AL?

  10. How AL?

  11. The core of AL--Query Strategy Framework Uncertainty Sampling( ) Query-By-Committee(QBC)( ) Expected Model Change( ) Expected Error Reduction( ) Variance Reduction( ) Density-Weighted Methods( )

  12. Uncertainty Sampling 1 Least Confident( ) 2 Margin Samplnig( ) 3 Entropy( ) 333438303935353b333633373435383bbcfdcdb7

  13. Query-By-Committee(QBC) Vote Entropy( ) Average Kullback-Leibler Divergence( KL ) 333438303935353b333633373435383bbcfdcdb7

  14. Density-Weighted Methods B A

  15. Experiment Logistic regression 400 instances, sampled from two class Gaussians. 30 actively queried instances 30 randomly labeled instances accuracy 90% accuracy 70%

  16. Differences with Semi-Supervised learning 1 Definition actively choose data 2 Processing active learning: intro the extra expert knowledge; choose the most misjudge data to oracle semi-superviesed: choose the least misjudge data

  17. Application Recommender Systems NLP Text classification Image classification

Related


More Related Content