Active Learning in Machine Learning

Slide Note

Active Learning (AL) is a subset of machine learning where a learning algorithm interacts with a user to label data for desired outputs. It aims to minimize the labeling bottleneck by achieving high accuracy with minimal labeled instances, thus reducing the cost of obtaining labeled data. Techniques like membership query synthesis, stream-based selective sampling, and pool-based sampling play key roles in AL, along with query strategies like Uncertainty Sampling and Query-By-Committee (QBC).

yara Follow

Uploaded on Jul 16, 2024 | 0 Views

Download Presentation

Please find below an Image/Link to download the presentation.

The content on the website is provided AS IS for your information and personal use only. It may not be sold, licensed, or shared on other websites without obtaining consent from the author.If you encounter any issues during the download, it is possible that the publisher has removed the file from their server.

You are allowed to download the files provided on this website for personal or commercial use, subject to the condition that they are used lawfully. All files are the property of their respective owners.

Download Presentation

The content on the website is provided AS IS for your information and personal use only. It may not be sold, licensed, or shared on other websites without obtaining consent from the author.

E N D

Presentation Transcript

Active Learning(AL)

Machine Learning Supervised Learning label data Machine Learning Unsupervised Learning unlabel data label data & unlabel data Semi-supervised Learning

What is AL? Active learning(AL)is the subset of machine learning in which a learning algorithm can query a user interactively to label data with the desired outputs. In statistics literature, it is sometimes also called query learning( ) optimal experimental design( ).The information source is also called teacher or oracle.

Why AL? attempt to overcome the labeling bottleneck by asking queries achieve high accuracy using as few labeled instances as possible(minimizing the cost of obtaining labeled data)

How AL?---Three Senerios 1 2 3

Membership query synthsis learner randomly choose query instance to oracle to label the data

Stream-based Selective Sampling the learner decide whether or not to request its label

Pool-based Sampling massive unlabeled data in the real world

How AL?

How AL?

The core of AL--Query Strategy Framework Uncertainty Sampling( ) Query-By-Committee(QBC)( ) Expected Model Change( ) Expected Error Reduction( ) Variance Reduction( ) Density-Weighted Methods( )

Uncertainty Sampling 1 Least Confident( ) 2 Margin Samplnig( ) 3 Entropy( ) 333438303935353b333633373435383bbcfdcdb7

Query-By-Committee(QBC) Vote Entropy( ) Average Kullback-Leibler Divergence( KL ) 333438303935353b333633373435383bbcfdcdb7

Density-Weighted Methods B A

Experiment Logistic regression 400 instances, sampled from two class Gaussians. 30 actively queried instances 30 randomly labeled instances accuracy 90% accuracy 70%

Differences with Semi-Supervised learning 1 Definition actively choose data 2 Processing active learning: intro the extra expert knowledge; choose the most misjudge data to oracle semi-superviesed: choose the least misjudge data

Application Recommender Systems NLP Text classification Image classification

Active Learning in Machine Learning

Download Presentation

Presentation Transcript

Related

More Related Content