A Unified Approach on Active Learning Dual Supervision

Full metadata record

DC Field	Value	Language
dc.contributor.author	Chriswanto, Adrian	en_US
dc.contributor.author	Pao, Hsing-Kuo	en_US
dc.contributor.author	Leet, Yuh-Jye	en_US
dc.date.accessioned	2020-07-01T05:21:49Z	-
dc.date.available	2020-07-01T05:21:49Z	-
dc.date.issued	2019-01-01	en_US
dc.identifier.isbn	978-1-7281-1985-4	en_US
dc.identifier.issn	2161-4393	en_US
dc.identifier.uri	http://hdl.handle.net/11536/154486	-
dc.description.abstract	Active Learning (AL) is a machine learning framework that aims to efficiently select a limited labeled data to construct an effective model given huge amount of unlabeled data on the side. Most studies in AL focus on how to select the unlabeled data to be labeled by a human oracle in order to maximize the performance gain of the model with as little labeling effort as possible. In this work, however, we focus not only on how to select appropriate data instances but also how to select informative features, more specifically, categorical features to be labeled by the oracle in a unified manner. The unification means that we select the best choice of item to label where the item can be either a feature or an instance on each iteration given a unified scoring function to make the decision. The method that we propose is by synthesizing new instances that represent a set of features. By utilizing synthesized instances, we can treat this set of features as if they are regular instances. Therefore they could be compared on an equal ground when the model tries to select which instances to be labeled by the oracle. The features used to build the synthesized instances need to be carefully selected so the resulting synthesized instances could improve the model and not introducing any contradicting information. We utilize hierarchical clustering in order to group features that own similar content. This is done first by picking clusters whose label purity are estimated to be high. Then we score a feature based on how common the feature is in the cluster and how related the feature is to the estimated majority label. The top scoring features will then be used to synthesize instances. We demonstrate the effectiveness of the proposed method through a few data sets that consist of only categorical features where the feature labeling makes more sense to labeling oracles. The experiment results show that adopting the unified approach creates clear benefit to model construction, especially in the early stage where we can efficiently obtain an effective model through only a few iterations, compared to the one using only instance labeling for model construction.	en_US
dc.language.iso	en_US	en_US
dc.subject	active learning	en_US
dc.subject	categorical features	en_US
dc.subject	feature labeling	en_US
dc.subject	hierarchical clustering	en_US
dc.title	A Unified Approach on Active Learning Dual Supervision	en_US
dc.type	Proceedings Paper	en_US
dc.identifier.journal	2019 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS (IJCNN)	en_US
dc.citation.spage	0	en_US
dc.citation.epage	0	en_US
dc.contributor.department	應用數學系	zh_TW
dc.contributor.department	Department of Applied Mathematics	en_US
dc.identifier.wosnumber	WOS:000530893800025	en_US
dc.citation.woscount	0	en_US
Appears in Collections:	Conferences Paper