特征选择方法可以分为3类:
filter, wrapper and embedded
过滤器,包装和嵌入式
Filter methods
Wrapper methods
Embedded methods
根据类标注信息分为3类:
supervised feature selection, unsupervised feature selection and semi-supervised feature selection
监督特征选择,无监督特征选择和半监督特征选择
半监督特征选择使用标注数据和未标注数据来进行特征选择
This paper presents a comprehensive survey(综合调查) on semi-supervised feature selection methods, categorizes the methods from two different perspectives, summarizes them with specific details and describes advantage and disadvantage of them.
The perspective of the second taxonomy(分类) is based on the taxonomy of semi-supervised learning methods which divides semi-supervised feature selection methods into five categories:
graph-based semi-supervised feature selection,
self-training based semi-supervised feature selection,
co-training based semi-supervised feature selection,
support vector machine based semi-supervised feature selection,
and other semi-supervised feature selection methods.
Semi-supervised learning
半监督学习
Semi-supervised learning learns from a small number of labeled data and a large number of unlabeled data.
In semi-supervised learning, certain smoothness assumptions such as cluster assumption(聚类假设) and manifold assumption(流形假设) must be met(必须得到满足).
生成模型
自我训练
协同训练
半监督支持向量机 (一开始被称为 Transductive Support Vector Machines,传导支持向量机)
基于图形的方法
Semi-supervised feature selection
基于两个角度分类,由明确的细节去总结
how they interact with the learning algorithm
他们怎么和学习算法交互
分为过滤式、包装式和嵌入式特征选择方法
depending on what semi-supervised learning algorithm corresponds to the procedure used in the semi-supervised feature selection method.
According to this taxonomy and literature review,