TY - JOUR
T1 - An efficient unsupervised feature selection procedure through feature clustering
AU - Yan, Xuyang
AU - Nazmi, Shabnam
AU - Erol, Berat A.
AU - Homaifar, Abdollah
AU - Gebru, Biniam
AU - Tunstel, Edward
N1 - Publisher Copyright:
© 2019
PY - 2020/3
Y1 - 2020/3
N2 - Due to the scarcity of readily available labels, unsupervised feature selection (UFS) methods are widely adopted in the analysis of high-dimensional data. However, most of the existing UFS methods primarily focus on the significance of features in maintaining the data structure while ignoring the redundancy among features. Moreover, the determination of the proper number of features is another challenge. In this paper, an efficient unsupervised feature selection method through feature clustering (EUFSFC) is proposed to address the redundancy among features, and to determine the size of the final feature subset. The proposed methodology is comprised of two steps: (a) feature cluster analysis, and (b) the selection of the representative features. An extended density-based clustering algorithm is proposed to separate features into an appropriate number of disjoint clusters with no requirement for predefined cluster numbers or radii. The selection of features is performed by choosing the most representative features from those feature clusters. Experiments are conducted to show the effectiveness of the proposed feature selection method.
AB - Due to the scarcity of readily available labels, unsupervised feature selection (UFS) methods are widely adopted in the analysis of high-dimensional data. However, most of the existing UFS methods primarily focus on the significance of features in maintaining the data structure while ignoring the redundancy among features. Moreover, the determination of the proper number of features is another challenge. In this paper, an efficient unsupervised feature selection method through feature clustering (EUFSFC) is proposed to address the redundancy among features, and to determine the size of the final feature subset. The proposed methodology is comprised of two steps: (a) feature cluster analysis, and (b) the selection of the representative features. An extended density-based clustering algorithm is proposed to separate features into an appropriate number of disjoint clusters with no requirement for predefined cluster numbers or radii. The selection of features is performed by choosing the most representative features from those feature clusters. Experiments are conducted to show the effectiveness of the proposed feature selection method.
KW - Feature clustering
KW - Feature redundancy
KW - Unsupervised feature selection
UR - https://www.scopus.com/pages/publications/85078564023
U2 - 10.1016/j.patrec.2019.12.022
DO - 10.1016/j.patrec.2019.12.022
M3 - Article
SN - 0167-8655
VL - 131
SP - 277
EP - 284
JO - Pattern Recognition Letters
JF - Pattern Recognition Letters
ER -