Tune Up Fuzzy C-Means for Big Data: Some Novel Hybrid Clustering Algorithms Based on Initial Selection and Incremental Clustering

Le, Hoang Son; Nguyen, Dang Tien

Full metadata record

DC Field	Value	Language
dc.contributor.author	Le, Hoang Son	-
dc.contributor.author	Nguyen, Dang Tien	-
dc.date.accessioned	2019-07-15T04:01:37Z	-
dc.date.available	2019-07-15T04:01:37Z	-
dc.date.issued	2017	-
dc.identifier.citation	Le, H. S., & Nguyen, D. T. (2017). Tune Up Fuzzy C-Means for Big Data: Some Novel Hybrid Clustering Algorithms Based on Initial Selection and Incremental Clustering. International Journal of Fuzzy Systems, 19(5), 1585-1602.	vi
dc.identifier.uri	http://repository.vnu.edu.vn/handle/VNU_123/64979	-
dc.description.abstract	Data are getting larger, and most of them are necessary for our businesses. Rapid explosion of data brings us a number of challenges relating to its complexity and how the most important knowledge can be captured in reasonable time. Fuzzy C-means (FCM)—one of the most efficient clustering algorithms which have been widely used in pattern recognition, data compression, image segmentation, computer vision and many other fields—also faces the problem of processing large datasets. In this paper, we propose some novel hybrid clustering algorithms based on incremental clustering and initial selection to tune up FCM for the Big Data problem. The first algorithm determines meshes of rectangle covering data points as the representatives, while the second one considers data points that have high influence to others as the representatives. The representatives are then clustered by FCM, and the new centers are selected as initial ones for clustering of the dataset. Theoretical analyses of the new algorithms including comparison of quality of solutions when clustering the representatives set versus the entire set are examined. The experimental results on both simulated and real datasets show that total computational time of the new methods including time of finding representatives and clustering is faster than those of other relevant algorithms. The validation on clustering quality is also examined. The findings of this paper have great impact and significance to researches in the fields of soft computing and Big Data processing. It is obvious that computing methodologies nowadays are facing with huge amount of diverse and complex data structures. Speed of processing is the main priority when considering effectiveness of a specific method. The findings demonstrated practical algorithms and investigated their characteristics that could be referenced by other researchers in similar applications. The usefulness and significance of this research are clearly demonstrated within the extent of real-life applications.	vi
dc.language.iso	en	vi
dc.publisher	Taiwan Fuzzy Systems Association and Springer-Verlag Berlin Heidelberg	vi
dc.relation.ispartofseries	International Journal of Fuzzy Systems;	-
dc.subject	Big Data	vi
dc.subject	Hybrid Clustering Algorithms	vi
dc.subject	Initial Selection	vi
dc.subject	Incremental Clustering	vi
dc.title	Tune Up Fuzzy C-Means for Big Data: Some Novel Hybrid Clustering Algorithms Based on Initial Selection and Incremental Clustering	vi
dc.type	Article	vi
dc.identifier.doi	10.1007/s40815-016-0260-3	-
Appears in Collections:	Bài báo của ĐHQGHN trong Scopus hoặc Web of Science

3094.pdf

Size : 3,61 MB
Format : Adobe PDF

View :
Download :

Request a copy

Show simple item record

Full metadata record

DC Field	Value	Language
dc.contributor.author	Le, Hoang Son	-
dc.contributor.author	Nguyen, Dang Tien	-
dc.date.accessioned	2019-07-15T04:01:37Z	-
dc.date.available	2019-07-15T04:01:37Z	-
dc.date.issued	2017	-
dc.identifier.citation	Le, H. S., & Nguyen, D. T. (2017). Tune Up Fuzzy C-Means for Big Data: Some Novel Hybrid Clustering Algorithms Based on Initial Selection and Incremental Clustering. International Journal of Fuzzy Systems, 19(5), 1585-1602.	vi
dc.identifier.uri	http://repository.vnu.edu.vn/handle/VNU_123/64979	-
dc.description.abstract	Data are getting larger, and most of them are necessary for our businesses. Rapid explosion of data brings us a number of challenges relating to its complexity and how the most important knowledge can be captured in reasonable time. Fuzzy C-means (FCM)—one of the most efficient clustering algorithms which have been widely used in pattern recognition, data compression, image segmentation, computer vision and many other fields—also faces the problem of processing large datasets. In this paper, we propose some novel hybrid clustering algorithms based on incremental clustering and initial selection to tune up FCM for the Big Data problem. The first algorithm determines meshes of rectangle covering data points as the representatives, while the second one considers data points that have high influence to others as the representatives. The representatives are then clustered by FCM, and the new centers are selected as initial ones for clustering of the dataset. Theoretical analyses of the new algorithms including comparison of quality of solutions when clustering the representatives set versus the entire set are examined. The experimental results on both simulated and real datasets show that total computational time of the new methods including time of finding representatives and clustering is faster than those of other relevant algorithms. The validation on clustering quality is also examined. The findings of this paper have great impact and significance to researches in the fields of soft computing and Big Data processing. It is obvious that computing methodologies nowadays are facing with huge amount of diverse and complex data structures. Speed of processing is the main priority when considering effectiveness of a specific method. The findings demonstrated practical algorithms and investigated their characteristics that could be referenced by other researchers in similar applications. The usefulness and significance of this research are clearly demonstrated within the extent of real-life applications.	vi
dc.language.iso	en	vi
dc.publisher	Taiwan Fuzzy Systems Association and Springer-Verlag Berlin Heidelberg	vi
dc.relation.ispartofseries	International Journal of Fuzzy Systems;	-
dc.subject	Big Data	vi
dc.subject	Hybrid Clustering Algorithms	vi
dc.subject	Initial Selection	vi
dc.subject	Incremental Clustering	vi
dc.title	Tune Up Fuzzy C-Means for Big Data: Some Novel Hybrid Clustering Algorithms Based on Initial Selection and Incremental Clustering	vi
dc.type	Article	vi
dc.identifier.doi	10.1007/s40815-016-0260-3	-
Appears in Collections:	Bài báo của ĐHQGHN trong Scopus hoặc Web of Science

3094.pdf

Size : 3,61 MB
Format : Adobe PDF

View :
Download :

Request a copy

Show simple item record

BROWSE

AUTHOR PROFILE

View :

Download :

44424942

26914453

View :

Download :