Stochastic k-means for efficient higher order clustering

Thumbnail Image
Γεραμούτσου, Βασιλική
Journal Title
Journal ISSN
Volume Title
The advent of digital technologies has resulted in a wealth of data across various domains, including the airline and music industries. This abundance of data allows for detailed insights into consumer preferences, pop culture trends, and customer satisfaction with airline applications or services. By systematically analyzing this data and leveraging machine learning models, valuable insights can be derived. This work introduces a stochastic variant of the widely used k-means clustering algorithm and provides guidelines for its implementation in Python. The stochastic kmeans algorithm offers improved scalability and computational efficiency compared to its traditional counterpart, making it suitable for large datasets and handling unknown attributes. Comprehensive guidelines are presented for the Python implementation, covering essential steps such as data preprocessing, distance calculation, centroid updating, and convergence criteria. These guidelines serve as a valuable resource for future approaches, enabling the adoption and development of stochastic clustering algorithms in data analysis. By following these guidelines, researchers and practitioners can effectively apply the stochastic k-means algorithm and contribute to advancements in the field. In conclusion, the combination of data analysis, machine learning, and the stochastic k-means algorithm provides a powerful framework for gaining insights into customer satisfaction and preferences. By leveraging these techniques, organizations in the airline industry can make informed decisions to enhance their services and meet the evolving needs of their customers.
k-means, Classification, Stochastic clustering, Pythonic, Machine learning, Airlines, Spotify