Application of Algorithms with Variable Greedy Heuristics for k-Medoids Problems
DOI:
https://doi.org/10.31449/inf.v44i1.2737Abstract
Progress in location theory methods and clustering algorithms is mainly targeted at improving the performance of the algorithms. The most popular clustering models are based on solving the p-median and similar location problems (k-means, k-medoids). In such problems, the algorithm must find several points called cluster centers, centroids, medoids, depending on the specific problem which minimize some function of distances from known objects to the centers. In the the k-medoids problem, the centers (medoids) of the cluster must coincide with one of the clustered objects. The problem is NP-hard, and the efforts of researchers are focused on the development of compromise heuristic algorithms that provide a fairly quick solution with minimal error. In this paper, we propose new algorithms of the Greedy Heuristic Method which use the idea of the Variable Neighborhood Search (VNS) algorithms for solving the k-medoids problem (which is also called the discrete p-median problem). In addition to the known PAM (Partition Around Medoids) algorithm, neighborhoods of a known solution are formed by applying greedy agglomerative heuristic procedures. According to the results of computational experiments, the new search algorithms (Greedy PAM-VNS) give more accurate and stable results (lower average value of the objective function and its standard deviation, smaller spread) in comparison with known algorithms on various data sets.Povzetek: Avtorji predlagajo nove algoritme za reševanje problema lokacije k-medoidov in grozdenja.References
Kazakovtsev L.A. (2016) The greedy heuristics method for systems of automatic grouping of objects. Diss ... Dr. tech. of science. Krasnoyarsk. Siberian Federal University.
Ghosh J., Acharya A. (2011) Cluster ensembles. Wiley Interdisciplinary Reviews: Data Mining and Knowledge Discovery, Vol. 1(4), pp..305−315.
Rozhnov I., Orlov V., Kazakovtsev L. (2018) Ensembles of clustering algorithms for problem of detection of homogeneous production batches of semiconductor devices. CEUR-WS, Vol.2098, pp.338-348.
Drezner Z., Hamacher H. (2004) Facility location: applications and theory. Berlin:Springer-Verlag.
Struyf A., Hubert M., Rousseeuw P. (1997) Clustering in an Object-Oriented Environment. Journal of Statistical Software, Issue 1 (4), pp.1-30.
Kaufman L., Rousseeuw P.J. (1990) Finding groups in data: an introduction to cluster analysis. New York:Wiley.
Moreno-Perez J.A., Roda Garcia J.L., Moreno-Vega J.M. (1994) A Parallel Genetic Algorithm for the Discrete p-Median Problem. Studies in Location Analysis, Issue 7, pp.131-141.
Wesolowsky, G. (1993) The Weber problem: History and perspectives. Location Science, No.1, pp.5-23.
Drezner Z., Wesolowsky G.O. (1978) A Trajectory Method for the Optimization of the Multifacility Location Problem with lp Distances. Management Science, Vol.24, pp.1507-1514.
Farahani R., Hekmatfar M. (2009). Facility location: Concepts, models, algorithms and case studies. Berlin Heidelberg:Springer-Verlag.
Deza M.M., Deza E. (2013) Metrics on Normed Structures. Encyclopedia of Distances. Berlin Heidelberg: Springer, pp.89-99, DOI: 10.1007/978-3-642-30958-85.
Weiszfeld, E. (1937) Sur le point sur lequel la somme des distances de n points donnes est minimum. Tohoku Mathematical Journal, Vol.43, No.1, pp.335-0386.
Kaufman L. and Rousseeuw P.J. (1987) Clustering by means of Medoids. Statistical Data Analysis Based on the L1-Norm and Related Methods, Springer US. pp. 405-416.
Kochetov Yu., Mladenovic N., Hansen P. (2003) Local search with alternating neighborhoods. Discrete analysis and operations research, Series 2, Vol.10(1), pp.11-43.
Nicholson T. A. J. (1965) A sequential method for discrete optimization problems and its application to the assignment, traveling salesman and tree scheduling problems. J. Inst. Math. Appl, Vol.13, pp.362-375.
Page E. S. (1965) On Monte Carlo methods in congestion problems. I: Searching for an optimum in discrete situations. Oper. Res. Vol.13(2), pp. 291-299.
Kernighan B. W., Lin S. (1970) An efficient heuristic procedure for partitioning graphs. Bell Syst. Tech. J. Vol.49. pp.291-307.
Gastrigin L.А. (1978) Random search - specificity, stages of history and prejudice. Questions of cybernetics. M.: Nauch. Council on the complex problem "Cybernetics" of the USSR Academy of Sciences, Vol. 33, pp.3-16.
Kochetov Yu.А. (2010) Local search methods for discrete placement problems. Dis. Doctors of Physical and Mathematical Sciences. Novosibirsk.
Hansen P., Mladenovic N., Bruke E.K., Kendall G. (2014) Variable Neighborhood Search. Search Methodology. Springer US. P.211-238, doi: 10.1007/0-387-28356-0_8.
Kazakovtsev, L.A., Antamoshkin, A.N. (2014) Genetic Algorithm with Fast Greedy Heuristic for Clustering and Location Problems. Informatica, Vol.38(3), pp.229-240.
Osman I. H., Laporte G. (1996) Metaheuristics: a bibliography. Ann. Oper. Res, Vol.63. pp.513-628.
Mladenovic N., Hansen P. Variable neighborhood search. Comput. Oper. Res, Vol.24, P.1097-1100.
Hansen P., Mladenovic N. (2001) Variable neighborhood search: principles and applications (invited review). European J. Oper. Res, Vol.130(3), pp.449-467.
Lopez F. G., Batista B. M., Moreno-Perez J., Moreno-Vega M. (2000) The parallel variable neighborhood search for the p-median problem. Res. Rep. Univ. of La Laguna, Spain.
Brimberg J., Mladenovic N. (1996) A variable neighborhood algorithm for solving the continuous location-allocation problem. Stud. Locat. Anal. V. 10. pp. 1-12.
Hansen P., Mladenovic N., Perez-Brito D. (2001) Variable neighborhood decomposition search. J. Heuristics, Vol.7 (4), pp. 335-350.
Goldberg D. E. (1989) Genetic algorithms in search, optimization, and machine learning. Reading, MA: Addison-Wesley.
Houck C.R., Joines J. A., G.Kay. M. (1996) Comparison of Genetic Algorithms, Random Restart, and Two-Opt Switching for Solving Large Location-Allocation Problems. Computers and Operations Research, Vol. 23, pp. 587-596.
Alp O., Erkut E., Drezner Z. (2003) An Efficient Genetic Algorithm for the p-Median Problem. Annals of Operations Research. Vol.122, pp.21-42, DOI:10.1023/A:1026130003508. (2003).
Neema M.N., Maniruzzaman K.M., Ohgai A. (2011) New Genetic Algorithms Based Approaches to Continuous p-Median Problem. Netw. Spat. Econ., Vol.11, pp.83-99, DOI:10.1007/s11067-008-9084-5.
UCI Machine Learning Repository [http://archive.ics.uci.edu/ml]. access date 28.03.2019.
Clustering basic benchmark [http://cs.joensuu.fi/sipu/datasets], access date 28.03.2019.
Sheng W., Liu X. (2006) A genetic k-medoids clustering algorithm. Journal of Heuristics. Vol.12, No.6. P. 447-466.
Wilcoxon F. (1945) Individual comparisons by ranking methods. Biometrics Bulletin, Vol.1(6), pp. 80–83, DOI:10.2307/3001968.
Derrac J., Garcia S., Molina D., Herrera F. (2011) A practical tutorial on the use of nonparametric statistical tests as a methodology for comparing evolutionary and swarm intelligence algorithms. Swarm and Evolutionary Computation, Vol. 1(1), pp. 3-18, DOI:j.swevo.2011.02.002.
Downloads
Published
How to Cite
Issue
Section
License
I assign to Informatica, An International Journal of Computing and Informatics ("Journal") the copyright in the manuscript identified above and any additional material (figures, tables, illustrations, software or other information intended for publication) submitted as part of or as a supplement to the manuscript ("Paper") in all forms and media throughout the world, in all languages, for the full term of copyright, effective when and if the article is accepted for publication. This transfer includes the right to reproduce and/or to distribute the Paper to other journals or digital libraries in electronic and online forms and systems.
I understand that I retain the rights to use the pre-prints, off-prints, accepted manuscript and published journal Paper for personal use, scholarly purposes and internal institutional use.
In certain cases, I can ask for retaining the publishing rights of the Paper. The Journal can permit or deny the request for publishing rights, to which I fully agree.
I declare that the submitted Paper is original, has been written by the stated authors and has not been published elsewhere nor is currently being considered for publication by any other journal and will not be submitted for such review while under review by this Journal. The Paper contains no material that violates proprietary rights of any other person or entity. I have obtained written permission from copyright owners for any excerpts from copyrighted works that are included and have credited the sources in my article. I have informed the co-author(s) of the terms of this publishing agreement.
Copyright © Slovenian Society Informatika