References

How to Cite

When using genieclust in research publications, please cite [Gag21] and [GBC16] as specified below. Thank you.

Bibliography

B+13

L. Buitinck and others. API design for machine learning software: Experiences from the scikit-learn project. In ECML PKDD Workshop: Languages for Data Mining and Machine Learning, 108–122. 2013.

CMZS15

R.J.G.B. Campello, D. Moulavi, A. Zimek, and J. Sander. Hierarchical density estimates for data clustering, visualization, and outlier detection. ACM Transactions on Knowledge Discovery from Data, 10(1):5:1–5:51, 2015. doi:10.1145/2733381.

CEL+18

R.R. Curtin, M. Edel, M. Lozhnikov, Y. Mentekidis, S. Ghaisas, and S. Zhang. Mlpack 3: A fast, flexible machine learning library. Journal of Open Source Software, 3(26):726, 2018. doi:10.21105/joss.00726.

DN09

S. Dasgupta and V. Ng. Single data, multiple clusterings. In Proc. NIPS Workshop Clustering: Science or Art? Towards Principled Approaches. 2009. URL: https://clusteringtheory.org.

DG19

D. Dua and C. Graff. UCI machine learning repository. 2019. URL: http://archive.ics.uci.edu/ml.

EKSX96

M. Ester, H.P. Kriegel, J. Sander, and X. Xu. A density-based algorithm for discovering clusters in large spatial databases with noise. In Proc. KDD'96, pages 226–231. 1996.

FM83

E.B. Fowlkes and C.L. Mallows. A method for comparing two hierarchical clusterings. Journal of the American Statistical Association, 78(383):553–569, 1983.

FMIZ16

P. Fränti, R. Mariescu-Istodor, and C. Zhong. XNN graph. Lecture Notes in Computer Science, 10029:207–217, 2016. doi:10.1007/978-3-319-49055-7_19.

FS18

P. Fränti and S. Sieranoja. K-means properties on six clustering benchmark datasets. Applied Intelligence, 48(12):4743–4759, 2018. doi:10.1007/s10489-018-1238-7.

Gag21

M. Gagolewski. genieclust: Fast and robust hierarchical clustering. SoftwareX, 15:100722, 2021. doi:10.1016/j.softx.2021.100722.

GBC16

M. Gagolewski, M. Bartoszuk, and A. Cena. Genie: A new, fast, and outlier-resistant hierarchical clustering algorithm. Information Sciences, 363:8–23, 2016. doi:10.1016/j.ins.2016.05.003.

G+20

M. Gagolewski and others. Benchmark suite for clustering algorithms – version 1. 2020. URL: https://github.com/gagolews/clustering_benchmarks_v1, doi:10.5281/zenodo.3815066.

GP10

D. Graves and W. Pedrycz. Kernel-based fuzzy clustering and fuzzy clustering: A comparative experimental study. Fuzzy Sets and Systems, 161:522–543, 2010. doi:10.1016/j.fss.2009.10.021.

HA85

L. Hubert and P. Arabie. Comparing partitions. Journal of Classification, 2(1):193–218, 1985. doi:10.1007/BF01908075.

JL05

A.K. Jain and M.H.C. Law. Data clustering: A user's dilemma. Lecture Notes in Computer Science, 3776:1–10, 2005.

KHK99

G. Karypis, E.H. Han, and V. Kumar. CHAMELEON: Hierarchical clustering using dynamic modeling. Computer, 32(8):68–75, 1999. doi:10.1109/2.781637.

KMKM17

A. Kobren, N. Monath, A. Krishnamurthy, and A. McCallum. A hierarchical algorithm for extreme clustering. In Proc. 23rd ACM SIGKDD'17, 255–264. 2017. doi:10.1145/3097983.3098079.

Lin73

R.F. Ling. A probability theory of cluster analysis. Journal of the American Statistical Association, 68(341):159–164, 1973.

MRG10

W.B. March, P. Ram, and A.G. Gray. Fast euclidean minimum spanning tree: Algorithm, analysis, and applications. In Proc. 16th ACM SIGKDD'10, 603–612. 2010. doi:10.1145/1835804.1835882.

MHA17

L. McInnes, J. Healy, and S. Astels. hdbscan: Hierarchical density based clustering. The Journal of Open Source Software, 2(11):205, 2017. doi:10.21105/joss.00205.

MNL12

A.C. Müller, S. Nowozin, and C.H. Lampert. Information theoretic clustering using minimum spanning trees. In Proc. German Conference on Pattern Recognition. 2012. URL: https://github.com/amueller/information-theoretic-mst.

Mul13

D. Müllner. fastcluster: Fast hierarchical, agglomerative clustering routines for R and Python. Journal of Statistical Software, 53(9):1–18, 2013. doi:10.18637/jss.v053.i09.

NBMN19

B. Naidan, L. Boytsov, Y. Malkov, and D. Novak. Non-metric space library (NMSLIB) manual, version 2.0. 2019. URL: https://github.com/nmslib/nmslib/blob/master/manual/latex/manual.pdf.

Ols95

C.F. Olson. Parallel algorithms for hierarchical clustering. Parallel Computing, 21:1313–1325, 1995. doi:10.1016/0167-8191(95)00017-I.

P+11

F. Pedregosa and others. Scikit-learn: Machine learning in Python. Journal of Machine Learning Research, 12(85):2825–2830, 2011. URL: http://jmlr.org/papers/v12/pedregosa11a.html.

RF16

M. Rezaei and P. Fränti. Set matching measures for external cluster validity. IEEE Transactions on Knowledge and Data Engineering, 28(8):2173–2186, 2016. doi:10.1109/TKDE.2016.2551240.

Ult05

A. Ultsch. Clustering with SOM: U*C. In Workshop on Self-Organizing Maps, pages 75–82. 2005.

VEB10

N.X. Vinh, J. Epps, and J. Bailey. Information theoretic measures for clusterings comparison: Variants, properties, normalization and correction for chance. Journal of Machine Learning Research, 11(95):2837–2854, 2010. URL: http://jmlr.org/papers/v11/vinh10a.html.