References¶

How to Cite¶

When using genieclust in research publications, please cite [4] and [9] as specified below. Thank you.

Bibliography¶

[1]

Buitinck, L. and others. (2013). API design for machine learning software: Experiences from the scikit-learn project. In: ECML PKDD Workshop: Languages for Data Mining and Machine Learning, pp. 108–122.

[2]

Campello, R.J.G.B., Moulavi, D., and Sander, J. (2013). Density-based clustering based on hierarchical density estimates. Lecture Notes in Computer Science, 7819:160–172. DOI: 10.1007/978-3-642-37456-2_14.

[3]

Ester, M., Kriegel, H.P., Sander, J., and Xu, X. (1996). A density-based algorithm for discovering clusters in large spatial databases with noise. In: Proc. KDD'96, pp. 226–231.

[4]

Gagolewski, M. (2021). genieclust: Fast and robust hierarchical clustering. SoftwareX, 15:100722. URL: https://genieclust.gagolewski.com/, DOI: 10.1016/j.softx.2021.100722.

[5]

Gagolewski, M. (2022). A framework for benchmarking clustering algorithms. SoftwareX, 20:101270. URL: https://clustering-benchmarks.gagolewski.com/, DOI: 10.1016/j.softx.2022.101270.

[6]

Gagolewski, M. (2025). Deep R Programming. Zenodo, Melbourne. ISBN 978-0-6455719-2-9. URL: https://deepr.gagolewski.com/, DOI: 10.5281/zenodo.7490464.

[7]

Gagolewski, M. (2025). Minimalist Data Wrangling with Python. Zenodo, Melbourne. ISBN 978-0-6455719-1-2. URL: https://datawranglingpy.gagolewski.com/, DOI: 10.5281/zenodo.6451068.

[8]

Gagolewski, M. (2025). Normalised clustering accuracy: An asymmetric external cluster validity measure. Journal of Classification, 42:2–30. URL: https://link.springer.com/content/pdf/10.1007/s00357-024-09482-2.pdf, DOI: 10.1007/s00357-024-09482-2.

[9]

Gagolewski, M., Bartoszuk, M., and Cena, A. (2016). Genie: A new, fast, and outlier-resistant hierarchical clustering algorithm. Information Sciences, 363:8–23. URL: https://arxiv.org/pdf/2209.05757, DOI: 10.1016/j.ins.2016.05.003.

[10]

Gagolewski, M., Bartoszuk, M., and Cena, A. (2021). Are cluster validity measures (in)valid? Information Sciences, 581:620–636. URL: https://arxiv.org/pdf/2208.01261, DOI: 10.1016/j.ins.2021.10.004.

[11]

Gagolewski, M., Cena, A., Bartoszuk, M., and Brzozowski, L. (2025). Clustering with minimum spanning trees: How good can it be? Journal of Classification, 42:90–112. URL: https://link.springer.com/content/pdf/10.1007/s00357-024-09483-1.pdf, DOI: 10.1007/s00357-024-09483-1.

[12]

Jain, A.K. and Law, M.H.C. (2005). Data clustering: A user's dilemma. Lecture Notes in Computer Science, 3776:1–10.

[13]

Ling, R.F. (1973). A probability theory of cluster analysis. Journal of the American Statistical Association, 68(341):159–164.

[14]

McInnes, L., Healy, J., and Astels, S. (2017). hdbscan: Hierarchical density based clustering. The Journal of Open Source Software, 2(11):205. DOI: 10.21105/joss.00205.

[15]

Pedregosa, F. and others. (2011). Scikit-learn: Machine learning in Python. Journal of Machine Learning Research, 12(85):2825–2830. URL: http://jmlr.org/papers/v12/pedregosa11a.html.

References¶

How to Cite¶

See Also¶

Bibliography¶