Novel Class Discovery (NCD) is the problem of trying to discover novel classes in an unlabeled set, given a labeled set of different but related classes. To interpret the results of clustering or NCD algorithms, data scientists need to understand the domain- and application-specific attributes of tabular data. This task is difficult and can often only be performed by a domain expert.
With this interface, a domain expert can easily run state-of-the-art NCD algorithms to discover classes in their tabular data formatted in CSV. Without writing any code, and with minimal knowledge in data science, accurate clusters can be generated and interpreted in the form of decision trees. Currently 3 NCD models and 2 unsupervised clustering algorithms are implemented: PBN, TabularNCD, a baseline model, k-means and Spectral Clustering.
This interface has been presented in the demo track of the ECML PKDD 2023 conference. The thesis’ defence replay is accessible on YouTube.