AI

  1. Novel Class Discovery (NCD) is the problem of trying to discover novel classes in an unlabeled set, given a labeled set of different but related classes. To interpret the results of clustering or NCD algorithms, data scientists need to understand the domain- and application-specific attributes of tabular data. This task is difficult and can often […]

  2. WikiFactDiff is a dataset designed as a resource to perform atomic factual knowledge updates on language models, with the goal of aligning them with current knowledge. It describes the evolution of factual knowledge between two dates, named T_old and T_new,​ in the form of semantic triples. To enable the possibility of evaluating knowledge algorithms (such […]

  3. Machine learning, in its various tasks from fitting to inference, can be highly energy intensive and raises growing environmental concerns. This situation inspired different initiatives fostering a more frugal, greener AI. Beyond the implementation of good practices, it appears pivotal for researchers and data engineers to gather an empiric knowledge of energy consumption per task, […]

  4. Marine Detect is an innovative project that leverages Deep Learning technology to advance the detection and identification of marine species. These models were developed in the context of the Let’s Revive project in partnership with Tēnaka. Tēnaka emphasizes impact measurement through Tēnaka Science, a platform sharing monthly coral ecosystems data. To automate data collection, Orange […]

  5. 3D Gaussian Splatting is a new algorithm for synthesising photo-realistic 3D scenes from 2D images. This algorithm has been developed in 2023 by INRIA in Sophia Antipolis. In just a few months, it has become the reference in the field. This solution is broken down into 2 parts: a tool for training scenes with Pytorch […]

    AICC++

    Published on

  6. Speech processing models are computationally expensive, generating environmental concerns because of their high energy consumption. ESSL (Efficient Self-Supervised Learning) addresses this issue, enabling pretraining with a single GPU for only 28 hours. The reduction in computational costs represents up to two orders of magnitude improvement against existing speech models. Its source code is available on […]