Languages

  1. TimeStress is a dataset designed to evaluate the temporal representation of facts in large language models (LLMs) by assessing their ability to distinguish between correct and incorrect factual statements contextualized with a date and formatted as questions, such as “In 2011, who was the president of the USA? Barack Obama”. The evaluation principle is that […]

  2. Microtune is an autonomous agent that dynamically controls the size of the memory cache used by a relational database (tested on MariaDB). Its objective (FinOps) is to minimize the memory footprint while respecting a maximum latency constraint not to be exceeded. This agent works externally to the database with a SQL link to change the […]

  3. Compose is an application, that allows for the management of the lifecycle of decentralized applications. This includes the design and deployment of these applications, as well as managing interactions with them. Compose can be deployed on various blockchain networks (ex. Ethereum, Alastria, Arbitrum). It is possible to directly design a smart contract within the tool […]

  4. This project provides an automated pipeline to generate smart contracts using Large Language Models with Ollama. The pipeline compiles these smart contracts with the Solidity compiler (solc), analyzes them using Slither, and performs unit testing based on the provided test instructions in the prompts. Statistics are then produced to determine the efficiency of each model […]

  5. PowerDNS-Operator is a Kubernetes operator that facilitates the management of PowerDNS resources, including Zones and RRsets, through the use of Custom Resources. Its primary purpose is to maintain the desired configuration state of PowerDNS. This operator is specifically designed to offer DNS capabilities as a self-service feature. Zone resources are managed at the cluster level, […]

  6. WikiFactDiff is a dataset designed as a resource to perform atomic factual knowledge updates on language models, with the goal of aligning them with current knowledge. It describes the evolution of factual knowledge between two dates, named T_old and T_new,​ in the form of semantic triples. To enable the possibility of evaluating knowledge algorithms (such […]

  7. Machine learning, in its various tasks from fitting to inference, can be highly energy intensive and raises growing environmental concerns. This situation inspired different initiatives fostering a more frugal, greener AI. Beyond the implementation of good practices, it appears pivotal for researchers and data engineers to gather an empiric knowledge of energy consumption per task, […]

  8. Marine Detect is an innovative project that leverages Deep Learning technology to advance the detection and identification of marine species. These models were developed in the context of the Let’s Revive project in partnership with Tēnaka. Tēnaka emphasizes impact measurement through Tēnaka Science, a platform sharing monthly coral ecosystems data. To automate data collection, Orange […]

  9. 3D Gaussian Splatting is a new algorithm for synthesising photo-realistic 3D scenes from 2D images. This algorithm has been developed in 2023 by INRIA in Sophia Antipolis. In just a few months, it has become the reference in the field. This solution is broken down into 2 parts: a tool for training scenes with Pytorch […]

    AICC++

    Published on

  10. Speech processing models are computationally expensive, generating environmental concerns because of their high energy consumption. ESSL (Efficient Self-Supervised Learning) addresses this issue, enabling pretraining with a single GPU for only 28 hours. The reduction in computational costs represents up to two orders of magnitude improvement against existing speech models. Its source code is available on […]