« Retour

Kucharavy Andrei

Professeur-e HES Assistant-e

Compétences principales

Professeur-e HES Assistant-e

Bureau: FOY

HES-SO Valais-Wallis - Haute Ecole de Gestion
Route de la Plaine 2, Case postale 80, 3960 Sierre, CH

Domaine
Economie et services

Filière principale
Economie d'entreprise

Andrei Kucharavy est un professeur associé à la Haute Ecole Spécialisée de la Suisse Occidentale (HES-SO Valais-Wallis). Il est le cofondateur du GenLearning Center, une entité de recherche appliquée qui se concentre sur les questions de sécurité et de déploiement des technologies d'IA générative. Les recherches d'Andrei se concentrent actuellement sur la sécurité, la confidentialité et la distribution de l'apprentissage automatique, en particulier du point de vue de la cybersécurité et des facteurs humains.

Auparavant, il a été collobarateur scientifique à l'IEM de la HEVS, chercheur postdoctoral au Distributed Computing Lab de l'EPFL et est le deuxième lauréat du "Distinguished Cyber-Defence (CYD) Postdoctoral Fellowship" du Cyber-Defence Campus, armasuisse S+T. Il travaille dans le domaine de l'apprentissage machine depuis 2013, de la ML générative depuis 2018, et de l'intersection de la ML générative et de la cybersécurité depuis 2020. Andrei a obtenu son doctorat à l'Université Paris Sorbonne pour des travaux effectués à l'Université Johns Hopkins, Baltimore, MD, USA et au Stowers Institute for Medical Research, Kansas City, MO, USA et est ingénieur de l'Ecole Polytechnique.

BSc HES-SO en Economie d'entreprise - HES-SO Valais-Wallis - Haute Ecole de Gestion

Fundamentals of Algorithms
Mathématiques
Mathematiques Financieres
Pensée Computationelle

En cours

Characterizing and Mitigating Attacks on Large Language Models in Code Generation and Privacy

Rôle: Co-requérant(s)

Financement: Cyber-Defence Campus

Description du projet :

WP1: Evaluation of LLM-generated code to injected vulnerabilities and potential mitigations: Whether allowing non-coders to realize software projects or making programmers more productive and more competent, code-generating LLMs are one of the focuses of the largest LLM providers, from GPT4 with Codex to GitHub with Copilot to Meta with Code-LLaMA. While it is still to be seen if the promises of code-generating LLMs will be realized, preliminary research from the cyber-security community and GenLearning Center has demonstrated that the code LLMs generate is as prone to bugs, making it vulnerable. Unfortunately, unlike code written by humans, the one generated by LLMs can be poisoned through datasets used in their training and cannot be trivially corrected at scale when the vulnerabilities affecting them are discovered and released. This working package focuses on indirect code vulnerability injection through training dataset poisoning, malicious fine-tuning, or adversarial pre-prompting, focusing specifically on less easily detectable vulnerabilities. The end goal of this project would be to investigate detection and mitigation avenues for such malicious tampering.

WP2: Defining vulnerabilities in LLMs and developing guidelines for characterizing, disclosing, and mitigating them: While there is a rising awareness in the cyber-security community that LLMs introduce novel threats in the cyber-space and amplify existing ones, there is still no understanding of how to systematically report them or link to details of LLM-derived tools implementations to allow the mitigation measures to propagate through ecosystem in the same way security patches are propagated in the software ecosystem. This project aims for the GenLearning Center to generalize the definitions of LLM vulnerabilities to cover aspects such as private data leakage, dangerous information generation, and failure in information retrieval and summarization and to establish guidelines for characterizing, disclosing, and mitigating them. This project would allow a compilation of standards to accept or not LLM-based solutions as part of cyber-physical systems used by DDPS, and how to keep them operating safely in a sustained manner, as well as share information about potential failure modes and mitigations with Swiss partners in defense.

WP3: Secure LLM privacy leakage detection: One of the significant issues with LLMs is their uncanny ability to memorize information they saw in their training only once. With the broad deployment of Conversational Agent LLMs and the use of conversations with users to fine-tune conversational agents further, they potentially memorize critical non-public information from such exchanges. The goal of this WP will be to develop a protocol to securely exchange information between LLM developers and entities who want to verify if their private information was leaked in a way that does not require either party to disclose potentially sensitive information. This project will specifically explore the private set intersection approach with additional safeguards from the differential privacy domain.

Equipe de recherche au sein de la HES-SO: Percia David Dimitri , Kucharavy Andrei , Vallez Cyril

Partenaires professionnels: Dr. Ljiljana Dolamic, Cyber-Defence Campus

Durée du projet: 01.01.2024 - 31.12.2024

Montant global du projet: 138'871 CHF

Url du site du projet: https://www.hevs.ch/en/projects/defining-vulnerabilities-in-llms-and-developing-guidelines-for-characterizing-disclosing-and-mitigating-them-208974

Statut: En cours

Quantitative Technology Assessment, Monitoring & Forecasting Models for Cyber-Defense

Rôle: Co-requérant(s)

Financement: Cyber-Defence Campus

Description du projet :

Objectives of the project: The project provides quantitative technology assessment, monitoring & forecasting models for cyber-defence. Such an effort aims to contribute to the Technology Monitoring (TM) portfolio of the Cyber-Defence (CYD) Campus in (i) fulfilling the first measure of the NCS[1] attributed to armasuisse S+T, (ii) writing technical reports on specific cyber-defense technologies for the CYD Campus clients, and (iii) contributing in the development of The Swiss Technology Observatory[2] – all three objectives being attached to the Strategy Cyber DPPS (see figure below). While the project aims to present concrete cybersecurity-technology assessments, monitoring, and forecasting through case studies defined under the TM portfolio needs, we want these results to be backed by coherent, relevant, and solid scientific methodologies that will be published in Q1 journals and A conferences. By applying (i) advanced natural language processing (NLP) methods, (ii) forecasting techniques based on machine-learning models and quantitative analysis (data science), and (iv) algorithmics economics, our work aims to rethink traditional technology mining methodologies by developing dynamic and holistic approaches to provide concrete cyber-defense insights in terms of technology assessment, monitoring, and forecasting.

WP1: Edition and Coordination to the CYD TM “Safety of LLMs in Cybertechnology” Overview Book: Successful public demonstration of high-performance LLMs in late 2022 led to a push to their generalized introduction across a range of software services, including mission-critical systems such as intelligence report generation, retrieval, and summarization or integration with operations systems as user interfaces. Unfortunately, LLMs are still a new technology, and the new risks to the security of the cyber-physical systems they introduce have yet to be discovered. To respond to this risk, the CYD TMM center is preparing the “Safety of LLMs in Cyber” book. While it results from collaboration between dozens of cyber-security and machine learning experts worldwide, their domain expertise until 2023 usually had minimal to no overlap with LLMs. In turn, it means that they rely critically on the information provided in an accessible manner by an LLM expert. As an ex-distinguished CYD Postdoctoral Fellow specializing in generative learning in cyber-security and cyber-defense and current co-leader of the GenLearning Center at HES-SO Valais-Wallis, Dr. Kucharavy is well-suited for this task. In the book's first chapters, Dr. Kucharavy will provide a solid base for others to work off, notably by introducing the principles behind the current generation of LLMs, an overview of existing models and approaches to adapt them to novel tasks, and their fundamental limitations. This will provide other authors with a solid basis for LLM capabilities evaluation to provide their input in their domain of expertise.

WP2: Identification of persistently robust technological monitoring proxies: Developing new defense capabilities fundamentally differs from fundamental or applied research in the civil environment. Because of the length of procurement and lifecycles, the technologies that provide them must still be relevant decades later. This leads to a conundrum. On the one hand, the technologies must be novel enough to be still relevant to the delivery time of the new capabilities. On the other hand, they must be mature enough to be ready for use by then. Errors in either direction are measured in lives lost or tens of billions wasted in procurement. MRAPs and DD-21 (Zumwalt) are recent impressive examples, but cyber-defense is rife with similar failures. In the 2010s, NATO lacked social media information operations defense, and despite repeated promises since the 1980s, expert verification systems have not yet gotten rid of all bugs in software. Quantitative technology monitoring and forecasting tools have been developed to address this problem. They rely on hard-to-falsify proxies, ranging from patent citation structure to bibliometrics, journalistic coverage analysis, and social media conversation sentiment. However, with the recent advances in Generative ML, such proxies are no longer hard to falsify. Given the IP-based investment and addition of AI tools to better evaluate patent analysis by several patent offices worldwide, they are likely to be falsified. To retain the robust technological monitoring capabilities of TMM, this project aims to identify novel robust proxies, notably by examining the novelty of terms and correlations and statement factuality coherence, and to apply it to current novel technology with high long-term novel capabilities potential – quantum technologies.

WP3: Novel NLP methods of evaluating short-term technological convergence potential: Key technologies underlying defensive operations rely on a steady effort and funding to be progressively developed and brought to a maturity level when applicable. However, novel operational capabilities often rely on technological convergence, obtaining a massive synergy from well-developed but previously unconnected technologies, such as DDoS and HTTP/2 parallel page loading assets logic, resulting in an overnight tripling of traffic load available to the attackers. Such convergences present a unique opportunity for offensive use, given that systems able to use them can be developed rapidly and deployed without warning. Because of that, it is critical for the side in a defensive posture, such as Switzerland, to anticipate short-term technological convergence potential and forecast the threat posed by systems resulting from such convergence. This working package aims to investigate how well recently developed NLP methods – notably entailment on scientific texts – could assist technological convergence analysis. Specifically, this WP will support the development of a prototype tool to perform such forecasting on DDoS, in addition to the above-mentioned HTTP/2 convergence previously seen with IoT.

[1] https://www.ncsc.admin.ch/ncsc/en/home/strategie/strategie-ncss-2018-2022.html

[2] https://technology-observatory.ch/

Equipe de recherche au sein de la HES-SO: Percia David Dimitri , Kucharavy Andrei

Partenaires professionnels: Dr. Alain Mermoud, Cyber-Defence Campus

Durée du projet: 01.01.2024 - 31.12.2024

Montant global du projet: 148'002 CHF

Url du site du projet: https://www.hevs.ch/en/projects/novel-nlp-methods-to-evaluate-short-term-technological-convergence-potential-208972

Statut: En cours

Terminés

Fine-Tuning of Generative Language Models On-Premises: Usefulness/Safety Balance of Patient-Facing LLM-Based Conversational Agents

Rôle: Co-requérant(s)

Financement: Axe Transformation Numérique (HES-SO Valais-Wallis)

Description du projet :

Generative language models (GLMs) gained significant attention in late 2002 / early 2023, notably with the introduction of models refined to act consistently with user's expectations of interactions with AI (conversational agents). Conversational fine-tuning revealed the extent of their true capabilities in a real-world environment, and eHealth applications are no exception. This has garnered both industry excitement for their potential applications in eHealth and concerns about their capabilities to assist health professionals and properly preserve patients' data privacy. The privacy-preserving possibilities for ensuring the security of patients' data while using GLMs are a significant concern. Recent research shows that, in the case of GLM usage, even federated learning is not a solution for ensuring privacy protection. In this case, the only plausible solution is to deploy models on-premises, safeguarding data privacy in-house.

In this project, we suggest exploring how effective a methodology could be: applying pre-prompt response fine-tuning combined with the request of personalized databases to leverage GLMs to deploy a tailor-made eHealth solution in-house. Fine-tuning GLMs on-premises is adapting a large pre-trained language model to a specific task or domain using a smaller dataset without sending data to a cloud server. This can be useful for privacy-preserving applications, such as eHealth, involving sensitive personal information. Here, we want to explore how effective fine-tuning of different GLMs on-premises by:

1. Fine-tuning language models from human preferences, where human feedback guides the model towards desired behaviors or styles in natural language tasks such as text continuation and summarization.
2. Fine-tuning language models for zero-shot learning, where natural language instructions teach the model to perform various tasks without any labeled data or examples.
3. Fine-tuning language models for domain-specific generation, where the model is adapted to produce more relevant and coherent text for a particular topic or audience.

Equipe de recherche au sein de la HES-SO: Percia David Dimitri , Kucharavy Andrei , Vallez Cyril

Durée du projet: 03.04.2023 - 31.01.2024

Montant global du projet: 40'000 CHF

Url du site du projet: https://www.hevs.ch/en/projects/fine-tuning-of-generative-language-models-on-premises-usefulness-safety-balance-of-patient-facing-llm-based-conversational-agents-206556

Statut: Terminé

Axe Transformation numérique 2023 - HES-SO Valais-Wallis

AGP

Rôle: Collaborateur/trice

Financement: VS - Direction / Ra&D

Description du projet : Projet inter-disciplinaire pour l'axe Transformation numérique de la HES-SO Valais-Wallis Principalement pour la saisie d'heures pour la coordination et la participation aux séances de l'axe Transformation numérique. Pas de remboursement de frais de déplacement inter-site. Ventilation pour les BSM : W00 - centre de coût 90014 - n°projet (ne pas mettre de pilier 6 dans la ventilation car c'est sur un centre de coûts) Répartition du budget sur les projets socle des instituts au 31.12.2023

, , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , Equipe de recherche au sein de la HES-SO: Rey-Gillioz Anne-Christine Vashdev Dina Antic Dragana Loup Jimmy Martinovic Jelena Bocchi Yann Verloo Henk Mudry Pierre-André Gabioud Dominique Zeder David Schumacher Michael Ignaz Schegg Roland Bürki Florian Zufferey Jeff Widmer Antoine Martinez Anja Manzo Gaetano Calbimonte Jean-Paul Darbellay Anne Hanik Nils Borgeat Rémy Calvaresi Davide Pitteloud Pascal Piguet Jean-Gabriel Félix Valérie Kuhn Alexandre Loloum Tristan Fellay Christophe Percia David Dimitri Delgado Pamela Kucharavy Andrei Imboden Serge Vallez Cyril

Durée du projet: 01.01.2023 - 31.12.2023

Montant global du projet: 110'300 CHF

Statut: Terminé

Cybersecurity-Technology Assessment, Monitoring & Forecasting

Rôle: Co-requérant(s)

Financement: Cyber-Defence Campus

Description du projet :

Objectives of the project: The project provides quantitative technology assessment, monitoring & forecasting models for cyber-defence. Such an effort aims to contribute to the Technology Monitoring (TM) portfolio of the Cyber-Defence (CYD) Campus in (i) fulfilling the first measure of the NCS[1] attributed to armasuisse S+T, (ii) writing technical reports on specific cyber-defense technologies for the CYD Campus clients, and (iii) contributing in the development of The Swiss Technology Observatory[2] – all three objectives being attached to the Strategy Cyber DPPS (see figure below). While the project aims to present concrete cybersecurity-technology assessments, monitoring, and forecasting through case studies defined under the TM portfolio needs, we want these results to be backed by coherent, relevant, and solid scientific methodologies that will be published in Q1 journals and A conferences. By applying (i) Solomonff Bayesian algorithmic probability, (ii) symbolic theory of evolution in Gillespie-Orr formulation, (iii) quantitative analysis (data science), and (iv) algorithmics economics, our work aims to rethink traditional technology mining methodologies by developing dynamic and holistic approaches to provide concrete cyber-defense insights in terms of technology assessment, monitoring, and forecasting.

WP1: Quantification of adoption processes in cyber-security technologies as evolutionary selection: In this working package, we will formalize the equivalence between the Gillespie-Orr model of evolution and innovation process and derive from first principles well-known properties of innovative processes that have not yet been quantified, such as hype curve or iterate-and-pivot innovation model, based on attention and adoption metrics, such as Wikipedia articles views and GitHub repositories creation, modification and starring. We expect our model to allow us to detect and quantify previously ignored phenomena in technological forecastings, such as neutral drift – widespread adoption of technologies offering no benefit due to random factors (e.g., blockchains in a trusted environment). We aim to provide a quantitative evaluation of the time-to-market readiness of technologies relevant to cyber-defense and the likelihood of them making it to the deployment readiness plateau. To validate our approach, we will retrospectively use contact tracing applications as vectors of target identification attacks, as well as generative machine learning in the context of information operations.

WP2: Optimization of time of performance metric re-adjustment: In this working package, we further develop the Gillespie-Orr view on technical innovation in cyber-security to predict the best time for technology performance metric revision. While Goodhart's law is an empirical adage stating that any good measure ceases to be one once it is used to guide decisions, it has fundamental theoretical reasons to exist due to a combination of Rice theorem and properties of evolution processes. We aim to formalize it, leveraging the quantitative cyber-security technology evolution framework, and predict the optimal time to revise them based on signals from quantitative data collection points. This is critical for cyber-security, given the speed of the new technology's emergence and deprecation. A concrete case study of application would be a comparative study of effectiveness saturation of software-mediated email attack vector de-fanging (eg. denial of mails with attachments or with links to outside domains), versus human training-based (phishing exercises), based on the speed of their adoption, attention and initial effectiveness. This would allow the CYD Campus to re-evaluate the performance metrics of deployed cyber-defense technologies before bypasses are feasible. This process will be illustrated with a case study on a specific technology identified by the CYD Campus as a critical future technology.

WP3: Development of a computational model to predict the indirect impact of new technologies on cyber-defense: In this working package, we will develop a formal language to describe critical infrastructures across different scales of granularity. Our goal is to be able to perform a quantitative evaluation of the potential points of vulnerabilities introduced by seemingly unrelated technologies through a semantic knowledge graph. A motivating example is the widespread adoption of smart electricity meters and smart devices, which are becoming a vector of attack on the stability of the electric grid. Specifically, without proper security validation, smart meters can be tricked into thinking that the price of electricity is low, as the load on the electric grid is at its peak and gives a “full electricity usage now” command to a large number of heavy-load intelligent appliances such as electric water boilers or electric cars charging stations. In turn, if the adoption of such meters and appliances is sufficiently widespread, the sudden surcharge can take out the electric grid and trigger the tripping of substations, inducing blackouts that can be exploited for military or informational operations. Our approach is to computationally evaluate a similar impact of new technologies on a grid of controlled vocabulary term relationships to detect similar attack points.

[1] https://www.ncsc.admin.ch/ncsc/en/home/strategie/strategie-ncss-2018-2022.html

[2] https://technology-observatory.ch/

Equipe de recherche au sein de la HES-SO: Percia David Dimitri , Kucharavy Andrei

Partenaires professionnels: Dr. Alain Mermoud, Cyber-Defence Campus

Durée du projet: 02.01.2023 - 29.12.2023

Montant global du projet: 144'753 CHF

Url du site du projet: https://www.hevs.ch/en/projects/quantification-of-adoption-processes-in-cyber-security-technologies-as-evolutionary-selection-208968

Statut: Terminé

Améliorer le transfert de connaissances entre les professeur-e-s et les étudiant-e-s : affinement des modèles de langage génératif en hébergement local

Rôle: Co-requérant(s)

Financement: Guichet permanent sur l'expérimentation digitale du centre de compétences numériques de la HES-SO

Description du projet :

Ce projet vise à développer et tester un nouvel outil pédagogique de transfert de connaissances assisté par l’intelligence artificielle (IA). Le but du projet est d’optimiser l'apprentissage des étudiant-e-s universitaires grâce à l’affinement (fine-tuning) sur mesure de modèles open-source de langage génératif (LLMs). La nouveauté du projet réside dans l'utilisation responsable et consciente des LLMs – affinés sur mesure (selon les matières enseignées) et sur place à l'aide de l’apprentissage par renforcement à partir de retours humains (RLHF). L'outil pédagogique permettra aux professeur-e-s et aux étudiant-e-s de personnaliser les LLMs pour différents cours et sujets, tout en garantissant la protection des données (car les modèles seront hébergés sur site), et la réduction les biais dans les résultats émis par les LLMs.

Equipe de recherche au sein de la HES-SO: Percia David Dimitri , Kucharavy Andrei , Vallez Cyril

Durée du projet: 01.05.2023 - 31.10.2023

Montant global du projet: 15'000 CHF

Url du site du projet: https://www.hes-so.ch/la-hes-so/soutien-a-lenseignement/projets-enseignement/detail-projet/ameliorer-le-transfert-de-connaissances-entre-les-professeur-e-s-et-les-etudiant-e-s-adaptation-des-modeles-de-langage-generatif-llms-en-hebergement-local

Statut: Terminé

Transformation numérique et intelligence artificielle: évaluation des solutions applicables aux processus de l'OIC

Rôle: Co-requérant(s)

Financement: Innosuisse

Description du projet :

Le but de ce mandat est d’évaluer de quelles façons et dans quels secteurs l’adaptation et l’adoption des agents conversationnels (conversational agents – Cas) basés sur des modèles génératifs de langage naturel (generative large language models – LLMs) peut aider l’OIC et ses collaborateurs à être plus efficaces et effectifs dans leur travail et leurs missions. Il s’agit notamment d’évaluer les solutions possibles d’implémentation de ces CAs dans différents domaines d’activité de l’organisation, tels que la gestion des ressources humaines, la comptabilité, le marketing ou le service client.

L’implémentation des CAs basés sur des LLMs permettrait ainsi à l’OIC et à ses collaborateurs d’atteindre les objectifs stratégiques suivants :

Augmenter l’allocation de leur temps de travail dédié au cœur de leur métier/leurs prestations, en diminuant leur temps d’allocation dédié aux tâches “parasites” mais nécessaires (telles que l’administration, les courriels, les formulaires, etc.).
Former les collaborateurs à l’usage des CAs basés sur des LLMs, en leur fournissant des formations adaptées et un support technique sur mesure.
Améliorer leur satisfaction au travail en libérant du temps pour être plus efficace et créatif dans leurs tâches quotidiennes.

Equipe de recherche au sein de la HES-SO: Percia David Dimitri , Kucharavy Andrei , Seppey Sherine

Partenaires professionnels: Robin Zambaz, Organisme Intercantonal de Certification (OIC)

Durée du projet: 01.03.2023 - 30.04.2023

Montant global du projet: 15'000 CHF

Statut: Terminé

2024

Overview of existing LLM families

Chapitre de livre ArODES

Andrei Kucharavy

Dans Kucharavy, Andrei, Lenders, Vincent, Mermoud, Alain, Mulder, Valentin, Plancherel, Octave, Large language models in cybersecurity (pp. 31–44). 2024, Cham : Springer

Lien vers la publication

Résumé:

While the general public discovered Large Language Models (LLMs) with ChatGPT—a generative autoregressive model, they are far from the only models in the LLM family. Various architectures and training regiments optimized for specific usages were designed throughout their development, which were then classified as different LLM families.

Fundamental limitations of generative LLMs

Chapitre de livre ArODES

Andrei Kucharavy

Dans Kucharavy, Andrei, Lenders, Vincent, Mermoud, Alain, Mulder, Valentin, Plancherel, Octave, Large language models in cybersecurity (pp. 3–17). 2024, Cham : Springer

Lien vers la publication

Résumé:

Large Language Models (LLMs) are scaled-up instances of Deep Neural Language Models—a type of Natural Language Processing (NLP) tools trained with Machine Learning (ML). To best understand how LLMs work, we must dive into what technologies they build on top of and what makes them different. To achieve this, an overview of the history of LLMs development, starting from the 1990s, is provided before covering the counterintuitive purely probabilistic nature of the Deep Neural Language Models, continuous token embedding spaces, recurrent neural networks-based models, what self-attention brought to the table, and finally, why scaling Deep Neural Language Models led to a qualitative change, warranting a new name for the technology.

From deep neural language models to LLMs

Chapitre de livre ArODES

Andrei Kucharavy

Dans Kucharavy, Andrei, Lenders, Vincent, Mermoud, Alain, Mulder, Valentin, Plancherel, Octave, Large language models in cybersecurity (pp. 3–17). 2024, Cham : Springer

Lien vers la publication

Résumé:

Adapting LLMs to downstream applications

Chapitre de livre ArODES

Andrei Kucharavy

Dans Kucharavy, Andrei, Lenders, Vincent, Mermoud, Alain, Mulder, Valentin, Plancherel, Octave, Large language models in cybersecurity (pp. 19–29). 2024, Cham : Springer

Lien vers la publication

Résumé:

By themselves, pretrained Large Language Models (LLMs) are interesting objects of study. However, they need to undergo a subsequent transfer learning phase to make them useful for downstream applications. While historically referred to as “fine-tuning,” the range of the tools available to LLMs users to better adapt base models to their applications is now significantly wider than the traditional fine-tuning. In order to provide the reader with an idea of the strengths and weaknesses of each method and allow them to pick one that would suit their needs best, an overview and classification of the most notable methods is provided, specifically the prompt optimization, pre-prompting and implicit prompting (system prompting), model coordination through actor agents, integration with auxiliary tools, parameter-efficient fine-tuning, further model pre-training, from-scratch retraining, and finally domain-specific distillation.

LLM-Resilient Bibliometrics

Article scientifique

Factual Consistency Through Entity Triplet Extraction

Alexander Sternfeld, Kucharavy Andrei, Percia David Dimitri, Alain Mermoud, Julian Jang-Jaccard

Swiss Technology Observatory, 2024

Lien vers la publication

Résumé:

The increase in power and availability of Large Language Models (LLMs) since late 2022 led to increased concerns with their usage to automate academic paper mills. In turn, this poses a threat to bibliometrics-based technology monitoring and forecasting in rapidly moving fields. We propose to address this issue by leveraging semantic entity triplets. Specifically, we extract factual statements from scientific papers and represent them as (subject, predicate, object) triplets before validating the factual consistency of statements within and between scientific papers. This approach heavily penalizes blind usage of stochastic text generators such as LLMs while not penalizing authors who used LLMs solely to improve the readability of their paper. Here, we present a pipeline to extract such triplets and compare them. While our pipeline is promising and sensitive enough to detect inconsistencies between papers from different domains, the intra-paper entity reference resolution needs to be improved to ensure that triplets are more specific. We believe that our pipeline will be useful to the general research community working on the factual consistency of scientific texts.

2023

Fundamentals of Generative Large Language Models and Perspectives in Cyber-Defense

Rapport

Kucharavy Andrei, Zachary Schillaci, Loïc Maréchal, Maxime Würsch, Ljiljana Dolamic, Remi Sabonnadiere, Percia David Dimitri, Alain Mermoud, Vincent Lenders

2023, Cornell : arXiv, 50 p.

Lien vers la publication

Résumé:

Generative Language Models gained significant attention in late 2022 / early 2023, notably with the introduction of models refined to act consistently with users' expectations of interactions with AI (conversational models). Arguably the focal point of public attention has been such a refinement of the GPT3 model -- the ChatGPT and its subsequent integration with auxiliary capabilities, including search as part of Microsoft Bing. Despite extensive prior research invested in their development, their performance and applicability to a range of daily tasks remained unclear and niche. However, their wider utilization without a requirement for technical expertise, made in large part possible through conversational fine-tuning, revealed the extent of their true capabilities in a real-world environment. This has garnered both public excitement for their potential applications and concerns about their capabilities and potential malicious uses. This review aims to provide a brief overview of the history, state of the art, and implications of Generative Language Models in terms of their principles, abilities, limitations, and future prospects -- especially in the context of cyber-defense, with a focus on the Swiss operational environment.

LLMs Perform Poorly at Concept Extraction in Cyber-security Research Literature

Article scientifique

Maxime Würsch, Kucharavy Andrei, Percia David Dimitri, Alain Mermoud

arXiv, 2023 , vol. arXiv:2312.07110v1, no arXiv:2312.07110v1

Lien vers la publication

Résumé:

The cybersecurity landscape evolves rapidly and poses threats to organizations. To enhance resilience, one needs to track the latest developments and trends in the domain. It has been demonstrated that standard bibliometrics approaches show their limits in such a fast-evolving domain. For this purpose, we use large language models (LLMs) to extract relevant knowledge entities from cybersecurity-related texts. We use a subset of arXiv preprints on cybersecurity as our data and compare different LLMs in terms of entity recognition (ER) and relevance. The results suggest that LLMs do not produce good knowledge entities that reflect the cybersecurity context, but our results show some potential for noun extractors. For this reason, we developed a noun extractor boosted with some statistical analysis to extract specific and relevant compound nouns from the domain. Later, we tested our model to identify trends in the LLM domain. We observe some limitations, but it offers promising results to monitor the evolution of emergent trends.

2024

Keynote: Threats And Mitigations Landscape In The Age Of Generative AI

Conférence

Kucharavy Andrei

Insomni'hack, 25.04.2024 - 26.04.2024, Lausanne

PEOPLE@HES-SO Annuaire et Répertoire des compétences

Kucharavy Andrei

Professeur-e HES Assistant-e

Compétences principales

Applied Machine Learning

Large Language Models

Generative ML

Cybersecurity

Data Science

Image processing and Analysis

Computational Systems Biology

Professeur-e HES Assistant-e

PEOPLE@HES-SO
Annuaire et Répertoire des compétences