Valorisez vos réalisations phares sur People@HES-SO Plus d'infos
PEOPLE@HES-SO – Annuaire et Répertoire des compétences
PEOPLE@HES-SO – Annuaire et Répertoire des compétences

PEOPLE@HES-SO
Annuaire et Répertoire des compétences

Aide
language
  • fr
  • en
  • de
  • fr
  • en
  • de
  • SWITCH edu-ID
  • Administration
ID
« Retour
Wolf Beat

Wolf Beat

Assoziierter Professor FH

Compétences principales

Distributed and Parallel Computing

Programmation

Bio-Informatique

Machine Learning

Document Analysis

Data Analysis

Anomaly Detection

  • Contact

  • Enseignement

  • Recherche

  • Publications

  • Conférences

Contrat principal

Assoziierter Professor FH

Bureau: HEIA_D20.07

Haute école d'ingénierie et d'architecture de Fribourg
Boulevard de Pérolles 80, 1700 Fribourg, CH
HEIA-FR
Institut
iCoSys - Institut des systèmes complexes
BA HES-SO en Architecture - Haute école d'ingénierie et d'architecture de Fribourg
  • Traitement de données
  • Machine Learning
  • Méthodologie DevOps
MSc HES-SO en Business Administration - HES-SO Master
  • Research Methods 1
MSc HES-SO en Engineering - HES-SO Master
  • New technologies in data analysis, applied machine learning

En cours

Autre instance tierce - ModIA
AGP

Rôle: Co-requérant(s)

Requérant(e)s: FR - EIA - Institut SeSi

Financement: HES-SO Rectorat

Description du projet : Autre instance tierce - ModIA

Equipe de recherche au sein de la HES-SO: Robyr Jean-Luc , Von Barnekow Alec , Donzallaz Jonathan , Sivanesan Nirosh , Magnin Vincent , Maillard Philippe , Wolf Beat

Partenaires académiques: FR - EIA - Institut iCoSys; FR - EIA - Institut SeSi

Durée du projet: 01.07.2024 - 30.06.2025

Montant global du projet: 15'000 CHF

Statut: En cours

Foundation model for time series forecasting
AGP

Rôle: Requérant(e) principal(e)

Financement: HES-SO Rectorat

Description du projet : L'analyse de séries temporelles est omniprésente, que ça soit dans l'industrie 4.0 avec les capteurs, le monde des finances, le monde médical, le « smart living » ou encore bien d'autres domaines. Un problème fréquent dans tous ces domaines est la prédiction des valeurs futures d'une série temporelle univariée ou multivariée. Une difficulté récurrente est la mise à disposition des données historiques pour entraîner de tels modèles prédictifs. L'entraînement de systèmes machine learning pour la prédiction est donc souvent laborieux ou aboutit en des performances insuffisantes. Dans d'autres domaines de l'analyse de données, notamment du texte ou des images, une tendance des dernières années est de non pas d'entraîner des modèles spécifiques pour un problème donné, mais de créer des « foundation models » (modèles de fondation) [3], entraînés sur une multitude de problèmes du même domaine. Ces modèles de fondation, tels que BERT[1], GPT-4[2] (ChatGPT), Stable Diffusion[3] ou ResNet[4], permettent de faire du transfer learning, few shot ou même zero-shot learning vers de nouveaux problèmes. Similaire à l'humain qui peut bénéficier de son savoir de divers domaines quand il apprend à résoudre un nouveau problème, les modèles de fondation permettent de rapidement s'adapter à un nouveau problème en se basant sur ce qu'ils ont appris auparavant. Ceci résout le problème de la quantité de données requises par problème et en même temps permet de réduire les ressources nécessaires pour entraîner ces modèles. Nos expériences à travers divers projets de recherche avec ce type de modèles dans les domaines du texte et d'images sont contrastées par le manque de ce type de modèles dans le domaine de la prédiction de séries temporelles, une tâche fondamentale pour d'autres analyses comme la détection d'anomalies. Les architectures deep learning de type transformers utilisées pour le texte doivent en effet être adaptées aux séries temporelles, entre autre pour incorporer les notions de tendances, saisonnalités, et même temporalité, qui sont absentes du traitement effectué par les transformers de base. Ceci est l'objet de recherches actuelles auxquelles nous souhaitons contribuer. Ce projet a donc comme but de répondre à la question « Est-il possible de créer des modèles de fondation pour la prédiction de séries temporelles ? ». Dans le cadre du projet, la faisabilité et les meilleures approches seront explorées Un jeu de données grand et divers créée en réutilisant des données publiques et potentiellement internes sera également assemblé. Si la recherche aboutit à un résultat positif, le modèle et, dans la mesure du possible, le jeu de données, seront publiés en open-source. L'idée étant de devenir le modèle de base pour tous ces types d'analyses comme BERT l'est actuellement pour l'analyse de texte. Ce projet de recherche permet de développer des compétences variées, applicables dans une variété de domaines de l'analyse de données et machine learning. Comme le but du projet de recherche n'est pas seulement d'arriver à une conclusion sur la question, mais de créer un « produit », c'est-à-dire un modèle réutilisable par tout le monde, implique une approche plus stricte et rigoureuse au niveau du développement, similaire à un projet industriel. Les compétences acquises de prédiction de valeurs futures de séries temporelles, que la création du modèle de fondation soit une réussite ou pas, seront d'un grand intérêt pour diverses entreprises.

Equipe de recherche au sein de la HES-SO: Montet Frédéric , Pasquier Benjamin , Biolley Valentin , Maillard Philippe , Wolf Beat

Partenaires académiques: FR - EIA - Institut iCoSys

Durée du projet: 01.05.2024 - 30.09.2025

Montant global du projet: 100'000 CHF

Statut: En cours

Terminés

Monitoring des Transformateurs de Puissance dans un Objectif de Maintenance Prédictive
AGP

Rôle: Collaborateur/trice

Requérant(e)s: FR - EIA - Institut ENERGY

Financement: HES-SO Rectorat

Description du projet : Le réseau électrique Suisse a un degré de fiabilité élevé, notamment grâce à la robustesse des composants HT (haute tension). Le coût des composants est élevé, il faut donc maximiser leur durée de vie. Cette dernière dépend de leur design, mais également des sollicitations auxquelles ils seront confrontés. Le réseau électrique se transforme à cause de l'injection décentralisée. Ceci implique des changements de la dynamique du réseau et des sollicitations sur les composants autres que celles pour lesquelles ils ont été dimensionnés, avec comme conséquence une possible diminution de leur durée de vie. Par exemple pour les transformateurs de puissance (TrP), les profils de charge du réseau seront nettement plus dynamiques et comportent de plus grandes variations. Il existe des techniques (diagnostic et monitoring) qui permettent d'évaluer l'état d'un composant, mais elles nécessitent des coûts d'investissement élevés et une grande expertise. Les modèles de vieillissement existants, qui pourraient potentiellement éviter les problèmes du diagnostic et monitoring actifs, sont en général trop simples et inexacts, ou alors trop complexes et difficilement paramétrables. De plus, la durée de vie d'un composant HT étant de plusieurs dizaines d'années, il est impossible de valider un modèle en effectuant des tests de vieillissement à l'échelle 1:1. L'objectif visé par ce projet est de développer une technique de monitoring intelligente et économique pour optimiser la maintenance et les coûts d'exploitation, et prolonger la durée de vie des composants, avec comme conséquence une qualité de service améliorée. La solution détectera des anomalies à partir de données de capteurs installés sur un TrP isolé et refroidi à l'huile minérale. Cette détection d'anomalies basées sur le `machine learning' devrait idéalement fonctionner sur un système embarqué à proximité du TrP. Notre priorité est de se concentrer sur les TrPs qui sont les composants les plus chers et critiques du réseau. Cependant, la solution pourra être appliquée à d'autres types de composants. Pour atteindre notre objectif, il est nécessaire de disposer de données réelles. Si celles-ci sont difficiles, à obtenir en quantité suffisante, des données simulées basées sur l'expérience viendront les compléter. Concernant les données réelles, un système robuste et économique pour l'acquisition des grandeurs mesurables est en phase finale de réalisation : une infrastructure de mesure innovante a été développée par une entreprise partenaire (Gradesens) qui mesure des valeurs de température et de puissance, mais aussi des grandeurs non standards telles que le spectre vibratoire du TrP et du changeur de prises (tap changer). Un pilote est en cours d'installation sur un TrP d'une autre entreprise partenaire (BKW), avec potentiellement plus de TrPs qui vont suivre. Ces données seront ainsi disponibles au début du projet. Vu le nombre de données limitées et que les TrPs pilotes vont présenter peu d'anomalies, nous allons développer un `digital twin'. Ceci nous permet de créer à la fois des données pour valider notre approche avec des anomalies simulées, mais aussi pour faire un pré-entrainement sur ces données, suivi par un `transfer learning' vers les données réelles. Des méthodes standards pour la détection d'anomalies vont être appliquées, notamment des approches de modélisation de l'état normal du TrP. Ainsi, nous pouvons détecter si les mesures provenant du TrP réel dérivent de ce que notre modèle prédit et ainsi détecter des anomalies. Le système final sera validé sur des TrPs de test des BKW. (des discussions sont en cours avec d'autres gestionnaires de réseau). Note : par TrP on sous-entend les TrPs de puissance supérieure à S'50 MVA. Mais il existe néanmoins un potentiel pour les TrPs de plus faible puissance.

Equipe de recherche au sein de la HES-SO: Rolle Dominique , Favrat Pierre , Von Barnekow Alec , Junod Charlie , Kissling Simon , El Hayek Joseph , Monnard Jacques , Mack Vincent , Charbon Yann , Corpataux Sam , Nicoulaz Didier , Gobbi Samuele , Vial Maël , Magnin Vincent , Karimian Mahboob , Litzistorf Johann , Maillard Philippe , Wolf Beat , Carpita Mauro

Partenaires académiques: IICT; FR - EIA - Institut ENERGY; FR - EIA - Institut iCoSys; iE

Durée du projet: 15.01.2023 - 29.11.2024

Montant global du projet: 220'000 CHF

Statut: Terminé

GREENum - Améneigement numérique pour la réalisation d'espaces verts
AGP

Rôle: Collaborateur/trice

Requérant(e)s: FR - EIA - Institut iTEC

Financement: SLL-PR

Description du projet : A l'heure des bouleversements climatiques qui nous touchent, des collectifs, des associations de quartier mènent des projets de renaturation des espaces imperméabilisés. D'autre part, les besoins en mobilité augmentent et la qualité des infrastructures et l'environnement dans lequel les usagers se déplacent revêtent une influence de plus en plus importante. Si le contraste entre infrastructures de transports et mobilité et espaces verts semble important, il est indéniable que certaines surfaces, à l'échelle de quartiers, de villes, ne nécessiteraient pas forcément un revêtement imperméable. Afin de visualiser les surfaces réelles empruntées par les différents modes de déplacements, la technique d'améneigement ou sneckdown, un terme qui désigne l'observation de l'utilisation d'espaces publics existants, par accumulation ou disparition de la neige, pouvait être utilisée. Avec des hivers où, sous nos latitudes, les cumuls de neige sont moins importants ou sur des périodes moins étendues, renforcé dans les villes où les services de voirie interviennent au plus vite lors d'épisodes neigeux, de telles observations deviennent plus complexes. Partant de ce constat, nous proposons d'évaluer, sous la forme d'une pré-étude, la faisabilité et le potentiel du développement d'un outil numérique d'observation (Machine learning) et d'identification de surfaces utilisées de l'espace public, qui pourrait alors être mis en 'uvre quelques soient les conditions météorologiques.

Equipe de recherche au sein de la HES-SO: Von Barnekow Alec , Vial Maël , Fénart Marc-Antoine , Vanbutsele Séréna , Wolf Beat , Schaffner Estela

Partenaires académiques: FR - EIA - Institut iTEC

Durée du projet: 15.09.2022 - 31.12.2022

Montant global du projet: 24'700 CHF

Url du site du projet: https://swissmoves.ch/index.php/component/content/article/greenum?catid=15&Itemid=123

Statut: Terminé

Next Generation DNA Sequencing Cloud based platform
AGP

Rôle: Collaborateur/trice

Requérant(e)s: FR - EIA - Institut iCoSys

Financement: CTI

Description du projet : Phenosystems SA has developed GensearchNGS, a framework geared towards diagnostics laboratories which analyses Next Generation Sequencing (NGS) data to detect changes in the DNA sequences for the diagnostic of genetic diseases.'Our current software distribution model is based on installation of the software at the customers' premises, requiring from them adequate computing power and storage capacity. Potential new customers are thus faced with a requirement for investment on the software and most of the time on the hardware side as well. This creates a barrier for them to start working with NGS data and might give them the opportunity to try out analytical services offered by competitors.'The GRID & Cloud Computing Group.of the HES-SO//Fribourg have been involved since numerous years in the development of applications and tools for large distributed systems. They have developed a model, and the associated software tools, for easy object oriented programming of distributed infrastructures such a clusters, Grids or Clouds. Two programming tools, called POP-C++ and POP-Java, have been developed. Both tools are based on an original distributed object oriented programming model called POP for: Parallel Object Programming. POP-C++ has already been intensively used in several research contexts.'The aim of this project is to study the feasiblity to offer our users GensearchNGS the possibility to run, time and memory consuming parts of the framework, on Cloud environments of their choice (our customers are based in various countries, and thus various legal constraints will require us to run on various Clouds).

Equipe de recherche au sein de la HES-SO: Goetschi Damien , Hennebert Jean , Rial Jonathan , Kuonen Pierre , Wolf Beat

Partenaires académiques: Phenosystem SA; FR - EIA - Institut iCoSys

Durée du projet: 04.11.2013 - 30.06.2021

Montant global du projet: 7'500 CHF

Statut: Terminé

2024

Enabling diffusion model for conditioned time series generation
Article scientifique ArODES

Montet. Frédéric, Benjamin Pasquier, Beat Wolf, Jean Hennebert

Engineering proceedings,  2024, 68, 1, 25

Lien vers la publication

Résumé:

Synthetic time series generation is an emerging field of study in the broad spectrum of data science, addressing critical needs in diverse fields such as finance, meteorology, and healthcare. In recent years, diffusion methods have shown impressive results for image synthesis thanks to models such as Stable Diffusion and DALL·E, defining the new state-of-the-art methods. In time series generation, their potential exists but remains largely unexplored. In this work, we demonstrate the applicability and suitability of diffusion methods for time series generation on several datasets with a rigorous evaluation procedure. Our proposal, inspired from an existing diffusion model, obtained a better performance than a reference model based on generative adversarial networks (GANs). We also propose a modification of the model to allow for guiding the generation with respect to conditioning variables. This conditioned generation is successfully demonstrated on meteorological data.

2023

Whole-genome sequencing identified new structural variations in the DMD gene that cause Duchenne muscular dystrophy in two girls
Article scientifique ArODES

Natalie Pluta, Arpad von Moers, Astrid Pechmann, Werner Stenzel, Hans-Hilmar Goebel, David Atlan, Beat Wolf, Indrajit Nanda, Ann-Kathrin Zaum, Simone Rost

International Journal of Molecular Sciences,  24, 17, 13567

Lien vers la publication

Résumé:

Dystrophinopathies are the most common muscle diseases, especially in men. In women, on the other hand, a manifestation of Duchenne muscular dystrophy is rare due to X-chromosomal inheritance. We present two young girls with severe muscle weakness, muscular dystrophies, and creatine kinase (CK) levels exceeding 10,000 U/L. In the skeletal muscle tissues, dystrophin staining reaction showed mosaicism. The almost entirely skewed X-inactivation in both cases supported the possibility of a dystrophinopathy. Despite standard molecular diagnostics (including multiplex ligation-dependent probe amplification (MLPA) and next generation sequencing (NGS) gene panel sequencing), the genetic cause of the girls’ conditions remained unknown. However, whole-genome sequencing revealed two reciprocal translocations between their X chromosomes and chromosome 5 and chromosome 19, respectively. In both cases, the breakpoints on the X chromosomes were located directly within the DMD gene (in introns 54 and 7, respectively) and were responsible for the patients’ phenotypes. Additional techniques such as Sanger sequencing, conventional karyotyping and fluorescence in situ hybridization (FISH) confirmed the disruption of DMD gene in both patients through translocations. These findings underscore the importance of accurate clinical data combined with histopathological analysis in pinpointing the suspected underlying genetic disorder. Moreover, our study illustrates the viability of whole-genome sequencing as a time-saving and highly effective method for identifying genetic factors responsible for complex genetic constellations in Duchenne muscular dystrophy (DMD).

2022

Homozygous inversion on chromosome 13 involving SGCG detected by short read whole genome sequencing in a patient suffering from limb-girdle muscular dystrophy
Article scientifique ArODES

Natalie Pluta, Sabine Hoffjan, Frederic Zimmer, Cornelia Köhler, Thomas Lücke, Jennifer Mohr, Matthias Vorgerd, Hoa Huu Phuc Nguyen, David Atlan, Beat Wolf, Ann-Katrin Zaum, Simone Rost

Genes,  2022, vol. 13, article no. 1752

Lien vers la publication

Résumé:

New techniques in molecular genetic diagnostics now allow for accurate diagnosis in a large proportion of patients with muscular diseases. Nevertheless, many patients remain unsolved, although the clinical history and/or the muscle biopsy give a clear indication of the involved genes. In many cases, there is a strong suspicion that the cause must lie in unexplored gene areas, such as deep-intronic or other non-coding regions. In order to find these changes, next-generation sequencing (NGS) methods are constantly evolving, making it possible to sequence entire genomes to reveal these previously uninvestigated regions. Here, we present a young woman who was strongly suspected of having a so far genetically unsolved sarcoglycanopathy based on her clinical history and muscle biopsy. Using short read whole genome sequencing (WGS), a homozygous inversion on chromosome 13 involving SGCG and LINC00621 was detected. The breakpoint in intron 2 of SGCG led to the absence of γ-sarcoglycan, resulting in the manifestation of autosomal recessive limb-girdle muscular dystrophy 5 (LGMDR5) in the young woman.

2021

Transcription alignment of historical vietnamese manuscripts without human-annotated learning samples
Article scientifique ArODES

Ann Scius-Bertrand, Michael Jungo, Beat Wolf, Andreas Fischer, Marc Bui

Applied Sciences,  2021, vol. 11, no. 11, article no. 4894

Lien vers la publication

Résumé:

The current state of the art for automatic transcription of historical manuscripts is typically limited by the requirement of human-annotated learning samples, which are are necessary to train specific machine learning models for specific languages and scripts. Transcription alignment is a simpler task that aims to find a correspondence between text in the scanned image and its existing Unicode counterpart, a correspondence which can then be used as training data. The alignment task can be approached with heuristic methods dedicated to certain types of manuscripts, or with weakly trained systems reducing the required amount of annotations. In this article, we propose a novel learning-based alignment method based on fully convolutional object detection that does not require any human annotation at all. Instead, the object detection system is initially trained on synthetic printed pages using a font and then adapted to the real manuscripts by means of self-training. On a dataset of historical Vietnamese handwriting, we demonstrate the feasibility of annotation-free alignment as well as the positive impact of self-training on the character detection accuracy, reaching a detection accuracy of 96.4% with a YOLOv5m model without using any human annotation.

A sparse observation model to quantify species distributions and their overlap in space and time
Article scientifique ArODES

Sadoune Ait Kaci Azzou, Liam Singer, Thierry Aebischer, Madleina Caduff, Beat Wolf, Daniel Wegmann

Ecography,  2021, vol. 44, pp. 1-13

Lien vers la publication

Résumé:

Camera traps and acoustic recording devices are essential tools to quantify the distribution, abundance and behavior of mobile species. Varying detection probabilities among device locations must be accounted for when analyzing such data, which is generally done using occupancy models. We introduce a Bayesian time‐dependent observation model for camera trap data (Tomcat), suited to estimate relative event densities in space and time. Tomcat allows to learn about the environmental requirements and daily activity patterns of species while accounting for imperfect detection. It further implements a sparse model that deals well will a large number of potentially highly correlated environmental variables. By integrating both spatial and temporal information, we extend the notation of overlap coefficient between species to time and space to study niche partitioning. We illustrate the power of Tomcat through an application to camera trap data of eight sympatrically occurring duiker Cephalophinae species in the savanna – rainforest ecotone in the Central African Republic and show that most species pairs show little overlap. Exceptions are those for which one species is very rare, likely as a result of direct competition.

2020

Detecting selection from linked sites using an F-model
Article scientifique ArODES

Marco Galimberti, Christoph Leuenberger, Beat Wolf, Sandor M. Szilagyi, Matthieu Foll, Daniel Wegmann

Genetics,  2020, vol. 216, no. 2

Lien vers la publication

Résumé:

Allele frequencies vary across populations and loci, even in the presence of migration. While most differences may be due to genetic drift, divergent selection will further increase differentiation at some loci. Identifying those is key in studying local adaptation, but remains statistically challenging. A particularly elegant way to describe allele frequency differences among populations connected by migration is the F-model, which measures differences in allele frequencies by population specific FST coefficients. This model readily accounts for multiple evolutionary forces by partitioning FST coefficients into locus and population specific components reflecting selection and drift, respectively. Here we present an extension of this model to linked loci by means of a hidden Markov model (HMM), which characterizes the effect of selection on linked markers through correlations in the locus specific component along the genome. Using extensive simulations we show that the statistical power of our method is up to two-fold that of previous implementations that assume sites to be independent. We finally evidence selection in the human genome by applying our method to data from the Human Genome Diversity Project (HGDP).

Using CNNs to optimize numerical simulations in geotechnical engineering
Article scientifique

Wolf Beat, Donzallaz Jonathan, Buchs Colette, Hayoz Amanda, Commend Stéphane, Hennebert Jean

Proceedings of IAPR Workshop on Artificial Neural Networks in Pattern Recognition, ANNPR 2020 : Artificial Neural Networks in Pattern Recognition, 2-4 September 2020, Winterthur, Switzerland, 2020

Lien vers la publication

Résumé:

Deep excavations are today mainly designed by manually optimising the wall’s geometry, stiffness and strut or anchor layout. In order to better automate this process for sustained excavations, we are exploring the possibility of approximating key values using a machine learning (ML) model instead of calculating them with time-consuming numerical simulations. After demonstrating in our previous work that this approach works for simple use cases, we show in this paper that this method can be enhanced to adapt to complex real-world supported excavations. We have improved our ML model compared to our previous work by using a convolutional neural network (CNN) model, coding the excavation configuration as a set of layers of fixed height containing the soil parameters as well as the geometry of the walls and struts. The system is trained and evaluated on a set of synthetically generated situations using numerical simulation software. To validate this approach, we also compare our results to a set of 15 real-world situations in a t-SNE. Using our improved CNN model we could show that applying machine learning to predict the output of numerical simulation in the domain of geotechnical engineering not only works in simple cases but also in more complex, real-world situations.

2019

A comprehensive method protocol for annotation and integrated functional understanding of lncRNAs
Article scientifique ArODES

Meik Kunz, Beat Wolf, Maximilian Fuchs, Jan Christoph, Ke Xiao, Thomas Thum, David Atlan, Hans-Ulrich Prokosch, Thomas Dandekar

Briefings in Bioinformatics,

Lien vers la publication

Résumé:

Long non-coding RNAs (lncRNAs) are of fundamental biological importance; however, their functional role is often unclear or loosely defined as experimental characterization is challenging and bioinformatic methods are limited. We developed a novel integrated method protocol for the annotation and detailed functional characterization of lncRNAs within the genome. It combines annotation, normalization and gene expression with sequence-structure conservation, functional interactome and promoter analysis. Our protocol allows an analysis based on the tissue and biological context, and is powerful in functional characterization of experimental and clinical RNA-Seq datasets including existing lncRNAs. This is demonstrated on the uncharacterized lncRNA GATA6-AS1 in dilated cardiomyopathy.

2018

Single CpG hypermethylation, allele methylation errors, and decreased expression of multiple tumor suppressor genes in normal body cells of mutation-negative early-onset and high-risk breast cancer patients
Article scientifique ArODES

Julia Böck, Silke Appenzeller, Larissa Haertle, Tamara Schneider, Andrea Gehrig, Jörg Schröder, Simone Rost, Beat Wolf, Claus R. Bartram, Christian Sutter, Thomas Haaf

International Journal of Cancer,  2018, vol. 143, no. 6, pp. 1416-1425

Lien vers la publication

Résumé:

To evaluate the role of constitutive epigenetic changes in normal body cells of BRCA1/BRCA2‐mutation negative patients, we have developed a deep bisulfite sequencing assay targeting the promoter regions of 8 tumor suppressor (TS) genes (BRCA1, BRCA2, RAD51C, ATM, PTEN, TP53, MLH1, RB1) and the estrogene receptor gene (ESR1), which plays a role in tumor progression. We analyzed blood samples of two breast cancer (BC) cohorts with early onset (EO) and high risk (HR) for a heterozygous mutation, respectively, along with age‐matched controls. Methylation analysis of up to 50,000 individual DNA molecules per gene and sample allowed quantification of epimutations (alleles with >50% methylated CpGs), which are associated with epigenetic silencing. Compared to ESR1, which is representative for an average promoter, TS genes were characterized by a very low (< 1%) average methylation level and a very low mean epimutation rate (EMR; < 0.0001% to 0.1%). With exception of BRCA1, which showed an increased EMR in BC (0.31% vs. 0.06%), there was no significant difference between patients and controls. One of 36 HR BC patients exhibited a dramatically increased EMR (14.7%) in BRCA1, consistent with a disease‐causing epimutation. Approximately one third (15 of 44) EO BC patients exhibited increased rates of single CpG methylation errors in multiple TS genes. Both EO and HR BC patients exhibited global underexpression of blood TS genes. We propose that epigenetic abnormalities in normal body cells are indicative of disturbed mechanisms for maintaining low methylation and appropriate expression levels and may be associated with an increased BC risk.

2017

Reducing the complexity of OMICS data analysis
Thèse de doctorat

Wolf Beat

2017,  Würzburg, Germany : University of Würzburg

Lien vers la publication

2016

Non-coding RNAs in lung cancer :
Article scientifique ArODES
contribution of bioinformatics analysis to the development of non-invasive diagnostic tools

Meik Kunz, Beat Wolf, Harald Schulze, David Atlan, Thorsten Walles, Heike Walles, Thomas Dandekar

Genes,  2017, vol. 8, no. 1, article no. 8

Lien vers la publication

Résumé:

Lung cancer is currently the leading cause of cancer related mortality due to late diagnosis and limited treatment intervention. Non-coding RNAs are not translated into proteins and have emerged as fundamental regulators of gene expression. Recent studies reported that microRNAs and long non-coding RNAs are involved in lung cancer development and progression. Moreover, they appear as new promising non-invasive biomarkers for early lung cancer diagnosis. Here, we highlight their potential as biomarker in lung cancer and present how bioinformatics can contribute to the development of non-invasive diagnostic tools. For this, we discuss several bioinformatics algorithms and software tools for a comprehensive understanding and functional characterization of microRNAs and long non-coding RNAs.

Non-coding RNAs in lung cancer: Contribution of bioinformatics analysis to the development of non-invasive diagnostic tools
Article scientifique

Meik Kunz, Wolf Beat

Human Genetics and Genomics, 2016

Lien vers la publication

2015

DNAseq workflow in a diagnostic context and an example of a user friendly implementation
Article scientifique ArODES

Beat Wolf, Pierre Kuonen, Thomas Dandekar, David Atlan

BioMed Research International,  2015, article no. 403497

Lien vers la publication

Résumé:

Over recent years next generation sequencing (NGS) technologies evolved from costly tools used by very few, to a much more accessible and economically viable technology. Through this recently gained popularity, its use-cases expanded from research environments into clinical settings. But the technical know-how and infrastructure required to analyze the data remain an obstacle for a wider adoption of this technology, especially in smaller laboratories. We present GensearchNGS, a commercial DNAseq software suite distributed by Phenosystems SA. The focus of GensearchNGS is the optimal usage of already existing infrastructure, while keeping its use simple. This is achieved through the integration of existing tools in a comprehensive software environment, as well as custom algorithms developed with the restrictions of limited infrastructures in mind. This includes the possibility to connect multiple computers to speed up computing intensive parts of the analysis such as sequence alignments. We present a typical DNAseq workflow for NGS data analysis and the approach GensearchNGS takes to implement it. The presented workflow goes from raw data quality control to the final variant report. This includes features such as gene panels and the integration of online databases, like Ensembl for annotations or Cafe Variome for variant sharing.

2024

Managing and optimizing a set of PV installations at the low-voltage grid level :
Conférence ArODES
a data-driven concept through machine learning techniques

Thibaud Alt, Beat Wolf, Jean-Philippe Bacher, Frédéric Montet

Proceedings of the 7th European Grid Service Market Symposium, 1-2 July 2024, Lucerne, Switzerland

Lien vers la conférence

Résumé:

This paper proposes a data-driven approach to managing and optimizing a set of photovoltaic (PV) installations by exploiting the possibilities of spatio-temporal modeling and machine learning techniques. Given the variable nature of solar energy production, optimizing PV installations for maximum output and efficiency is crucial. The aim is to identify trends, patterns, challenges, and opportunities for improvement in the operation of multi-site PV systems as well as to provide information for optimal management of the lowvoltage network. A diverse array of methods are compared to forecast energy production, detect declines in system performance and refine maintenance scheduling. This study contributes to the growing field of renewable energy management by showcasing the effectiveness of ML models in optimizing a set of PV systems. It sets the stage for future progress in incorporating renewable energy sources into the electrical grid.

2023

Character queries :
Conférence ArODES
a transformer-based approach to on-line handwritten character segmentation

Michael Jungo, Beat Wolf, Andrii Maksai, Claudiu Musat, Andreas Fischer

Document analysis and recognition ICDAR 2023 ; Proceedings of the 17th International Conference, 21-26 August 2023, San José, CA, USA

Lien vers la conférence

Résumé:

On-line handwritten character segmentation is often associated with handwriting recognition and even though recognition models include mechanisms to locate relevant positions during the recognition process, it is typically insufficient to produce a precise segmentation. Decoupling the segmentation from the recognition unlocks the potential to further utilize the result of the recognition. We specifically focus on the scenario where the transcription is known beforehand, in which case the character segmentation becomes an assignment problem between sampling points of the stylus trajectory and characters in the text. Inspired by the k-means clustering algorithm, we view it from the perspective of cluster assignment and present a Transformer-based architecture where each cluster is formed based on a learned character query in the Transformer decoder block. In order to assess the quality of our approach, we create character segmentation ground truths for two popular on-line handwriting datasets, IAM-OnDB and HANDS-VNOnDB, and evaluate multiple methods on them, demonstrating that our approach achieves the overall best results.

Bullingers Briefwechsel zugänglich machen :
Conférence ArODES
Stand der Handschriftenerkennung

Philip Ströbel, Tobias Hodel, Andreas Fischer, Anna Scius-Bertrand, Beat Wolf, Anna Janka, Jonas Widmer, Patricia Scheurer, Martin Volk

Digital Humanities im deutschsprachigen Raum 2023 (DHd2023): Open Humanities, Open Culture, 13-17 März 2023, Trier, Germany, Belval, Luxembourg

Lien vers la conférence

2022

Improving handwriting recognition for historical documents using synthetic text lines
Conférence ArODES

Martin Spoto, Beat Wolf, Andreas Fischer, Anna Scius-Bertrand

Proceedings of the 20th International Conference of the International Graphonomics Society, IGS 2021, Intertwining Graphnomics with Human Movements, -9 June 2022, Las Palmas de Gran Canaria

Lien vers la conférence

Résumé:

Automatic handwriting recognition for historical documents is a key element for making our cultural heritage available to researchers and the general public. However, current approaches based on machine learning require a considerable amount of annotated learning samples to read ancient scripts and languages. Producing such ground truth is a laborious and time-consuming task that often requires human experts. In this paper, to cope with a limited amount of learning samples, we explore the impact of using synthetic text line images to support the training of handwriting recognition systems. For generating text lines, we consider lineGen, a recent GAN-based approach, and for handwriting recognition, we consider HTR-Flor, a state-of-the-art recognition system. Different meta-learning strategies are explored that schedule the addition of synthetic text line images to the existing real samples. In an experimental evaluation on the well-known Bentham dataset as well as the newly introduced Bullinger dataset, we demonstrate a significant improvement of the recognition performance when combining real and synthetic samples.

2021

Annotation-free character detection in historical vietnamese stele images
Conférence ArODES

Anna Scius-Bertrand, Michael Jungo, Beat Wolf, Andreas Fischer, Marc Bui

Proceedings of the International Conference on Document Analysis and Recognition (ICDAR 2021), 5-10 September 2021, Lausanne, Switzerland

Lien vers la conférence

Résumé:

Images of Historical Vietnamese stone engravings provide historians with a unique opportunity to study the past of the country. However, due to the large heterogeneity of thousands of images regarding both the text foreground and the stone background, it is difficult to use automatic document analysis methods for supporting manual examination, especially with a view to the labeling effort needed for training machine learning systems. In this paper, we present a method for finding the location of Chu Nom characters in the main text of the steles without the need of any human annotation. Using self-calibration, fully convolutional object detection methods trained on printed characters are successfully adapted to the handwritten image collection. The achieved detection results are promising for subsequent document analysis tasks, such as keyword spotting or transcription.

2020

Using CNNs to optimize numerical simulations in geotechnical engineering
Conférence ArODES

Beat Wolf, Jonathan Donzallaz, Colette Jost, Amanda Hayoz, Stéphane Commend, Jean Hennebert

Lecture Notes in Computer Science ; Proceedings of IAPR Workshop on Artificial Neural Networks in Pattern Recognition, ANNPR 2020 : Artificial Neural Networks in Pattern Recognition, 2-4 September 2020, Winterthur, Switzerland

Lien vers la conférence

Résumé:

Deep excavations are today mainly designed by manually optimising the wall’s geometry, stiffness and strut or anchor layout. In order to better automate this process for sustained excavations, we are exploring the possibility of approximating key values using a machine learning (ML) model instead of calculating them with time-consuming numerical simulations. After demonstrating in our previous work that this approach works for simple use cases, we show in this paper that this method can be enhanced to adapt to complex real-world supported excavations. We have improved our ML model compared to our previous work by using a convolutional neural network (CNN) model, coding the excavation configuration as a set of layers of fixed height containing the soil parameters as well as the geometry of the walls and struts. The system is trained and evaluated on a set of synthetically generated situations using numerical simulation software. To validate this approach, we also compare our results to a set of 15 real-world situations in a t-SNE. Using our improved CNN model we could show that applying machine learning to predict the output of numerical simulation in the domain of geotechnical engineering not only works in simple cases but also in more complex, real-world situations.

2016

GNATY :
Conférence ArODES
optimized NGS variant calling and coverage analysis

Beat Wolf, Pierre Kuonen, Thomas Dandekar

Proceedings of 4th International Conference, Bioinformatics and Biomedical Engineering, IWBBIO 2016, 20-22 April 2016, Granada, Spain

Lien vers la conférence

Résumé:

Next generation sequencing produces an ever increasing amount of data, requiring increasingly fast computing infrastructures to keep up. We present GNATY, a collection of tools for NGS data analysis, aimed at optimizing parts of the sequence analysis process to reduce the hardware requirements. The tools are developed with efficiency in mind, using multithreading and other techniques to speed up the analysis. The architecture has been verified by implementing a variant caller based on the Varscan 2 variant calling model, achieving a speedup of nearly 18 times. Additionally, the flexibility of the algorithm is also demonstrated by applying it to coverage analysis. Compared to BEDtools 2 the same analysis results were found but in only half the time by GNATY. The speed increase allows for a faster data analysis and more flexibility to analyse the same sample using multiple settings. The software is freely available for non-commercial usage at http://gnaty.phenosystems.com/.

2015

Multilevel parallelism in sequence alignment using a streaming approach
Conférence ArODES

Beat Wolf, Pierre Kuonen, Thomas Dandekar

Proceedings of Nesus 2015 workshop, 10-11 September 2015, Krakow, Poland

Lien vers la conférence

Résumé:

Ultrascale computing and bioinformatics are two rapidly growing fields with a big impact right now and even more so in the future. The introduction of next generation sequencing pushes current bioinformatics tools and workflows to their limits in terms of performance. This forces the tools to become increasingly performant to keep up with the growing speed at which sequencing data is created. Ultrascale computing can greatly benefit bioinformatics in the challenges it faces today, especially in terms of scalability, data management and reliability. But before this is possible, the algorithms and software used in the field of bioinformatics need to be prepared to be used in a heterogeneous distributed environment. For this paper we choose to look at sequence alignment, which has been an active topic of research to speed up next generation sequence analysis, as it is ideally suited for parallel processing. We present a multilevel stream based parallel architecture to transparently distribute sequence alignment over multiple cores of the same machine, multiple machines and cloud resources. The same concepts are used to achieve multithreaded and distributed parallelism, making the architecture simple to extend and adapt to new situations. A prototype of the architecture has been implemented using an existing commercial sequence aligner. We demonstrate the flexibility of the implementation by running it on different configurations, combining local and cloud computing resources.

FriendComputing :
Conférence ArODES
organic application centric distributed computing

Beat Wolf, Loïc Monney, Pierre Kuonen

Proceedings of Nesus 2015 workshop, 10-11 September 2015, Krakow, Poland

Lien vers la conférence

Résumé:

Building Ultrascale computer systems is a hard problem, not yet solved and fully explored. Combining the computing resources of multiple organizations, often in different administrative domains with heterogeneous hardware and diverse demands on the system, requires new tools and frameworks to be put in place. During previous work we developed POP-Java, a Java programming language extension that allows to easily develop distributed applications in a heterogeneous environment. We now present an extension to the POP-Java language, that allows to create application centered networks in which any member can benefit from the computing power and storage capacity of its members. An accounting system is integrated, allowing the different members of the network to bill the usage of their resources to the other members, if so desired. The system is expanded through a similar process as seen in social networks, making it possible to use the resources of friend and friends of friends. Parts of the proposed system has been implemented as a prototype inside the POP-Java programming language.

2014

POP-Java :
Conférence ArODES
parallélisme et distribution orienté objet

Beat Wolf, Pierre Kuonen, Thomas Dandekar

Actes de la conférence ComPAS'2014 : Parallélisme / Architecture / Système, 23-25 Avril 2014, Neuchâtel, Suisse

Lien vers la conférence

Résumé:

Cet article présente l’intégration du modèle de programmation POP pour Parallel Object Programming, dans le langage de programmation Java. Le modèle POP permet de créer des objets dans un environnement distribué et de les accéder d’une manière parallèle et transparente pour le programmeur. Ce travail se base sur les travaux déjà faits dans POP-C++, une implémentation du modèle POP en C++. À travers un exemple concret, les performances et fonctionnalités de POP-Java sont présentées et validées.

2013

A novel approach for heuristic pairwise DNA sequence alignment
Conférence ArODES

Beat Wolf, Pierre Kuonen

Proceedings of the International Conference on Bioinformatics and Computational Biology, BIOCOMP'13, 18 March 2013

Lien vers la conférence

Résumé:

With the ever increasing speed at which DNA can be sequenced the available computing power is struggling to follow at the same pace. In contrast to earlier days of DNA sequencing, it now takes longer to align the data than to sequence it. Thus, new approaches to solve the alignment problem need to be investigated. A perfect sequence alignment might not always be required, specially when available computing power is limited. With that in mind, a purely heuristic approach to sequence alignment is proposed and evaluated. The evaluation shows good performance and quality with the used datasets, but suggest that the algorithm should not be used as the final alignment step, but to quickly identify alignment candidate locations.

Réalisations

Médias et communication
Nous contacter
Suivez la HES-SO
linkedin instagram facebook twitter youtube rss
univ-unita.eu www.eua.be swissuniversities.ch
Mentions légales
© 2021 - HES-SO.

HES-SO Rectorat