Valorisez vos réalisations phares sur People@HES-SO Plus d'infos
PEOPLE@HES-SO – Annuaire et Répertoire des compétences
PEOPLE@HES-SO – Annuaire et Répertoire des compétences

PEOPLE@HES-SO
Annuaire et Répertoire des compétences

Aide
language
  • fr
  • en
  • de
  • fr
  • en
  • de
  • SWITCH edu-ID
  • Administration
« Retour
Fischer Andreas

Fischer Andreas

Ordentlicher Professor FH

Compétences principales

Pattern Recognition

Applied Machine Learning

Document Analysis

Handwriting Recognition

Natural Language Processing

Graph Matching

Geometric Deep Learning

  • Contact

  • Enseignement

  • Recherche

  • Publications

  • Conférences

Contrat principal

Ordentlicher Professor FH

Téléphone: +41 26 429 67 34

Bureau: HEIA_D20.07

Haute école d'ingénierie et d'architecture de Fribourg
Boulevard de Pérolles 80, 1700 Fribourg, CH
HEIA-FR
BSc HES-SO en Informatique - Haute école d'ingénierie et d'architecture de Fribourg
  • Algorithms and Data Structures
  • Formal Languages
  • IT Project Management

En cours

Automatic handwriting recognition for tax form validator (2017-2020)

Rôle: Requérant(e) principal(e)

Financement: TAINA Technology

Description du projet :

2017-2020, TAINA Technology

Equipe de recherche au sein de la HES-SO: Fischer Andreas

Statut: En cours

Automatic translation from Swiss German to High German (2018-2021)

Rôle: Requérant(e) principal(e)

Financement: Swisscom

Description du projet :

2018-2021, Swisscom

Equipe de recherche au sein de la HES-SO: Fischer Andreas

Statut: En cours

Towards graph-based keyword spotting in historical Vietnamese steles (2020-2021)

Rôle: Requérant(e) principal(e)

Financement: Hasler Foundation

Description du projet :

2020-2021, Hasler Foundation

Equipe de recherche au sein de la HES-SO: Fischer Andreas

Statut: En cours

2023

Universality of Büchi automata :
Article scientifique ArODES
analysis with graph neural networks

Christophe Stammet, Ulrich Ultes-Nitsche, Andreas Fischer

IEEE Access,  2023, 11, 140993 - 141007

Lien vers la publication

Résumé:

The universality check of Büchi automata is a foundational problem in automata-based formal verification, closely related to the complementation problem, and is known to be PSPACE-complete. This article introduces a novel approach for creating labelled datasets of Büchi automata concerning their universality. We start with small automata, where the universality check can still be algorithmically performed within a reasonable timeframe, and then apply transformations that provably preserve (non-)universality while increasing their size. This approach enables the generation of large datasets of labelled Büchi automata without the need for an explicit and computationally intensive universality check. We subsequently employ these generated datasets to train Graph Neural Networks (GNNs) for the purpose of classifying automata with respect to their (non-)universality. The classification results indicate that such a network can learn patterns related to the behaviour of Büchi automata that facilitate the recognition of universality. Additionally, our results on randomly generated automata, which were not constructed using the transformation techniques, demonstrate the network’s potential in classifying Büchi automata with respect to universality, extending its applicability beyond cases generated using a specific technique.

A hybrid deep learning approach to keyword spotting in vietnamese stele images
Article scientifique ArODES

Anna Scius-Bertrand, Marc Bui, Andreas Fischer

Informatica,  47, 3, 361-372

Lien vers la publication

Résumé:

In order to access the rich cultural heritage conveyed in Vietnamese steles, automatic reading of stone engravings would be a great support for historians, who are analyzing tens of thousands of stele images. Approaching the challenging problem with deep learning alone is difficult because the data-driven models require large representative datasets with expert human annotations, which are not available for the steles and costly to obtain. In this article, we present a hybrid approach to spot keywords in stele images that combines data-driven deep learning with knowledge-based structural modeling and matching of Chu Nom characters. The main advantage of the proposed method is that it is annotation-free, i.e. no human data annotation is required. In an experimental evaluation, we demonstrate that keywords can be successfully spotted with a mean average precision of more than 70% when a single engraving style is considered.

2022

Self-rule to multi-adapt :
Article scientifique ArODES
generalized multi-source feature learning using unsupervised domain adaptation for colorectal cancer tissue detection

Christian Abbet, Linda Studer, Andreas Fischer, Heather Dawson, Inti Zlobec, Behzad Bozorgtabar, Jean-Philippe Thiran

Medical Image Analysis,  2022, vol. 79, article no. 102473

Lien vers la publication

Résumé:

Supervised learning is constrained by the availability of labeled data, which are especially expensive to acquire in the field of digital pathology. Making use of open-source data for pre-training or using domain adaptation can be a way to overcome this issue. However, pre-trained networks often fail to generalize to new test domains that are not distributed identically due to tissue stainings, types, and textures variations. Additionally, current domain adaptation methods mainly rely on fully-labeled source datasets. In this work, we propose Self-Rule to Multi-Adapt (SRMA), which takes advantage of self-supervised learning to perform domain adaptation, and removes the necessity of fully-labeled source datasets. SRMA can effectively transfer the discriminative knowledge obtained from a few labeled source domain’s data to a new target domain without requiring additional tissue annotations. Our method harnesses both domains’ structures by capturing visual similarity with intra-domain and cross-domain self-supervision. Moreover, we present a generalized formulation of our approach that allows the framework to learn from multiple source domains. We show that our proposed method outperforms baselines for domain adaptation of colorectal tissue type classification in single and multi-source settings, and further validate our approach on an in-house clinical cohort. The code and trained models are available open-source: https://github.com/christianabbet/SRA.

2021

Learning graph edit distance by graph neural networks
Article scientifique ArODES

Pau Riba, Josep Lladòs, Alicia Fornés, Andreas Fischer

Pattern Recognition,  2021, vol. 120, article no. 108132

Lien vers la publication

Résumé:

The emergence of geometric deep learning as a novel framework to deal with graph-based representations has faded away traditional approaches in favor of completely new methodologies. In this paper, we propose a new framework able to combine the advances on deep metric learning with traditional approximations of the graph edit distance. Hence, we propose an efficient graph distance based on the novel field of geometric deep learning. Our method employs a message passing neural network to capture the graph structure, and thus, leveraging this information for its use on a distance computation. The performance of the proposed graph distance is validated on two different scenarios. On the one hand, in a graph retrieval of handwritten words i.e. keyword spotting, showing its superior performance when compared with (approximate) graph edit distance benchmarks. On the other hand, demonstrating competitive results for graph similarity learning when compared with the current state-of-the-art on a recent benchmark dataset.

Transcription alignment of historical vietnamese manuscripts without human-annotated learning samples
Article scientifique ArODES

Ann Scius-Bertrand, Michael Jungo, Beat Wolf, Andreas Fischer, Marc Bui

Applied Sciences,  2021, vol. 11, no. 11, article no. 4894

Lien vers la publication

Résumé:

The current state of the art for automatic transcription of historical manuscripts is typically limited by the requirement of human-annotated learning samples, which are are necessary to train specific machine learning models for specific languages and scripts. Transcription alignment is a simpler task that aims to find a correspondence between text in the scanned image and its existing Unicode counterpart, a correspondence which can then be used as training data. The alignment task can be approached with heuristic methods dedicated to certain types of manuscripts, or with weakly trained systems reducing the required amount of annotations. In this article, we propose a novel learning-based alignment method based on fully convolutional object detection that does not require any human annotation at all. Instead, the object detection system is initially trained on synthetic printed pages using a font and then adapted to the real manuscripts by means of self-training. On a dataset of historical Vietnamese handwriting, we demonstrate the feasibility of annotation-free alignment as well as the positive impact of self-training on the character detection accuracy, reaching a detection accuracy of 96.4% with a YOLOv5m model without using any human annotation.

The VCG tool :
Chapitre de livre ArODES
knowledge base and methods

Andreas Fischer, Michael Keller

AlpLinkBioEco (2021). Creating Bio-based Value in the Alpine Space. Interreg Alpine Space.  (pp. 20-26). 2021,  Interred Alpine Space : x

Lien vers la publication

Bio-based business opportunities unearthed :
Article scientifique ArODES
the VCG software tool

Michael Keller, Andreas Fischer, Dorian Wessely, Ashna Mudaffer

Working Paper, February 2021, HES-SO//FR HEIA-FR, iCoSys, PICC, INNOSQUARE, Business Upper Austria,

Lien vers la publication

CITIZEN PARTICIPATION & DIGITAL TOOLS TO IMPROVE PEDESTRIAN MOBILITY IN CITIES
Article scientifique

Sandoz Romain, Ertz Olivier, Scius-Bertrand Anna, Fischer Andreas, Hüsser Olivier, Ghorbel Hatem

The International Archives of Photogrammetry, Remote Sensing and Spatial Information Sciences, 2021 , vol.  XLVI-4/W1-2021, pp.  29-34

Lien vers la publication

Résumé:

In this work, we present a framework supported by mobile and web apps and able to propose personalized pedestrian routes that match
user mobility profile considering mobility impediments factors. We explain how these later have been defined using a pedestrian-
centric approach based on travel experiences as perceived in the field by senior citizens. Through workshops, six main factors that may
influence pedestrian route choices were revealed: passability, obstacle in path, surface problem, security, sidewalk width, slope. These
categories were used to build digital tools and guide a citizen participatory approach to collect geolocated points of obstacle documented
with walkability information (picture, category, impact score, free comment). We also involved citizens to evaluate these information
and especially senior referents for validation. Finally we present how we connect these points of obstacle with a pedestrian network
based on OpenStreetMap to configure a routing cost function. The framework has been partially deployed in 2020 with limited people
due to the pandemic. Nonetheless, we share lessons learned from interaction with citizens in the design of such a framework whose
underlying workflow is reproducible. We plan to further assess its relevance and sustainability in the future.

2020

Handwritten historical document analysis, recognition, and retrieval - state of the art and future trends
Livre ArODES

Andreas Fischer, Marcus Liwicki

2020,  New Jersey : World Scientific,  268 p.

Lien vers la publication

Résumé:

In recent years, libraries and archives all around the world have increased their efforts to digitize historical manuscripts. To integrate the manuscripts into digital libraries, pattern recognition and machine learning methods are needed to extract and index the contents of the scanned images. The unique compendium describes the outcome of the HisDoc research project, a pioneering attempt to study the whole processing chain of layout analysis, handwriting recognition, and retrieval of historical manuscripts. This description is complemented with an overview of other related research projects, in order to convey the current state of the art in the field and outline future trends. This must-have volume is a relevant reference work for librarians, archivists and computer scientists.

Introduction
Chapitre de livre ArODES

Andreas Fischer, Marcus Liwicki, Rolf Ingold

Dans Fischer, Andreas, Ingold, Rolf, Liwicki, Marcus, Handwritten historical document analysis, recognition, and retrieval - state of the art and future trends  (8 p.). 2020,  New Jersey : World Scientific

Lien vers la publication

Résumé:

Document image analysis or document recognition refers to the process of extracting valuable information from document images. Although a few optical character reading systems were already available in the 1970’s, the fundamental research activities on this challenging task has mainly emerged with the development of the scanner technologies in the 1980’s, which allowed affordable document image acquisition. At that time, the main applications were focused on office automation and the interpretation of printed material…

Automatic handwriting recognition in historical documents
Chapitre de livre ArODES

Andreas Fischer

Handwritten historical document analysis, recognition, and retrieval - state of the art and future trends  (pp. 67-80). 2020,  New Jersey : World Scientific

Lien vers la publication

IAM-HistDB a dataset of handwritten historical documents
Chapitre de livre ArODES

Andreas Fischer

Handwritten historical document analysis, recognition, and retrieval - state of the art and future trends  (pp. 11-23). 2020,  New Jersey : World Scientific

Lien vers la publication

Conclusions and future trends
Chapitre de livre ArODES

Andreas Fischer, Marcus Liwicki, Rolf Ingold

Handwritten historical document analysis, recognition, and retrieval - state of the art and future trends  (pp. 249-251). 2020,  New Jersey : World Scientific

Lien vers la publication

Taking tumour budding to the next frontier :
Article scientifique ArODES
a post International Tumour Budding Consensus Conference (ITBCC) 2016 review

Linda Studer, Annika Blank, John-Melle Bokhorst, Iris D. Nagtegaal, Inti Zlobec, Alessandro Lugli, Andreas Fischer, Heather Dawson

Histopathology,  2020, vol. 78, no. 4, pp. 476-484

Lien vers la publication

Résumé:

Tumour budding in colorectal cancer, defined as single tumour cells or small clusters containing four or fewer tumour cells, is a robust and independent biomarker of aggressive tumour biology. On the basis of published data in the literature, the evidence is certainly in favour of reporting tumour budding in routine practice. One important aspect of implementing tumour budding has been to establish a standardised and evidence-based scoring method, as was recommended by the International Tumour Budding Consensus Conference (ITBCC) in 2016. Further developments have aimed at establishing methods for automated tumour budding assessment. A digital approach to scoring tumour buds has great potential to assist in performing an objective budding count but, like the manual consensus method, must be validated and standardised. The aim of the present review is to present general considerations behind the ITBCC scoring method, and a broad overview of the current situation and challenges regarding automated tumour budding detection methods.

2019

Modeling 3D movements with the kinematic theory of rapid human movements
Chapitre de livre ArODES

Andreas Fischer, Roman Schindler, Manuel Bouillon, Réjean Plamondon

Dans Ferrer, Miguel Ángel, Marcelli, Angelo, Plamondon, Réjean, The lognormality principle and its applications in e-security, e-learning and e-health  (pp. 327-342). 2019,  New Jersey : World Scientific

Lien vers la publication

Résumé:

The Kinematic Theory of rapid human movements analytically describes pen tip movements as a sequence of elementary strokes with lognormal speed. The theory has been confirmed in a large number of experimental evaluations, achieving a high reconstruction quality when compared with observed trajectories and providing pertinent features for biomedical applications as well as biometric identification. So far, the Kinematic Theory has focused on one-dimensional movements with the Delta-Lognormal model and on two-dimensional movements with the Sigma-Lognormal model. In this chapter, we present a model for movements in three dimensions, which naturally extends the Sigma-Lognormal approach. We evaluate our method on two action recognition datasets and an air-writing dataset, demonstrating a high reconstruction quality for modelling rapid 3D movements in all cases.

Combining graph edit distance and triplet networks for offline signature verification
Article scientifique ArODES

Paul Maergner, Vinaychandran Pondenkandath, Michele Alberti, Marcus Liwicki, Kaspar Riesen, Rolf Ingold, Andreas Fischer

Pattern Recognition Letters,  2019, vol. 125, pp. 527-533

Lien vers la publication

Résumé:

Offline signature verification is a challenging pattern recognition task where a writer model is inferred using only a small number of genuine signatures. A combination of complementary writer models can make it more difficult for an attacker to deceive the verification system. In this work, we propose to combine a recent structural approach based on graph edit distance with a statistical approach based on deep triplet networks. The combination of the structural and statistical models achieve significant improvements in performance on four publicly available benchmark datasets, highlighting their complementary perspectives.

2018

On the impact of using utilities rather than costs for graph matching
Article scientifique ArODES

Kaspar Riesen, Andreas Fischer, Horst Bunke

Neural Processing Letters,  2018, vol. 48, pp. 691-707

Lien vers la publication

Résumé:

The concept of graph edit distance constitutes one of the most flexible graph matching paradigms available. The major drawback of graph edit distance, viz. the exponential time complexity, has been recently overcome by means of a reformulation of the edit distance problem to a linear sum assignment problem. However, the substantial speed up of the matching is also accompanied by an approximation error on the distances. Major contribution of this paper is the introduction of a transformation process in order to convert the underlying cost model into a utility model. The benefit of this transformation is that it enables the integration of additional information in the assignment process. We empirically confirm the positive effects of this transformation on five benchmark graph sets with respect to the accuracy and run time of a distance based classifier.

Keyword spotting in historical handwritten documents based on graph matching
Article scientifique ArODES

Michael Stauffer, Kaspar Riesen, Andreas Fischer

Pattern Recognition,  2018, vol. 81 pp. 240-253

Lien vers la publication

Résumé:

In the last decades historical handwritten documents have become increasingly available in digital form. Yet, the accessibility to these documents with respect to browsing and searching remained limited as full automatic transcription is often not possible or not sufficiently accurate. This paper proposes a novel reliable approach for template-based keyword spotting in historical handwritten documents. In particular, our framework makes use of different graph representations for segmented word images and a sophisticated matching procedure. Moreover, we extend our method to a spotting ensemble. In an exhaustive experimental evaluation on four widely used benchmark datasets we show that the proposed approach is able to keep up or even outperform several state-of-the-art methods for template- and learning-based keyword spotting.

Graph-based keyword spotting in historical manuscripts using Hausdorff edit distance
Article scientifique ArODES

Mohammad Reza Ameri, Michael Stauffer, Kaspar Riesen, Tien D. Bui, Andreas Fischer

Pattern Recognition Letters,

Lien vers la publication

Résumé:

Keyword spotting enables content-based retrieval of scanned historical manuscripts using search terms, which, in turn, facilitates the indexation in digital libraries. Recent approaches include graph-based representations that capture the complex structure of handwriting. However, the high representational power of graphs comes at the cost of high computational complexity for graph matching. In this article, we investigate the potential of Hausdorff edit distance (HED) for keyword spotting. It is an efficient quadratictime approximation of the graph edit distance. In a comprehensive experimental evaluation with four types of handwriting graphs and four benchmark datasets (George Washington, Parzival, Botany, and Alvermann Konzilsprotokolle), we demonstrate a strong performance of the proposed HED-based method when compared with the state of the art, both, in terms of precision and speed.

Filters for graph-based keyword spotting in historical handwritten documents
Article scientifique ArODES

Michael Stauffer, Andreas Fischer, Kaspar Riesen

Pattern Recognition Letters,

Lien vers la publication

Résumé:

The accessibility to handwritten historical documents is often constrained by the limited feasibility of automatic full transcriptions. Keyword Spotting (KWS), that allows to retrieve arbitrary query words from documents, has been proposed as alternative. In the present paper, we make use of graphs for representing word images. The actual keyword spotting is thus based on matching a query graph with all documents graphs. However, even with relative fast approximation algorithms the shear amount of matchings might limit the practical application of this approach. For this reason we present two novel filters with linear time complexity that allow to substantially reduce the number of graph matchings actually required. In particular, these filters estimate a graph dissimilarity between a query graph and all document graphs based on their node and edge distribution in a polar coordinate system. Eventually, all graphs from the document with distributions that differ to heavily from the query’s node/edge distribution are eliminated. In an experimental evaluation on four different historical documents, we show that about 90% of the matchings can be omitted, while the KWS accuracy is not negatively affected.

Dynamic signature verification system based on one real signature
Article scientifique ArODES

Moises Diaz, Andreas Fischer, Miguel A. Ferrer, Réjean Plamondon

IEEE Transactions on Cybernetics,  2018, vol. 48, no. 1, pp. 228-239

Lien vers la publication

Résumé:

The dynamic signature is a biometric trait widely used and accepted for verifying a person's identity. Current automatic signature-based biometric systems typically require five, ten, or even more specimens of a person's signature to learn intrapersonal variability sufficient to provide an accurate verification of the individual's identity. To mitigate this drawback, this paper proposes a procedure for training with only a single reference signature. Our strategy consists of duplicating the given signature a number of times and training an automatic signature verifier with each of the resulting signatures. The duplication scheme is based on a sigma lognormal decomposition of the reference signature. Two methods are presented to create human-like duplicated signatures: the first varies the strokes' lognormal parameters (stroke-wise) whereas the second modifies their virtual target points (target-wise). A challenging benchmark, assessed with multiple state-of-the-art automatic signature verifiers and multiple databases, proves the robustness of the system. Experimental results suggest that our system, with a single reference signature, is capable of achieving a similar performance to standard verifiers trained with up to five signature specimens.

2017

A user-centered segmentation method for complex historical manuscripts based on document graphs
Article scientifique ArODES

Angelika Garz, Mathias Seuret, Andreas Fischer, Rolf Ingold

IEEE Transactions on Human-Machine Systems,  2017, vol. 47, no. 2, pp- 181-193

Lien vers la publication

Résumé:

In historical manuscripts, humans can detect handwritten words, lines, and decorations with lightness even if they do not know the language or the script. Yet for automatic processing this task has proven elusive, especially in the case of handwritten documents with complex layouts, which is why semiautomatic methods that integrate the human user into the process are needed. In this paper, we introduce a user-centered segmentation method based on document graphs and scribbling interaction. The graphs capture a sparse representation of the document's structure that can then be edited by the user with a stylus on a touch-sensitive screen. We evaluate the proposed method on a newly introduced database of historical manuscripts with complex layout and demonstrate, first, that the document graphs are already close to the desired segmentation and, second, that scribbling allows a natural and efficient interaction.

Signature verification based on the kinematic theory of rapid human movements
Article scientifique ArODES

Andreas Fischer, Réjean Plamondon

IEEE Transactions on Human-Machine Systems,  207, vol. 47, no. 2, pp. 169-180

Lien vers la publication

Résumé:

When using tablet computers, smartphones, or digital pens, human users perform movements with a stylus or their fingers that can be analyzed by the kinematic theory of rapid human movements. In this paper, we present a user-centered system for signature verification that performs such a kinematic analysis to verify the identity of the user. It is one of the first systems that is based on a direct comparison of the elementary neuromuscular strokes which are detected in the handwriting. Taking into account the number of strokes, their similarity, and their timing, the string edit distance is employed to derive a dissimilarity measure for signature verification. On several benchmark datasets, we demonstrate that this neuromuscular analysis is complementary to a well-established verification using dynamic time warping. By combining both approaches, our verifier is able to outperform current state-of-the-art results in on-line signature verification.

Improved quadratic time approximation of graph edit distance by combining Hausdorff matching and greedy assignment
Article scientifique ArODES

Andreas Fischer, Kaspar Riesen, Horst Bunke

Pattern Recognition Letters,  2017, vol. 87, no. 1, pp. 55-62

Lien vers la publication

Résumé:

Approximation of graph edit distance in polynomial time enables us to compare large, arbitrarily labeled graphs for structural pattern recognition. In a recent approximation framework, bipartite graph matching (BP) has been proposed to reduce the problem of edit distance to a cubic-time linear sum assignment problem (LSAP) between local substructures. Following the same line of research, first attempts towards quadratic-time approximation have been made recently, including a lower bound based on Hausdorff matching (Hausdorff Edit Distance) and an upper bound based on greedy assignment (Greedy Edit Distance). In this paper, we compare the two approaches and derive a novel upper bound (BP2) which combines advantages of both. In an experimental evaluation on the IAM graph database repository, we demonstrate that the proposed quadratic-time methods perform equally well or, quite surprisingly, in some cases even better than the cubic-time method.

2016

Simple and fast geometrical descriptors for writer identification
Article scientifique ArODES

Angelika Garz, Marcel Würsch, Andreas Fischer, Rolf Ingold

Electronic Imaging,  2016, vol. 28

Lien vers la publication

Résumé:

Recent advances in writer identification push the limits by using increasingly complex methods relying on sophisticated preprocessing, or the combination of already complex descriptors. In this paper, we pursue a simpler and faster approach to writer identification, introducing novel descriptors computed from the geometrical arrangement of interest points at different scales. They capture orientation distributions and geometrical relationships of script parts such as strokes, junctions, endings, and loops. Thus, we avoid a fixed set of character appearances as in standard codebook-based methods. The proposed descriptors significantly cut down processing time compared to existing methods, are simple and efficient, and can be applied out-of-the-box to an unseen dataset. Evaluations on widely-used datasets show their potential when applied by themselves, and in combination with other descriptors. Limitations of our method relate to the amount of data needed to obtain reliable models.

2024

Zero-shot prompting and few-shot fine-tuning :
Conférence ArODES
revisiting document image classification using large language models

Anna Scius-Bertrand, Michael Jungo, Jean-Marc Spat, Andreas Fischer

Proceedings of the 27th International Conference, ICPR 2024, 1-5 December 2024, Kolkata, India, Part XIX

Lien vers la conférence

Résumé:

Classifying scanned documents is a challenging problem that involves image, layout, and text analysis for document understanding. Nevertheless, for certain benchmark datasets, notably RVL-CDIP, the state of the art is closing in to near-perfect performance when considering hundreds of thousands of training samples. With the advent of large language models (LLMs), which are excellent few-shot learners, the question arises to what extent the document classification problem can be addressed with only a few training samples, or even none at all. In this paper, we investigate this question in the context of zero-shot prompting and few-shot model fine-tuning, with the aim of reducing the need for human-annotated training samples as much as possible.

Post-correction of handwriting recognition using large language models
Conférence ArODES

Jean Pool Pereyra Principe, Andreas Fischer, Anna Scius-Bertrand

Proceedings of SOICT 2024, The 13th International Symposium on Information and Communication Technology, 13-15 December 2024, Danang, Vietnam

Lien vers la conférence

Résumé:

Handwriting recognition enables the automatic transcription of large volumes of digitized collections, providing access to the content. However, regardless of the system used, some recognition errors still oc-cur. With the advancement of Large Language Models (LLMs), the ques-tion arises whether these models can improve handwriting recognition as a post-processing step. We have developed a method for LLM-based post-correction and evaluated it on three benchmark datasets, namely Washington, Bentham, and IAM. We consistently achieved a character error rate reduction of up to 30%, though we observed significant vari-ability depending on the prompt and the LLM used.

Are layout analysis and OCR still useful for document information extraction using foundation models ?
Conférence ArODES

Anna Scius-Bertrand, Atefeh Fakhari, lars Vötglin, Daniel Ribeiro Cabral, Andreas Fischer

Proceedings of the 18th International Conference, 30 August – 4 September 2024, Athens, Greece

Lien vers la conférence

Résumé:

With the advent of end-to-end models and the remarkable performance of foundation models, the question arises regarding the relevance of preliminary steps, such as layout analysis and optical character recognition (OCR), for information extraction from document images. We attempt to provide some answers through experiments conducted on a new database of food labels. The goal is to extract nutritional values from cellphone pictures taken in grocery stores. We compare the results of OCR-free models that take the raw images as input (Donut and GPT-4-Vision) with two-stage systems that first perform OCR and then extract information using large language models (LLMs) from the recognized text (Mistral, GPT-3, and GPT-4). To assess the impact of layout analysis, we applied the same systems to three different views of the image: the original full image, a large manual crop containing the entire food label, and a small crop focusing on the relevant nutrition information. Comparative experiments are also conducted on the CORD database of receipts. Our results demonstrate that although OCR-free models achieve a remarkable performance, they still require some guidance regarding the layout, and two-stage systems achieve better results overall.

2023

Impact of the ground truth quality for handwriting recognition
Conférence ArODES

Michael Jungo, Atefeh Fakhari, Nathan Wegmann, Rolf Ingold, Andreas Fischer, Anna Scius-Bertrand

SOICT '23: Proceedings of the 12th International Symposium on Information and Communication Technology, 7-8 December 2023, Ho Chi Minh, Vietnam

Lien vers la conférence

Résumé:

Handwriting recognition is a key technology for accessing the content of old manuscripts, helping to preserve cultural heritage. Deep learning shows an impressive performance in solving this task. However, to achieve its full potential, it requires a large amount of labeled data, which is difficult to obtain for ancient languages and scripts. Often, a trade-off has to be made between ground truth quantity and quality, as is the case for the recently introduced Bullinger database. It contains an impressive amount of over a hundred thousand labeled text line images of mostly premodern German and Latin texts that were obtained by automatically aligning existing page-level transcriptions with text line images. However, the alignment process introduces systematic errors, such as wrongly hyphenated words. In this paper, we investigate the impact of such errors on training and evaluation and suggest means to detect and correct typical alignment errors.

Lognormality :
Conférence ArODES
an open window on neuromotor control

Réjean Plamondon, Asma Bensalah, Karina Lebel, Romeo Salameh, Guillaume Séguin de Broin, Christian O'Reilly, Mickael Begon, Olivier Desbiens, Youssef Beloufa, Aymeric Guy, Daniel Berio, Frederic Fol Leymarie, Simon-Pierre Boyoguéno-Bidias, Andreas Fischer, Zigeng Zhang, Marie-France Morin, Denis Alamargot, Céline Rémi, Nadir Faci, Raphaëlle Fortin, Marie-Noëlle Simard, Caroline Bazinet

Proceedings of the 21st International Conference of the International Graphonomics Society, IGS 2023, 16-19 October 2023, Evora, Portugal ; Graphonomics in Human Body Movement. Bridging Research and Practice from Motor Control to Handwriting Analysis and Recognition

Lien vers la conférence

Résumé:

This invited special session of IGS 2023 presents the works carried out at Laboratoire Scribens and some of its collaborating laboratories. It summarises the 17 talks presented in the colloquium #611 entitled « La lognormalité: une fenêtre ouverte sur le contrôle neuromoteur» (Lognormality: a window opened on neuromotor control), at the 2023 conference of the Association Francophone pour le Savoir (ACFAS) on May 10, 2023. These talks covered a wide range of subjects related to the Kinematic Theory, including key elements of the theory, some gesture analysis algorithms that have emerged from it, and its application to various fields, particularly in biomedical engineering and human-machine interaction.

Towards visuo-structural handwriting evaluation based on graph matching
Conférence ArODES

Anna Scius-Bertrand, Céline Rémi, Emmanuel Biabiany, Jimmy Nagau, Andreas Fischer

Proceedings of the 21st International Conference of the International Graphonomics Society, IGS 2023, 16-19 October 2023, Evora, Portugal ; Graphonomics in Human Body Movement. Bridging Research and Practice from Motor Control to Handwriting Analysis and Recognition

Lien vers la conférence

Résumé:

Judging the quality of handwriting based on visuo-structural criteria is fundamental for teachers when accompanying children who are learning to write. Automatic methods for quality assessment can support teachers when dealing with a large number of handwritings, in order to identify children who are having difficulties. In this paper, we investigate the potential of graph-based handwriting representation and graph matching to capture visuo-structural features and determine the legibility of cursive handwriting. On a comprehensive dataset of words written by children aged from 3 to 11 years, we compare the judgment of human experts with a graph-based analysis, both with respect to classification and clustering. The results are promising and highlight the potential of graph-based methods for handwriting evaluation.

The Bullinger dataset :
Conférence ArODES
a writer adaptation challenge

Anna Scius-Bertrand, Phillip Ströbel, Martin Volk, Tobias Hodel, Andreas Fischer

Document analysis and recognition ICDAR 2023 ; Proceedings of the 17th International Conference, 21-26 August 2023, San José, CA, USA

Lien vers la conférence

Résumé:

One of the main challenges of automatically transcribing large collections of handwritten letters is to cope with the high variability of writing styles present in the collection. In particular, the writing styles of non-frequent writers, who have contributed only few letters, are often missing in the annotated learning samples used for training handwriting recognition systems. In this paper, we introduce the Bullinger dataset for writer adaptation, which is based on the Heinrich Bullinger letter collection from the 16th century, using a subset of 3,622 annotated letters (about 1.2 million words) from 306 writers. We provide baseline results for handwriting recognition with modern recognizers, before and after the application of standard techniques for supervised adaptation of frequent writers and self-supervised adaptation of non-frequent writers.

Character queries :
Conférence ArODES
a transformer-based approach to on-line handwritten character segmentation

Michael Jungo, Beat Wolf, Andrii Maksai, Claudiu Musat, Andreas Fischer

Document analysis and recognition ICDAR 2023 ; Proceedings of the 17th International Conference, 21-26 August 2023, San José, CA, USA

Lien vers la conférence

Résumé:

On-line handwritten character segmentation is often associated with handwriting recognition and even though recognition models include mechanisms to locate relevant positions during the recognition process, it is typically insufficient to produce a precise segmentation. Decoupling the segmentation from the recognition unlocks the potential to further utilize the result of the recognition. We specifically focus on the scenario where the transcription is known beforehand, in which case the character segmentation becomes an assignment problem between sampling points of the stylus trajectory and characters in the text. Inspired by the k-means clustering algorithm, we view it from the perspective of cluster assignment and present a Transformer-based architecture where each cluster is formed based on a learned character query in the Transformer decoder block. In order to assess the quality of our approach, we create character segmentation ground truths for two popular on-line handwriting datasets, IAM-OnDB and HANDS-VNOnDB, and evaluate multiple methods on them, demonstrating that our approach achieves the overall best results.

DIVA-DAF :
Conférence ArODES
a deep learning framework for historical document image analysis

Anna Scius-Bertrand, Paul Maergner, Andreas Fischer, Rolf Ingold

Proceedings of the 7th International Workshop on Historical Document Imaging and Processing (HIP'23), 25-26 August 2023, San José, CA, USA

Lien vers la conférence

Résumé:

Deep learning methods have shown strong performance in solving tasks for historical document image analysis. However, despite current libraries and frameworks, programming an experiment or a set of experiments and executing them can be time-consuming. This is why we propose an open-source deep learning framework, DIVA-DAF, which is based on PyTorch Lightning and specifically designed for historical document analysis. Pre-implemented tasks such as segmentation and classification can be easily used or customized. It is also easy to create one’s own tasks with the benefit of powerful modules for loading data, even large data sets, and different forms of ground truth. The applications conducted have demonstrated time savings for the programming of a document analysis task, as well as for different scenarios such as pre-training or changing the architecture. Thanks to its data module, the framework also allows to reduce the time of model training significantly.

GammaFocus :
Conférence ArODES
an image augmentation method to focus model attention for classification

Ana Leni Frei, Amjad Khan, Philipp Zens, Alessandro Lugli, Inti Zlobec, Andreas Fischer

Proceedings of Medical Imaging with Deep Learning (MIDL), 10-12 July 2023, Nashville, USA

Lien vers la conférence

Résumé:

In histopathology, histologic elements are not randomly located across an image but organize into structured patterns. In this regard, classification tasks or feature extraction from histology images may require context information to increase performance. In this work, we explore the importance of keeping context information for a cell classification task on Hematoxylin and Eosin (H&E) scanned whole slide images (WSI) in colorectal cancer. We show that to differentiate normal from malignant epithelial cells, the environment around the cell plays a critical role. We propose here an image augmentation based on gamma variations to guide deep learning models to focus on the object of interest while keeping context information. This augmentation method yielded more specific models and helped to increase the model performance (weighted F1 score with/without gamma augmentation respectively, PanNuke: 99.49 vs 99.37 and TCGA: 91.38 vs. 89.12, p < 0.05).

Local and global feature aggregation for accurate epithelial cell classification using graph attention mechanisms in histopathology images
Conférence ArODES

Ana Leni Frei, Amjad Khan, Linda Studer, Philipp Zens, Alessandro Lugli, Andreas Fischer, Inti Zlobec

Proceedings of the Medical Imaging with Deep Learning (MIDL), 10-12 July 2023, Nashville, USA

Lien vers la conférence

Résumé:

In digital pathology, cell-level tissue analyses are widely used to better understand tissue composition and structure. Publicly available datasets and models for cell detection and classification in colorectal cancer exist but lack the differentiation of normal and malignant epithelial cells that are important to perform prior to any downstream cell-based analysis. This classification task is particularly difficult due to the high intra-class variability of neoplastic cells. To tackle this, we present here a new method that uses graph-based node classification to take advantage of both local cell features and global tissue architecture to perform accurate epithelial cell classification. The proposed method demonstrated excellent performance on F1 score (PanNuke: 1.0, TCGA: 0.98) and performed significantly better than conventional computer vision methods (PanNuke: 0.99, TCGA: 0.92).

Tumor budding t-cell graphs :
Conférence ArODES
assessing the need for resection in pT1 colorectal cancer patients

Linda Studer, John-Melle Bokhorst, Iris Nagtegaal, Inti Zlobec, Heather Dawson, Andreas Fischer

Proceedings of Medical Imaging with Deep Learning (MIDL), 10-12 July 2023, Nashville, USA

Lien vers la conférence

Résumé:

Colon resection is often the treatment of choice for colorectal cancer (CRC) patients. However, especially for minimally invasive cancer, such as pT1, simply removing the polyps may be enough to stop cancer progression. Different histopathological risk factors such as tumor grade and invasion depth currently found the basis for the need for colon resection in pT1 CRC patients. Here, we investigate two additional risk factors, tumor budding and lymphocyte infiltration at the invasive front, which are known to be clinically relevant. We capture the spatial layout of tumor buds and T-cells and use graph-based deep learning to investigate them as potential risk predictors. Our pT1 Hotspot Tumor Budding T-cell Graph (pT1-HBTG) dataset consists of 626 tumor budding hotspots from 575 patients. We propose and compare three different graph structures, as well as combinations of the node labels. The best-performing Graph Neural Network architecture is able to increase specificity by 20% compared to the currently recommended risk stratification based on histopathological risk factors, without losing any sensitivity. We believe that using a graph-based analysis can help to assist pathologists in making risk assessments for pT1 CRC patients, and thus decrease the number of patients undergoing potentially unnecessary surgery. Both the code and dataset are made publicly available.

Bullingers Briefwechsel zugänglich machen :
Conférence ArODES
Stand der Handschriftenerkennung

Philip Ströbel, Tobias Hodel, Andreas Fischer, Anna Scius-Bertrand, Beat Wolf, Anna Janka, Jonas Widmer, Patricia Scheurer, Martin Volk

Digital Humanities im deutschsprachigen Raum 2023 (DHd2023): Open Humanities, Open Culture, 13-17 März 2023, Trier, Germany, Belval, Luxembourg

Lien vers la conférence

2022

Generating synthetic styled Chu nom characters
Conférence ArODES

Jonas Diesbach, Andreas Fischer, Marc Bui, Anna Scius-Bertrand

Proceedings of International Conference on Frontiers in Handwriting Recognition (ICFHR) 2022, 4-7 December 2022, Hyderabad, India

Lien vers la conférence

Résumé:

Images of historical Vietnamese steles allow historians to discover invaluable information regarding the past of the country, especially about the life of people in rural villages. Due to the sheer amount of available stone engravings and their diverseness, manual examination is difficult and time-consuming. Therefore, automatic document analysis methods based on machine learning could immensely facilitate this laborious work. However, creating ground truth for machine learning is also complex and time-consuming for human experts, which is why synthetic training samples greatly support learning while reducing human effort. In particular, they can be used to train deep neural networks for character detection and recognition. In this paper, we present a method for creating synthetic engravings and use it to create a new database composed of 26,901 synthetic Chu Nom characters in 21 different styles. Using a machine learning model for unpaired image-to-image translation, our approach is annotation-free, i.e. there is no need for human experts to label character images. A user study demonstrates that the synthetic engravings look realistic to the human eye.

Retrieving keywords in historical vietnamese stele images without human annotations
Conférence ArODES

Anna Scius-Bertrand, Andreas Fischer, Marc Bui

SoICT 2022: The 11th International Symposium on Information and Communication Technology, 1-3 December 2022, Hanoi, Vietnam

Lien vers la conférence

Résumé:

Stone engravings on Vietnamese steles are an invaluable resource for historians to study the life of the villagers in the past. Thanks to pictures taken of stampings of the steles, they can be investigated today in the form of digital images. Automatic keyword spotting is a promising means to access the textual content of the images, allowing to retrieve steles that contain a certain query term. In this paper, we present a complete pipeline for retrieving Chu Nom characters in Vietnamese steles that operates fully automatically on the original images, without the need for preprocessing, segmentation, or human annotation. It combines a self-calibration approach to character detection using deep convolutional neural networks with a graph-based approach to keyword spotting that compares templates of the search term with detected characters based on structural properties.

Improving handwriting recognition for historical documents using synthetic text lines
Conférence ArODES

Martin Spoto, Beat Wolf, Andreas Fischer, Anna Scius-Bertrand

Proceedings of the 20th International Conference of the International Graphonomics Society, IGS 2021, Intertwining Graphnomics with Human Movements, -9 June 2022, Las Palmas de Gran Canaria

Lien vers la conférence

Résumé:

Automatic handwriting recognition for historical documents is a key element for making our cultural heritage available to researchers and the general public. However, current approaches based on machine learning require a considerable amount of annotated learning samples to read ancient scripts and languages. Producing such ground truth is a laborious and time-consuming task that often requires human experts. In this paper, to cope with a limited amount of learning samples, we explore the impact of using synthetic text line images to support the training of handwriting recognition systems. For generating text lines, we consider lineGen, a recent GAN-based approach, and for handwriting recognition, we consider HTR-Flor, a state-of-the-art recognition system. Different meta-learning strategies are explored that schedule the addition of synthetic text line images to the existing real samples. In an experimental evaluation on the well-known Bentham dataset as well as the newly introduced Bullinger dataset, we demonstrate a significant improvement of the recognition performance when combining real and synthetic samples.

The RPM3D project :
Conférence ArODES
3D kinematics for remote patient monitoring

Alicia Fornés, Asma Bensalah, Cristina Carmona-Duarte, Jialuo Chen, Miguel A. Ferrer, Andreas Fischer, Josep Lladós, Cristina Martin, Eloy Opisso, Réjean Plamondon, Anna Scius-Bertrand, Josep Maria Tormos

Intertwining graphonomics with human movements ; Proceedings of the 20th International Graphonomics Society, IGS 2021, 7-9 June 2022, Las Palmas de Gran Canaria, Spain

Lien vers la conférence

Résumé:

This project explores the feasibility of remote patient monitoring based on the analysis of 3D movements captured with smartwatches. We base our analysis on the Kinematic Theory of Rapid Human Movement. We have validated our research in a real case scenario for stroke rehabilitation at the Guttmann Institute5 (neurorehabilitation hospital), showing promising results. Our work could have a great impact in remote healthcare applications, improving the medical efficiency and reducing the healthcare costs. Future steps include more clinical validation, developing multi-modal analysis architectures (analysing data from sensors, images, audio, etc.), and exploring the application of our technology to monitor other neurodegenerative diseases.

Annotation-free keyword spotting in historical Vietnamese manuscripts using graph matching
Conférence ArODES

Anna Scius-Bertrand, Linda Studer, Andreas Fischer, Marc Bui

Proceedings of the S+SSPR 2022. IAPR Joint International Workshops on Statistical Techniques in Pattern Recognition (SPR 2022) and Structural and Syntactic Pattern Recognition (SSPR 2022), 26-27 August 2022, Montreal, Canada

Lien vers la conférence

Résumé:

Finding key terms in scanned historical manuscripts is invaluable for accessing our written cultural heritage. While keyword spotting (KWS) approaches based on machine learning achieve the best spotting results in the current state of the art, they are limited by the fact that annotated learning samples are needed to infer the writing style of a particular manuscript collection. In this paper, we propose an annotation-free KWS method that does not require any labeled handwriting sample but learns from a printed font instead. First, we train a deep convolutional character detection system on synthetic pages using printed characters. Afterwards, the structure of the detected characters is modeled by means of graphs and is compared with search terms using graph matching. We evaluate our method for spotting logographic Chu Nom characters on the newly introduced Kieu database, which is a historical Vietnamese manuscripts containing 719 scanned pages of the famous Tale of Kieu. Our results show that search terms can be found with promising precision both when providing handwritten samples (query by example) as well as printed characters (query by string).

Analyzing Büchi automata with graph neural networks
Conférence ArODES

Christophe Stammet, Prisca Dotti, Ulrich Ultes-Nitsche, Andreas Fischer

Proceedings of the 4th International Workshop on Learning and Automata (LearnAut 2022), 4 July 2022, Paris, France

Lien vers la conférence

Résumé:

Büchi Automata on infinite words present many interesting problems and are used frequently in program verification and model checking. A lot of these problems on Büchi automata are computationally hard, raising the question if a learning-based data-driven analysis might be more efficient than using traditional algorithms. Since Büchi automata can be represented by graphs, graph neural networks are a natural choice for such a learning-based analysis. In this paper, we demonstrate how graph neural networks can be used to reliably predict basic properties of Büchi automata when trained on automatically generated random automata datasets.

Building-T-cell score is a potential predictor for more aggressive treatment in pT1 colorectal cancers
Conférence ArODES

Linda Studer, John-Melle Bokhorst, Francesco Ciompi, Andreas Fischer, Heather Dawson

Proceedings of the ECDP 2022 18th European Congress on Digital Pathology, 15-18 June 2022, Berlin, Germany

Lien vers la conférence

2021

Citizen participation & digital tools to improve pedestrian mobility in cities
Conférence ArODES

Olivier Ertz, Andreas Fischer, Hatem Ghorbel, Olivier Hüsser, Romain Sandoz, Anna Scius-Bertrand

The International Archives of the Photogrammetry, Remote Sensing and Spatial Information Sciences ; Proceedings of the 6th International Conference on Smart Data and Smart Cities

Lien vers la conférence

Résumé:

In this work, we present a framework supported by mobile and web apps and able to propose personalized pedestrian routes that match user mobility profile considering mobility impediments factors. We explain how these later have been defined using a pedestrian-centric approach based on travel experiences as perceived in the field by senior citizens. Through workshops, six main factors that may influence pedestrian route choices were revealed: passability, obstacle in path, surface problem, security, sidewalk width, slope. These categories were used to build digital tools and guide a citizen participatory approach to collect geolocated points of obstacle documented with walkability information (picture, category, impact score, free comment). We also involved citizens to evaluate these information and especially senior referents for validation. Finally we present how we connect these points of obstacle with a pedestrian network based on OpenStreetMap to configure a routing cost function. The framework has been partially deployed in 2020 with limited people due to the pandemic. Nonetheless, we share lessons learned from interaction with citizens in the design of such a framework whose underlying workflow is reproducible. We plan to further assess its relevance and sustainability in the future.

Annotation-free character detection in historical vietnamese stele images
Conférence ArODES

Anna Scius-Bertrand, Michael Jungo, Beat Wolf, Andreas Fischer, Marc Bui

Proceedings of the International Conference on Document Analysis and Recognition (ICDAR 2021), 5-10 September 2021, Lausanne, Switzerland

Lien vers la conférence

Résumé:

Images of Historical Vietnamese stone engravings provide historians with a unique opportunity to study the past of the country. However, due to the large heterogeneity of thousands of images regarding both the text foreground and the stone background, it is difficult to use automatic document analysis methods for supporting manual examination, especially with a view to the labeling effort needed for training machine learning systems. In this paper, we present a method for finding the location of Chu Nom characters in the main text of the steles without the need of any human annotation. Using self-calibration, fully convolutional object detection methods trained on printed characters are successfully adapted to the handwritten image collection. The achieved detection results are promising for subsequent document analysis tasks, such as keyword spotting or transcription.

Graph convolutional neural networks for learning attribute representations for word spotting
Conférence ArODES

Andreas Fischer, Gernot A. Fink

Proceedings of International Conference on Document Analysis and Recognition (ICDAR 2021), 5-10 September 2021, Lausanne, Switzerland

Lien vers la conférence

Résumé:

Graphs are an intuitive and natural way of representing handwriting. Due to their high representational power, they have shown high performances in different learning-free document analysis tasks. While machine learning is rather unexplored for graph representations, geometric deep learning offers a novel framework that allows for convolutional neural networks similar to the image domain. In this work, we show that the concept of attribute prediction can be adapted to the graph domain. We propose a graph neural network to map handwritten word graphs to a symbolic attribute space. This mapping allows to perform query-by-example word spotting as it was also tackled by other learning-free approaches in the graph domain. Furthermore, our model is capable of query-by-string, which is out of scope for other graph-based methods in the literature. We investigate two variants of graph convolutional layers and show that learning improves performances considerably on two popular graph-based word spotting benchmarks.

Self-rule to adapt :
Conférence ArODES
learning generalized features from sparsely-labeled data using unsupervised domain adaptation for colorectal cancer tissue phenotyping

Christian Abbet, Linda Studer, Andreas Fischer, Heather Dawson, Inti Zlobec, Behzad Bozorgtabar, Jean-Philippe Thiran

Proceedings of the Medical Imaging with Deep Learning (MIDL 2021), 7 - 9 July 2021, Lübeck, Germany

Lien vers la conférence

Résumé:

Supervised learning is conditioned by the availability of labeled data, which are especially expensive to acquire in the field of medical image analysis. Making use of open-source data for pre-training or using domain adaptation can be a way to overcome this issue. However, pre-trained networks often fail to generalize to new test domains that are not distributed identically due to variations in tissue stainings, types, and textures. Additionally, current domain adaptation methods mainly rely on fully-labeled source datasets. In this work, we propose Self-Rule to Adapt (SRA) which takes advantage of self-supervised learning to perform domain adaptation and removes the burden of fully-labeled source datasets. SRA can effectively transfer the discriminative knowledge obtained from a few labeled source domain to a new target domain without requiring additional tissue annotations. Our method harnesses both domains’ structures by capturing visual similarity with intra-domain and cross-domain self-supervision. We show that our proposed method outperforms baselines across diverse domain adaptation settings and further validate our approach to our in-house clinical cohort.

Classification of intestinal gland cell-graphs using graph neural networks
Conférence ArODES

Linda Studer, Janis Wallau, Heather Dawson, Inti Zlobec, Andreas Fischer

Proceedings of the 25th International Conference on Pattern Recognition (ICPR), 10-15 January 2021, Milan, Italy

Lien vers la conférence

Résumé:

We propose to classify intestinal glands as normal or dysplastic using cell-graphs and graph-based deep learning methods. Dysplastic intestinal glands can lead to colorectal cancer, which is one of the three most common cancer types in the world. In order to assess the cancer stage and thus the treatment of a patient, pathologists analyse tissue samples of affected patients. Among other factors, they look at the changes in morphology of different tissues, such as the intestinal glands. Cell-graphs have a high representational power and can describe topological and geometrical properties of intestinal glands. However, classical graph-based methods have a high computational complexity and there is only a limited range of machine learning methods available. In this paper, we propose Graph Neural Networks (GNNs) as an efficient learning-based approach to classify cell-graphs. We investigate different variants of so-called Message Passing Neural Networks and compare them with a classical graph-based approach based on approximated Graph Edit Distance and k-nearest neighbours classifier. A promising classification accuracy of 94.8% is achieved by the proposed method on the pT1 Gland Graph dataset, which is an increase of 11.5% over the baseline result.

2020

Effects of graph pooling layers on classification with graph neural networks
Conférence ArODES

Linda Studer, Jannis Wallau, Rolf Ingold, Andreas Fischer

Proceedings of the 7th Swiss Conference on Data Science (SDS), 26 June 2020, Luzern, Switzerland

Lien vers la conférence

Résumé:

With the rise of graph neural networks, sometimes also referred to as geometric deep learning, a range of new types of network layers have been introduced. Since this is a very recent development, the design of new architectures relies a lot on intuition and trial-and-error. In this paper, we evaluate the effect of adding graph pooling layers to a network, which down-sample graphs, and evaluate the performance on three different datasets. We find that especially for smaller graphs, adding pooling layers should be done with caution, as they can have a negative effect on the overall performance.

Automatic creation of text corpora for low-resource languages from the internet :
Conférence ArODES
the case of swiss german

Lucy Linder, Michael Jungo, Jean Hennebert, Claudiu Musat, Andreas Fischer

Proceedings of the 12th Conference on Language Resources and Evaluation (LREC 2020), 11-16 May 2020, Marseille, France

Lien vers la conférence

Résumé:

This paper presents SwissCrawl, the largest Swiss German text corpus to date. Composed of more than half a million sentences, it was generated using a customized web scraping tool that could be applied to other low-resource languages as well. The approach demonstrates how freely available web pages can be used to construct comprehensive text corpora, which are of fundamental importance for natural language processing. In an experimental evaluation, we show that using the new corpus leads to significant improvements for the task of language modeling. To capture new content, our approach will run continuously to keep increasing the corpus over time.

2018

Offline signature verification by combining graph edit distance and triplet networks
Conférence ArODES

Paul Maergner, Vinaychandran Pondenkandath, Michele Alberti, Marcus Liwicki, Kaspar Riesen, Rolf Ingold, Andreas Fischer

Proceedings of Joint IAPR International Workshop, S+SSPR 2018, Beijing, China, 17-19 August 2018

Lien vers la conférence

Résumé:

Biometric authentication by means of handwritten signatures is a challenging pattern recognition task, which aims to infer a writer model from only a handful of genuine signatures. In order to make it more difficult for a forger to attack the verification system, a promising strategy is to combine different writer models. In this work, we propose to complement a recent structural approach to offline signature verification based on graph edit distance with a statistical approach based on metric learning with deep neural networks. On the MCYT and GPDS benchmark datasets, we demonstrate that combining the structural and statistical models leads to significant improvements in performance, profiting from their complementary properties.

Learning graph distances with message passing neural networks
Conférence ArODES

Pau Riba, Andreas Fischer, Josep Lladós, Alicia Fornés

ICPR 2018, the 24th International Conference on Pattern Recognition, 20-24 August 2018, Beijing, China

Lien vers la conférence

Résumé:

Graph representations have been widely used in pattern recognition thanks to their powerful representation formalism and rich theoretical background. A number of errortolerant graph matching algorithms such as graph edit distance have been proposed for computing a distance between two labelled graphs. However, they typically suffer from a high computational complexity, which makes it difficult to apply these matching algorithms in a real scenario. In this paper, we propose an efficient graph distance based on the emerging field of geometric deep learning. Our method employs a message passing neural network to capture the graph structure and learns a metric with a siamese network approach. The performance of the proposed graph distance is validated in two application cases, graph classification and graph retrieval of handwritten words, and shows a promising performance when compared with (approximate) graph edit distance benchmarks.

Offline signature verification via structural methods :
Conférence ArODES
graph edit distance and inkball models

Paul Maergner, Nicholas R. Howe, Kaspar Riesen, Rolf Ingold, Andreas Fischer

Proceedings of ICFHR 2018, the 16th International Conference on Frontiers in Handwriting Recognition, 5-8 August 2018, Niagara Falls, USA

Lien vers la conférence

Résumé:

For handwritten signature verification, signature images are typically represented with fixed-sized feature vectors capturing local and global properties of the handwriting. Graphbased representations offer a promising alternative, as they are flexible in size and model the global structure of the handwriting. However, they are only rarely used for signature verification, which may be due to the high computational complexity involved when matching two graphs. In this paper, we take a closer look at two recently presented structural methods for handwriting analysis, for which efficient matching methods are available: keypoint graphs with approximate graph edit distance and inkball models. Inkball models, in particular, have never been used for signature verification before. We investigate both approaches individually and propose a combined verification system, which demonstrates an excellent performance on the MCYT and GPDS benchmark data sets when compared with the state of the art.

Seamless GPU evaluation of smart expression templates
Conférence ArODES

Baptiste Wicht, Andreas Fischer, Jean Hennebert

Proceedings of the 2018 International Conference on High Performance Computing & Simulation (HPCS 2018), The 16th Annual Meeting, 16-20 July 2018, Orléans, France

Lien vers la conférence

Résumé:

Expression Templates is a technique allowing to write linear algebra code in C++ the same way it would be written on paper. It is also used extensively as a performance optimization technique, especially as the Smart Expression Templates form which allows for even higher performance. It has proved to be very efficient for computation on a Central Processing Unit (CPU). However, due to its design, it is not easily implemented on a Graphics Processing Unit (GPU). In this paper, we devise a set of techniques to allow the seamless evaluation of Smart Expression Templates on the GPU. The execution is transparent for the user of the library which still uses the matrices and vector as if it was on the CPU and profits from the performance and higher multi-processing capabilities of the GPU. We also show that the GPU version is significantly faster than the CPU version, without any change to the code of the user.

Extending the Sigma-Lognormal model of the kinematic theory to three dimensions
Conférence ArODES

Roman Schindler, Manuel Bouillon, Réjean Plamondon, Andreas Fischer

Proceedings of ICPRAI 2018 - International Conference on Pattern Recognition and Artificial Intelligence, Celebrating the 30th Anniversary of CENPARMI, 14-17 May 2018 + Public Lecture on 13 May 2018, Concordia University, Montréal, Canada

Lien vers la conférence

Résumé:

The Kinematic Theory of rapid human movements and its Sigma-Lognormal model enables to model human gestures, in particular complex handwriting patterns such as words, signatures and free gestures. This paper investigates the extension of the theory and its Sigma-Lognormal model from two dimensions to three, taking into account new acquisition modalities (motion capture), multiple subjects, and unconstrained motions. Despite the increased complexity and the new acquisition modalities, we demonstrate that the Sigma-Lognormal model can be successfully generalized to describe 3D human movements. Starting from the 2D model, we replace circular with spherical motions to derive a representation of unconstrained human movements with a new 3D Sigma-Lognormal model. First experiments show a high reconstruction quality with an average signal-tonoise ratio (SNR) of 18.52 dB on the HDM05 dataset. Gesture recognition using dynamic time warping (DTW) achieves similar recognition accuracies when using original and reconstructed gestures, which confirms the high quality of the proposed model.

Graph-based keyword spotting in historical documents using context-aware Hausdorff edit distance
Conférence ArODES

Michael Stauffer, Andreas Fischer, Kaspar Riesen

Proceedings of DAS 2018 : 13th IAPR International Workshop on Document Analysis Systems, 24-27 April 2018, Vienna, Austria

Lien vers la conférence

Résumé:

Scanned handwritten historical documents are often not well accessible due to the limited feasibility of automatic full transcriptions. Thus, Keyword Spotting (KWS) has been proposed as an alternative to retrieve arbitrary query words from this kind of documents. In the present paper, word images are represented by means of graphs. That is, a graph is used to represent the inherent topological characteristics of handwriting. The actual keyword spotting is then based on matching a query graph with all document graphs. In particular, we make use of a fast graph matching algorithm that considers the contextual substructure of nodes. The motivation for this inclusion of node context is to increase the overall KWS accuracy. In an experimental evaluation on four historical documents, we show that the proposed procedure clearly outperforms diverse other template-based reference systems. Moreover, our novel framework keeps up or even outperforms many state-of-the-art learning-based KWS approaches.

2017

A structural approach to offline signature verification using graph edit distance
Conférence ArODES

Paul Maergner, Kaspar Riesen, Kaspar Ingold, Andreas Fischer

Proceedings of 2017 14th IAPR International Conference on Document Analysis and Recognition, 9-15 November 2017, Kyoto, Japan

Lien vers la conférence

Résumé:

Graphs provide a powerful representation formalism for handwritten signatures, capturing local properties as well as their relations. Yet, although introduced early for signature verification, only a few current systems rely on graph-based representations. A possible reason is the high computational complexity involved for matching two general graphs. In this paper, we introduce a novel structural approach to offline signature verification using an efficient cubic-time approximation of graph edit distance. We put forward several ways of creating, normalizing, and comparing signature graphs built from keypoints and investigate their performance on three benchmark datasets. The experiments demonstrate a promising performance of the proposed structural approach when compared with the state of the art.

Ensembles for graph-based keyword spotting in historical handwritten documents
Conférence ArODES

Michael Stauffer, Andreas Fischer, Kaspar Riesen

Proceedings of the 2017 14th IAPR International Conference on Document Analysis and Recognition (ICDAR), 9-15 November 2017, Kyoto, Japan

Lien vers la conférence

Résumé:

Keyword Spotting (KWS) offers a convenient way to improve the accessibility to historical handwritten documents by retrieving search terms in scanned document images. The approach for KWS proposed in the present paper is based on segmented word images that are represented by means of different types of graphs. The actual keyword spotting is based on matching a query graph with a set of document graphs using the concept of graph edit distance. In particular, we propose to employ ensemble methods for KWS with graphs. That is, a query graph is not matched against one but several different graphs representing the same document word. Eventually, we use different strategies to combine these individual graph dissimilarities. In an experimental evaluation on two benchmark datasets, the proposed ensemble methods outperform the individual ensemble members as well as four state-of-the-art reference systems based on dynamic time warping.

Speeding-up graph-based keyword spotting in historical handwritten documents
Conférence ArODES

Michael Stauffer, Andreas Fischer, Kaspar Riesen

Lecture Notes in Computer Science ; Proceedings of International Workshop on Graph-based Representations in Pattern Recognition (GbRPR 2017), 16-18 May 2017, Anacapri, Italy

Lien vers la conférence

Résumé:

The present paper is concerned with a graph-based system for Keyword Spotting (KWS) in historical documents. This particular system operates on segmented words that are in turn represented as graphs. The basic KWS process employs the cubic-time bipartite matching algorithm (BP). Yet, even though this graph matching procedure is relatively efficient, the computation time is a limiting factor for processing large volumes of historical manuscripts. In order to speed up our framework, we propose a novel fast rejection heuristic. This heuristic compares the node distribution of the query graph and the document graph in a polar coordinate system. This comparison can be accomplished in linear time. If the node distributions are similar enough, the BP matching is actually carried out (otherwise the document graph is rejected). In an experimental evaluation on two benchmark datasets we show that about 50% or more of the matchings can be omitted with this procedure while the KWS accuracy is not negatively affected.

A survey on applications of bipartite graph edit distance
Conférence ArODES

Michael Stauffer, Thomas Tschachtli, Andreas Fischer, Kaspar Riesen

Lecture Notes in Computer Science ; Proceedings of International Workshop on Graph-based Representations in Pattern Recognition (GbRPR 2017), 16-18 May 2017, Anacapri, Italy

Lien vers la conférence

Résumé:

About ten years ago, a novel graph edit distance framework based on bipartite graph matching has been introduced. This particular framework allows the approximation of graph edit distance in cubic time. This, in turn, makes the concept of graph edit distance also applicable to larger graphs. In the last decade the corresponding paper has been cited more than 360 times. Besides various extensions from the methodological point of view, we also observe a great variety of applications that make use of the bipartite graph matching framework. The present paper aims at giving a first survey on these applications stemming from six different categories (which range from document analysis, over biometrics to malware detection).

Improved graph edit distance approximation with simulated annealing
Conférence ArODES

Kaspar Riesen, Andreas Fischer, Horst Bunke

Proceedings of the International Workshop on Graph-based Representations in Pattern Recognition (GbRPR 2017), 16-18 May 2017, Anacapri, Italy ; Lecture Notes in Computer Science

Lien vers la conférence

Résumé:

The present paper is concerned with graph edit distance, which is widely accepted as one of the most flexible graph dissimilarity measures available. A recent algorithmic framework for approximating the graph edit distance overcomes the major drawback of this distance model, viz. its exponential time complexity. Yet, this particular approximation suffers from an overestimation of the true edit distance in general. Overall aim of the present paper is to improve the distance quality of this approximation by means of a post-processing search procedure. The employed search procedure is based on the idea of simulated annealing, which turns out to be particularly suitable for complex optimization problems. In an experimental evaluation on several graph data sets the benefit of this extension is empirically confirmed.

Inkball models as features for handwriting recognition
Conférence ArODES

Nicholas R. Howe, Andreas Fischer, Baptiste Wicht

Proceedings of the 2016 15th International Conference on Frontiers in Handwriting Recognition (ICFHR), 23-26 October 2016, Shenzhen, China

Lien vers la conférence

Résumé:

Inkball models provide a tool for matching and comparison of spatially structured markings such as handwritten characters and words. Hidden Markov models offer a framework for decoding a stream of text in terms of the most likely sequence of causal states. Prior work with HMM has relied on observation of features that are correlated with underlying characters, without modeling them directly. This paper proposes to use the results of inkball-based character matching as a feature set input directly to the HMM. Experiments indicate that this technique outperforms other tested methods at handwritten word recognition on a common benchmark when applied without normalization or text deslanting.

2016

Deep learning features for handwritten keyword spotting
Conférence ArODES

Baptiste Wicht, Andreas Fischer, Jean Hennebert

Proceedings of the 2016 23rd International Conference on Pattern Recognition (ICPR), 4-8 December 2016, Cancun, Mexico

Lien vers la conférence

Résumé:

Deep learning had a significant impact on diverse pattern recognition tasks in the recent past. In this paper, we investigate its potential for keyword spotting in handwritten documents by designing a novel feature extraction system based on Convolutional Deep Belief Networks. Sliding window features are learned from word images in an unsupervised manner. The proposed features are evaluated both for template-based word spotting with Dynamic Time Warping and for learning-based word spotting with Hidden Markov Models. In an experimental evaluation on three benchmark data sets with historical and modern handwriting, it is shown that the proposed learned features outperform three standard sets of handcrafted features.

Graph-based keyword spotting in historical handwritten documents
Conférence ArODES

Michael Stauffer, Kaspar Riesen, Andreas Fischer

Proceedings of Joint IAPR International Workshops on Statistical Techniques in Pattern Recognition (SPR) and Structural and Syntactic Pattern Recognition (SSPR), 29 November-2 December 2016, Mérida, Mexico

Lien vers la conférence

Résumé:

The amount of handwritten documents that is digitally available is rapidly increasing. However, we observe a certain lack of accessibility to these documents especially with respect to searching and browsing. This paper aims at closing this gap by means of a novel method for keyword spotting in ancient handwritten documents. The proposed system relies on a keypoint-based graph representation for individual words. Keypoints are characteristic points in a word image that are represented by nodes, while edges are employed to represent strokes between two keypoints. The basic task of keyword spotting is then conducted by a recent approximation algorithm for graph edit distance. The novel framework for graph-based keyword spotting is tested on the George Washington dataset on which a state-of-the-art reference system is clearly outperformed.

A novel graph database for handwritten word images
Conférence ArODES

Michael Stauffer, Andreas Fischer, Kaspar Riesen

Proceedings of Joint IAPR International Workshops on Statistical Techniques in Pattern Recognition (SPR) and Structural and Syntactic Pattern Recognition (SSPR), 29 November-2 December 2016, Mérida, Mexico

Lien vers la conférence

Résumé:

For several decades graphs act as a powerful and flexible representation formalism in pattern recognition and related fields. For instance, graphs have been employed for specific tasks in image and video analysis, bioinformatics, or network analysis. Yet, graphs are only rarely used when it comes to handwriting recognition. One possible reason for this observation might be the increased complexity of many algorithmic procedures that take graphs, rather than feature vectors, as their input. However, with the rise of efficient graph kernels and fast approximative graph matching algorithms, graph-based handwriting representation could become a versatile alternative to traditional methods. This paper aims at making a seminal step towards promoting graphs in the field of handwriting recognition. In particular, we introduce a set of six different graph formalisms that can be employed to represent handwritten word images. The different graph representations for words, are analysed in a classification experiment (using a distance based classifier). The results of this word classifier provide a benchmark for further investigations.

Nautilus :
Conférence ArODES
real-time interaction between dancers and augmented reality with pixel-cloud avatars

Andreas Fischer, Pascal Buchs, Maurizio Caon, Omar Abou Khaled, Elena Mugellini, Sara Grimm, Franziska Meyer, Claudia Wagner, Valentine Bernasconi, Angelika Garz

Actes de la 28ième conférence francophone sur l’Interaction Homme-Machine (IHM'16), 25-28 octobre 2016, Fribourg, Suisse

Lien vers la conférence

Résumé:

L'interaction en temps réel avec la réalité augmentée re-présente un nouveau matériel avec lequel les danseurs et chorégraphes peuvent travailler pour leurs spectacles. Cela permet aux danseurs d'aller au-delà de la seule syn-chronisation entre musique et mouvement et amène de nouvelles opportunités comme modifier l'environnement audio-visuel et de réagir à ses changements. Dans cet article, nous présentons le processus et le résultat d'un travail collaboratif entre art et technologie, lequel a per- mis d’explorer ce nouveau matériel dans le cadre du spec- tacle Nautilus. Nous suggérons une approche basée sur le tracking des corps par caméra 3D et sur des avatars com- posés de nuages de pixels ; cette approche permet aux danseurs d’interagir de manière fiable avec la réalité augmentée en gardant la liberté de mouvements.

Approximation of graph edit distance by means of a utility matrix
Conférence ArODES

Kaspar Riesen, Andreas Fischer, Horst Bunke

Proceedings of the 7th IAPR TC3 Workshop, Artificial Neural Networks in Pattern Recognition (ANNPR) 2016, 28-30 September 2016, Ulm Germany

Lien vers la conférence

Résumé:

Graph edit distance is one of the most popular graph matching paradigms available. By means of a reformulation of graph edit distance to an instance of a linear sum assignment problem, the major drawback of this dissimilarity model, viz. the exponential time complexity, has been invalidated recently. Yet, the substantial decrease of the computation time is at the expense of an approximation error. The present paper introduces a novel transformation that processes the underlying cost model into a utility model. The benefit of this transformation is that it enables the integration of additional information in the assignment process. We empirically confirm the positive effects of this transformation on three standard graph data sets. That is, we show that the accuracy of a distance based classifier can be improved with the proposed transformation while the run time remains nearly unaffected.

Keyword spotting with convolutional deep belief networks and dynamic time warping
Conférence ArODES

Pascal Wicht, Andreas Fischer, Jean Hennebert

Proceedings of the 25th International Conference on Artificial Neural Networks and Machine Learning (ICANN), 6-9 September 2016, Barcelona, Spain

Lien vers la conférence

Résumé:

To spot keywords on handwritten documents, we present a hybrid keyword spotting system, based on features extracted with Convolutional Deep Belief Networks and using Dynamic Time Warping for word scoring. Features are learned from word images, in an unsupervised manner, using a sliding window to extract horizontal patches. For two single writer historical data sets, it is shown that the proposed learned feature extractor outperforms two standard sets of features.

On CPU performance optimization of restricted Boltzmann machine and convolutional RBM
Conférence ArODES

Baptiste Wicht, Andreas Fischer, Jean Hennebert

Proceedings of the 7th IAPR TC3 Workshop, Artificial Neural Networks in Pattern Recognition (ANNPR) 2016, 28-30 September 2016, Ulm Germany

Lien vers la conférence

Résumé:

Although Graphics Processing Units (GPUs) seem to currently be the best platform to train machine learning models, most research laboratories are still only equipped with standard CPU systems. In this paper, we investigate multiple techniques to speedup the training of Restricted Boltzmann Machine (RBM) models and Convolutional RBM (CRBM) models on CPU with the Contrastive Divergence (CD) algorithm. Experimentally, we show that the proposed techniques can reduce the training time by up to 30 times for RBM and up to 12 times for CRBM, on a data set of handwritten digits.

Creating ground truth for historical manuscripts with document graphs and scribbling interaction
Conférence ArODES

Angelika Garz, Mathias Seuret, Fotini Simistira, Andreas Fischer, Rolf Ingold

Proceedings of 12th IAPR Workshop on Document Analysis Systems (DAS), 11-14 April 2016, Santorini, Greece

Lien vers la conférence

Résumé:

Ground truth is both – indispensable for training and evaluating document analysis methods, and yet very tedious to create manually. This especially holds true for complex historical manuscripts that exhibit challenging layouts with interfering and overlapping handwriting. In this paper, we propose a novel semi-automatic system to support layout annotations in such a scenario based on document graphs and a pen-based scribbling interaction. On the one hand, document graphs provide a sparse page representation that is already close to the desired ground truth and on the other hand, scribbling facilitates an efficient and convenient pen-based interaction with the graph. The performance of the system is demonstrated in the context of a newly introduced database of historical manuscripts with complex layouts.

2015

Robust score normalization for DTW-based on-line signature verification
Conférence ArODES

Andreas Fischer, Moises Diaz, Réjean Plamondon, Miguel A. Ferrer

Proceedings of the 2015 13th International Conference on Document Analysis and Recognition (ICDAR), 23-26 August 2015, Tunis, Tunisia

Lien vers la conférence

Résumé:

In the field of automatic signature verification, a major challenge for statistical analysis and pattern recognition is the small number of reference signatures per user. Score normalization, in particular, is challenged by the lack of information about intra-user variability. In this paper, we analyze several approaches to score normalization for dynamic time warping and propose a new two-stage normalization which detects simple forgeries in a first stage and copes with more skilled forgeries in a second stage. An experimental evaluation is conducted on two data sets with different characteristics, namely the MCYT online signature corpus, which contains over three hundred users, and the SUSIG visual sub-corpus, which contains highly skilled forgeries. The results demonstrate that score normalization is a key component for signature verification and that the proposed two-stage normalization achieves some of the best results on these difficult data sets both for random and for skilled forgeries.

Towards an automatic on-line signature verifier using only one reference per signer
Conférence ArODES

Andreas Fischer, Moises Diaz, Réjean Plamondon, Miguel A. Ferrer

Proceedings of 2015 13th International Conference on Document Analysis and Recognition (ICDAR), 23-26 August 2015, Tunis, Tunisia

Lien vers la conférence

Résumé:

What can be done with only one enrolled real hand-written signature in Automatic Signature Verification (ASV)? Using 5 or 10 signatures for training is the most common case to evaluate ASV. In the scarcely addressed case of only one available signature for training, we propose to use modified duplicates. Our novel technique relies on a fully neuromuscular representation of the signatures based on the Kinematic Theory of rapid human movements and its Sigma-Lognormal model. This way, a real on-line signature is converted into the Sigma-Lognormal model domain. The model parameters are then varied to generate new duplicated signatures.

Selecting Autoencoder Features for Layout Analysis of Historical Documents
Conférence ArODES

Hao Wei, Mathias Seuret, Kai Chen, Andreas Fischer, Marcus Liwicki, Rolf Ingold

Proceedings of HIP '15: Proceedings of the 3rd International Workshop on Historical Document Imaging and Processing, 22 August 2015, Nancy, France

Lien vers la conférence

Résumé:

Automatic layout analysis of historical documents has to cope with a large number of different scripts, writing supports, and digitalization qualities. Under these conditions, the design of robust features for machine learning is a highly challenging task. We use convolutional autoencoders to learn features from the images. In order to increase the classification accuracy and to reduce the feature dimension, in this paper we propose a novel feature selection method. The method cascades adapted versions of two conventional methods. Compared to three conventional methods and our previous work, the proposed method achieves a higher classification accuracy in most cases, while maintaining low feature dimension. In addition, we find that a significant number of autoencoder features are redundant or irrelevant for the classification, and we give our explanations. To the best of our knowledge, this paper is one of the first investigations in the field of image processing on the detection of redundancy and irrelevance of autoencoder features using feature selection.

Clustering historical documents based on the reconstruction error of autoencoders
Conférence ArODES

Mathias Seuret, Andreas Fischer, Angelika Garz, Marcus Liwicki, Rolf Ingold

Proceedings of HIP '15: Proceedings of the 3rd International Workshop on Historical Document Imaging and Processing, 22 August 2022, Nancy, France

Lien vers la conférence

Résumé:

The term "historical documents" encompasses an enormous variety of document types considering different scripts, languages, writing supports, and degradation degrees. For automatic processing with machine learning and pattern recognition methods, it would be ideal to share labeled learning samples and trained statistical models across similar documents, avoiding a retraining from scratch for every historical document anew. In this paper, we propose using the reconstruction error of autoencoders to compare historical manuscripts with the goal of clustering them according to their visual appearance. A low reconstruction error suggests visual similarity between a new manuscript and a known manuscript, for which the autoencoder was trained in an unsupervised fashion. Preliminary experiments conducted on 10 different manuscripts written with ink on parchment demonstrate the ability of the reconstruction error to group similar writing styles. For discriminating between Carolingian and cursive script, in particular, near-perfect results are reported.

Omega-lognormal analysis of oscillatory movements as a function of brain stroke risk factors
Conférence ArODES

Albert Bou Hernandez, Andreas Fischer, Réjean Plamondon

Proceedings of the 17th Biennial Conference of the International Graphonomics Society, International Graphonomics Society (IGS), 21-24 June 2015, Pointe-à-Pitre, Guadeloupe

Lien vers la conférence

Résumé:

The development of predictive tools has been commonly utilized as the most effective manner to prevent illnesses that strike suddenly. Within this context, investigations linking fine human motor control with brain stroke risk factors are considered to have a high potential but they are still in an early stage of research. The present paper analyses neuromuscular features of oscillatory movements based on the Omega-Lognormal model of the Kinematic Theory. On a database of oscillatory movements from 120 subjects, we demonstrate that the proposed features differ significantly between subjects with and without brain stroke risk factors. This promising result motivates the development of predictive tools based on the Omega-Lognormal model.

A dissimilarity measure for on-line signature verification based on the sigma-lognormal model
Conférence ArODES

Andreas Fischer, Réjean Plamondon

Proceedings of the 17th Biennial Conference of the International Graphonomics Society, International Graphonomics Society (IGS), 21-24 June 2015, Pointe-à-Pitre, Guadeloupe

Lien vers la conférence

Résumé:

The Sigma-Lognormal model of the Kinematic Theory of rapid human movements allows us to represent online signatures with an analytical neuromuscular model. It has been successfully used in the past to generate synthetic signatures in order to improve the performance of an automatic verification system. In this paper, we attempt for the first time to build a verification system based on the model parameters themselves. For describing individual lognormal strokes, we propose eighteen features which capture cognitive psychomotor characteristics of the signer. They are matched by means of dynamic time warping to derive a dissimilarity measure for signature verification. Promising initial results are reported for an experimental evaluation on the SUSIG visual sub-corpus, which contains some of the most skilled forgeries currently available for research.

Approximation of graph edit distance in quadratic time
Conférence ArODES

Kaspar Riesen, Miquel Ferrer, Andreas Fischer, Horst Bunke

Proceedings of International Workshop on Graph-Based Representations in Pattern Recognition ; GbRPR 2015: Graph-Based Representations in Pattern Recognition, 13-15 May 2015, Beijing, China

Lien vers la conférence

Résumé:

The basic idea of a recent graph matching framework is to reduce the problem of graph edit distance (GED) to an instance of a linear sum assignment problem (LSAP). The optimal solution for this simplified GED problem can be computed in cubic time and is eventually used to derive a suboptimal solution for the original GED problem. Yet, for large scale graphs and/or large scale graph sets the cubic time complexity remains a severe handicap of this procedure. Therefore, we propose to use suboptimal algorithms – with quadratic rather than cubic time complexity – for solving the underlying LSAP. In particular, we introduce several greedy assignment algorithms for approximating GED. In an experimental evaluation we show that there is great potential for further speeding up the GED computation. Moreover, we empirically confirm that the distances obtained by this procedure remain sufficiently accurate for graph based pattern classification.

Improving Hausdorff edit distance using structural node context
Conférence ArODES

Andreas Fischer, Seiichi Uchida, Volkmar Frinken, Kaspar Riesen, Horst Bunke

Proceedings of International Workshop on Graph-Based Representations in Pattern Recognition ; GbRPR 2015: Graph-Based Representations in Pattern Recognition, 13-15 May 2015, Beijing, China

Lien vers la conférence

Résumé:

In order to cope with the exponential time complexity of graph edit distance, several polynomial-time approximation algorithms have been proposed in recent years. The Hausdorff edit distance is a quadratic-time matching procedure for labeled graphs which reduces the edit distance to a correspondence problem between local substructures. In its original formulation, nodes and their adjacent edges have been considered as local substructures. In this paper, we integrate a more general structural node context into the matching procedure based on hierarchical subgraphs. In an experimental evaluation on diverse graph data sets, we demonstrate that the proposed generalization of Hausdorff edit distance can significantly improve the accuracy of graph classification while maintaining low computational complexity.

Réalisations

Médias et communication
Nous contacter
Suivez la HES-SO
linkedin instagram facebook twitter youtube rss
univ-unita.eu www.eua.be swissuniversities.ch
Mentions légales
© 2021 - HES-SO.

HES-SO Rectorat