Téléphone: +41 26 429 67 34
Rôle: Requérant(e) principal(e)
Description du projet :
2017-2020, TAINA Technology
Equipe de recherche au sein de la HES-SO:
Statut: En cours
2020-2021, Hasler Foundation
Christophe Stammet, Ulrich Ultes-Nitsche, Andreas Fischer
To be published.
Lien vers la publication
The universality check of Büchi automata is a foundational problem in automata-based formal verification, closely related to the complementation problem, and is known to be PSPACE-complete. This article introduces a novel approach for creating labelled datasets of Büchi automata concerning their universality. We start with small automata, where the universality check can still be algorithmically performed within a reasonable timeframe, and then apply transformations that provably preserve (non-)universality while increasing their size. This approach enables the generation of large datasets of labelled Büchi automata without the need for an explicit and computationally intensive universality check. We subsequently employ these generated datasets to train Graph Neural Networks (GNNs) for the purpose of classifying automata with respect to their (non-)universality. The classification results indicate that such a network can learn patterns related to the behaviour of Büchi automata that facilitate the recognition of universality. Additionally, our results on randomly generated automata, which were not constructed using the transformation techniques, demonstrate the network’s potential in classifying Büchi automata with respect to universality, extending its applicability beyond cases generated using a specific technique.
Anna Scius-Bertrand, Marc Bui, Andreas Fischer
47, 3, 361-372
In order to access the rich cultural heritage conveyed in Vietnamese steles, automatic reading of stone engravings would be a great support for historians, who are analyzing tens of thousands of stele images. Approaching the challenging problem with deep learning alone is difficult because the data-driven models require large representative datasets with expert human annotations, which are not available for the steles and costly to obtain. In this article, we present a hybrid approach to spot keywords in stele images that combines data-driven deep learning with knowledge-based structural modeling and matching of Chu Nom characters. The main advantage of the proposed method is that it is annotation-free, i.e. no human data annotation is required. In an experimental evaluation, we demonstrate that keywords can be successfully spotted with a mean average precision of more than 70% when a single engraving style is considered.
Christian Abbet, Linda Studer, Andreas Fischer, Heather Dawson, Inti Zlobec, Behzad Bozorgtabar, Jean-Philippe Thiran
Medical Image Analysis,
2022, vol. 79, article no. 102473
Supervised learning is constrained by the availability of labeled data, which are especially expensive to acquire in the field of digital pathology. Making use of open-source data for pre-training or using domain adaptation can be a way to overcome this issue. However, pre-trained networks often fail to generalize to new test domains that are not distributed identically due to tissue stainings, types, and textures variations. Additionally, current domain adaptation methods mainly rely on fully-labeled source datasets. In this work, we propose Self-Rule to Multi-Adapt (SRMA), which takes advantage of self-supervised learning to perform domain adaptation, and removes the necessity of fully-labeled source datasets. SRMA can effectively transfer the discriminative knowledge obtained from a few labeled source domain’s data to a new target domain without requiring additional tissue annotations. Our method harnesses both domains’ structures by capturing visual similarity with intra-domain and cross-domain self-supervision. Moreover, we present a generalized formulation of our approach that allows the framework to learn from multiple source domains. We show that our proposed method outperforms baselines for domain adaptation of colorectal tissue type classification in single and multi-source settings, and further validate our approach on an in-house clinical cohort. The code and trained models are available open-source: https://github.com/christianabbet/SRA.
Pau Riba, Josep Lladòs, Alicia Fornés, Andreas Fischer
2021, vol. 120, article no. 108132
The emergence of geometric deep learning as a novel framework to deal with graph-based representations has faded away traditional approaches in favor of completely new methodologies. In this paper, we propose a new framework able to combine the advances on deep metric learning with traditional approximations of the graph edit distance. Hence, we propose an efficient graph distance based on the novel field of geometric deep learning. Our method employs a message passing neural network to capture the graph structure, and thus, leveraging this information for its use on a distance computation. The performance of the proposed graph distance is validated on two different scenarios. On the one hand, in a graph retrieval of handwritten words i.e. keyword spotting, showing its superior performance when compared with (approximate) graph edit distance benchmarks. On the other hand, demonstrating competitive results for graph similarity learning when compared with the current state-of-the-art on a recent benchmark dataset.
Ann Scius-Bertrand, Michael Jungo, Beat Wolf, Andreas Fischer, Marc Bui
2021, vol. 11, no. 11, article no. 4894
The current state of the art for automatic transcription of historical manuscripts is typically limited by the requirement of human-annotated learning samples, which are are necessary to train specific machine learning models for specific languages and scripts. Transcription alignment is a simpler task that aims to find a correspondence between text in the scanned image and its existing Unicode counterpart, a correspondence which can then be used as training data. The alignment task can be approached with heuristic methods dedicated to certain types of manuscripts, or with weakly trained systems reducing the required amount of annotations. In this article, we propose a novel learning-based alignment method based on fully convolutional object detection that does not require any human annotation at all. Instead, the object detection system is initially trained on synthetic printed pages using a font and then adapted to the real manuscripts by means of self-training. On a dataset of historical Vietnamese handwriting, we demonstrate the feasibility of annotation-free alignment as well as the positive impact of self-training on the character detection accuracy, reaching a detection accuracy of 96.4% with a YOLOv5m model without using any human annotation.
Andreas Fischer, Michael Keller
AlpLinkBioEco (2021). Creating Bio-based Value in the Alpine Space. Interreg Alpine Space.
(pp. 20-26). 2021,
Interred Alpine Space : x
Lien vers la publication
Michael Keller, Andreas Fischer, Dorian Wessely, Ashna Mudaffer
Working Paper, February 2021, HES-SO//FR HEIA-FR, iCoSys, PICC, INNOSQUARE, Business Upper Austria,
Sandoz Romain, Ertz Olivier, Scius-Bertrand Anna, Fischer Andreas, Hüsser Olivier, Ghorbel Hatem
The International Archives of Photogrammetry, Remote Sensing and Spatial Information Sciences, 2021 , vol.
In this work, we present a framework supported by mobile and web apps and able to propose personalized pedestrian routes that match
user mobility profile considering mobility impediments factors. We explain how these later have been defined using a pedestrian-
centric approach based on travel experiences as perceived in the field by senior citizens. Through workshops, six main factors that may
influence pedestrian route choices were revealed: passability, obstacle in path, surface problem, security, sidewalk width, slope. These
categories were used to build digital tools and guide a citizen participatory approach to collect geolocated points of obstacle documented
with walkability information (picture, category, impact score, free comment). We also involved citizens to evaluate these information
and especially senior referents for validation. Finally we present how we connect these points of obstacle with a pedestrian network
based on OpenStreetMap to configure a routing cost function. The framework has been partially deployed in 2020 with limited people
due to the pandemic. Nonetheless, we share lessons learned from interaction with citizens in the design of such a framework whose
underlying workflow is reproducible. We plan to further assess its relevance and sustainability in the future.
Andreas Fischer, Marcus Liwicki
New Jersey : World Scientific,
In recent years, libraries and archives all around the world have increased their efforts to digitize historical manuscripts. To integrate the manuscripts into digital libraries, pattern recognition and machine learning methods are needed to extract and index the contents of the scanned images.
The unique compendium describes the outcome of the HisDoc research project, a pioneering attempt to study the whole processing chain of layout analysis, handwriting recognition, and retrieval of historical manuscripts. This description is complemented with an overview of other related research projects, in order to convey the current state of the art in the field and outline future trends.
This must-have volume is a relevant reference work for librarians, archivists and computer scientists.
Andreas Fischer, Marcus Liwicki, Rolf Ingold
Dans Fischer, Andreas, Ingold, Rolf, Liwicki, Marcus, Handwritten historical document analysis, recognition, and retrieval - state of the art and future trends
(8 p.). 2020,
New Jersey : World Scientific
Document image analysis or document recognition refers to the process of extracting valuable information from document images. Although a few optical character reading systems were already available in the 1970’s, the fundamental research activities on this challenging task has mainly emerged with the development of the scanner technologies in the 1980’s, which allowed affordable document image acquisition. At that time, the main applications were focused on office automation and the interpretation of printed material…
Handwritten historical document analysis, recognition, and retrieval - state of the art and future trends
(pp. 67-80). 2020,
New Jersey : World Scientific
Handwritten historical document analysis, recognition, and retrieval - state of the art and future trends
(pp. 11-23). 2020,
New Jersey : World Scientific
Handwritten historical document analysis, recognition, and retrieval - state of the art and future trends
(pp. 249-251). 2020,
New Jersey : World Scientific
Linda Studer, Annika Blank, John-Melle Bokhorst, Iris D. Nagtegaal, Inti Zlobec, Alessandro Lugli, Andreas Fischer, Heather Dawson
2020, vol. 78, no. 4, pp. 476-484
Tumour budding in colorectal cancer, defined as single tumour cells or small clusters containing four or fewer tumour cells, is a robust and independent biomarker of aggressive tumour biology. On the basis of published data in the literature, the evidence is certainly in favour of reporting tumour budding in routine practice. One important aspect of implementing tumour budding has been to establish a standardised and evidence-based scoring method, as was recommended by the International Tumour Budding Consensus Conference (ITBCC) in 2016. Further developments have aimed at establishing methods for automated tumour budding assessment. A digital approach to scoring tumour buds has great potential to assist in performing an objective budding count but, like the manual consensus method, must be validated and standardised. The aim of the present review is to present general considerations behind the ITBCC scoring method, and a broad overview of the current situation and challenges regarding automated tumour budding detection methods.
Andreas Fischer, Roman Schindler, Manuel Bouillon, Réjean Plamondon
Dans Ferrer, Miguel Ángel, Marcelli, Angelo, Plamondon, Réjean, The lognormality principle and its applications in e-security, e-learning and e-health
(pp. 327-342). 2019,
New Jersey : World Scientific
The Kinematic Theory of rapid human movements analytically describes pen tip movements as a sequence of elementary strokes with lognormal speed. The theory has been confirmed in a large number of experimental evaluations, achieving a high reconstruction quality when compared with observed trajectories and providing pertinent features for biomedical applications as well as biometric identification. So far, the Kinematic Theory has focused on one-dimensional movements with the Delta-Lognormal model and on two-dimensional movements with the Sigma-Lognormal model. In this chapter, we present a model for movements in three dimensions, which naturally extends the Sigma-Lognormal approach. We evaluate our method on two action recognition datasets and an air-writing dataset, demonstrating a high reconstruction quality for modelling rapid 3D movements in all cases.
Paul Maergner, Vinaychandran Pondenkandath, Michele Alberti, Marcus Liwicki, Kaspar Riesen, Rolf Ingold, Andreas Fischer
Pattern Recognition Letters,
2019, vol. 125, pp. 527-533
Offline signature verification is a challenging pattern recognition task where a writer model is inferred using only a small number of genuine signatures. A combination of complementary writer models can make it more difficult for an attacker to deceive the verification system. In this work, we propose to combine a recent structural approach based on graph edit distance with a statistical approach based on deep triplet networks. The combination of the structural and statistical models achieve significant improvements in performance on four publicly available benchmark datasets, highlighting their complementary perspectives.
Kaspar Riesen, Andreas Fischer, Horst Bunke
Neural Processing Letters,
2018, vol. 48, pp. 691-707
The concept of graph edit distance constitutes one of the most flexible graph matching paradigms available. The major drawback of graph edit distance, viz. the exponential time complexity, has been recently overcome by means of a reformulation of the edit distance problem to a linear sum assignment problem. However, the substantial speed up of the matching is also accompanied by an approximation error on the distances. Major contribution of this paper is the introduction of a transformation process in order to convert the underlying cost model into a utility model. The benefit of this transformation is that it enables the integration of additional information in the assignment process. We empirically confirm the positive effects of this transformation on five benchmark graph sets with respect to the accuracy and run time of a distance based classifier.
Michael Stauffer, Kaspar Riesen, Andreas Fischer
2018, vol. 81 pp. 240-253
In the last decades historical handwritten documents have become increasingly available in digital form. Yet, the accessibility to these documents with respect to browsing and searching remained limited as full automatic transcription is often not possible or not sufficiently accurate. This paper proposes a novel reliable approach for template-based keyword spotting in historical handwritten documents. In particular, our framework makes use of different graph representations for segmented word images and a sophisticated matching procedure. Moreover, we extend our method to a spotting ensemble. In an exhaustive experimental evaluation on four widely used benchmark datasets we show that the proposed approach is able to keep up or even outperform several state-of-the-art methods for template- and learning-based keyword spotting.
Mohammad Reza Ameri, Michael Stauffer, Kaspar Riesen, Tien D. Bui, Andreas Fischer
Pattern Recognition Letters,
Keyword spotting enables content-based retrieval of scanned historical manuscripts using search terms, which, in turn, facilitates the indexation in digital libraries. Recent approaches include graph-based representations that capture the complex structure of handwriting. However, the high representational power of graphs comes at the cost of high computational complexity for graph matching. In this article, we investigate the potential of Hausdorff edit distance (HED) for keyword spotting. It is an efficient quadratictime approximation of the graph edit distance. In a comprehensive experimental evaluation with four types of handwriting graphs and four benchmark datasets (George Washington, Parzival, Botany, and Alvermann Konzilsprotokolle), we demonstrate a strong performance of the proposed HED-based method when compared with the state of the art, both, in terms of precision and speed.
Michael Stauffer, Andreas Fischer, Kaspar Riesen
The accessibility to handwritten historical documents is often constrained by the limited feasibility of automatic full transcriptions. Keyword Spotting (KWS), that allows to retrieve arbitrary query words from documents, has been proposed as alternative. In the present paper, we make use of graphs for representing word images. The actual keyword spotting is thus based on matching a query graph with all documents graphs. However, even with relative fast approximation algorithms the shear amount of matchings might limit the practical application of this approach. For this reason we present two novel filters with linear time complexity that allow to substantially reduce the number of graph matchings actually required. In particular, these filters estimate a graph dissimilarity between a query graph and all document graphs based on their node and edge distribution in a polar coordinate system. Eventually, all graphs from the document with distributions that differ to heavily from the query’s node/edge distribution are eliminated. In an experimental evaluation on four different historical documents, we show that about 90% of the matchings can be omitted, while the KWS accuracy is not negatively affected.
Moises Diaz, Andreas Fischer, Miguel A. Ferrer, Réjean Plamondon
IEEE Transactions on Cybernetics,
2018, vol. 48, no. 1, pp. 228-239
The dynamic signature is a biometric trait widely used and accepted for verifying a person's identity. Current automatic signature-based biometric systems typically require five, ten, or even more specimens of a person's signature to learn intrapersonal variability sufficient to provide an accurate verification of the individual's identity. To mitigate this drawback, this paper proposes a procedure for training with only a single reference signature. Our strategy consists of duplicating the given signature a number of times and training an automatic signature verifier with each of the resulting signatures. The duplication scheme is based on a sigma lognormal decomposition of the reference signature. Two methods are presented to create human-like duplicated signatures: the first varies the strokes' lognormal parameters (stroke-wise) whereas the second modifies their virtual target points (target-wise). A challenging benchmark, assessed with multiple state-of-the-art automatic signature verifiers and multiple databases, proves the robustness of the system. Experimental results suggest that our system, with a single reference signature, is capable of achieving a similar performance to standard verifiers trained with up to five signature specimens.
Angelika Garz, Mathias Seuret, Andreas Fischer, Rolf Ingold
IEEE Transactions on Human-Machine Systems,
2017, vol. 47, no. 2, pp- 181-193
In historical manuscripts, humans can detect handwritten words, lines, and decorations with lightness even if they do not know the language or the script. Yet for automatic processing this task has proven elusive, especially in the case of handwritten documents with complex layouts, which is why semiautomatic methods that integrate the human user into the process are needed. In this paper, we introduce a user-centered segmentation method based on document graphs and scribbling interaction. The graphs capture a sparse representation of the document's structure that can then be edited by the user with a stylus on a touch-sensitive screen. We evaluate the proposed method on a newly introduced database of historical manuscripts with complex layout and demonstrate, first, that the document graphs are already close to the desired segmentation and, second, that scribbling allows a natural and efficient interaction.
Andreas Fischer, Réjean Plamondon
IEEE Transactions on Human-Machine Systems,
207, vol. 47, no. 2, pp. 169-180
When using tablet computers, smartphones, or digital pens, human users perform movements with a stylus or their fingers that can be analyzed by the kinematic theory of rapid human movements. In this paper, we present a user-centered system for signature verification that performs such a kinematic analysis to verify the identity of the user. It is one of the first systems that is based on a direct comparison of the elementary neuromuscular strokes which are detected in the handwriting. Taking into account the number of strokes, their similarity, and their timing, the string edit distance is employed to derive a dissimilarity measure for signature verification. On several benchmark datasets, we demonstrate that this neuromuscular analysis is complementary to a well-established verification using dynamic time warping. By combining both approaches, our verifier is able to outperform current state-of-the-art results in on-line signature verification.
Andreas Fischer, Kaspar Riesen, Horst Bunke
Pattern Recognition Letters,
2017, vol. 87, no. 1, pp. 55-62
Approximation of graph edit distance in polynomial time enables us to compare large, arbitrarily labeled graphs for structural pattern recognition. In a recent approximation framework, bipartite graph matching (BP) has been proposed to reduce the problem of edit distance to a cubic-time linear sum assignment problem (LSAP) between local substructures. Following the same line of research, first attempts towards quadratic-time approximation have been made recently, including a lower bound based on Hausdorff matching (Hausdorff Edit Distance) and an upper bound based on greedy assignment (Greedy Edit Distance). In this paper, we compare the two approaches and derive a novel upper bound (BP2) which combines advantages of both. In an experimental evaluation on the IAM graph database repository, we demonstrate that the proposed quadratic-time methods perform equally well or, quite surprisingly, in some cases even better than the cubic-time method.
Angelika Garz, Marcel Würsch, Andreas Fischer, Rolf Ingold
2016, vol. 28
Recent advances in writer identification push the limits by using increasingly complex methods relying on sophisticated preprocessing, or the combination of already complex descriptors. In this paper, we pursue a simpler and faster approach to writer identification, introducing novel descriptors computed from the geometrical arrangement of interest points at different scales. They capture orientation distributions and geometrical relationships of script parts such as strokes, junctions, endings, and loops. Thus, we avoid a fixed set of character appearances as in standard codebook-based methods. The proposed descriptors significantly cut down processing time compared to existing methods, are simple and efficient, and can be applied out-of-the-box to an unseen dataset. Evaluations on widely-used datasets show their potential when applied by themselves, and in combination with other descriptors. Limitations of our method relate to the amount of data needed to obtain reliable models.
Michael Jungo, Lars Vötglin, Atefeh Fakhari, Nathan Wegmann, Rolf Ingold, Andreas Fischer, Anna Scius-Bertrand
SOICT '23: Proceedings of the 12th International Symposium on Information and Communication Technology, 7-8 December 2023, Ho Chi Minh, Vietnam
Lien vers la conférence
Handwriting recognition is a key technology for accessing the content of old manuscripts, helping to preserve cultural heritage. Deep learning shows an impressive performance in solving this task. However, to achieve its full potential, it requires a large amount of labeled data, which is difficult to obtain for ancient languages and scripts. Often, a trade-off has to be made between ground truth quantity and quality, as is the case for the recently introduced Bullinger database. It contains an impressive amount of over a hundred thousand labeled text line images of mostly premodern German and Latin texts that were obtained by automatically aligning existing page-level transcriptions with text line images. However, the alignment process introduces systematic errors, such as wrongly hyphenated words. In this paper, we investigate the impact of such errors on training and evaluation and suggest means to detect and correct typical alignment errors.
Réjean Plamondon, Asma Bensalah, Karina Lebel, Romeo Salameh, Guillaume Séguin de Broin, Christian O'Reilly, Mickael Begon, Olivier Desbiens, Youssef Beloufa, Aymeric Guy, Daniel Berio, Frederic Fol Leymarie, Simon-Pierre Boyoguéno-Bidias, Andreas Fischer, Zigeng Zhang, Marie-France Morin, Denis Alamargot, Céline Rémi, Nadir Faci, Raphaëlle Fortin, Marie-Noëlle Simard, Caroline Bazinet
Proceedings of the 21st International Conference of the International Graphonomics Society, IGS 2023, 16-19 October 2023, Evora, Portugal ; Graphonomics in Human Body Movement. Bridging Research and Practice from Motor Control to Handwriting Analysis and Recognition
This invited special session of IGS 2023 presents the works carried out at Laboratoire Scribens and some of its collaborating laboratories. It summarises the 17 talks presented in the colloquium #611 entitled « La lognormalité: une fenêtre ouverte sur le contrôle neuromoteur» (Lognormality: a window opened on neuromotor control), at the 2023 conference of the Association Francophone pour le Savoir (ACFAS) on May 10, 2023. These talks covered a wide range of subjects related to the Kinematic Theory, including key elements of the theory, some gesture analysis algorithms that have emerged from it, and its application to various fields, particularly in biomedical engineering and human-machine interaction.
Anna Scius-Bertrand, Phillip Ströbel, Martin Volk, Tobias Hodel, Andreas Fischer
Document analysis and recognition ICDAR 2023 ; Proceedings of the 17th International Conference, 21-26 August 2023, San José, CA, USA
One of the main challenges of automatically transcribing large collections of handwritten letters is to cope with the high variability of writing styles present in the collection. In particular, the writing styles of non-frequent writers, who have contributed only few letters, are often missing in the annotated learning samples used for training handwriting recognition systems. In this paper, we introduce the Bullinger dataset for writer adaptation, which is based on the Heinrich Bullinger letter collection from the 16th century, using a subset of 3,622 annotated letters (about 1.2 million words) from 306 writers. We provide baseline results for handwriting recognition with modern recognizers, before and after the application of standard techniques for supervised adaptation of frequent writers and self-supervised adaptation of non-frequent writers.
Michael Jungo, Beat Wolf, Andrii Maksai, Claudiu Musat, Andreas Fischer
On-line handwritten character segmentation is often associated with handwriting recognition and even though recognition models include mechanisms to locate relevant positions during the recognition process, it is typically insufficient to produce a precise segmentation. Decoupling the segmentation from the recognition unlocks the potential to further utilize the result of the recognition. We specifically focus on the scenario where the transcription is known beforehand, in which case the character segmentation becomes an assignment problem between sampling points of the stylus trajectory and characters in the text. Inspired by the k-means clustering algorithm, we view it from the perspective of cluster assignment and present a Transformer-based architecture where each cluster is formed based on a learned character query in the Transformer decoder block. In order to assess the quality of our approach, we create character segmentation ground truths for two popular on-line handwriting datasets, IAM-OnDB and HANDS-VNOnDB, and evaluate multiple methods on them, demonstrating that our approach achieves the overall best results.
Lars Vötglin, Anna Scius-Bertrand, Paul Maergner, Andreas Fischer, Rolf Ingold
Proceedings of the 7th International Workshop on Historical Document Imaging and Processing (HIP'23), 25-26 August 2023, San José, CA, USA
Deep learning methods have shown strong performance in solving tasks for historical document image analysis. However, despite current libraries and frameworks, programming an experiment or a set of experiments and executing them can be time-consuming. This is why we propose an open-source deep learning framework, DIVA-DAF, which is based on PyTorch Lightning and specifically designed for historical document analysis. Pre-implemented tasks such as segmentation and classification can be easily used or customized. It is also easy to create one’s own tasks with the benefit of powerful modules for loading data, even large data sets, and different forms of ground truth. The applications conducted have demonstrated time savings for the programming of a document analysis task, as well as for different scenarios such as pre-training or changing the architecture. Thanks to its data module, the framework also allows to reduce the time of model training significantly.
Ana Leni Frei, Amjad Khan, Philipp Zens, Alessandro Lugli, Inti Zlobec, Andreas Fischer
Proceedings of Medical Imaging with Deep Learning (MIDL), 10-12 July 2023, Nashville, USA
In histopathology, histologic elements are not randomly located across an image but organize into structured patterns. In this regard, classification tasks or feature extraction from histology images may require context information to increase performance. In this work, we explore the importance of keeping context information for a cell classification task on Hematoxylin and Eosin (H&E) scanned whole slide images (WSI) in colorectal cancer. We show that to differentiate normal from malignant epithelial cells, the environment around the cell plays a critical role. We propose here an image augmentation based on gamma variations to guide deep learning models to focus on the object of interest while keeping context information. This augmentation method yielded more specific models and helped to increase the model performance (weighted F1 score with/without gamma augmentation respectively, PanNuke: 99.49 vs 99.37 and TCGA: 91.38 vs. 89.12, p < 0.05).
Ana Leni Frei, Amjad Khan, Linda Studer, Philipp Zens, Alessandro Lugli, Andreas Fischer, Inti Zlobec
Proceedings of the Medical Imaging with Deep Learning (MIDL), 10-12 July 2023, Nashville, USA
In digital pathology, cell-level tissue analyses are widely used to better understand tissue composition and structure. Publicly available datasets and models for cell detection and classification in colorectal cancer exist but lack the differentiation of normal and malignant epithelial cells that are important to perform prior to any downstream cell-based analysis. This classification task is particularly difficult due to the high intra-class variability of neoplastic cells. To tackle this, we present here a new method that uses graph-based node classification to take advantage of both local cell features and global tissue architecture to perform accurate epithelial cell classification. The proposed method demonstrated excellent performance on F1 score (PanNuke: 1.0, TCGA: 0.98) and performed significantly better than conventional computer vision methods (PanNuke: 0.99, TCGA: 0.92).
Linda Studer, John-Melle Bokhorst, Iris Nagtegaal, Inti Zlobec, Heather Dawson, Andreas Fischer
Colon resection is often the treatment of choice for colorectal cancer (CRC) patients. However, especially for minimally invasive cancer, such as pT1, simply removing the polyps may be enough to stop cancer progression. Different histopathological risk factors such as tumor grade and invasion depth currently found the basis for the need for colon resection in pT1 CRC patients. Here, we investigate two additional risk factors, tumor budding and lymphocyte infiltration at the invasive front, which are known to be clinically relevant. We capture the spatial layout of tumor buds and T-cells and use graph-based deep learning to investigate them as potential risk predictors. Our pT1 Hotspot Tumor Budding T-cell Graph (pT1-HBTG) dataset consists of 626 tumor budding hotspots from 575 patients. We propose and compare three different graph structures, as well as combinations of the node labels. The best-performing Graph Neural Network architecture is able to increase specificity by 20% compared to the currently recommended risk stratification based on histopathological risk factors, without losing any sensitivity. We believe that using a graph-based analysis can help to assist pathologists in making risk assessments for pT1 CRC patients, and thus decrease the number of patients undergoing potentially unnecessary surgery. Both the code and dataset are made publicly available.
Philip Ströbel, Tobias Hodel, Andreas Fischer, Anna Scius-Bertrand, Beat Wolf, Anna Janka, Jonas Widmer, Patricia Scheurer, Martin Volk
Digital Humanities im deutschsprachigen Raum 2023 (DHd2023): Open Humanities, Open Culture, 13-17 März 2023, Trier, Germany, Belval, Luxembourg
Jonas Diesbach, Andreas Fischer, Marc Bui, Anna Scius-Bertrand
Proceedings of International Conference on Frontiers in Handwriting Recognition (ICFHR) 2022, 4-7 December 2022, Hyderabad, India
Images of historical Vietnamese steles allow historians to discover invaluable information regarding the past of the country, especially about the life of people in rural villages. Due to the sheer amount of available stone engravings and their diverseness, manual examination is difficult and time-consuming. Therefore, automatic document analysis methods based on machine learning could immensely facilitate this laborious work. However, creating ground truth for machine learning is also complex and time-consuming for human experts, which is why synthetic training samples greatly support learning while reducing human effort. In particular, they can be used to train deep neural networks for character detection and recognition. In this paper, we present a method for creating synthetic engravings and use it to create a new database composed of 26,901 synthetic Chu Nom characters in 21 different styles. Using a machine learning model for unpaired image-to-image translation, our approach is annotation-free, i.e. there is no need for human experts to label character images. A user study demonstrates that the synthetic engravings look realistic to the human eye.
Anna Scius-Bertrand, Andreas Fischer, Marc Bui
SoICT 2022: The 11th International Symposium on Information and Communication Technology, 1-3 December 2022, Hanoi, Vietnam
Stone engravings on Vietnamese steles are an invaluable resource for historians to study the life of the villagers in the past. Thanks to pictures taken of stampings of the steles, they can be investigated today in the form of digital images. Automatic keyword spotting is a promising means to access the textual content of the images, allowing to retrieve steles that contain a certain query term. In this paper, we present a complete pipeline for retrieving Chu Nom characters in Vietnamese steles that operates fully automatically on the original images, without the need for preprocessing, segmentation, or human annotation. It combines a self-calibration approach to character detection using deep convolutional neural networks with a graph-based approach to keyword spotting that compares templates of the search term with detected characters based on structural properties.
Martin Spoto, Beat Wolf, Andreas Fischer, Anna Scius-Bertrand
Proceedings of the 20th International Conference of the International Graphonomics Society, IGS 2021, Intertwining Graphnomics with Human Movements, -9 June 2022, Las Palmas de Gran Canaria
Automatic handwriting recognition for historical documents is a key element for making our cultural heritage available to researchers and the general public. However, current approaches based on machine learning require a considerable amount of annotated learning samples to read ancient scripts and languages. Producing such ground truth is a laborious and time-consuming task that often requires human experts. In this paper, to cope with a limited amount of learning samples, we explore the impact of using synthetic text line images to support the training of handwriting recognition systems. For generating text lines, we consider lineGen, a recent GAN-based approach, and for handwriting recognition, we consider HTR-Flor, a state-of-the-art recognition system. Different meta-learning strategies are explored that schedule the addition of synthetic text line images to the existing real samples. In an experimental evaluation on the well-known Bentham dataset as well as the newly introduced Bullinger dataset, we demonstrate a significant improvement of the recognition performance when combining real and synthetic samples.
Anna Scius-Bertrand, Linda Studer, Andreas Fischer, Marc Bui
Proceedings of the S+SSPR 2022. IAPR Joint International Workshops on Statistical Techniques in Pattern Recognition (SPR 2022) and Structural and Syntactic Pattern Recognition (SSPR 2022), 26-27 August 2022, Montreal, Canada
Finding key terms in scanned historical manuscripts is invaluable for accessing our written cultural heritage. While keyword spotting (KWS) approaches based on machine learning achieve the best spotting results in the current state of the art, they are limited by the fact that annotated learning samples are needed to infer the writing style of a particular manuscript collection. In this paper, we propose an annotation-free KWS method that does not require any labeled handwriting sample but learns from a printed font instead. First, we train a deep convolutional character detection system on synthetic pages using printed characters. Afterwards, the structure of the detected characters is modeled by means of graphs and is compared with search terms using graph matching. We evaluate our method for spotting logographic Chu Nom characters on the newly introduced Kieu database, which is a historical Vietnamese manuscripts containing 719 scanned pages of the famous Tale of Kieu. Our results show that search terms can be found with promising precision both when providing handwritten samples (query by example) as well as printed characters (query by string).
Christophe Stammet, Prisca Dotti, Ulrich Ultes-Nitsche, Andreas Fischer
Proceedings of the 4th International Workshop on Learning and Automata (LearnAut 2022), 4 July 2022, Paris, France
Büchi Automata on infinite words present many interesting problems and are used frequently in program verification and model checking. A lot of these problems on Büchi automata are computationally hard, raising the question if a learning-based data-driven analysis might be more efficient than using traditional algorithms. Since Büchi automata can be represented by graphs, graph neural networks are a natural choice for such a learning-based analysis. In this paper, we demonstrate how graph neural networks can be used to reliably predict basic properties of Büchi automata when trained on automatically generated random automata datasets.
Linda Studer, John-Melle Bokhorst, Francesco Ciompi, Andreas Fischer, Heather Dawson
Proceedings of the ECDP 2022 18th European Congress on Digital Pathology, 15-18 June 2022, Berlin, Germany
Anna Scius-Bertrand, Michael Jungo, Beat Wolf, Andreas Fischer, Marc Bui
Proceedings of the International Conference on Document Analysis and Recognition (ICDAR 2021), 5-10 September 2021, Lausanne, Switzerland
Images of Historical Vietnamese stone engravings provide historians with a unique opportunity to study the past of the country. However, due to the large heterogeneity of thousands of images regarding both the text foreground and the stone background, it is difficult to use automatic document analysis methods for supporting manual examination, especially with a view to the labeling effort needed for training machine learning systems. In this paper, we present a method for finding the location of Chu Nom characters in the main text of the steles without the need of any human annotation. Using self-calibration, fully convolutional object detection methods trained on printed characters are successfully adapted to the handwritten image collection. The achieved detection results are promising for subsequent document analysis tasks, such as keyword spotting or transcription.
Andreas Fischer, Gernot A. Fink
Proceedings of International Conference on Document Analysis and Recognition (ICDAR 2021), 5-10 September 2021, Lausanne, Switzerland
Graphs are an intuitive and natural way of representing handwriting. Due to their high representational power, they have shown high performances in different learning-free document analysis tasks. While machine learning is rather unexplored for graph representations, geometric deep learning offers a novel framework that allows for convolutional neural networks similar to the image domain. In this work, we show that the concept of attribute prediction can be adapted to the graph domain. We propose a graph neural network to map handwritten word graphs to a symbolic attribute space. This mapping allows to perform query-by-example word spotting as it was also tackled by other learning-free approaches in the graph domain. Furthermore, our model is capable of query-by-string, which is out of scope for other graph-based methods in the literature. We investigate two variants of graph convolutional layers and show that learning improves performances considerably on two popular graph-based word spotting benchmarks.
Olivier Ertz, Andreas Fischer, Hatem Ghorbel, Olivier Hüsser, Romain Sandoz, Anna Scius-Bertrand
The International Archives of the Photogrammetry, Remote Sensing and Spatial Information Sciences ; Proceedings of the 6th International Conference on Smart Data and Smart Cities
In this work, we present a framework supported by mobile and web apps and able to propose personalized pedestrian routes that match user mobility profile considering mobility impediments factors. We explain how these later have been defined using a pedestrian-centric approach based on travel experiences as perceived in the field by senior citizens. Through workshops, six main factors that may influence pedestrian route choices were revealed: passability, obstacle in path, surface problem, security, sidewalk width, slope. These categories were used to build digital tools and guide a citizen participatory approach to collect geolocated points of obstacle documented with walkability information (picture, category, impact score, free comment). We also involved citizens to evaluate these information and especially senior referents for validation. Finally we present how we connect these points of obstacle with a pedestrian network based on OpenStreetMap to configure a routing cost function. The framework has been partially deployed in 2020 with limited people due to the pandemic. Nonetheless, we share lessons learned from interaction with citizens in the design of such a framework whose underlying workflow is reproducible. We plan to further assess its relevance and sustainability in the future.
Proceedings of the Medical Imaging with Deep Learning (MIDL 2021), 7 - 9 July 2021, Lübeck, Germany
Supervised learning is conditioned by the availability of labeled data, which are especially expensive to acquire in the field of medical image analysis. Making use of open-source data for pre-training or using domain adaptation can be a way to overcome this issue. However, pre-trained networks often fail to generalize to new test domains that are not distributed identically due to variations in tissue stainings, types, and textures. Additionally, current domain adaptation methods mainly rely on fully-labeled source datasets. In this work, we propose Self-Rule to Adapt (SRA) which takes advantage of self-supervised learning to perform domain adaptation and removes the burden of fully-labeled source datasets. SRA can effectively transfer the discriminative knowledge obtained from a few labeled source domain to a new target domain without requiring additional tissue annotations. Our method harnesses both domains’ structures by capturing visual similarity with intra-domain and cross-domain self-supervision. We show that our proposed method outperforms baselines across diverse domain adaptation settings and further validate our approach to our in-house clinical cohort.
Linda Studer, Janis Wallau, Heather Dawson, Inti Zlobec, Andreas Fischer
Proceedings of the 25th International Conference on Pattern Recognition (ICPR), 10-15 January 2021, Milan, Italy
We propose to classify intestinal glands as normal or dysplastic using cell-graphs and graph-based deep learning methods. Dysplastic intestinal glands can lead to colorectal cancer, which is one of the three most common cancer types in the world. In order to assess the cancer stage and thus the treatment of a patient, pathologists analyse tissue samples of affected patients. Among other factors, they look at the changes in morphology of different tissues, such as the intestinal glands. Cell-graphs have a high representational power and can describe topological and geometrical properties of intestinal glands. However, classical graph-based methods have a high computational complexity and there is only a limited range of machine learning methods available. In this paper, we propose Graph Neural Networks (GNNs) as an efficient learning-based approach to classify cell-graphs. We investigate different variants of so-called Message Passing Neural Networks and compare them with a classical graph-based approach based on approximated Graph Edit Distance and k-nearest neighbours classifier. A promising classification accuracy of 94.8% is achieved by the proposed method on the pT1 Gland Graph dataset, which is an increase of 11.5% over the baseline result.
Linda Studer, Jannis Wallau, Rolf Ingold, Andreas Fischer
Proceedings of the 7th Swiss Conference on Data Science (SDS), 26 June 2020, Luzern, Switzerland
With the rise of graph neural networks, sometimes also referred to as geometric deep learning, a range of new types of network layers have been introduced. Since this is a very recent development, the design of new architectures relies a lot on intuition and trial-and-error. In this paper, we evaluate the effect of adding graph pooling layers to a network, which down-sample graphs, and evaluate the performance on three different datasets. We find that especially for smaller graphs, adding pooling layers should be done with caution, as they can have a negative effect on the overall performance.
Lucy Linder, Michael Jungo, Jean Hennebert, Claudiu Musat, Andreas Fischer
Proceedings of the 12th Conference on Language Resources and Evaluation (LREC 2020), 11-16 May 2020, Marseille, France
This paper presents SwissCrawl, the largest Swiss German text corpus to date. Composed of more than half a million sentences, it was generated using a customized web scraping tool that could be applied to other low-resource languages as well. The approach demonstrates how freely available web pages can be used to construct comprehensive text corpora, which are of fundamental importance for natural language processing. In an experimental evaluation, we show that using the new corpus leads to significant improvements for the task of language modeling. To capture new content, our approach will run continuously to keep increasing the corpus over time.
Paul Maergner, Nicholas R. Howe, Kaspar Riesen, Rolf Ingold, Andreas Fischer
Proceedings of ICFHR 2018, the 16th International Conference on Frontiers in Handwriting Recognition, 5-8 August 2018, Niagara Falls, USA
For handwritten signature verification, signature images are typically represented with fixed-sized feature vectors capturing local and global properties of the handwriting. Graphbased representations offer a promising alternative, as they are flexible in size and model the global structure of the handwriting. However, they are only rarely used for signature verification, which may be due to the high computational complexity involved when matching two graphs. In this paper, we take a closer look at two recently presented structural methods for handwriting analysis, for which efficient matching methods are available: keypoint graphs with approximate graph edit distance and inkball models. Inkball models, in particular, have never been used for signature verification before. We investigate both approaches individually and propose a combined verification system, which demonstrates an excellent performance on the MCYT and GPDS benchmark data sets when compared with the state of the art.
Proceedings of Joint IAPR International Workshop, S+SSPR 2018, Beijing, China, 17-19 August 2018
Biometric authentication by means of handwritten signatures is a challenging pattern recognition task, which aims to infer a writer model from only a handful of genuine signatures. In order to make it more difficult for a forger to attack the verification system, a promising strategy is to combine different writer models. In this work, we propose to complement a recent structural approach to offline signature verification based on graph edit distance with a statistical approach based on metric learning with deep neural networks. On the MCYT and GPDS benchmark datasets, we demonstrate that combining the structural and statistical models leads to significant improvements in performance, profiting from their complementary properties.
Pau Riba, Andreas Fischer, Josep Lladós, Alicia Fornés
ICPR 2018, the 24th International Conference on Pattern Recognition, 20-24 August 2018, Beijing, China
Graph representations have been widely used in pattern recognition thanks to their powerful representation formalism and rich theoretical background. A number of errortolerant graph matching algorithms such as graph edit distance have been proposed for computing a distance between two labelled graphs. However, they typically suffer from a high computational complexity, which makes it difficult to apply these matching algorithms in a real scenario. In this paper, we propose an efficient graph distance based on the emerging field of geometric deep learning. Our method employs a message passing neural network to capture the graph structure and learns a metric with a siamese network approach. The performance of the proposed graph distance is validated in two application cases, graph classification and graph retrieval of handwritten words, and shows a promising performance when compared with (approximate) graph edit distance benchmarks.
Baptiste Wicht, Andreas Fischer, Jean Hennebert
Proceedings of the 2018 International Conference on High Performance Computing & Simulation (HPCS 2018), The 16th Annual Meeting, 16-20 July 2018, Orléans, France
Expression Templates is a technique allowing to write linear algebra code in C++ the same way it would be written on paper. It is also used extensively as a performance optimization technique, especially as the Smart Expression Templates form which allows for even higher performance. It has proved to be very efficient for computation on a Central Processing Unit (CPU). However, due to its design, it is not easily implemented on a Graphics Processing Unit (GPU). In this paper, we devise a set of techniques to allow the seamless evaluation of Smart Expression Templates on the GPU. The execution is transparent for the user of the library which still uses the matrices and vector as if it was on the CPU and profits from the performance and higher multi-processing capabilities of the GPU. We also show that the GPU version is significantly faster than the CPU version, without any change to the code of the user.
Roman Schindler, Manuel Bouillon, Réjean Plamondon, Andreas Fischer
Proceedings of ICPRAI 2018 - International Conference on Pattern Recognition and Artificial Intelligence, Celebrating the 30th Anniversary of CENPARMI, 14-17 May 2018 + Public Lecture on 13 May 2018, Concordia University, Montréal, Canada
The Kinematic Theory of rapid human movements and its Sigma-Lognormal model enables to model human gestures, in particular complex handwriting patterns such as words, signatures and free gestures. This paper investigates the extension of the theory and its Sigma-Lognormal model from two dimensions to three, taking into account new acquisition modalities (motion capture), multiple subjects, and unconstrained motions. Despite the increased complexity and the new acquisition modalities, we demonstrate that the Sigma-Lognormal model can be successfully generalized to describe 3D human movements. Starting from the 2D model, we replace circular with spherical motions to derive a representation of unconstrained human movements with a new 3D Sigma-Lognormal model. First experiments show a high reconstruction quality with an average signal-tonoise ratio (SNR) of 18.52 dB on the HDM05 dataset. Gesture recognition using dynamic time warping (DTW) achieves similar recognition accuracies when using original and reconstructed gestures, which confirms the high quality of the proposed model.
Proceedings of DAS 2018 : 13th IAPR International Workshop on Document Analysis Systems, 24-27 April 2018, Vienna, Austria
Scanned handwritten historical documents are often not well accessible due to the limited feasibility of automatic full transcriptions. Thus, Keyword Spotting (KWS) has been proposed as an alternative to retrieve arbitrary query words from this kind of documents. In the present paper, word images are represented by means of graphs. That is, a graph is used to represent the inherent topological characteristics of handwriting. The actual keyword spotting is then based on matching a query graph with all document graphs. In particular, we make use of a fast graph matching algorithm that considers the contextual substructure of nodes. The motivation for this inclusion of node context is to increase the overall KWS accuracy. In an experimental evaluation on four historical documents, we show that the proposed procedure clearly outperforms diverse other template-based reference systems. Moreover, our novel framework keeps up or even outperforms many state-of-the-art learning-based KWS approaches.
Paul Maergner, Kaspar Riesen, Kaspar Ingold, Andreas Fischer
Proceedings of 2017 14th IAPR International Conference on Document Analysis and Recognition, 9-15 November 2017, Kyoto, Japan
Graphs provide a powerful representation formalism for handwritten signatures, capturing local properties as well as their relations. Yet, although introduced early for signature verification, only a few current systems rely on graph-based representations. A possible reason is the high computational complexity involved for matching two general graphs. In this paper, we introduce a novel structural approach to offline signature verification using an efficient cubic-time approximation of graph edit distance. We put forward several ways of creating, normalizing, and comparing signature graphs built from keypoints and investigate their performance on three benchmark datasets. The experiments demonstrate a promising performance of the proposed structural approach when compared with the state of the art.
Proceedings of the 2017 14th IAPR International Conference on Document Analysis and Recognition (ICDAR), 9-15 November 2017, Kyoto, Japan
Keyword Spotting (KWS) offers a convenient way to improve the accessibility to historical handwritten documents by retrieving search terms in scanned document images. The approach for KWS proposed in the present paper is based on segmented word images that are represented by means of different types of graphs. The actual keyword spotting is based on matching a query graph with a set of document graphs using the concept of graph edit distance. In particular, we propose to employ ensemble methods for KWS with graphs. That is, a query graph is not matched against one but several different graphs representing the same document word. Eventually, we use different strategies to combine these individual graph dissimilarities. In an experimental evaluation on two benchmark datasets, the proposed ensemble methods outperform the individual ensemble members as well as four state-of-the-art reference systems based on dynamic time warping.
Proceedings of the International Workshop on Graph-based Representations in Pattern Recognition (GbRPR 2017), 16-18 May 2017, Anacapri, Italy ; Lecture Notes in Computer Science
The present paper is concerned with graph edit distance, which is widely accepted as one of the most flexible graph dissimilarity measures available. A recent algorithmic framework for approximating the graph edit distance overcomes the major drawback of this distance model, viz. its exponential time complexity. Yet, this particular approximation suffers from an overestimation of the true edit distance in general. Overall aim of the present paper is to improve the distance quality of this approximation by means of a post-processing search procedure. The employed search procedure is based on the idea of simulated annealing, which turns out to be particularly suitable for complex optimization problems. In an experimental evaluation on several graph data sets the benefit of this extension is empirically confirmed.
Lecture Notes in Computer Science ; Proceedings of International Workshop on Graph-based Representations in Pattern Recognition (GbRPR 2017), 16-18 May 2017, Anacapri, Italy
The present paper is concerned with a graph-based system for Keyword Spotting (KWS) in historical documents. This particular system operates on segmented words that are in turn represented as graphs. The basic KWS process employs the cubic-time bipartite matching algorithm (BP). Yet, even though this graph matching procedure is relatively efficient, the computation time is a limiting factor for processing large volumes of historical manuscripts. In order to speed up our framework, we propose a novel fast rejection heuristic. This heuristic compares the node distribution of the query graph and the document graph in a polar coordinate system. This comparison can be accomplished in linear time. If the node distributions are similar enough, the BP matching is actually carried out (otherwise the document graph is rejected). In an experimental evaluation on two benchmark datasets we show that about 50% or more of the matchings can be omitted with this procedure while the KWS accuracy is not negatively affected.
Michael Stauffer, Thomas Tschachtli, Andreas Fischer, Kaspar Riesen
About ten years ago, a novel graph edit distance framework based on bipartite graph matching has been introduced. This particular framework allows the approximation of graph edit distance in cubic time. This, in turn, makes the concept of graph edit distance also applicable to larger graphs. In the last decade the corresponding paper has been cited more than 360 times. Besides various extensions from the methodological point of view, we also observe a great variety of applications that make use of the bipartite graph matching framework. The present paper aims at giving a first survey on these applications stemming from six different categories (which range from document analysis, over biometrics to malware detection).
Nicholas R. Howe, Andreas Fischer, Baptiste Wicht
Proceedings of the 2016 15th International Conference on Frontiers in Handwriting Recognition (ICFHR), 23-26 October 2016, Shenzhen, China
Inkball models provide a tool for matching and comparison of spatially structured markings such as handwritten characters and words. Hidden Markov models offer a framework for decoding a stream of text in terms of the most likely sequence of causal states. Prior work with HMM has relied on observation of features that are correlated with underlying characters, without modeling them directly. This paper proposes to use the results of inkball-based character matching as a feature set input directly to the HMM. Experiments indicate that this technique outperforms other tested methods at handwritten word recognition on a common benchmark when applied without normalization or text deslanting.
Proceedings of the 2016 23rd International Conference on Pattern Recognition (ICPR), 4-8 December 2016, Cancun, Mexico
Deep learning had a significant impact on diverse pattern recognition tasks in the recent past. In this paper, we investigate its potential for keyword spotting in handwritten documents by designing a novel feature extraction system based on Convolutional Deep Belief Networks. Sliding window features are learned from word images in an unsupervised manner. The proposed features are evaluated both for template-based word spotting with Dynamic Time Warping and for learning-based word spotting with Hidden Markov Models. In an experimental evaluation on three benchmark data sets with historical and modern handwriting, it is shown that the proposed learned features outperform three standard sets of handcrafted features.
Proceedings of Joint IAPR International Workshops on Statistical Techniques in Pattern Recognition (SPR) and Structural and Syntactic Pattern Recognition (SSPR), 29 November-2 December 2016, Mérida, Mexico
The amount of handwritten documents that is digitally available is rapidly increasing. However, we observe a certain lack of accessibility to these documents especially with respect to searching and browsing. This paper aims at closing this gap by means of a novel method for keyword spotting in ancient handwritten documents. The proposed system relies on a keypoint-based graph representation for individual words. Keypoints are characteristic points in a word image that are represented by nodes, while edges are employed to represent strokes between two keypoints. The basic task of keyword spotting is then conducted by a recent approximation algorithm for graph edit distance. The novel framework for graph-based keyword spotting is tested on the George Washington dataset on which a state-of-the-art reference system is clearly outperformed.
For several decades graphs act as a powerful and flexible representation formalism in pattern recognition and related fields. For instance, graphs have been employed for specific tasks in image and video analysis, bioinformatics, or network analysis. Yet, graphs are only rarely used when it comes to handwriting recognition. One possible reason for this observation might be the increased complexity of many algorithmic procedures that take graphs, rather than feature vectors, as their input. However, with the rise of efficient graph kernels and fast approximative graph matching algorithms, graph-based handwriting representation could become a versatile alternative to traditional methods. This paper aims at making a seminal step towards promoting graphs in the field of handwriting recognition. In particular, we introduce a set of six different graph formalisms that can be employed to represent handwritten word images. The different graph representations for words, are analysed in a classification experiment (using a distance based classifier). The results of this word classifier provide a benchmark for further investigations.
Andreas Fischer, Pascal Buchs, Maurizio Caon, Omar Abou Khaled, Elena Mugellini, Sara Grimm, Franziska Meyer, Claudia Wagner, Valentine Bernasconi, Angelika Garz
Actes de la 28ième conférence francophone sur l’Interaction Homme-Machine (IHM'16), 25-28 octobre 2016, Fribourg, Suisse
L'interaction en temps réel avec la réalité augmentée re-présente un nouveau matériel avec lequel les danseurs et chorégraphes peuvent travailler pour leurs spectacles. Cela permet aux danseurs d'aller au-delà de la seule syn-chronisation entre musique et mouvement et amène de nouvelles opportunités comme modifier l'environnement audio-visuel et de réagir à ses changements. Dans cet article, nous présentons le processus et le résultat d'un travail collaboratif entre art et technologie, lequel a per- mis d’explorer ce nouveau matériel dans le cadre du spec- tacle Nautilus. Nous suggérons une approche basée sur le tracking des corps par caméra 3D et sur des avatars com- posés de nuages de pixels ; cette approche permet aux danseurs d’interagir de manière fiable avec la réalité augmentée en gardant la liberté de mouvements.
Proceedings of the 7th IAPR TC3 Workshop, Artificial Neural Networks in Pattern Recognition (ANNPR) 2016, 28-30 September 2016, Ulm Germany
Graph edit distance is one of the most popular graph matching paradigms available. By means of a reformulation of graph edit distance to an instance of a linear sum assignment problem, the major drawback of this dissimilarity model, viz. the exponential time complexity, has been invalidated recently. Yet, the substantial decrease of the computation time is at the expense of an approximation error. The present paper introduces a novel transformation that processes the underlying cost model into a utility model. The benefit of this transformation is that it enables the integration of additional information in the assignment process. We empirically confirm the positive effects of this transformation on three standard graph data sets. That is, we show that the accuracy of a distance based classifier can be improved with the proposed transformation while the run time remains nearly unaffected.
Pascal Wicht, Andreas Fischer, Jean Hennebert
Proceedings of the 25th International Conference on Artificial Neural Networks and Machine Learning (ICANN), 6-9 September 2016, Barcelona, Spain
To spot keywords on handwritten documents, we present a hybrid keyword spotting system, based on features extracted with Convolutional Deep Belief Networks and using Dynamic Time Warping for word scoring. Features are learned from word images, in an unsupervised manner, using a sliding window to extract horizontal patches. For two single writer historical data sets, it is shown that the proposed learned feature extractor outperforms two standard sets of features.
Although Graphics Processing Units (GPUs) seem to currently be the best platform to train machine learning models, most research laboratories are still only equipped with standard CPU systems. In this paper, we investigate multiple techniques to speedup the training of Restricted Boltzmann Machine (RBM) models and Convolutional RBM (CRBM) models on CPU with the Contrastive Divergence (CD) algorithm. Experimentally, we show that the proposed techniques can reduce the training time by up to 30 times for RBM and up to 12 times for CRBM, on a data set of handwritten digits.
Angelika Garz, Mathias Seuret, Fotini Simistira, Andreas Fischer, Rolf Ingold
Proceedings of 12th IAPR Workshop on Document Analysis Systems (DAS), 11-14 April 2016, Santorini, Greece
Ground truth is both – indispensable for training and evaluating document analysis methods, and yet very tedious to create manually. This especially holds true for complex historical manuscripts that exhibit challenging layouts with interfering and overlapping handwriting. In this paper, we propose a novel semi-automatic system to support layout annotations in such a scenario based on document graphs and a pen-based scribbling interaction. On the one hand, document graphs provide a sparse page representation that is already close to the desired ground truth and on the other hand, scribbling facilitates an efficient and convenient pen-based interaction with the graph. The performance of the system is demonstrated in the context of a newly introduced database of historical manuscripts with complex layouts.
Andreas Fischer, Moises Diaz, Réjean Plamondon, Miguel A. Ferrer
Proceedings of the 2015 13th International Conference on Document Analysis and Recognition (ICDAR), 23-26 August 2015, Tunis, Tunisia
In the field of automatic signature verification, a major challenge for statistical analysis and pattern recognition is the small number of reference signatures per user. Score normalization, in particular, is challenged by the lack of information about intra-user variability. In this paper, we analyze several approaches to score normalization for dynamic time warping and propose a new two-stage normalization which detects simple forgeries in a first stage and copes with more skilled forgeries in a second stage. An experimental evaluation is conducted on two data sets with different characteristics, namely the MCYT online signature corpus, which contains over three hundred users, and the SUSIG visual sub-corpus, which contains highly skilled forgeries. The results demonstrate that score normalization is a key component for signature verification and that the proposed two-stage normalization achieves some of the best results on these difficult data sets both for random and for skilled forgeries.
Proceedings of 2015 13th International Conference on Document Analysis and Recognition (ICDAR), 23-26 August 2015, Tunis, Tunisia
What can be done with only one enrolled real hand-written signature in Automatic Signature Verification (ASV)? Using 5 or 10 signatures for training is the most common case to evaluate ASV. In the scarcely addressed case of only one available signature for training, we propose to use modified duplicates. Our novel technique relies on a fully neuromuscular representation of the signatures based on the Kinematic Theory of rapid human movements and its Sigma-Lognormal model. This way, a real on-line signature is converted into the Sigma-Lognormal model domain. The model parameters are then varied to generate new duplicated signatures.
Hao Wei, Mathias Seuret, Kai Chen, Andreas Fischer, Marcus Liwicki, Rolf Ingold
Proceedings of HIP '15: Proceedings of the 3rd International Workshop on Historical Document Imaging and Processing, 22 August 2015, Nancy, France
Automatic layout analysis of historical documents has to cope with a large number of different scripts, writing supports, and digitalization qualities. Under these conditions, the design of robust features for machine learning is a highly challenging task. We use convolutional autoencoders to learn features from the images. In order to increase the classification accuracy and to reduce the feature dimension, in this paper we propose a novel feature selection method. The method cascades adapted versions of two conventional methods. Compared to three conventional methods and our previous work, the proposed method achieves a higher classification accuracy in most cases, while maintaining low feature dimension. In addition, we find that a significant number of autoencoder features are redundant or irrelevant for the classification, and we give our explanations. To the best of our knowledge, this paper is one of the first investigations in the field of image processing on the detection of redundancy and irrelevance of autoencoder features using feature selection.
Mathias Seuret, Andreas Fischer, Angelika Garz, Marcus Liwicki, Rolf Ingold
Proceedings of HIP '15: Proceedings of the 3rd International Workshop on Historical Document Imaging and Processing, 22 August 2022, Nancy, France
The term "historical documents" encompasses an enormous variety of document types considering different scripts, languages, writing supports, and degradation degrees. For automatic processing with machine learning and pattern recognition methods, it would be ideal to share labeled learning samples and trained statistical models across similar documents, avoiding a retraining from scratch for every historical document anew. In this paper, we propose using the reconstruction error of autoencoders to compare historical manuscripts with the goal of clustering them according to their visual appearance. A low reconstruction error suggests visual similarity between a new manuscript and a known manuscript, for which the autoencoder was trained in an unsupervised fashion. Preliminary experiments conducted on 10 different manuscripts written with ink on parchment demonstrate the ability of the reconstruction error to group similar writing styles. For discriminating between Carolingian and cursive script, in particular, near-perfect results are reported.
Albert Bou Hernandez, Andreas Fischer, Réjean Plamondon
Proceedings of the 17th Biennial Conference of the International Graphonomics Society, International Graphonomics Society (IGS), 21-24 June 2015, Pointe-à-Pitre, Guadeloupe
The development of predictive tools has been commonly utilized as the most effective manner to prevent illnesses that strike suddenly. Within this context, investigations linking fine human motor control with brain stroke risk factors are considered to have a high potential but they are still in an early stage of research. The present paper analyses neuromuscular features of oscillatory movements based on the Omega-Lognormal model of the Kinematic Theory. On a database of oscillatory movements from 120 subjects, we demonstrate that the proposed features differ significantly between subjects with and without brain stroke risk factors. This promising result motivates the development of predictive tools based on the Omega-Lognormal model.
The Sigma-Lognormal model of the Kinematic Theory of rapid human movements allows us to represent online signatures with an analytical neuromuscular model. It has been successfully used in the past to generate synthetic signatures in order to improve the performance of an automatic verification system. In this paper, we attempt for the first time to build a verification system based on the model parameters themselves. For describing individual lognormal strokes, we propose eighteen features which capture cognitive psychomotor characteristics of the signer. They are matched by means of dynamic time warping to derive a dissimilarity measure for signature verification. Promising initial results are reported for an experimental evaluation on the SUSIG visual sub-corpus, which contains some of the most skilled forgeries currently available for research.
Andreas Fischer, Seiichi Uchida, Volkmar Frinken, Kaspar Riesen, Horst Bunke
Proceedings of International Workshop on Graph-Based Representations in Pattern Recognition ; GbRPR 2015: Graph-Based Representations in Pattern Recognition, 13-15 May 2015, Beijing, China
In order to cope with the exponential time complexity of graph edit distance, several polynomial-time approximation algorithms have been proposed in recent years. The Hausdorff edit distance is a quadratic-time matching procedure for labeled graphs which reduces the edit distance to a correspondence problem between local substructures. In its original formulation, nodes and their adjacent edges have been considered as local substructures. In this paper, we integrate a more general structural node context into the matching procedure based on hierarchical subgraphs. In an experimental evaluation on diverse graph data sets, we demonstrate that the proposed generalization of Hausdorff edit distance can significantly improve the accuracy of graph classification while maintaining low computational complexity.
Kaspar Riesen, Miquel Ferrer, Andreas Fischer, Horst Bunke
The basic idea of a recent graph matching framework is to reduce the problem of graph edit distance (GED) to an instance of a linear sum assignment problem (LSAP). The optimal solution for this simplified GED problem can be computed in cubic time and is eventually used to derive a suboptimal solution for the original GED problem. Yet, for large scale graphs and/or large scale graph sets the cubic time complexity remains a severe handicap of this procedure. Therefore, we propose to use suboptimal algorithms – with quadratic rather than cubic time complexity – for solving the underlying LSAP. In particular, we introduce several greedy assignment algorithms for approximating GED. In an experimental evaluation we show that there is great potential for further speeding up the GED computation. Moreover, we empirically confirm that the distances obtained by this procedure remain sufficiently accurate for graph based pattern classification.