Publications
2024
- VHS: High-Resolution Iterative Stereo Matching with Visual Hull PriorsMarkus Plack, Hannah Droege, Leif Van Holland, and 1 more authorarXiv preprint arXiv:2406.02552, 2024
We present a stereo-matching method for depth estimation from high-resolution images using visual hulls as priors, and a memory-efficient technique for the correlation computation. Our method uses object masks extracted from supplementary views of the scene to guide the disparity estimation, effectively reducing the search space for matches. This approach is specifically tailored to stereo rigs in volumetric capture systems, where an accurate depth plays a key role in the downstream reconstruction task. To enable training and regression at high resolutions targeted by recent systems, our approach extends a sparse correlation computation into a hybrid sparse-dense scheme suitable for application in leading recurrent network architectures. We evaluate the performance-efficiency trade-off of our method compared to state-of-the-art methods, and demonstrate the efficacy of the visual hull guidance. In addition, we propose a training scheme for a further reduction of memory requirements during optimization, facilitating training on high-resolution data.
- Kissing to Find a Match: Efficient Low-Rank Permutation RepresentationHannah Droege, Zorah Laehner, Yuval Bahat, and 3 more authorsAdvances in Neural Information Processing Systems, 2024
Permutation matrices play a key role in matching and assignment problems across the fields, especially in computer vision and robotics. However, memory for explicitly representing permutation matrices grows quadratically with the size of the problem, prohibiting large problem instances. In this work, we propose to tackle the curse of dimensionality of large permutation matrices by approximating them using low-rank matrix factorization, followed by a nonlinearity. To this end, we rely on the Kissing number theory to infer the minimal rank required for representing a permutation matrix of a given size, which is significantly smaller than the problem size. This leads to a drastic reduction in computation and memory costs, e.g., up to 3 orders of magnitude less memory for a problem of size n=20000, represented using 8.4×10^5 elements in two small matrices instead of using a single huge matrix with 4×10^8 elements. The proposed representation allows for accurate representations of large permutation matrices, which in turn enables handling large problems that would have been infeasible otherwise. We demonstrate the applicability and merits of the proposed approach through a series of experiments on a range of problems that involve predicting permutation matrices, from linear and quadratic assignment to shape matching problems.
- Robustness and Exploration of Variational and Machine Learning Approaches to Inverse Problems: An OverviewAlexander Auras, Kanchana Vaishnavi Gandikota, Hannah Droege, and 1 more authorarXiv preprint arXiv:2402.12072, 2024
This paper attempts to provide an overview of current approaches for solving inverse problems in imaging using variational methods and machine learning. A special focus lies on point estimators and their robustness against adversarial perturbations. In this context results of numerical experiments for a one-dimensional toy problem are provided, showing the robustness of different approaches and empirically verifying theoretical guarantees. Another focus of this review is the exploration of the subspace of data consistent solutions through explicit guidance to satisfy specific semantic or textural properties.
2023
- Evaluating Adversarial Robustness of Low dose CT RecoveryKanchana Vaishnavi Gandikota, Paramanand Chandramouli, Hannah Droege, and 1 more authorIn Medical Imaging with Deep Learning, 2023
Low dose computer tomography (CT) acquisition using reduced radiation or sparse angle measurements is recommended to decrease the harmful effects of X-ray radiation. Recent works successfully apply deep networks to the problem of low dose CT recovery on benchmark datasets. However, their robustness needs a thorough evaluation before use in clinical settings. In this work, we evaluate the robustness of different deep learning approaches and classical methods for CT recovery.We show that deep networks, including model based networks encouraging data consistency are more susceptible to untargeted attacks. Surprisingly, we observe that data consistency is not heavily affected even for these poor quality reconstructions, motivating the need for better regularization for the networks. We demonstrate the feasibility of universal attacks and study attack transferability across different methods. We analyze robustness to attacks causing localized changes in clinically relevant regions. Both classical approaches and deep networks are affected by such attacks leading to change in visual appearance of localized lesions, for extremely small perturbations. As the resulting reconstructions have high data consistency with original measurements, these localized attacks can be used to explore the solution space of CT recovery problem.
- On the Confluence of Machine Learning and Model-Based Energy Minimization Methods for Computer VisionHannah DroegeUniversität Siegen, 2023
Deep learning has achieved great success in the field of computer vision across a wide range of applications. However, learning-based methods still have several limitations, particularly in terms of interpretability and guarantees. In contrast, traditional model-based computer vision techniques, built on explicit models that are derived from our understanding of the specific problem domain, offer a different and interpretable approach on addressing these challenges. In this work, we analyze and further develop hybrid approaches that combine model-based and learning-based methods in computer vision, introducing four different approaches. We analyze the capabilities of both model-based and learning-based methods, discuss the value of deep learning for underdetermined problems, present an extended approach to incorporate learning directly into the optimization process, and address problems where the challenge lies in the intrinsic formulation of the problem itself. Thereby we deal with different application areas in the field of computer vision. We start with studying segmentation problems on a single image, given only user input in the form of drawn scribbles in the color images, and analyze the performance of learning-based methods to incorporate the scribble information, compared to a cleverly designed model-based approach. Further, we address reconstruction problems, focusing on underdetermined computed tomography reconstructions of lung scans. We integrate a learning-based regularizer into the reconstruction process and explore the space of possible data-consistent reconstructions corresponding to various degrees of pathological malignancy. Also, to integrate neural networks into model-based approaches, we build on recent studies, which aim to learn iterative descent directions for minimizing model-based cost functions. By applying Moreau-Yosida regularization, we introduce a method that avoids the need for differentiability. This is a significant improvement over previous approaches, that are limited to continuously differentiable cost functions. For solving matching and assignment problems, we introduce an approach that approximates large permutation matrices and reduces computation and memory costs by non-linear low-rank matrix factorization. We experimentally demonstrate its performance across various model- and learning-based methods.
2022
- Explorable Data Consistent CT ReconstructionHannah Droege, Yuval Bahat, Felix Heide, and 1 more authorIn British Machine Vision Conference, 2022
Computed Tomography (CT) is an indispensable tool for the detection and assessment of various medical conditions. This, however, comes at the cost of the health risks entailed in the usage of ionizing X-ray radiation. Using sparse-view CT aims to minimize these risks, as well as to reduce scan times, by capturing fewer X-ray projections, which correspond to fewer projection angles. However, the lack of sufficient projections may introduce significant ambiguity when solving the ill-posed inverse CT reconstruction problem, which may hinder the medical interpretation of the results. We propose a method for resolving these ambiguities, by conditioning image reconstruction on different possible semantic meanings. We demonstrate our method on the task of identifying malignant lung nodules in chest CT. To this end, we exploit a pre-trained malignancy classifier for producing an array of possible reconstructions corresponding to different malignancy levels, rather than outputting a single image corresponding to an arbitrary medical interpretation. The data-consistency of all our method reconstructions then facilitates performing a reliable and informed diagnosis (eg by a medical doctor).
- Non-smooth Energy Dissipating NetworksHannah Droege, Thomas Moellenhoff, and Michael MoellerIn IEEE International Conference on Image Processing, 2022
Over the past decade, deep neural networks have been shown to perform extremely well on a variety of image reconstruction tasks. Such networks do, however, fail to provide guarantees about these predictions, making them difficult to use in safety-critical applications. Recent works addressed this problem by combining model-and learning-based approaches, e.g., by forcing networks to iteratively minimize a model-based cost function via the prediction of suitable descent directions. While previous approaches were limited to continuously differentiable cost functions, this paper discusses a way to remove the restriction of differentiability. We propose to use the Moreau-Yosida regularization of such costs to make the framework of energy dissipating networks applicable. We demonstrate our framework on two exemplary applications, i.e., safeguarding energy dissipating denoising networks to the expected distribution of the noise as well as enforcing binary constraints on bar-code deblurring networks to improve their respective performances.
2021
- Learning or Modelling? An Analysis of Single Image Segmentation Based on Scribble InformationHannah Droege, and Michael MoellerIn IEEE International Conference on Image Processing, 2021
Single image segmentation based on scribbles is an important technique in several applications, e.g. for image editing software. In this paper, we investigate the scope of single image segmentation solely given the image and scribble information using both convolutional neural networks as well as classical model-based methods, and present three main findings: 1) Despite the success of deep learning in the semantic analysis of images, networks fail to outperform model-based approaches in the case of learning on a single image only. Even using a pretrained network for transfer learning does not yield faithful segmentations. 2) The best way to utilize an annotated data set is by exploiting a model-based approach that combines semantic features of a pretrained network with the RGB information, and 3) allowing the networks prediction to change spatially and additionally enforce this variation to be smooth via a gradient-based regularization term on the loss (double backpropagation) is the most successful strategy for pure single image learning-based segmentation.
- Mitral Valve Segmentation Using Robust Nonnegative Matrix FactorizationHannah Droege, Baichuan Yuan, Rafael Llerena, and 3 more authorsJournal of imaging, 2021
Analyzing and understanding the movement of the mitral valve is of vital importance in cardiology, as the treatment and prevention of several serious heart diseases depend on it. Unfortunately, large amounts of noise as well as a highly varying image quality make the automatic tracking and segmentation of the mitral valve in two-dimensional echocardiographic videos challenging. In this paper, we present a fully automatic and unsupervised method for segmentation of the mitral valve in two-dimensional echocardiographic videos, independently of the echocardiographic view. We propose a bias-free variant of the robust non-negative matrix factorization (RNMF) along with a window-based localization approach, that is able to identify the mitral valve in several challenging situations. We improve the average f1-score on our dataset of 10 echocardiographic videos by 0.18 to a f1-score of 0.56.
2020
- Inverting Gradients-How Easy is it to Break Privacy in Federated Learning?Jonas Geiping, Hartmut Bauermeister, Hannah Droege, and 1 more authorAdvances in Neural Information Processing Systems, 2020
The idea of federated learning is to collaboratively train a neural network on a server. Each user receives the current weights of the network and in turns sends parameter updates (gradients) based on local data. This protocol has been designed not only to train neural networks data-efficiently, but also to provide privacy benefits for users, as their input data remains on device and only parameter gradients are shared. But how secure is sharing parameter gradients? Previous attacks have provided a false sense of security, by succeeding only in contrived settings-even for a single image. However, by exploiting a magnitude-invariant loss along with optimization strategies based on adversarial attacks, we show that is is actually possible to faithfully reconstruct images at high resolution from the knowledge of their parameter gradients, and demonstrate that such a break of privacy is possible even for trained deep networks. We analyze the effects of architecture as well as parameters on the difficulty of reconstructing an input image and prove that any input to a fully connected layer can be reconstructed analytically independent of the remaining architecture. Finally we discuss settings encountered in practice and show that even averaging gradients over several iterations or several images does not protect the user’s privacy in federated learning applications.