Keynote: Sonali Parbhoo (Imperial College, London, UK) |
Title:Mind your (Evaluation) Metrics: Towards Actionable Off-Policy Evaluation for Safety Critical Settings |
Abstract:In high-stakes decision-making domains ofreinforcement learning such as healthcare,off-policy evaluation (OPE) is necessary forpractitioners to understand the performanceof a new policy before deploying it. However,existing OPE estimators often exhibit highbiasand variance in settings with large, combinatorial action spaces. Such action spaces result in poor sample overlap between the off-policy data and the policy to be evaluated. In this talk, I introduce a family of decomposed OPE estimators designed to utilise an action space factorisation. Improved sample overlap resultsin these estimators having lower bias andvariance than their non-decomposed counterparts, subject to certain assumptions. To facilitate the practical applications of decomposed estimators, Iwill introduce various strategies to derive action space factorisations tailored to achieve low bias and variance on agiven OPE problem and verify these theoretical results through empirical experiments ontoy environments, a healthcare-inspired simulator, and a clinical application of treating patients in the ICU with sepsis. |
Authors: Amirehsan Khorashadizadeh, Valentin Debarnot and Ivan Dokmanić |
Title: Deep Local Transformers for Computed Tomography |
Abstract: Deep learning has shown promise for CT image reconstruction by regressing the image from back-projections using convolutional neural networks (CNNs). Despite satisfactory results for data similar to the training data, these methods fall short for out-of-distribution samples, common in practice. This is to say if the test data distribution slightly deviates from the training data, their performance sharply drops. In this paper, we propose a deep local transformer that grabs the sin curve measurements from the sinogram associated with a pixel and processes it with a tiny multi-layer perception to reconstruct the image intensity in that pixel. We will show that the proposed method, which has only fully connected layers, can achieve comparable performance with highly successful CNN models like U-Net on inlier test data and significantly outperforms them on out-of-distribution data. |
Authors: Jim Zhao, Aurelien Lucchi and Nikita Doikov |
Title: Stochastic Coordinate Cubic Regularized Newton for non-convex optimization |
Abstract: Gradient-based methods, such as stochastic gradient descent are popular for optimization due to their simplicity and low per-iteration cost. In contrast, second-order methods, like Cubic regularized Newton, are more effective but costly. This paper presents a more efficient stochastic version of Cubic regularized Newton that adapts coordinate sampling for convergence to critical points, potentially saving computation while maintaining performance. |
Authors: Marcus J. Grote, Omar Lakkis and Carina S. Santos |
Title: A Posteriori Error Estimates For The Wave Equation |
Abstract: A posteriori error estimates are derived for the fully discrete time-dependent wave equation where combined with explicit time-stepping. Both the leapfrog method with and without local time-stepping are considered. Numerical results illustrate the accuracy of the error estimates. |
Authors: Marcello Massimo Negri, Fabricio Arend Torres and Volker Roth |
Title: Objective Bayesian Lasso with Conditional Flows |
Abstract: We propose an Objective Bayesian framework for Lasso regression by defining the approximate posterior on the manifold ||beta||1 = t. As a key difference with current Bayesian Lasso approaches, we can avoid assuming a subjective Laplace prior on the regression coefficients. In particular, we propose to perform variational inference through a Conditional Normalizing Flows defined by construction on the manifold ||beta||1 = t. By conditioning the flow on t, we can efficiently explore solution paths as a function of the norm value t. |
Authors: Enrico Giudice, Jack Kuipers and Giusi Moffa |
Title: A Bayesian Take on Gaussian Process Networks |
Abstract: In this work, we develop a sampling scheme to perform fully Bayesian structure inference on generic continuous-variable networks with potentially non-linear relations among variables. We follow the strategy of Friedman and Nachman and model the functional dependencies between each variable and its parents via Gaussian process (GP) priors. We extend their approach to the Bayesian framework by making structural inference based on the posterior distribution, which also involves treating the hyperparameters of the model as random. |
Keynote: Mahnaz Parian-Scherb (Roche) |
Title: Pixels to Patients: Exploring Computer Vision's Impact on Healthcare |
Abstract: In this presentation, I will take you on a tour through the innovative applications of computer vision within Roche's Pharma Early Research and Development division. We will discuss the pivotal role of this technology in the early detection and diligent monitoring of challenging neurodegenerative diseases such as Huntington’s, as well as various eye diseases. The focus will be on how we utilize video and image analysis to gain clear, objective insights into disease progression, and how we identify subtle changes in features that may indicate the onset of symptoms. This technology is instrumental in our approach to personalized medicine, aiding us in the creation of tailored rehabilitation programs and treatments, specifically designed to align with individual patient progress. |
Authors: Sanaullah Sanaullah, Shamini Koravuna, Ulrich Rückert and Thorsten Jungeblut |
Title: A Novel Spike Vision Approach for Robust Multi-Object Detection using SNNs |
Abstract: In this paper, we propose a novel system that combines computer vision techniques with SNNs to detect spike vision-based multi-object and tracking. Our system integrates computer vision techniques for robust and accurate detection and tracking, extracts regions of interest (ROIs) for focused analysis, and simulates spiking neurons for biologically inspired representation. Our approach advances the understanding of visual processing and empowers the development of efficient SNN models. In addition, our approach has achieved state-of-the-art results in visual processing tasks, showcasing the effectiveness and superiority of our approach. Extensive experiments and evaluations have been conducted to demonstrate the effectiveness and superiority of our proposed architecture and algorithm. The results obtained from our system are provided in this paper, showcasing the revolutionary performance that validates the efficacy of our approach and establishes it as a promising solution in the field of SNNs. |
Authors: Navish Kumar, Aurelien Lucchi, Thomas Mollenhöff and Emtiyaz Khan |
Title: Natural gradient square root variational inference |
Abstract: This paper investigates the convergence properties of Gaussian variational inference (VI), with a specific emphasis on the Natural Gradient Gaussian Variational Inference (NGVI) framework utilizing the square root Gaussian parameterization. Specifically, we study the variants of the Bayesian learning rule (BLR) introduced in, and conduct a comparative analysis between our proposed approach and Bures–Wasserstein SGD (BW-SGD) algorithm proposed in the Wasserstein geometry, showcasing the benefits of natural geometry. In particular, through extensive experimentation, we demonstrate the advantageous preconditioning effect exhibited by BLR, which contributes to its effectiveness in handling ill-conditioned data. Our evaluation encompasses both synthetic and real-world scenarios, validating the practical applicability of our approach. |
Authors: Emanuele Francazi, Aurelien Lucchi and Marco Baity Jesi |
Title: Initial Guessing Bias: How Untrained Networks Favor Some Classes |
Abstract: In the realm of classification tasks, our work elucidates how neural network design can predispose the model to prematurely assign data to a particular class, even before any training occurs. Dubbed ”Initial Guessing Bias” (IGB), this phenomenon hinges on design choices such as activation functions, network depth, and the presence of max-pooling layers. IGB challenges conventional assumptions and has far-reaching implications as, for example, it can significantly slow down the dynamics of gradient-based optimization methods. Distinguishing itself from prior work, this study takes a unique approach. Rather than considering the entire ensemble of weight initializations per input, the study fixes weight initialization and explores behavior through data expectations. This approach mirrors practical deployment where single networks classify data. Moreover, our analysis uncovers a breaking of the permutation symmetry betweeen nodes on the same layer. This symmetry is a common assumption when analysing untrained multi-layer perceptrons. The study’s revelation of this symmetry breaking serves as the foundation for IGB. In particular, by means of our analysis, we are able to access the distributions of the output nodes for a given untrained neural network with weights fixed by initialization and demonstrate that, when IGB is present, these are no longer identically distributed. Once we know the output distributions, we can derive the density (across different instances of the untrained network) of the fraction f1 of data points allocated to the generic class i. In the absence of IGB, we would anticipate a dataset uniformly distributed across the number of classes NC. Conversely, in the presence of IGB, the distribution assumes a non-trivial profile. We elucidate the dependence of IGB on the activation function, and provide general rules to identify which activation functions give rise to IGB. Remarkably, we can employ this knowledge to modify the definition of a generic activation function in a manner that can trigger or alleviate the emergence of IGB. Furthermore, we demonstrate that IGB is both induced and intensified by max pooling. Lastly, we establish that network depth does not engender IGB, but rather amplifies it if already extant. On a practical level, the insight gained from understanding IGB guides decisions regarding architecture selection and initialization strategies. Furthermore, this bias could be leveraged for innovative algorithmic solutions, e.g. to mitigate class imbalance. |
Authors: Shiva Parsarad and Isabel Wagner |
Title: Privacy in Brain Computer Interfaces |
Abstract: Brain-computer interfaces (BCI) allow people to control computing devices with their brain, instead of having to rely on input devices like keyboards. BCIs enable not only medical applications, but have also been proposed for entertainment and computer security. However, because BCI systems have access to brain signals, their privacy implications need to be carefully considered: a malicious BCI may be able to expose private preferences, emotions, or even thoughts. In this paper, we explore privacy risks and privacy protections for BCI, show which open gaps remain, and propose a roadmap for addressing them. |
Authors: David Lengweiler, Marco Vogt and Heiko Schuldt |
Title: Towards a Unified Data Integration, Access and Analysis Platform |
Abstract: Interactive Notebooks have become the defacto standard for many data analysis tasks. By introducing support for notebooks as part of the data management system Polypheny, we are consolidating various tools into a single, unified platform, simplifying data ingestion, transformation, and analysis. This integration enhances reusability and collaboration by providing a unified environment with comprehensive features, while also enabling advanced optimization strategies. |
Author: Alexander Ahn |
Title: Evaluating the Performance of Machine Learning Methods for Predicting Mortality in Intensive Care Unit Patients |
Abstract: Machine learning methods are increasingly being used for building diagnostic models in clinical settings to identify patients who are at a higher risk of mortality. Recent studies have shown that ensemble tree-based learning methods, provide an alternative non-parametric approach compared to traditional methods for building predictive models in high-dimensional datasets. In this study, we evaluated the performance of logistic regression, random forest, XGBoost, and LGBM (leaf-wise tree-based learning algorithm) for identifying ICU patients with a 28-day mortality risk at the time of hospital admission. The case study data originates from a subset of publicly available data from the Medical Information Mart for Intensive Care (MIMIC) II database. The performance of different methods was evaluated using prediction error curves. The results show that the XGBoost classification method achieved the best prediction accuracy for classifying survivors vs. non-survivors with (cross-validation area under the curve; AUC=0.86). The top features for predicting death at the time of ICU admission included age, simplified acute physiology score (SAPS), and serum sodium levels at admission. These results can help predict which patients are likely to die within 28 days of ICU admission so that healthcare professionals can design & implement optimal treatment strategies to improve patient outcomes. All analyses were conducted using the AutoAI tool in IBM Watson Studio. |
Authors: Rustem Islamov, Xun Qian, Slavomír Hanzely, Mher Safaryan and Peter Richtárik |
Title: Distributed Newton-Type Methods with Communication Compression and Bernoulli Aggregation |
Abstract: In this work we consider the distributed optimization problem given by the form of ERM where d is the (potentially large) number of parameters of the model x \in Rd we aim to train, n is the (potentially large) number of devices in the distributed system, fi(x) is the loss/risk associated with the data stored on machine i \in [n] := {1, 2, ..., n} and f(x) is the empirical loss/risk. The goal of Federated Learning is to jointly train a single machine learning model using all devices’ local data which is kept on devices due to its increasingly large size and data privacy. Because of these considerations, there has been a serious stream of works studying distributed training with decentralized data. This paradigm of training brings its own advantages and limitations. The cost of scaling the training over multiple devices forces intensive communication between nodes, which is the key bottleneck in distributed systems. |
Authors: Dilana Irvine Rauch, Florian Klaus Kaiser and Frank Schultmann |
Title: Local Price Effects of Flood Exposure and Flood Risk in the Residential Property Market |
Abstract: In the face of climate change, real estate is confronted with an increasing risk of natural catastrophe loss events such as fires and floods. Is this risk internalized in the market? With respect to flood risk, the literature presents mixed evidence. The estimated price effect for a property within a floodplain ranges between −75.5 percent and +61.0 percent. However, most studies are conducted within the US, while the only two analyses of Germany are confined to a single city respectively. We address this research gap and conduct the first analysis for the whole of Germany. |