Mathematics of Data Science and AI
The 21st century, often called the data age, is characterized by the immense collection of information from diverse sources like social media, healthcare, or retail, highlighting the significance of Data Science and AI. These disciplines transform how we analyze and interpret this data deluge, facing challenges posed by diverse data types and the sheer amount of the available data. The adoption of sophisticated AI and data science techniques is crucial for the efficient management, analysis, and storage of data.

Mathematics serves as the backbone for these AI and data science methodologies, providing a robust and analyzable foundation. Particularly, AI-driven deep neural networks have outperformed conventional models, marking significant advancements. Yet, their mathematical bases are still intensively researched. The field melds classical mathematics - like harmonic analysis, functional analysis, and linear algebra - with AI, fostering innovative, interdisciplinary solutions to real-world issues. Scientists in this field are not merely formulating mathematical theories; they collaborate across various disciplines, harnessing AI and data science power to tackle real-world challenges in an interdisciplinary and transdisciplinary manner.


Berlin research groups
Berlin is home to a diverse range of research areas within the fields of AI and data science, which collectively contribute to the broader landscape in a unique way:

AI for Dynamical Systems
Artificial Intelligence (AI) has shown significant promise in understanding and predicting dynamical systems. By leveraging mathematical models and algorithms, researchers in Berlin are pioneering methods to integrate AI into the study of complex systems, from molecular systems to biological processes. These methods aim to enhance the accuracy and efficiency of predictions, bridging the gap between theoretical mathematics and real-world applications.

Optimization for AI
Optimization plays a crucial role in the efficiency and effectiveness of AI algorithms. Berlin researchers are at the forefront of developing advanced optimization techniques tailored for AI, ensuring that machine learning models are both accurate and computationally efficient. These techniques are pivotal in refining AI models, making them more adaptable and responsive to diverse datasets.

AI-based Data Analysis
With the explosion of data in various fields, AI-driven data analysis methods are becoming indispensable. Berlin's mathematical community is actively involved in creating robust AI tools that can sift through vast amounts of data, uncovering patterns, anomalies, and insights that traditional methods might miss. The aim is to harness the power of AI to transform raw data into actionable insights, building digital twins, or aiding decision-making processes across various sectors, such as medicine, transportation or energy. 

Foundations of AI
The comprehension of AI's fundamental principles is of vital importance given the extensive range of its applications. The mathematical scientists in Berlin are actively engaged in exploring the fundamental theoretical foundations of artificial intelligence (AI), thereby ensuring that the advancements in this field are firmly rooted in robust mathematical principles. Key questions of Berlin-based research lie, for example, at the interface between geometry and AI, in understanding the approximative qualities of AI-based methods for high-dimensional problems, or in contrasting “small data” learning challenges and “data-hungry” AI-tools like Deep Learning.


  Tim Conrad
Mathematical Data Science and AI in Health

Our research focuses on mathematical data science and AI methodologies, including interpretable deep learning for biomarker detection and efficient multi-label classification pipelines for large medical datasets, showcasing our commitment to enhancing health outcomes through technological innovation. We also work on models and efficient frameworks for scenarios simulations in epidemics, improving public health strategies by providing a deeper understanding of disease dynamics.

 ZIB

  Martin Eigel

Uncertainty Quantification and Scientific Machine Learning

High-dimensional problems are ubiquitous and require the development of new numerical approaches which exploit inherent structures, in particular different types of regularity. Uncertainty Quantification is a popular research area where such problems occur due to the modelling of physical processes with parametric random PDEs. This especially also includes related Bayesian inverse problems to determine parameter distributions from measurements. In addition to techniques from Scientific Machine Learning with neural networks, tensor networks have also proven to be a potent tool to represent these problems efficiently.

 WIAS
  Jens Eisert
Quantum Information and Condensed Matter Theory

Our group is concerned with research in quantum information theory, condensed matter theory and the intersection between the fields. We ask what information processing tasks are possible using single quantum systems as carriers of information. We think about the mathematical-theoretical foundations of quantum information, specifically about the theory of entanglement and questions of tomography, but also about ways of realizing topological quantum computing. A main emphasis of our theoretical research is in condensed matter theory, concerning static properties of quantum many-body systems, their efficient numerical simulation, as well as their quantum dynamics in non-equilibrium. Methods of tensor networks play a special role here. We are also involved in identifying quantum optical realizations of such ideas, specifically using light modes or cold atoms in optical lattices. Characteristic for our work is to be guided by the rigor of mathematical physics, but at the same time to be deeply pragmatically and physically motivated, which often leads to collaborations with experimentalists.

 FU Berlin (Physics)

  Konstantin Fackeldey

Efficient Large Scale Computing in Life Sciences

Our research is focused on mathematical and computational methodologies in the field of life sciences. In an interdisciplinary team we develop mathematical models, efficient parallel computing- and machine learning algorithms for virtual screening and drug development.

 TU Berlin

  Hanno Gottschalk

AI-based Data Analysis: Mathematical Modeling of Industrial Life Cycles
A modern understanding of mathematical modeling of industrial life cycles involves model for the data driven economy. Our research group thus investigates the life cycle large scale computer vision models from data production over training and monitoring during inference to repair of AI models. We also model, predict and optimize the life span of technical components. As a trained physicist, Hanno Gottschalk retained some interest in the mathematical foundations of quantum field theory.

 TU Berlin

  Aswin Kannan
Data-Centric-Optimization

Problems in machine-learning and derivative-free optimization can be inherently multi-objective in nature. We focus on such problems in the context of both energy markets and imaging. Objectives can be multi-fold ranging from accuracy and computational time to fairness indicators and sparsity. We study the theoretical and computational behavior of next-generation fusion type algorithms that aim to improve the quality of the resulting Pareto frontiers. Examples include joint Bayesian and Direct-Search type schemes and fusion of novel hyper-parameter and model parameter optimization methods. As extensions, some problems in lieu of unit-commitment applications and l1-type regression are also studied.

 HU Berlin

  Klaus-Robert Müller

Advanced Machine Learning and Applications

Our research focuses on diverse areas within machine learning and its applications, including explainable AI, probabilistic modeling, multimodal learning, and ML for quantum chemistry and physical sciences. We also work on digital pathology, biomedical sensing, computational neuroscience, and digital humanities. Our projects include developing novel probabilistic models, creating machine learning methods for molecular simulations, and advancing wearable neurotechnology for comprehensive brain-body monitoring. Through these efforts, we aim to enhance the understanding and application of machine learning in complex real-world scenarios.

 TU Berlin (Computer Science)

  Frank Noé

Artificial Intelligence for the Sciences

We develop mathematical and deep learning methods to solve fundamental challenges in the natural sciences, in particular in molecular Physics applications.

 FU Berlin

  Sebastian Pokutta

Mathematical Optimization and Machine Learning

Our research at the IOL Lab integrates optimization with machine learning, focusing on conditional gradient algorithms, non-smooth optimization, and accelerated methods. We also delve into robust and explainable AI, enhancing the interpretability and robustness of deep neural networks. Our work spans various applications including combinatorial optimization and interactive theorem proving, contributing to advancements in computational mathematics and quantum algorithms.

 TU Berlin and ZIB

  Markus Reiß

Mathematical Statistics

In mathematical statistics we ask about the construction and the properties of statistical methods like estimators, tests or classifiers. Based on a rigorous modelisation the aim is to find optimal procedures among all feasible ones (which may be as large as all measurable functions of the data). The statistical problems considered vary from modern high-dimensional statistics and machine learning where the data sets and the unknown parameters are huge over nonparametric statistics and inverse problems where the unknowns are functions to statistics of stochastic processes where realisations from stochastic dynamics, e.g. described by stochastic (partial) differential equations, generate the data to work with. Modern statistics does not only require advanced knowledge of probability theory, but has also strong relationships with functional analysis, optimisation and numerical analysis as well as with modern developments in data science and applied fields. A mathematical understanding of statistical problems and methods is key for correct usages and poses at the same time challenging fundamental questions.

 HU Berlin

  Claudia Schillings

Numercial analysis of (random) pdes

Our research focus is on applied and computational mathematics, in particular optimization, inverse problems and uncertainty quantification. We are interested in the theory, development and analysis of methods for the proper treatment of uncertainties in inverse and optimization problems. We work at the interface of applied mathematics and statistics, which is an exciting and fast-growing area of research. Applications include engineering, environmental, physical and biological systems, e.g., groundwater flow problems, shape uncertainties in aerodynamic applications or nano-optics, biochemical networks and finance.

 FU Berlin

  Christof Schütte

AI for Dynamical Systems: Effective Dynamics of Complex Systems

The mathematics of complex systems is a fascinating and challenging field that seeks to capture the intricate dynamics and emergent properties of interconnected elements. The underlying mathematical descriptions range from ODEs and SDEs (mostly in high dimensions) to (S)PDEs, and the associated mathematical research is interdisciplinary, often combing elements of dynamical systems theory, probability and stochastics, numerical analysis, operator theory, and machine learning. On one hand, fundamental questions like existence and structural properties of solutions, their robustness and controllability, or their scaling behavior are essential. On the other hand, algorithmic key aspects include numerical efficiency and error control of large-scale multiscale simulations to overcome the curse of dimension, rare event simulation, and model reduction, e.g., by finding the backbone of the effective dynamics. Methodological progress, combined with advancements in computing, data analytics, and artificial intelligence, is making simulations of complex systems more sophisticated, accurate, and capable of describing intricate real-world phenomena, facilitating innovation in various domains, including energy, mobility, health, or sustainable development.

 FU Berlin and ZIB

  Gabriele Steidl

Advanced Image Processing and Inverse Problems

Our research focuses on advanced image processing and inverse problem-solving techniques, with applications in materials science, neural networks, and biomedical imaging. We enhance superresolution of multiscale images in materials science using geometric features, analyze anisotropy effects on deformation processes in nickel alloys through FEM and variational image processing, and explore invertible neural networks for solving inverse problems.

 TU Berlin
  Max von Kleist
Public Health Data Science: Heterogeneous data integration

Two general modelling philosophies are prevalent in data-driven mathematics, each of which in isolation is insufficient to understand relevant mechanisms: In a ‘top-down’ approach, a model is learned from and fitted to available data. This model may lack interpretability in terms of mechanisms and may only perform robustly for scenarios already covered by the data used to derive it. A ‘bottom-up’ approach builds on mechanistic insights derived from surrogate experiments, which can be conducted under controlled conditions, but may not fully represent real situations. Our research is focused on developing mathematical and computational methodologies that support heterogeneous data integration in biomedical application and public health. In an interdisciplinary team we develop multi-scale mathematical models, efficient numerical schemes for stochastic simulation and parameter uncertainty estimation with the aim of combining top-down/bottom-up modelling.

 FU Berlin


Links to other areas
There are natural, strong collaborative ties to the following groups:

Foundations of AI in RTA 4:
Martin Skutella

Optimization for AI in RTA 6:
Michael Hintermüller


Core Courses
The BMS Core Courses in Area 8 and details about their content can be found under "Course Program":
Area 8 - Core Courses

Advanced Courses
The current advanced courses are listed under "Course Program":
Area 8 - Advanced Courses