Data Science
Data Science
0590a_MA120-
Ethical Foundations of Data Science
0590aB1.2-
19331102
Practice seminar
Practice Session on Human Centered Data Science (Claudia Müller-Birn)
Schedule: Di 16:00-18:00 (Class starts on: 2025-04-15)
Location: T9/SR 006 Seminarraum (Takustr. 9)
-
19331102
Practice seminar
-
Data Base Systems for Students of Data Science
0590aB1.20-
19301501
Lecture
Database Systems (Agnès Voisard)
Schedule: Di 14:00-16:00, Do 14:00-16:00, zusätzliche Termine siehe LV-Details (Class starts on: 2025-04-15)
Location: T9/Gr. Hörsaal (Takustr. 9)
Additional information / Pre-requisites
Requirements
- ALP 1 - Functional Programming
- ALP 2 - Object-oriented Programming
- ALP 3 - Data structures and data abstractions
- OR Informatik B
Comments
Content
Database design with ERM/ERDD. Theoretical foundations of relational database systems: relational algebra, functional dependencies, normal forms. Relational database development: SQL data definitions, foreign keys and other integrity constraints, SQL as applicable language: essential language elements, embedding in programming language. Application programming; object-relational mapping. Security and protection concepts. Transaction subject, transactional guaranties, synchronization of multi user operations, fault tolerance features. Application and new developments: data warehousing, data mining, OLAP.
Project: the topics are deepened in an implementation project for student groups.
Suggested reading
- Alfons Kemper, Andre Eickler: Datenbanksysteme - Eine Einführung, 5. Auflage, Oldenbourg 2004
- R. Elmasri, S. Navathe: Grundlagen von Datenbanksystemen, Pearson Studium, 2005
-
19301502
Practice seminar
Practice seminar for Database systems (Muhammed-Ugur Karagülle)
Schedule: Mo 12:00-14:00, Mo 14:00-16:00, Mo 16:00-18:00, Di 08:00-10:00, Di 10:00-12:00, Di 12:00-14:00, Mi 10:00-12:00, Mi 12:00-14:00, Mi 14:00-16:00, Do 08:00-10:00, Do 10:00-12:00, Do 12:00-14:00, Do 16:00-18:00, Fr 10:00-12:00, Fr 14:00-16:00, Fr 16:00-18:00 (Class starts on: 2025-04-14)
Location: T9/SR 006 Seminarraum (Takustr. 9)
-
19301501
Lecture
-
Mobile Communications
0590aB1.22-
19303901
Lecture
Mobile Communications (Jochen Schiller)
Schedule: Mi 10:00-12:00 (Class starts on: 2025-04-16)
Location: T9/049 Seminarraum (Takustr. 9)
Comments
The module mobile communication presents major topics from mobile and wireless communications - the key drivers behind today's communication industry that influence everybody's daily life.
The whole lecture focuses on a system perspective giving many pointers to real systems, standardization and current research.
The format of the lecture is the flipped classroom, i.e., you should watch the videos of a lecture BEFORE participating in the Q&A session. We will then discuss all open issues, answer questions etc. during the Q&A session.
Main topics of the lecture are:
- Basics of wireless transmission: frequencies, signals, antennas, multiplexing, modulation, spread spectrum
- Medium access: SDMA, FDMA, TDMA, CDMA;
- Wireless telecommunication systems: GSM, TETRA, IMT-2000, LTE, 5G
- Wireless local area networks: infrastructure/ad-hoc, IEEE 802.11/15, Bluetooth, ZigBee
- Mobile networking: Mobile IP, ad-hoc networks
- Mobile transport layer: traditional TCP, additional mechanisms
- Outlook: 5 to 6G, low power wireless networks
Suggested reading
Jochen Schiller, Mobilkommunikation, Addison-Wesley, 2.Auflage 2003
Alle Unterlagen verfügbar unter http://www.mi.fu-berlin.de/inf/groups/ag-tech/teaching/resources/Mobile_Communications/course_Material/index.html
-
19303901
Lecture
-
Machine Learning in Bioinformatics
0590aB1.30-
19405701
Lecture
Machine Learning in Bioinformatics (Philipp Florian Benner, Hugues Richard)
Schedule: Mo 08:00-10:00 (Class starts on: 2025-04-14)
Location: A6/SR 025/026 Seminarraum (Arnimallee 6)
Comments
This course introduces key machine learning concepts and is accompanied by tutorials and exercises where machine learning methods are applied to actual bioinformatics problems. After a short recap of probability theory, we introduce probabilistic methods for classification and sequence analysis (Naive Bayes, Mixture Models, Hidden Markov Models). We discuss Expectation Maximization (EM) from a probabilistic perspective and use it for sequence analysis. Linear and logistic regression serve as an entry point to more complex machine learning methods, including kernel methods and neural networks. The lecture covers multiple neural network architectures (CNNs, GNN, Transformers) that are currently used in the bioinformatics community and other research domains. During the tutorials and as part of homework assignments, selected machine learning models are implemented in Python using scikit-learn and pytorch. The course should enable students to understand all common machine learning techniques and devise state of the art classification strategies that can then be applied to problems in bioinformatics and related fields.
Contents:
- Naive Bayes
- Clustering and Mixture Models
- Hidden Markov Models
- Regression and Partial Least Squares
- Kernel Methods
- Neural Networks and Architectures
- Regularization and Model Selection Requirements:
- Linear algebra (basic vector and matrix algebra)
- Analysis (mathematical optimization, Lagrange)
- Programming in Python -- including object oriented programming
- A basic understanding or keen interest in molecular biology and bioinformatics applications -
19405702
Practice seminar
Practice Seminar for Machine Learning in Bioinformatics (Philipp Florian Benner, Hugues Richard)
Schedule: Mi 08:00-10:00 (Class starts on: 2025-04-16)
Location: A7/SR 031 (Arnimallee 7)
-
19405701
Lecture
-
Complex Systems in Bioinformatics
0590aB1.32-
19405201
Lecture
Complex Systems in Bioinformatics (Martin Vingron, Max von Kleist, Jana Wolf)
Schedule: Di 12:00-14:00 (Class starts on: 2025-04-15)
Location: A3/SR 120 (Arnimallee 3-5)
Comments
Students have acquired a deeper understanding of fundamental mathematical and algorithmic concepts in the field of modeling, simulation and analysis of complex biological systems against the background of current research trends in system biology and biotechnology. They are capable of analyzing a given biological or medical problem, selecting a suitable modeling approach, independently developing a solution and assessing and communicating the results.
Content:
Topics from the following areas are considered in depth:
- Network structure analysis
- Graphical modeling
- Modeling of biochemical networks using standard differential equations
- Discrete modeling of regulatory networks
- Constraint-based modeling
- Stochastic and hybrid modeling
Suggested reading
wird in der Veranstaltung bekanntgegeben.
-
19405202
Practice seminar
Practice seminar for Complex Systems in Bioinformatics (Martin Vingron, Max von Kleist, Jana Wolf)
Schedule: Di 14:00-16:00 (Class starts on: 2025-04-15)
Location: A3/SR 120 (Arnimallee 3-5)
-
19405211
Seminar
Seminar for Complex Systems in Bioinformatics (Martin Vingron, Max von Kleist, Jana Wolf)
Schedule: Do 10:00-12:00 (Class starts on: 2025-04-17)
Location: A3/SR 119 (Arnimallee 3-5)
-
19405201
Lecture
-
Data Science in the Life Sciences
0590aB2.1-
19405606
Seminar-style instruction
Data Science in the Life Sciences (Katharina Jahn)
Schedule: Mo 10:00-14:00 (Class starts on: 2025-04-14)
Location: T9/SR 006 Seminarraum (Takustr. 9)
Comments
This course offers an introduction to various types of data and analysis techniques which are typically used in the life sciences (e.g. omics technologies). The goal is to get a deeper understanding of advanced concepts and data analytical methods in the area of life sciences.
The focus will be on the following topics:
* acquisition and pre-processing of data from the area of life sciences,
* explorative analysis techniques,
* concepts and tools for reproducible research,
* theory and practice of methods and models for the analysis of data from the life sciences (statistical inference, regression models, methods of machine learning),
* introduction to methods of big data analysis.After successful completion of this course, participants are able to evaluate, plan and conduct investigations in the life sciences using common methods.
-
19405612
Project Seminar
Projectseminar for Data Science in the Life Sciences (Katharina Jahn)
Schedule: Mi 10:00-14:00 (Class starts on: 2025-04-16)
Location: A6/SR 031 Seminarraum (Arnimallee 6)
-
19405606
Seminar-style instruction
-
Special Aspects of Data Science in Life Sciences
0590aB2.4-
19336901
Lecture
Advanced Data Visualization for Artificial Intelligence (Georges Hattab)
Schedule: Mi 10:00-12:00 (Class starts on: 2025-04-16)
Location: A6/SR 007/008 Seminarraum (Arnimallee 6)
Comments
The lecture on Advanced Data Visualization for Artificial Intelligence is a comprehensive exploration of state-of-the-art techniques and tools to create and validate complex visualizations for communicating data insights and stories, with a specific focus on applications in Natural Language Processing (NLP) and Explainable AI. The lecture will introduce participants to the nested model of visualization, which encompasses four layers: characterizing the task and data, abstracting into operations and data types, designing visual encoding and interaction techniques, and creating algorithms to execute techniques efficiently. This model will serve as a framework for designing and validating data visualizations.
Furthermore, the lecture will delve into the application of data visualization in NLP, emphasizing the visualization of word embeddings and language models to aid in the exploration of semantic relationships between words and the interpretation of language model behavior. In the context of Explainable AI, the focus will be on using visualizations to explain model predictions and feature importance, thereby enhancing the interpretability of AI models. By leveraging the nested model of visualization and focusing on NLP and Explainable AI, the lecture aims to empower participants with the essential skills to design and validate advanced data visualizations tailored to these specific applications, ultimately enabling them to effectively communicate complex data patterns and gain deeper insights from their data. -
60102501
Lecture
Resampling techniques and their application (Frank Konietschke)
Schedule: Mi 14:00-16:00 (Class starts on: 2025-04-16)
Location: A6/SR 032 Seminarraum (Arnimallee 6)
Comments
In this course, we introduce resampling techniques for analyzing trials with small sample sizes. Special attention will be given to both estimation methods as well as inference procedures. We hereby will find answers to the questions (1) "How does resampling work?" and "When does resampling work"? Throughout the class we will study one sample, two samples and even factorial designs with independent and dependent observations. All algorithms will be presented and illustrated using R statistical software. Knowledge of fundamentals in statistical testing as well as basic skills in R are recommended and prerequisite.
-
60102701
Lecture
Complex Data Analysis in Physiology (Dorothee Günzel)
Schedule: Mo 14:30-18:30 (Class starts on: 2025-04-14)
Location: keine Angabe
Comments
Joint class taught by the Institute of Clinical Physiology and the Institute of Physiology at the Charité.
Theoretical and practical aspects of data acquisition, real-time data processing and automated pattern recognition in biomedicine. Topics from the following areas are covered in depth:
- Data acquisition and processing of image files in research and clinical settings (e.g. live cell imaging, super-resolution microscopy, medical imaging techniques).
- Electrophysiological methods (e.g. impedance spectroscopy, microarrays, EEG, ECG)
- Methods and application of automated pattern recognition (e.g. automated tumour detection, real-time analysis of biological signals in the brain-computer interface or in retina implants, prediction of individual arrhythmia risks)
The course will be split into two segments: the first seven appointments in the semester will take place at the Institute of Physiology, while the second seven appointments will take place at the Institute of Clinical Physiology.
For further information: http://klinphys.charite.de/bioinfo/ or mail to Dorothee Günzel
-
19336902
Practice seminar
Ü: Advanced Data Visualization for Artificial Intelligence (Georges Hattab)
Schedule: Mi 14:00-16:00 (Class starts on: 2025-04-16)
Location: A6/SR 007/008 Seminarraum (Arnimallee 6)
-
60102502
Practice seminar
Practice Seminar for Resampling techniques and their application (Frank Konietschke)
Schedule: Mi 16:00-18:00 (Class starts on: 2025-04-16)
Location: A6/SR 032 Seminarraum (Arnimallee 6)
-
60102702
Practice seminar
Practice seminar for Complex Data Analysis in Physiology (Dorothee Günzel)
Schedule: s. Vorlesung
Location: keine Angabe
-
19336901
Lecture
-
Special Aspects of Data Science Technologies
0590aB3.3-
19327401
Lecture
Image- and video coding (Heiko Schwarz)
Schedule: Mo 14:00-16:00 (Class starts on: 2025-04-14)
Location: T9/053 Seminarraum (Takustr. 9)
Comments
This course introduces the most important concepts and algorithms that are used in modern image and video coding approaches. We will particularly focus on techniques that are found in current international video coding standards.
In a short first part, we introduce the so-called raw data formats, which are used as input and output formats of image and video codecs. This part covers the following topics:
- Colour spaces and their relation to human visual perception
- Transfer functions (gamma encoding)
- Why do we use the YCbCr format?
The second part of the course deals with still image coding and includes the following topics:
- The start: How does JPEG work?
- Why do we use the Discrete Cosine Transform?
- Efficient coding of transform coefficients
- Prediction of image blocks
- Adaptive block partitioning
- How do we take decisions in an encoder?
- Optimized quantization
In the third part, we discuss approaches that make video coding much more efficient than coding all pictures using still image coding techniques:
- Motion-compensated prediction
- Coding of motion vectors
- Algorithms for motion estimation
- Sub-sample accurate motion vectors and interpolation filters
- Usage of multiple reference pictures
- What are B pictures and why do we use them?
- Deblocking and deringing filters
- Efficient temporal coding structures
In the exercises, we will implement our own image codec (in a gradual manner). We may extend it to a simple video codec.
Suggested reading
- Bull, D. R., “Communicating Pictures: A Course in Image and Video Coding,” Elsevier, 2014.
- Ohm, J.-R., “Multimedia Signal Coding and Transmission,” Springer, 2015.
- Wien, M., “High Efficiency Video Coding — Coding Tools and Specifications,” Springer 2014.
- Sze, V., Budagavi, M., and Sullivan, G. J. (eds.), “High Efficiency Video Coding (HEVC): Algorithm and Architectures,” Springer, 2014.
- Wiegand, T. and Schwarz, H., "Source Coding: Part I of Fundamentals of Source and Video Coding,” Foundations and Trends in Signal Processing, Now Publishers, vol. 4, no. 1–2, 2011.
- Schwarz, H. and Wiegand, T., “Video Coding: Part II of Fundamentals of Source and Video Coding,” Foundations and Trends in Signal Processing, Now Publishers, vol. 10, no. 1–3, 2016.
-
19336901
Lecture
Advanced Data Visualization for Artificial Intelligence (Georges Hattab)
Schedule: Mi 10:00-12:00 (Class starts on: 2025-04-16)
Location: A6/SR 007/008 Seminarraum (Arnimallee 6)
Comments
The lecture on Advanced Data Visualization for Artificial Intelligence is a comprehensive exploration of state-of-the-art techniques and tools to create and validate complex visualizations for communicating data insights and stories, with a specific focus on applications in Natural Language Processing (NLP) and Explainable AI. The lecture will introduce participants to the nested model of visualization, which encompasses four layers: characterizing the task and data, abstracting into operations and data types, designing visual encoding and interaction techniques, and creating algorithms to execute techniques efficiently. This model will serve as a framework for designing and validating data visualizations.
Furthermore, the lecture will delve into the application of data visualization in NLP, emphasizing the visualization of word embeddings and language models to aid in the exploration of semantic relationships between words and the interpretation of language model behavior. In the context of Explainable AI, the focus will be on using visualizations to explain model predictions and feature importance, thereby enhancing the interpretability of AI models. By leveraging the nested model of visualization and focusing on NLP and Explainable AI, the lecture aims to empower participants with the essential skills to design and validate advanced data visualizations tailored to these specific applications, ultimately enabling them to effectively communicate complex data patterns and gain deeper insights from their data. -
19327402
Practice seminar
Practice seminar for image- und video coding (Heiko Schwarz)
Schedule: Mo 12:00-14:00 (Class starts on: 2025-04-14)
Location: T9/053 Seminarraum (Takustr. 9)
-
19336902
Practice seminar
Ü: Advanced Data Visualization for Artificial Intelligence (Georges Hattab)
Schedule: Mi 14:00-16:00 (Class starts on: 2025-04-16)
Location: A6/SR 007/008 Seminarraum (Arnimallee 6)
-
19327401
Lecture
-
Current Research Topics in Data Science Technologies
0590aB3.4-
19327401
Lecture
Image- and video coding (Heiko Schwarz)
Schedule: Mo 14:00-16:00 (Class starts on: 2025-04-14)
Location: T9/053 Seminarraum (Takustr. 9)
Comments
This course introduces the most important concepts and algorithms that are used in modern image and video coding approaches. We will particularly focus on techniques that are found in current international video coding standards.
In a short first part, we introduce the so-called raw data formats, which are used as input and output formats of image and video codecs. This part covers the following topics:
- Colour spaces and their relation to human visual perception
- Transfer functions (gamma encoding)
- Why do we use the YCbCr format?
The second part of the course deals with still image coding and includes the following topics:
- The start: How does JPEG work?
- Why do we use the Discrete Cosine Transform?
- Efficient coding of transform coefficients
- Prediction of image blocks
- Adaptive block partitioning
- How do we take decisions in an encoder?
- Optimized quantization
In the third part, we discuss approaches that make video coding much more efficient than coding all pictures using still image coding techniques:
- Motion-compensated prediction
- Coding of motion vectors
- Algorithms for motion estimation
- Sub-sample accurate motion vectors and interpolation filters
- Usage of multiple reference pictures
- What are B pictures and why do we use them?
- Deblocking and deringing filters
- Efficient temporal coding structures
In the exercises, we will implement our own image codec (in a gradual manner). We may extend it to a simple video codec.
Suggested reading
- Bull, D. R., “Communicating Pictures: A Course in Image and Video Coding,” Elsevier, 2014.
- Ohm, J.-R., “Multimedia Signal Coding and Transmission,” Springer, 2015.
- Wien, M., “High Efficiency Video Coding — Coding Tools and Specifications,” Springer 2014.
- Sze, V., Budagavi, M., and Sullivan, G. J. (eds.), “High Efficiency Video Coding (HEVC): Algorithm and Architectures,” Springer, 2014.
- Wiegand, T. and Schwarz, H., "Source Coding: Part I of Fundamentals of Source and Video Coding,” Foundations and Trends in Signal Processing, Now Publishers, vol. 4, no. 1–2, 2011.
- Schwarz, H. and Wiegand, T., “Video Coding: Part II of Fundamentals of Source and Video Coding,” Foundations and Trends in Signal Processing, Now Publishers, vol. 10, no. 1–3, 2016.
-
19327402
Practice seminar
Practice seminar for image- und video coding (Heiko Schwarz)
Schedule: Mo 12:00-14:00 (Class starts on: 2025-04-14)
Location: T9/053 Seminarraum (Takustr. 9)
-
19327401
Lecture
-
Selected Topics in Data Science Technologies
0590aB3.5-
19326601
Lecture
Markov Chains (Katinka Wolter)
Schedule: Di 12:00-14:00, Do 10:00-12:00 (Class starts on: 2025-04-15)
Location: T9/Gr. Hörsaal (Takustr. 9)
Comments
In this course we will study stochastic models commonly used to analyse the performance of dynamic systems. Markov models and queues are used to study the behaviour over time of a wide range of systems, from computer hardware, communication systems, biological systems, epidemics, traffic networks to crypto-currencies. We will take a tour of the basics of Markov modelling, starting from birth-death processes, the Poisson process to general Markov and semi-Markov processes and solution methods for those processes. Then we will look at queueing models and queueing networks with exact and approximate solution algorithms. If time allows we will finally study some of the foundations of discrete event simulation.
Suggested reading
William Stewart. Probability, Markov Chains, Queues and Simulation. Princeton University Press 2009.
-
19326602
Practice seminar
Practice seminar for Markov Chains (Justus Purat)
Schedule: Di 14:00-16:00 (Class starts on: 2025-04-15)
Location: A6/SR 007/008 Seminarraum (Arnimallee 6)
-
19326601
Lecture
-
Software Project Data Science
0590aB3.1-
19308312
Project Seminar
Implementation Project: Applications of Algorithms (Mahmoud Elashmawi)
Schedule: Do 08:30-10:00 (Class starts on: 2025-04-10)
Location: T9/053 Seminarraum (Takustr. 9)
Comments
Contents
We choose a typical application area of algorithms, usually for geometric problems, and develop software solutions for it, e.g., computer graphics (representation of objects in a computer, projections, hidden edge and surface removal, lighting, raytracing), computer vision (image processing, filtering, projections, camera calibration, stereo-vision) or pattern recognition (classification, searching).
Prerequsitions
Basic knowledge in design and anaylsis of algorithms.
Suggested reading
je nach Anwendungsgebiet
-
19314012
Project Seminar
Software Project: Semantic Technologies (Adrian Paschke)
Schedule: Mi 14:00-16:00 (Class starts on: 2025-04-16)
Location: A7/SR 031 (Arnimallee 7)
Additional information / Pre-requisites
Further information can be found on the course website
Comments
Mixed groups of master and bachelor students will either implement an independent project or are part of a larger project in the area of semantic technologies. They will gain in-depth programming knowledge about applications of semantic technologies and artificial intelligence techniques in the Corporate Semantic Web. They will practice teamwork and best practices in software development of large distributed systems and Semantic Web applications. The software project can be done in collaboration with an external partner from industry or standardization. It is possible to continue the project as bachelor or master thesis.
Suggested reading
-
19334212
Project Seminar
Softwareproject: Machine Learning and Explainability for Improved (Cancer) Treatment (Pauline Hiort)
Schedule: Di 15:00-17:00, zusätzliche Termine siehe LV-Details (Class starts on: 2025-02-26)
Location: T9/K40 Multimediaraum (Takustr. 9)
Comments
In the software project, we will implement, train, and evaluate various machine learning (ML) methods. The focus of the project is on neural networks (NN) and their explainability. We will compare the methods with different baseline models, such as regression models. The various ML methods will be applied to a specific dataset, e.g., for predicting drug combinations for cancer treatment, and evaluated accordingly. The dataset will be prepared by us and analyzed using the implemented methods. Additionally, we will focus on explainability to ensure that the predictions of the ML models are understandable and interpretable. For this purpose, we will integrate appropriate explainability techniques to better understand and visualize the decision-making processes of the models.
The programming language is Python, and we plan to use modern Python modules for ML like scikit-learn, and PyTorch. Good Python skills are required. The goal is to create a Python package that provides reusable code for preprocessing, training ML models, and evaluating results with documentation (e.g., using Sphinx) for the specific use case. The software project takes place throughout the semester and can also be conducted in English.
-
19308312
Project Seminar
-
-
Introduction to Profile Areas 0590aA1.1
-
Statistics for Students of Data Science 0590aA1.2
-
Machine Learning for Data Science 0590aA1.3
-
Programming for Data Science 0590aA1.4
-
Data Science in the Social Sciences 0590aB1.1
-
Mobile Mental Health 0590aB1.10
-
Developing Psychological Online Interventions 0590aB1.11
-
Selected Topics in Data Science in the Social Sciences 0590aB1.12
-
Special Aspects of Data Science in the Social Sciences 0590aB1.13
-
Distributed Systems 0590aB1.21
-
Telematics 0590aB1.23
-
Advanced Analysis 0590aB1.24
-
Computer Security 0590aB1.25
-
Pattern Recognition 0590aB1.26
-
Network-Based Information Systems 0590aB1.27
-
Artificial Intelligence 0590aB1.28
-
Special Aspects of Data Administration 0590aB1.29
-
Research Practice 0590aB1.3
-
Big Data Analysis in Bioinformatics 0590aB1.31
-
Neurocognitive Methods and Programming for Data Science 0590aB1.4
-
Cognitive Neuroscience for Data Science A 0590aB1.5
-
Cognitive Neuroscience for Data Science B 0590aB1.6
-
Differential Psychological Approaches in Data Sciences 0590aB1.7
-
Natural Language Processing 0590aB1.8
-
Introduction to Psychoinformatics 0590aB1.9
-
Selected Topics in Data Science in Life Sciences 0590aB2.5
-