We run a journal club to discuss data science topics every Friday at 2 pm. To join our mailing list and receive notifications, please send an empty email to odsl-subscribe(at)lists.lrz.de or visit this website: https://lists.lrz.de/mailman/listinfo/odsl.
If you have ideas for topics to discuss, feel free to propose them in the following google doc: http://bit.ly/odsljc20.
We are organising our next set of Block courses from September 6th - 16th 2021 under the title Practical Inference for Researchers in the Physical Sciences.
This session consists of two one-week courses:
1. Monte Carlo inference methods
Introduction to Bayesian inference with physical models. Parameter uncertainties, degeneracies and knowledge updates. Model comparison and criticism. Modern Monte Carlo algorithms for Bayesian inference in practice and probabilistic computation packages: Importance Sampling, Markov Chain Monte Carlo, Nested Sampling.
2. Bayesian workflow
Bayesian thinking, going from a science question to a generative statistical model, defining sensible priors, verification through simulations, diagnosing problems in models and computation, robust decision making, experiment design.
The courses will be held online and organised by Johannes Buchner and Francesca Capel. We plan to offer credits to both TUM and LMU students.
For more information and to register please visit: https://indico.ph.tum.de/event/6875/
We are opening the second call for proposals from Origins Cluster scientists to collaborate with the ODSL team on data analysis projects. Our team has a wide range of expertise in applied statistics and can offer dedicated support to help you make the most of your data. We are looking for scientific projects with flexible durations, anything from a few weeks to many months.
Our core team currently consists of the two postdocs: Francesca Capel and Jakob Knollmüller and a PhD student. We are also joined by the four ODSL fellows, Johannes Buchner, Philipp Eller, Nahuel Ferreiro and Oliver Schulz. Together we have experience in a variety of data analysis topics including Bayesian analysis, Monte Carlo methods, hierarchical modelling, machine learning, likelihood-free inference and variational inference to give some examples.
In return, we expect acknowledgement or authorship on resulting publications, depending on the level of involvement. We want to make it clear that we are not offering help in setting up computing environments or basic software, nor are we a high-performance computing facility.
Proposals are welcome from all Origins member scientists but must be endorsed by an Origins PI. Proposals should include
If you have any questions regarding the proposals, then please contact us at odsl-team(at)origins-cluster.de. We are happy to discuss and help you to formulate your request. A selection committee made up of the ODSL core team, fellows, and Origins scientists from different disciplines will evaluate the proposals promptly after the submission deadline.
The deadline for this call is October 31st 2021.
Please send your proposal in English as a single pdf file to odsl-team(at)origins-cluster.de.
An introductory C++ course will take place 8.3.-19.3.2021. Exam 19.3.
Moodle page: https://www.moodle.tum.de/course/view.php?id=64027
Students may also register as guest (please get into contact with alice.smith-gicklhorn@origins-cluster.de to get access)
On the Moodle-page, information on the course, slides and literature for download may be found.
The course takes place online via zoom::
https://cern.zoom.us/j/63382034823?pwd=VHF3T1VCOWpybjN4MmRFWUcySU5SQT09
Meeting ID: 633 8203 4823
Passcode: (please get into contact with alice.smith-gicklhorn@origins-cluster.de to get the passcode)
Dates:
8.3.2021-19.3.2021. (examen: 19.3.2021 - ECTS credits 3)
Times:
10.00 to 12 - 12.30 (2 lectures 20-20 slides each)
14.00 to 16 - 17 practical part (students are programming, lecturer answers their questions)
Lecturer: Sergei Gerassimov
Language: English
We organize two block courses during March 1.-11. 2021. These introduce statistical, as well as Monte Carlo Methods.
The courses, as well as the tutorials, will take place online. The tutorials will be organised through breakout rooms, each assigned to a tutor.
The block courses follow the schedule:
All exercises will be made available before the courses and the deadline to hand in the report is March 31.
For students successfully completing both Block Courses, ECTS points can be awarded.
Lecturer: Prof. Allen Caldwell
Topics: Derivation and application of the most commonly used statistical distributions, Central Limit Theorem, point estimates, confidence intervals, test statistics, p-values and related topics.
Lecturer: Prof. Allen Caldwell
Topics: Variable transformations, accept-reject methods, sample mean, importance sampling, random walks, Markov Chain Monte Carlos and applications
The registration, as well as further details can be found here: https://indico.ph.tum.de/event/6797/
We are opening a call for proposals from Origins Cluster scientists to collaborate with the ODSL team on data analysis projects. Our team has a wide range of expertise in applied statistics and can offer dedicated support to help you make the most of your data. We are looking for scientific projects with flexible durations, anything from a few weeks to many months.
Our core team consists of the three postdocs: Francesca Capel, Philipp Eller and Jakob Knollmüller. We are also joined by the two ODSL fellows, Johannes Buchner and Oliver Schulz. Together we have experience in a variety of data analysis topics including Bayesian analysis, Monte Carlo methods, hierarchical modelling, machine learning, likelihood-free inference and variational inference to give some examples.
What we offer
In return, we expect acknowledgement or authorship on resulting publications, depending on the level of involvement. We want to make it clear that we are not offering help in setting up computing environments or basic software, nor are we a high-performance computing facility.
Proposal guidelines
Proposals are welcome from all Origins member scientists but must be endorsed by an Origins PI. Proposals should include
If you have any questions regarding the proposals, then please contact us at odsl-team(at)origins-cluster.de. We are happy to discuss and help you to formulate your request. A selection committee made up of the ODSL core team, fellows, and Origins scientists from different disciplines will evaluate the proposals promptly after the submission deadline.
Proposal deadline
The deadline for this call is October 31st 2020. We anticipate regular calls, with the next call early next year.
Submit a proposal
Please send your proposal in English as a single pdf file to odsl-team(at)origins-cluster.de.
UPDATE: The proposals for this period have now been assigned. We will consider further proposals in Spring 2021.
The Origins Data Science Lab (ODSL) is organizing two block courses of three afternoons each on data science topics.
Each block consists of six lectures of one hour, followed by the possibility to work on a set of problems, including small calculations and implementations.
In this course we will introduce the basic concepts of reasoning under uncertainty. After a brief introduction to probability theory and commonly used probability distributions, we discuss inference tasks with various probabilistic models. We conclude by outlining methods to approach more involved inference tasks through approximation or sampling.
Lecturer: Jakob Knollmüller
Prerequisites: Linear Algebra, basic Analysis, a programming language of choice
Skills acquired: basics of probabilistic reasoning and Bayesian inference, probabilistic modelling, model comparison, approximate inference
This course is focusing on methods for data processing, optimization and machine learning. First we will learn the basics of data decorrelation, reduction and optimization algorithms. Based on these new skills, we dive into machine learning topics, such as clustering, classification and regression with tree based algorithms and neural networks. In the last part deep learning models and different architectures will be introduced and explained.
Lecturer: Dr. Philipp Eller
Prerequisites: Linear Algebra, basic Analysis, a programming language of choice
Skills acquired: basic data transformations, knowledge in various optimization algorithms, k-means clustering, decision trees, neural networks, convolutional neural networks, auto-encoders, generative models
It is possible to get a Certification or ECTS points for participation in the Block Courses:
To get a Certificate of Participation (for either one of the two blocks or both), you will need to turn in solutions to the exercises that will be assigned during the course and get a passing grade. The certification will be done on a course-by-course basis, and will state that you have successfully completed the Block Course in the respective topic. Please register for the course in advance so we can estimate how much work will be involved in the evaluation of the reports.
To get the 3 ECTS points, you will need to turn in solutions to the exercises for both Block Courses that will be offered this year. The grade for the course will be based on the two sets of exercises, and there will not be an additional exam. The deadline to hand in the report is September 30, 2020. Please register for the courses in advance so we can estimate how much work will be involved in the evaluation of the reports.
For more information and to register please visit https://indico.ph.tum.de/event/4491/
Der Workshop wird sowohl einführende als auch fortgeschrittene Themen im Bereich der statistischen Stichprobenziehung (Sampling )und Clusterbildung behandeln. Neben Vorträgen über den neuesten Stand der Technik wird der Workshop auch Hands-on und Übungssitzungen umfassen.
Der vom Max-Planck-Institut für Physik (MPP) veranstaltete Workshop wird von INSIGHTS ITN, MPP IMPRS und dem Exzellenzcluster ORIGINS organisiert und steht allen Angehörigen dieser Organisationen offen.
Aufgrund von Covid-19 werden alle Vorlesungen, Übungseinheiten und sozialen Interaktionen online abgehalten.
Abhängig von der weiteren Entwicklung der aktuellen Situation kann es immer noch möglich sein, zum MPP zu reisen, um persönliche Kontakte zu knüpfen.
Die Teilnahme ist kostenlos, aber alle Teilnehmer sollten sich bis zum 20. September 2020 anmelden.
Mehr infos: https://indico.mpp.mpg.de/event/7494/overview
Um ein gutes Bild einer räumlich variierenden Größe, einem Feld, aus unvollständigen und verrauschten Messdaten zu rekonstruieren, bedarf es der Kombination der Messungen mit Wissen über allgemeine physikalische Eigenschaften des Feldes, wie dessen Glattheit, Korrelationsstruktur, oder Divergenz-Freiheit. Die Informationsfeldtheorie nutzt den eleganten Formalismus von Feldtheorien, um optimale bayesianische Bildgebungsalgorithmen für die unterschiedlichsten Messsituationen mathematisch herzuleiten. Diese Algorithmen können mittels des „Numerical Information Field Theory“ (NIFTy) Programierpaketes effizient und allgemein implementiert werden. Algorithmen die NIFTy nutzen kommen beispielsweise bereits in der Radio- und Gammastrahlungsastronomie zum Einsatz. NIFTy entwickelt sich gerade zu einem universell einsetzbaren Werkzeug für Bildgebungsprobleme in Astronomie, Teilchenphysik, Medizin und andere Gebiete.
The Bayesian Analysis Toolkit, BAT, is a software package which is designed to help solve statistical problems encountered in Bayesian inference. BAT is based on Bayes' Theorem and is currently realized with the use of Markov Chain Monte Carlo. This gives access to the full posterior probability distribution and enables straightforward parameter estimation, limit setting and uncertainty propagation. Novel sampling methods, optimization schemes and parallelization are example development areas.
We will build a platform to host and combine overarching information on experimental studies, astronomical observations and theoretical modeling of Dark Matter to facilitate the combination and cross-correlation of existing and forthcoming data. This data centre will allow tests for Dark Matter (DM) candidates passing all existing benchmarks in cosmology, astro- and particle physics, experiments and in theory. We plan probing for tensions between different data sets and theories pointing towards new, hidden properties of DM. The data centre will make the data available to the international community for further global analysis and model benchmarking, following examples in astro- and high-energy physics.