MATA6200 Topics in Mathematics of Data Science (4 op)

Avaa opintojakson esite Sisussa

Opinnon taso:

Aineopinnot

Arviointiasteikko:

0-5

Suorituskieli:

englanti

Vastuuorganisaatio:

Matematiikan ja tilastotieteen laitos

Opetussuunnitelmakaudet:

2017-2018, 2018-2019, 2019-2020

Kuvaus

Sisältö

This course covers basics of optimization and computational linear algebra used in Data Science.
(1st part) Introduction: Classiﬁcation problem and examples (imaging, shape analysis, words’ mover distance, etc). Brief recall on some elements of Linear Algebra. Linear systems: Least square, Linear Regression, Singular-value decomposition/principal-
component analysis, Rayleigh quotients, K-means.
(2nd part) Convergence: Local optima and global optima. Some elements of Convex Analysis. Convexity and smoothness, Lipschitz functions, Strong convexity. Gradient Decent methods, Newton’s method.
(3rd part) Metric Learning / Cost functions: Distances (Euclidian, Earth Mover distance) and f-Divergences (e.g., Kullback-Lieber). Properties on the real line. Study the particular case of the space of Gaussian distributions in 1d.
Time allowing further topics can be discussed as for instance: Linear programming and Sinkhorn Algorithm, Neural Networks, Generative Models, Zero-sum games, etc.
.

Suoritustavat

We expect the students of being very active on the development of the course. Grades will be based on the homeworks (1/2) and a ﬁnal exam (1/2). The students will receive a short material to work on before the lectures as well as about three or four homework assignments will be provided for personal study after the lectures.

Osaamistavoitteet

• Understand basic mathematical tools used in Machine Learning algorithms.
• Study convergence of basic algorithms used in Applied Mathematics and Data Science. • Understand some theoretical issues in Machine Learning algorithms and be aware of open problems in the ﬁeld.
• Reinforce some concepts studied in Calculus and Linear Algebra by applying these on concrete problems.
• Be familiar with some concepts and theorems in Convex Analysis.
• Be introduced to some mathematical ideas and concepts which are going to be devel- oped in a master degree in mathematics.

Lisätietoja

The goal of this course is to introduce some mathematical aspects of Data Science to
undergraduate students in mathematics. In particular, we aim to develop mathematical tools that are crucial to understand basic theoretical issues in supervised and unsupervised learning.
This course focus on fundamental aspects of the theory and the formalism, rather than
study diﬀerent types of algorithms and recipes to solve a speciﬁc problem. One of the course goals is to present some applications of Calculus and Linear Algebra (studied in the ﬁrst year degree in mathematics) in an applied context.
Programming: No programming skills are required and there will be no coding homework in this course.

Esitietojen kuvaus

Lineaarinen algebra ja geometria 2 and Vektoricalculus 1 (or an equivalent course in Linear Algebra and Multi-variable Calculus).

Oppimateriaalit

The instructor will provided further links for references and notes after the lectures. Part
of the course material is inspired on selected sections of the following books.
- On Convex Analysis:
1. S´ebastien Bubeck. Convex Optimization: Algorithms and Complexity. Foundations and Trends in Machine Learning Vol. 8, No. 3-4 (2015) 231-358.
2. Stephen Boyd and Lieven Vandenberghe. Convex Optimization. Cambridge Univer- sity Press. Available at: https://web.stanford.edu/ boyd/cvxbook/
- On Linear Algebra:
3. Gilbert Strang. Introduction to Linear Algebra, Fifth Edition, 2016.
- Complementary material (selected topics):
4. Lindsay I Smith. A tutorial on Principal Components Analysis. Available at:
https://www.cs.otago.ac.nz/cosc453/student tutorials/principal components.pdf
5. Thomas M. Cover, Joy A. Thomas. Elements of Information Theory 2nd Edition
(Wiley Series in Telecommunications and Signal Processing). 2006.

Suoritustavat

Tapa 1

Valitaan kaikki merkityt osat

Suoritustapojen osat

Julkaisematon arviointikohde