Course Description

Basic and modern topics in unsupervised learning (clustering and feature selections) will be covered in this course. Basic topics include K-mean, soft K-mean clustering, K-medoid, hierarchical clustering, self organized map, principal components, graph theory, kernel principal components, multidimensional scaling and validation methods. A selection of possible modern topics include the spectral clustering, ISOMAP, locally linear embedding, diffusion map, Laplacian eigenmaps, neural-network related methods, density-based clustering, clustering by passing messages, etc. Focuses will be put on interpretable methods, instead of blackbox methods, that are suitable for problems in natural science.

Class Plan

First meeting: Rm 306, Hus 6, Kräftriket, 10:30am - 12:30pm, Mar 25 (Wed) 2020.

First 1/3 (4 lectures): Basic topics of unsupervised learning given in the form of lectures.

Second 1/3  (4 to 5 meetings depends on number of students): Cover modern methods in unsupervised learning in the form of journal paper review and discussion by students.

Last 1/3 (4 to 5 meetings depends on number of students): Carry out projects using the learned methods. Project presentations will be given by students.

Course Reference

Small part of the book “Elements of Statistical Learning” by Hastie, Tibshirani & Friedman will be used in the first 4 classes. A list of recent journal papers on unsupervised learning will be recommended during the course.

Grading 

Grade will be assigned based on students’ performance on the journal reviews and projects. No written exam will be given.

Prerequisites 

Multivariable calculus, linear algebra, basic knowledge of optimization, and basic probability theory.

First session will be Friday October 4th, 10-12 hrs, in room 306 house 6 (Cramér room), and the course will then run with one session per week until mid-December.

The main text for the course will be "Statistical Learning with Sparsity: The Lasso and Generelizations" by Hastie, Tibshirani and Wainwright ( https://web.stanford.edu/~hastie/StatLearnSparsity/ )

The course is planned to cover most of Chapters 1--8 of this book. Additional material covering basics on convex sets, functions and convex optimisation will also be included, as well as additional material on proximal algorithms for solving convex optimisation problems relavant to the course contents.

The course will have a hands-on perspective, solving exercises and doing computer assignments, rather than plunging deep into theory.

Prerequisites are multivariable calculus, linear algebra, basic knowledge of optimisation, statistics including regression and preferrably logistic regression.