TIES4200 Natural Language Processing (5 op)

Opinnon taso:
Syventävät opinnot
Arviointiasteikko:
0-5
Suorituskieli:
englanti
Vastuuorganisaatio:
Informaatioteknologian tiedekunta
Opetussuunnitelmakaudet:
2024-2025, 2025-2026, 2026-2027, 2027-2028

Kuvaus

Modern Natural Language (NLP) techniques including high-profile models like BERT and GPT use (large-scale) language modelling to create foundational models adaptable to different tasks. This course gives a language modelling -focussed introduction to NLP. Practical exercises in the course include implementing scaled-down versions of the algorithms used by these models as well as making use of high-level NLP libraries. Students will complete a final project of their choice.


The course includes

  • Foundational material on rule-based and traditional statistical approaches to NLP, their drawbacks and limitations, and how they relate to current language modelling-based methods 
  • An introduction to neural sequence models, building up to attention and the transformer architecture 
  • Text classification and regression with pretrained language models
  • Material and exercises on the evaluation of NLP systems and language models 
  • The link between linear algebra and text: (subword) tokenisation and encoding/decoding 
  • Topical material on emerging techniques and issues which may include one or more of: Explainability, Reinforcement Learning from Human Feedback for InstructGPT/ChatGPT style models; curation of massive training corpora for large language models; and prompt engineering

Osaamistavoitteet

On completion of the course the student will:

  • Have an understanding of why current systems have converged upon language modelling as a key objective
  • Have some knowhow about how to build NLP systems based on existing library code
  • Be able to modify and reimplement algorithms underlying generative language models
  • Be able to empirically evaluate the performance of NLP systems
  • Have gained some skills for working on and presenting practical projects involving NLP

Esitietojen kuvaus

Basic/intermediate level programming skills, basic knowledge of Python. High-school level mathematics skills, introductory linear algebra such as vectors, matrices, and their products.

Oppimateriaalit

Speech and Language Processing (3rd ed. draft) Dan Jurafsky and James H. Martin

Suoritustavat

Tapa 1

Kuvaus:
Taught course completed through assignments and project work.
Arviointiperusteet:
Assignments and project work.
Valitaan kaikki merkityt osat
Suoritustapojen osat
x

Osallistuminen opetukseen (5 op)

Tyyppi:
Osallistuminen opetukseen
Arviointiasteikko:
0-5

Opetus