ISEA2013 Data Engineering (5 op)
Opinnon taso:
Aineopinnot
Arviointiasteikko:
0-5
Suorituskieli:
englanti
Vastuuorganisaatio:
Informaatioteknologian tiedekunta
Opetussuunnitelmakaudet:
2026-2027, 2027-2028
Kuvaus
-
Osaamistavoitteet
By the end of the course, students will be able to:
- Choose an approach that fits the data and the goal. Explain whether a problem needs batch processing (periodic jobs) or streaming (continuous updates), and justify your choice based on data size, speed, and accuracy needs.
- Build a reliable data pipeline from end to end. Ingest raw data, clean and transform it, store the result, and deliver it to users or applications—adding checks and simple monitoring so that mistakes are caught early.
- Store data in formats that make work faster and safer. Pick between row‑based and column‑based layouts, use modern “table” structures common in data lakes and warehouses, and handle changes to the schema without breaking existing users.
- Make databases work for you, not against you. Describe how a database answers a query (for example, using indexes and joins), spot why a query is slow, and make at least one concrete change that speeds it up.
- Process data in real time without losing or double‑counting records. Design a simple streaming job that copes with late arrivals, system restarts, and duplicates so results remain correct.
- Support a small machine‑learning feature and prediction flow. Create features at scale, connect a basic prediction service to your data, and track simple health signals such as accuracy and response time.
- Improve performance and keep costs under control. Measure how your job uses memory, compute, and storage across a cluster; then apply one change that measurably reduces runtime or cost.
- Protect people’s data and document your choices. Apply access controls, remove or mask personal information when needed, and write a short “data sheet” that states assumptions, risks, and how to use the dataset responsibly.
- Collaborate effectively in a small team using issues, reviews, and clear repository structure; communicate results with concise writing, visuals, and short demonstrations. Reflect on ethical and societal impacts of data‑driven features and define responsible deployment criteria for your project.
Suoritustavat
Tapa 1
Arviointiperusteet:
Grade is based on completed assignments, self-evaluations, and on the evaluation student gives on the group-work.
Valitaan kaikki merkityt osat
Suoritustapojen osat
x
Osallistuminen opetukseen (5 op)
Tyyppi:
Osallistuminen opetukseen
Arviointiasteikko:
0-5
Suorituskieli:
englanti