Online

Online: Practical machine learning for spatial data

Practical machine learning for spatial data

This course gives a practical introduction to machine learning for spatial data, both to shallow learning and deep learning models, especially convolutional neural networks (CNN).

The course consists of lectures and hands-on exercises. Exercises will be done with different Python libraries. Scikit-learn will be used for the shallow learning exercises in Notebooks or on local PCs. Keras and solaris with pytorch will be used for deep learning exercises on Puhti-AI.

Learning outcome

After the course the participants should have the skills and knowledge needed to begin applying machine learning and deep learning for different tasks and utilizing the GPU resources available at CSC for training and deploying their own neural networks.

Prerequisities

  • Basics of geoinformatics: data types, formats.
  • Basics of Python. The course will include a fair amount of reading and writing Python code, so you should be able to follow Python syntax. If you need to refresh your Python skills you can go through the materials of Helsinki University GeoPython course.
  • Very basic Linux commands: cd, ls, mv, cp, rm, chmod, less, tail, echo, mkdir, pwd. If unfamiliar take a look for example at LinuxSurvival first two modules.

The course is similar the Practical machine learning for spatial data course kept in autumn 2019.

Preliminary program

Monday 9.11.2020

9:00-12:00

Introduction

Lecture: Introduction to machine learning
Lecture: Introduction to exercises, preparing spatial data for machine learning
Exercise 1: Preparing vector data for regression
Exercise 2: Preparing raster data and labels for clustering and classification

12.00-13:00

Lunch break

13:00-16:15

Lecture: Shallow machine learning models
Exercise 3: Shallow regression with scikit-learn
Exercise 4: Image segmentation using k-means with scikit-learn Exercise 5: Image classification using shallow classifiers, grid search with scikit-learn

Tuesday 10.11.2020

9:00-12:00

Lecture: Introduction to deep learning models Lecture: Fully connected neural networks
Lecture: Puhti GPUs and batch jobs
Exercise 6: Fully connected regressor with keras
Exercise 7: Fully connected classifier with keras

12.00-13:00

Lunch break

13:00-16:15

Lecture: Convolutional neural networks (CNN)
Exercise 8: CNN based image segmentation with keras

Wednesday 11.11.2020

9:00-12:00

Lecture: GIS software supporting machine learning for spatial data
Lecture: Introduction to solaris
Exercise 9: CNN based image segmentation with solaris
Conclusions

12.00-13:00

Lunch break

13:00-15:00

Reservation for delays in timetable.

We will have short breaks also during morning and afternoon sessions.

Course exercise materials (under development): https://github.com/csc-training/geocomputing/tree/master/machineLearning

Technical requirements for participant local computers:

Minimum:

  • Zoom
  • For working with Puhti:
    • o In Windows Putty and WinSCP/FileZilla or some other similar tool
    • o In Mac and Linux, FileZilla if moving files with scp is not familiar for you.
  • ArcGIS or QGIS or some other GIS tool for viewing the files (GeoTiff, JPG2000, Shape, GeoPackage).
  • Spyder, PyCharm or some other Python IDE, or Notepad++ or some other text editor. For Windows users it is important that the tools has possibility to change end-of-line character type from Windows to Linux. (Basic Notepad and Word are not suitable.)

Optional:

Location: Zoom