Material for learning how to perform machine learning using PyTorch
This course was designed by Jack Atkinson (@jatkinson1000) and Jim Denholm (@jdenholm) of ICCS.
The material has been delivered at both the ICCS and NCAS summer schools.
All materials, including slides and videos, are available such that individuals can cover the course in their own time.
The key learning objective from this workshop could be simply summarised as:
'Provide the ability to develop ML models in PyTorch'.
However, more specifically we aim to:
With regards to specific ML content we cover:
The slides for the workshop can be viewed at the following links:
They are generated from markdown using quarto.
The raw markdown and html files can be found in the
Videos from past workshops may be useful if you are following along independently. These can be found on the ICCS youtube channel under the 2023 Summer School materials.
The practical element of the course consists of 4 exercises demonstrating how to use both ANNs and CNNs to perform classification and regression.
The exercises take the form of partially completed jupyter notebooks and can be found in the
We recommend the local install approach, especially if you forked the repository, as it is the easiest way to keep a copy of your work and push back to github.
However, if you experience issues with the installation process or are unfamiliar with the terminal/installation process there is the option to run the notebooks in Google Colab.
Navigate to the location you want to install this repository on your system and clone via https by running:
git clone https://github.com/Cambridge-ICCS/practical-ml-with-pytorch.git
This will create a directory
Please note that if you have a GitHub account and want to preserve any work you do we suggest you first fork the repository and then clone your fork. This will allow you to push your changes and progress from the workshop back up to your fork for future reference.
Before installing any Python packages it is important to first create a Python virtual environment. This provides an insulated environment inside which we can install Python packages without polluting the operating systems's Python environment.
If you have never done this before don't worry: it is *very* good practise, especially when you are working on multiple projects, and easy to do.
python3 -m venv MLvenv
This will create a directory called `MLvenv` containing software for the virtual environment.
To activate the environment run:
source MLvenv/bin/activate
You can now work on python from within this isolated environment, installing packages as you wish without disturbing your base system environment.
When you have finished working on this project run:
deactivate
to deactivate the venv and return to the system python environment.
You can always boot back into the venv as you left it by running the activate command again.
It is now time to install the dependencies for our code, for example PyTorch. The project has been packaged with a pyproject.toml so can be installed in one go.
From within the root directory in a active virtual environment run:
pip install .
This will download the relevant dependencies into the venv as well as setting up the datasets that we will be using in the course.
Whilst the workshop should install and run with the latest versions of python libraries, it has been tested with following versions for major dependencies: torch 2.0.1, pandas 2.1.0, palmerpenguins 0.1.4, ipykernel 6.25.2, matplotlib 3.8.0, notebook 7.0.3.
From the current directory, launch the Jupyter notebook server:
jupyter notebook
This command should then point you to the right location within your browser to use the notebook, typically http://localhost:8888/.
The following step is sometimes useful if you're having trouble with your Jupyter notebook finding the virtual environment. Before launching the Jupyter notebook run:
python -m ipykernel install --user --name=MLvenv
Running on Colab is useful as it allows you to access GPU resources.
To launch the notebooks in Google Colab click the following links for each of the exercises:
Notes:
To run the notebooks in binder click the following link:
Notes:
Worked solutions for all of the exercises can be found in the
If you were working on Colab you can open the worked solutions using the following links:
To get the most out of the session we assume a basic understanding in a few areas and for you to do some preparation in advance. Expected knowledge is outlined below, along with resources for reading if you are unfamiliar.
Basic mathematics knowledge:
Neural Networks:
The course will be taught in python using PyTorch.
Whilst no prior knowledge of PyTorch is expected we assume users are familiar with the basics of Python3.
This includes:
Unless participating via Colab or binder you will be expected to know how to:
The workshop from the 2022 ICCS Summer School should provide the neccessary knowledge.
We have linked suitable applications for windows in the above lists.
However, you may wish to refer to Windows' getting-started with python information for a complete guide to getting set up on a Windows system.
If you require assistance or further information with any of these please reach out to us before a training session.
This workshop has been published in JOSE, the Journal of Open Source Education with DOI: 10.21105/jose.00239). The paper materials can be found in JOSE_paper/ directory.
If you re-use or build on this material please cite this publication using the information in the CITATION.cff file.
@article{Atkinson2024, doi = {10.21105/jose.00239}, url = {https://doi.org/10.21105/jose.00239}, year = {2024}, publisher = {The Open Journal}, volume = {7}, number = {76}, pages = {239}, author = {Jack Atkinson and Jim Denholm}, title = {Practical machine learning with PyTorch}, journal = {Journal of Open Source Education} }
The code materials in this project are licensed under the MIT license.
The teaching materials are licensed under CC BY-NC-SA 4.0.
If you spot an issue with the materials please let us know by opening an issue on GitHub clearly describing the problem.
If you are able to fix an issue that you spot, or an existing open issue please get in touch by commenting on the issue thread.
Contributions from the community are welcome. To contribute back to the repository please first fork it, make the neccessary changes to fix the problem, and then open a pull request back to this repository clerly describing the changes you have made. We will then preform a review and merge once ready.
If you would like support using these materials, adapting them to your needs, or delivering them please get in touch either via GitHub or via ICCS.