Optimizers API

User Guide

Optimizers API Documentation

Overview
Procedures

Overview

FTorch provides an torch_optim derived type exposing the functionality of the torch::optim C++ class. The interface is designed to be familiar to Fortran programmers, whilst retaining strong similarity with torch::optim class and the torch.optim Python package. This includes default values for optional tuning parameters.

This module enables Fortran programmers to use familiar optimizers like SGD, Adam, and AdamW through a Fortran-friendly interface to perform optimization steps to tensors.

The torch_optim type holds a pointer to a PyTorch optimizer object in C++ (implemented using c_ptr from the iso_c_binding intrinsic module). This avoids unnecessary data copies and provides direct access to Torch's optimization capabilities.

Procedures

Constructors

FTorch currently provides three optimizer constructors, each corresponding to a popular PyTorch optimization algorithm. All of these are created by a call to a specific subroutine that takes as inputs a torch_optim type to assign the created optimizer to, and an array of torch_tensor objects to optimize.

In addition they can each take a number of optional tuning parameters with defaults set to match those of PyTorch. Details of these appear in the API pages linked below.

SGD (Stochastic Gradient Descent) is implemented through torch_optim_SGD
Adam (Adaptive Moment Estimation) is implemented through torch_optim_Adam
AdamW (Decoupled Weight Decay Adam) is implemented through torch_optim_AdamW

Core Methods

Whilst different subroutines exist for creating different kinds of optimizer, they all have common core methods that are used.

Zero Gradients

torch_optim_zero_grad clears the gradients of all parameters managed by the optimizer. This should be called at the beginning of each iteration during training. The method is implemented as a procedure bound to the torch_optim type and can be called as: optimizer%zero_grad().

Step

torch_optim_step performs a single optimization step, updating all parameters managed by the optimizer based on their gradients. It should be called after backpropogation has been performed following a forward pass during a training iteration.

The method is implemented as a procedure bound to the torch_optim type and can be called as: optimizer%step().

Deallocation

torch_optim_delete deallocates the memory associated with an optimizer. It is implemented as the finalizer of the torch_optim type so will automatically be called when the optimizer goes out of scope. See the Fortran-lang page on object-oriented Fortran for further details about finalization.

Usage

The typical usage pattern for FTorch optimizers follows the standard PyTorch training loop:

type(torch_tensor) :: tensor, output, target_data, loss
type(torch_optim) :: optimizer

! Create optimizer - here we use SGD
call torch_optim_SGD(optimizer, [tensor], learning_rate=0.01D0)

! Training loop
do i = 1, n_epochs

  ! Zero gradients
  call optimizer%zero_grad()

  ! Forward pass and loss calculation
  call my_forward_pass(tensor, output)
  call torch_loss_mse(output, target, loss)

  ! Backward pass
  call torch_tensor_backward(loss)

  ! Optimization step
  call optimizer%step()

end do

For more details on backpropogation and autograd, and use of optimizers as part of the training process, see the online training documentation.

Note

For a concrete example of how to use the various optimizer methods as part of a training loop see the optimizers worked example.