πŸ›  The Learnable Handwriter Complete Tutorial

← Back to main page

Need Help?
For any questions regarding data preparation/training or application to palaeographical data contact: matenia03[at]gmail[dot]com. For questions related to the architecture, contact: y.siglidis[at]gmail[dot]com

This page provides a comprehensive guide for using the Learnable Handwriter. The Learnable Handwriter is an adaptation of The Learnable Typewriter: A Generative Approach to Text Analysis for palaeographical morphological analysis.The tutorial covers preparing data, training and fine-tuning models, as well as best practices.

Table of Contents

Section 1: Data Preparation & Best Practices

Data Format

Your dataset must follow a specific structure for the system to work correctly. You will need to create a datasets/name-of-your-dataset directory containing extracted images of text lines as well as their annotations in an annotation.json, as illustrated in the image below.

Data structure diagram showing folder hierarchy and annotation format
Required data structure: folders, images, and annotation.json content
So more specifically:

Image Directory Structure:

datasets/<DATASET-NAME>/
        β”œβ”€β”€ annotation.json
        └── images/
            β”œβ”€β”€ <image_id>.png
            └── ...

Annotation Format:

The annotation.json file must contain entries in this exact format:

{
    "<image_id>": {
      "split": "train",                    # {"train" or "val" if you don't want to train on these data but want to finetune on them}
      "label": "your transcription text",  # The ground truth text
      "script": "Script_Type_Name",        # Script category (hand or script type)
      "page": "manuscript_page.jpg"        # (optional) Source page reference, or other metadata
    },
    ...
  }
Attention: Null lines will not be skipped by the Learnable Handwriter and will cause the training to fail. Make sure there are no empty label values in your annotation.json.

πŸ‘‰ You can download and follow the instructions in this notebook to automatically create a Learnable Handwriter-compatible dataset.

Best Practices for Data Preparation

Section 2: Installation & Training

Installation Setup

No GPU? You can still train models using our Google Colab notebook.

Training in Colab Open Training Colab

Or perform inference on pre-trained and fine-tuned models:

Minimal Inference in Colab Open Inference Colab

⚠️ Platform Compatibility: macOS is not supported due to compatibility issues with PyTorch's affine transforms. We recommend running on a Linux system with CUDA support. GPU usage is strongly advised for training.

Installation Steps

After cloning the repository and entering the base folder:

  1. Create a conda environment:
    conda create --name lhr python=3.10
    conda activate lhr
  2. Install PyTorch:

    Follow the official PyTorch installation guide for your system. Ensure CUDA compatibility if using GPU.

  3. Install requirements:
    python -m pip install -r requirements.txt
    Training Visualization: Weights & Biases (wandb) is automatically installed for training process visualization and monitoring.

Training on Provided Dataset

To get started quickly with our reference dataset from the paper β€œAn Interpretable Deep Learning Approach for Morphological Script Type Analysis (IWCP 2024)” :

  1. Download and extract datasets.zip
  2. Run the training script:
    python scripts/train.py iwcp_south_north.yaml

Fine-tuning Options

1. Script-based Fine-tuning (Northern and Southern Textualis):

python scripts/finetune_scripts.py -i runs/iwcp_south_north/train/ \
  -o runs/iwcp_south_north/finetune/ \
  --mode g_theta --max_steps 2500 --invert_sprites \
  --script Northern_Textualis Southern_Textualis \
  -a datasets/iwcp_south_north/annotation.json \
  -d datasets/iwcp_south_north/ --split train

2. Document-based Fine-tuning:

python scripts/finetune_docs.py -i runs/iwcp_south_north/train/ \
  -o runs/iwcp_south_north/finetune/ \
  --mode g_theta --max_steps 2500 --invert_sprites \
  -a datasets/iwcp_south_north/annotation.json \
  -d datasets/iwcp_south_north/ --split all

Training on Your Custom Data

Configuration Setup

The configuration system uses YAML files to define hyperparameters, dataset paths, and training settings. Each experiment requires both a dataset configuration and a main configuration file.

  1. Create dataset config: configs/dataset/<DATASET_ID>.yaml
    DATASET-TAG:                 
      path: <DATASET-NAME>/      
      sep: ''                    # Character separator in annotation
      space: ' '                 # Space representation in annotation
  2. Create hyperparameter config: configs/<DATASET_ID>.yaml
    For concrete structure reference, see the provided config file for our iwcp_south_north experiment.
Configuration files structure and hierarchy
Configuration files structure for training setup

Training Process

  1. Initial Training:
    python scripts/train.py <CONFIG_NAME>.yaml
  2. Fine-tuning by Script Type:
    python scripts/finetune_scripts.py -i runs/<MODEL_PATH>/ \
      -o <OUTPUT_PATH>/ --mode g_theta --max_steps <int> \
      --invert_sprites --script '<SCRIPT_NAME>' \
      -a <DATASET_PATH>/annotation.json \
      -d <DATASET_PATH>/ --split <train or all>
  3. Fine-tuning by Individual Documents:
    python scripts/finetune_docs.py -i runs/<MODEL_PATH>/ \
      -o <OUTPUT_PATH>/ --mode g_theta --max_steps <int> \
      --invert_sprites -a <DATASET_PATH>/annotation.json \
      -d <DATASET_PATH>/ --split <train or all>

πŸ“Œ Technical Implementation Notes

⚠️ The existing disambiguation_table will be used as DEFAULT if not changed. Ensure this is appropriate for your script type before training.