{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "# 2. Training a Pytorch Lighning model\n", "\n", "In this notebook, we show the training of a simple CNN model using Pytorch Lightning. \n", "We first start with data, then define the model, and finally train it for a HAR task." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Creating KuHar LightningDataModule\n", "\n", "In order to train a model, we must first create a `LightningDataModule`, that will define the data loaders for training, validation and test.\n", "Here, we will use the Standartized KuHar data. Therefore, the data directory may looks like this:\n", "\n", "```\n", "KuHar/\n", " test.csv\n", " train.csv\n", " validation.csv\n", "```\n", "\n", "The `train.csv` file may look like this:\n", "\n", "| accel-x-0 | accel-x-1 | accel-y-0 | accel-y-1 | ... | standard activity code |\n", "|-----------|-----------|-----------|-----------|------|------------------------|\n", "| 0.502123 | 0.02123 | 0.502123 | 0.502123 | ... | 0 |\n", "| 0.6820123 | 0.02123 | 0.502123 | 0.502123 | ... | 0 |\n", "| 0.498217 | 0.00001 | 1.414141 | 3.141592 | ... | 1 |\n", "\n", "As each CSV file contains windowed time signals of two 3-axial sensors, we may use the `MultiModalSeriesCSVDataset` class to handle this data structure.\n", "After it, we must create a `LightningDataModule`, that will define the data loaders for training, validation and test. \n", "The implementation of `LightningDataModule` may look like the snippet below:\n", "\n", "```python\n", "import lightning as L\n", "from torch.utils.data import DataLoader\n", "from ssl_tools.data.datasets import MultiModalSeriesCSVDataset\n", "\n", "class HARDataModule(L.LightningDataModule):\n", " def __init__(self, data_path: Path, batch_size: int):\n", " super().__init__()\n", " self.data_path = data_path\n", " self.batch_size = batch_size\n", " \n", " def train_dataloader(self):\n", " dataset = MultiModalSeriesCSVDataset(self.data_path / 'train.csv')\n", " return DataLoader(dataset, batch_size=self.batch_size, shuffle=True)\n", " \n", " ...\n", "```" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Faciliting the creation of the LightningDataModule with MultiModalHARSeriesDataModule\n", "\n", "If your directory is organized like the one above, the CSVs are a collection of time-windows of signals, and the `LightningDataModule` implementation may looks like the one above, you can use the `MultiModalHARSeriesDataModule` to create a `LightningDataModule` easily for you.\n", "The `train_dataloader` method will use `train.csv`, `val_dataloader` will use `validation.csv` and `test_dataloader` will use `test.csv` to create the `MultiModalSeriesCSVDataset` and encapsulate into `DataLoader`.\n", "\n", "To create a `MultiModalHARSeriesDataModule`, we must pass:\n", "\n", "- `data_path`: the path to the directory containing the CSV files (`train.csv`, `validation.csv` and `test.csv`). We use `standardized_balanced/KuHar` in this case;\n", "- `feature_prefixes`: the prefixes of the features in the CSV files. In this case, we have `accel-x`, `accel-y`, `accel-z`, `gyro-x`, `gyro-y` and `gyro-z`;\n", "- `batch_size`: the batch size for the data loaders; and\n", "- `num_workers`: the number of workers for the data loaders. Essentially, the number of parallel processes to load the data.\n", "\n", "All data loader will share the passed parameters, such as `batch_size`, `num_workers`, and `feature_prefixes`." ] }, { "cell_type": "code", "execution_count": 1, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "MultiModalHARSeriesDataModule(data_path=/workspaces/hiaac-m4/ssl_tools/data/standartized_balanced/KuHar, batch_size=64)" ] }, "execution_count": 1, "metadata": {}, "output_type": "execute_result" } ], "source": [ "from ssl_tools.data.data_modules.har import MultiModalHARSeriesDataModule\n", "\n", "data_path = \"/workspaces/hiaac-m4/ssl_tools/data/standartized_balanced/KuHar/\"\n", "\n", "data_module = MultiModalHARSeriesDataModule(\n", " data_path=data_path,\n", " feature_prefixes=(\"accel-x\", \"accel-y\", \"accel-z\", \"gyro-x\", \"gyro-y\", \"gyro-z\"),\n", " label=\"standard activity code\",\n", " features_as_channels=True,\n", " batch_size=64,\n", ")\n", "data_module" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "We can test the dataloaders by getting the first batch of each one. Let's do it (only for`train_dataloader`)!. \n", "\n", "> **NOTE**: We use the data_module.train_dataloader() method to get the data loader for the training set. Note that the `.setup()` method must be called before getting the data loaders. If you don't call it, the data loaders will not be created. However, when used to train a model, the Pytorch Lightning `Trainer.fit()` method will automatically call the `.setup()` method for you. So, we put it here just to show how to fetch a data from `train_dataloader` and check if it is working." ] }, { "cell_type": "code", "execution_count": 4, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Inputs shape: torch.Size([64, 6, 60]), Targets shape: torch.Size([64])\n" ] } ], "source": [ "data_module.setup(\"fit\") # We just put it here to test.\n", " # When training a model, the Trainer will \n", " # call this method.\n", "\n", "train_dataloader = data_module.train_dataloader()\n", "\n", "# Pick the first batch to inspect. As batch size is 64, we will have 64 samples.\n", "# Note that dataloader only implement iterator protocol, \n", "# so we can use next() to fetch one batch.\n", "batch = next(iter(train_dataloader))\n", "# Each batch is a 2-element tuple:\n", "# First element is a Tensor with 64 input samples\n", "# and the second is a Tensor with 64 labels.\n", "inputs, targets = batch\n", "\n", "# (B, C, T) = (Batch size, Channels, Time steps) = (64, 6, 60)\n", "print(f\"Inputs shape: {inputs.shape}, Targets shape: {targets.shape}\")" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Training a simple model\n", "\n", "We will create a simple 1D CNN Pytorch Lightning model using the `Simple1DConvNetwork`. The model will be trained to classify the activities in KuHar dataset. \n", "\n", "Pytorch Lightning models must implement the `forward` method, `training_step` and `configure_optimizers` methods. \n", "Also, the `__init__` method is used to define the model.\n", "The `forward` method is the same as the Pytorch `forward` method. \n", "The `training_step` method is the method that will be called for each batch of data during the training. It should return the loss of the batch.\n", "The `configure_optimizers` method is the method that will define the optimizer to be used during the training.\n", "\n", "The `Simple1DConvNetwork` is a simple 1D CNN model, that has 3 convolutional layers and 2 fully connected layers. \n", "It is trained using the `Adam` optimizer and the `CrossEntropyLoss` loss function.\n", "\n", "Besides that, Lightning models implemented in this framework, usually logs the training and validation losses." ] }, { "cell_type": "code", "execution_count": 5, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "Simple1DConvNetwork(\n", " (loss_func): CrossEntropyLoss()\n", " (features): Sequential(\n", " (0): Conv1d(6, 64, kernel_size=(5,), stride=(1,))\n", " (1): ReLU()\n", " (2): Dropout(p=0.5, inplace=False)\n", " (3): Conv1d(64, 64, kernel_size=(5,), stride=(1,))\n", " (4): ReLU()\n", " (5): Dropout(p=0.5, inplace=False)\n", " (6): Conv1d(64, 64, kernel_size=(5,), stride=(1,))\n", " (7): ReLU()\n", " )\n", " (classifier): Sequential(\n", " (0): Dropout(p=0.5, inplace=False)\n", " (1): Linear(in_features=3072, out_features=128, bias=True)\n", " (2): ReLU()\n", " (3): Dropout(p=0.5, inplace=False)\n", " (4): Linear(in_features=128, out_features=6, bias=True)\n", " )\n", ")" ] }, "execution_count": 5, "metadata": {}, "output_type": "execute_result" } ], "source": [ "from ssl_tools.models.nets.convnet import Simple1DConvNetwork\n", "\n", "model = Simple1DConvNetwork(\n", " input_shape=(6,60), # (The number of input channels, input size of FC layers)\n", " num_classes=6, # The number of output classes\n", " learning_rate=1e-3, # The learning rate of the Adam optimizer\n", ")\n", "\n", "model" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "To train a Lightning model using Pytorch Lightning, we must create a `Trainer` and call the `fit` method. The `Trainer` is responsible for training the model. \n", "It has several parameters, such as the number of epochs, the number of GPUs/CPUs to use, *etc*. \n", "\n", "We will train our model using the already defined dataloader. \n", "The `fit` method will be responsible for training the model using the training and validation data loaders. \n", "After training, we will test the model using the test data loader and Trainer's `test` method.\n", "\n", "Here, the training will run for 300 epochs (`max_epochs`) and will use only 1 (`devices`) GPU (`accelerator`)." ] }, { "cell_type": "code", "execution_count": 6, "metadata": {}, "outputs": [ { "name": "stderr", "output_type": "stream", "text": [ "GPU available: True (cuda), used: True\n", "TPU available: False, using: 0 TPU cores\n", "IPU available: False, using: 0 IPUs\n", "HPU available: False, using: 0 HPUs\n", "LOCAL_RANK: 0 - CUDA_VISIBLE_DEVICES: [0,1]\n", "\n", " | Name | Type | Params\n", "------------------------------------------------\n", "0 | loss_func | CrossEntropyLoss | 0 \n", "1 | features | Sequential | 43.1 K\n", "2 | classifier | Sequential | 394 K \n", "------------------------------------------------\n", "437 K Trainable params\n", "0 Non-trainable params\n", "437 K Total params\n", "1.749 Total estimated model params size (MB)\n" ] }, { "data": { "application/vnd.jupyter.widget-view+json": { "model_id": "59a3f2507e874233b3fcea3038ad1e81", "version_major": 2, "version_minor": 0 }, "text/plain": [ "Sanity Checking: | | 0/? [00:00┏━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━━━━┓\n", "┃ Test metric DataLoader 0 ┃\n", "┡━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━━━━┩\n", "│ test_acc 0.8333333134651184 │\n", "│ test_loss 1.9901254177093506 │\n", "└───────────────────────────┴───────────────────────────┘\n", "\n" ], "text/plain": [ "┏━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━━━━┓\n", "┃\u001b[1m \u001b[0m\u001b[1m Test metric \u001b[0m\u001b[1m \u001b[0m┃\u001b[1m \u001b[0m\u001b[1m DataLoader 0 \u001b[0m\u001b[1m \u001b[0m┃\n", "┡━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━━━━┩\n", "│\u001b[36m \u001b[0m\u001b[36m test_acc \u001b[0m\u001b[36m \u001b[0m│\u001b[35m \u001b[0m\u001b[35m 0.8333333134651184 \u001b[0m\u001b[35m \u001b[0m│\n", "│\u001b[36m \u001b[0m\u001b[36m test_loss \u001b[0m\u001b[36m \u001b[0m│\u001b[35m \u001b[0m\u001b[35m 1.9901254177093506 \u001b[0m\u001b[35m \u001b[0m│\n", "└───────────────────────────┴───────────────────────────┘\n" ] }, "metadata": {}, "output_type": "display_data" }, { "data": { "text/plain": [ "[{'test_loss': 1.9901254177093506, 'test_acc': 0.8333333134651184}]" ] }, "execution_count": 7, "metadata": {}, "output_type": "execute_result" } ], "source": [ "trainer.test(model, data_module)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Using any other set from data module" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "And if we want to test the model using the validation data loader, we also can use the `trainer.test` method, but passing the `val_dataloader`. \n", "Remember that as we are not passing a `LightningDataModule` to the `test` method, but a `DataLoader`, we must call `setup` method." ] }, { "cell_type": "code", "execution_count": 8, "metadata": {}, "outputs": [ { "name": "stderr", "output_type": "stream", "text": [ "LOCAL_RANK: 0 - CUDA_VISIBLE_DEVICES: [0,1]\n" ] }, { "data": { "application/vnd.jupyter.widget-view+json": { "model_id": "bd31957d8c5a40bfa0624948e306396e", "version_major": 2, "version_minor": 0 }, "text/plain": [ "Testing: | | 0/? [00:00┏━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━━━━┓\n", "┃ Test metric DataLoader 0 ┃\n", "┡━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━━━━┩\n", "│ test_acc 0.5962441563606262 │\n", "│ test_loss 14.916933059692383 │\n", "└───────────────────────────┴───────────────────────────┘\n", "\n" ], "text/plain": [ "┏━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━━━━┓\n", "┃\u001b[1m \u001b[0m\u001b[1m Test metric \u001b[0m\u001b[1m \u001b[0m┃\u001b[1m \u001b[0m\u001b[1m DataLoader 0 \u001b[0m\u001b[1m \u001b[0m┃\n", "┡━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━━━━┩\n", "│\u001b[36m \u001b[0m\u001b[36m test_acc \u001b[0m\u001b[36m \u001b[0m│\u001b[35m \u001b[0m\u001b[35m 0.5962441563606262 \u001b[0m\u001b[35m \u001b[0m│\n", "│\u001b[36m \u001b[0m\u001b[36m test_loss \u001b[0m\u001b[36m \u001b[0m│\u001b[35m \u001b[0m\u001b[35m 14.916933059692383 \u001b[0m\u001b[35m \u001b[0m│\n", "└───────────────────────────┴───────────────────────────┘\n" ] }, "metadata": {}, "output_type": "display_data" }, { "data": { "text/plain": [ "[{'test_loss': 14.916933059692383, 'test_acc': 0.5962441563606262}]" ] }, "execution_count": 8, "metadata": {}, "output_type": "execute_result" } ], "source": [ "data_module.setup(\"fit\")\n", "validation_dataloader = data_module.val_dataloader()\n", "trainer.test(model, validation_dataloader)" ] } ], "metadata": { "kernelspec": { "display_name": "Python 3", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.10.6" } }, "nbformat": 4, "nbformat_minor": 2 }