gpu_devel

The partition gpu_devel is designed for GPU code development, including tasks such as developing, testing, and debugging code. To ensure instant response time, It has relatively low per-user limits, as outlined in the table below:

name limit
walltime 6 hours
CPU core 4
memory 32 GiB
GPU 2

MIG

MIG (Multi-Instance GPU) is a technology that partitions a single physical GPU into multiple virtual instances, each with a smaller share of processing power and VRAM. These virtual GPUs, known as MIG instances, enable multiple tasks to run simultaneously on the same GPU, improving resource utilization while ensuring isolation and efficiency.

Misha’s gpu_devel partition includes one A100 node with four A100 cards. Two types of MIG instances are configured for those A100 cards:

  • a100.MIG.10gb – 10GB VRAM
  • a100.MIG.20gb – 20GB VRAM, approximately twice as fast as the a100.MIG.10gb instance

Each A100 GPU is partitioned into two a100.MIG.20gb instances and three a100.MIG.10gb instances, resulting in a total of 20 MIG instances in the gpu_devel partition.

Requesting a MIG instance is similar to requesting a standard GPU - you need to specify both the partition name and the GPU type. For example, to request one a100.MIG.20gb instance, use the following SLURM options:

 -p gpu_devel --gres=gpu:a100.MIG.20gb:1

Open OnDemand

In Open OnDemand, select gpu_devel for partition name, and then select a GPU type or leave it empty. Also, provide the number of GPUs (the default is 1). Usually, one is enough. Please don’t request more than you need.

Updated: