PyTorch에서 CUDA 메모리를 지우는 방법

2024-07-27

torch.cuda.empty_cache() 사용

torch.cuda.empty_cache() 함수는 사용하지 않는 모든 CUDA 메모리를 비웁니다. 이는 메모리 누수를 방지하는 데 도움이 됩니다.

torch.cuda.empty_cache()

del 사용

더 이상 사용하지 않는 텐서는 del 키워드를 사용하여 삭제할 수 있습니다. 이렇게 하면 텐서가 차지하는 CUDA 메모리가 해제됩니다.

tensor = torch.cuda.FloatTensor(1024, 1024)

# ...

del tensor

gc.collect() 사용

Python의 가비지 컬렉터는 더 이상 사용되지 않는 객체를 자동으로 삭제합니다. gc.collect() 함수를 호출하면 가비지 컬렉터를 수동으로 실행하여 CUDA 메모리를 포함한 사용하지 않는 모든 메모리를 해제할 수 있습니다.

import gc

gc.collect()

CUDA 컨텍스트 관리

torch.cuda.set_device() 함수를 사용하여 CUDA 컨텍스트를 설정할 수 있습니다. 컨텍스트 외부에서 생성된 텐서는 CPU 메모리에 저장됩니다. 컨텍스트를 종료하면 컨텍스트 내에서 생성된 모든 텐서가 삭제되고 CUDA 메모리가 해제됩니다.

torch.cuda.set_device(0)

# ...

# 컨텍스트 종료

torch.no_grad() 사용

추론 단계에서는 torch.no_grad() 블록을 사용하여 계산 중에 기울기를 추적하지 않도록 설정할 수 있습니다. 이렇게 하면 메모리 사용량을 줄일 수 있습니다.

with torch.no_grad():
    # ...

CUDA_VISIBLE_DEVICES 환경 변수 설정

CUDA_VISIBLE_DEVICES 환경 변수를 사용하여 사용 가능한 CUDA 장치를 설정할 수 있습니다. 이 변수를 설정하여 사용하지 않는 장치를 비활성화하면 해당 장치의 메모리를 사용하지 않도록 할 수 있습니다.

export CUDA_VISIBLE_DEVICES=0

PyTorch Lightning 사용

PyTorch Lightning은 PyTorch를 사용하여 딥 러닝 모델을 훈련하고 배포하는 데 도움이 되는 라이브러리입니다. PyTorch Lightning에는 Trainer 클래스가 있으며, 이 클래스에는 clear_memory_on_every_epoch 속성이 있습니다. 이 속성을 True로 설정하면 각 에포크가 끝날 때 CUDA 메모리가 자동으로 비워집니다.

from pytorch_lightning import Trainer

trainer = Trainer(clear_memory_on_every_epoch=True)

Apex 사용

Apex는 PyTorch를 위한 확장 라이브러리입니다. Apex에는 amp.GradScaler 클래스가 있으며, 이 클래스는 훈련 중에 메모리 사용량을 줄이는 데 도움이 됩니다.

from apex import amp

scaler = amp.GradScaler()

# ...

scaler.scale(optimizer.step())

사용하지 않는 패키지 제거

사용하지 않는 패키지는 pip uninstall 명령을 사용하여 제거할 수 있습니다. 이렇게 하면 패키지가 차지하는 메모리를 확보할 수 있습니다.

pip uninstall <package_name>

시스템 재부팅

모든 방법을 시도해도 CUDA 메모리 문제가 해결되지 않으면 시스템을 재부팅해야 할 수도 있습니다.

예제 코드

import torch

# 메모리 할당
tensor1 = torch.cuda.FloatTensor(1024, 1024)
tensor2 = torch.cuda.FloatTensor(1024, 1024)

# 메모리 확인
print(torch.cuda.memory_allocated())

# 메모리 비우기
torch.cuda.empty_cache()

# 메모리 확인
print(torch.cuda.memory_allocated())

import torch

# 메모리 할당
tensor = torch.cuda.FloatTensor(1024, 1024)

# 메모리 확인
print(torch.cuda.memory_allocated())

# 메모리 비우기
del tensor

# 메모리 확인
print(torch.cuda.memory_allocated())

import torch
import gc

# 메모리 할당
tensor = torch.cuda.FloatTensor(1024, 1024)

# 메모리 확인
print(torch.cuda.memory_allocated())

# 메모리 비우기
gc.collect()

# 메모리 확인
print(torch.cuda.memory_allocated())

import torch

# 컨텍스트 설정
torch.cuda.set_device(0)

# 메모리 할당
tensor = torch.cuda.FloatTensor(1024, 1024)

# 메모리 확인
print(torch.cuda.memory_allocated())

# 컨텍스트 종료
with torch.no_grad():
    pass

# 메모리 확인
print(torch.cuda.memory_allocated())

import torch

# 메모리 할당
tensor = torch.cuda.FloatTensor(1024, 1024)

# 메모리 확인
print(torch.cuda.memory_allocated())

# 계산 중에 기울기 추적 비활성화
with torch.no_grad():
    # ...

# 메모리 확인
print(torch.cuda.memory_allocated())

import torch

# 환경 변수 설정
export CUDA_VISIBLE_DEVICES=0

# 메모리 할당
tensor = torch.cuda.FloatTensor(1024, 1024)

# 메모리 확인
print(torch.cuda.memory_allocated())

from pytorch_lightning import Trainer

# 모델 정의
class Model(torch.nn.Module):
    def __init__(self):
        super().__init__()
        self.linear = torch.nn.Linear(10, 1)

    def forward(self, x):
        return self.linear(x)

# Trainer 설정
trainer = Trainer(clear_memory_on_every_epoch=True)

# 모델 학습
trainer.fit(Model())

from apex import amp

# 모델 정의
class Model(torch.nn.Module):
    def __init__(self):
        super().__init__()
        self.linear = torch.nn.Linear(10, 1)

    def forward(self, x):
        return self.linear(x)

# Apex 설정
scaler = amp.GradScaler()

# 모델 학습
for epoch in range(10):
    # ...

    # GradScaler 사용
    scaler.scale(optimizer.step())