この夏(2022年)に, 下記のようなスペックのWindows PCを導入した.
CPU : AMD Ryzen7 5800X
RAM : 16GB
OS : Windows 11 Pro
SSD + HDD : 500GB + 6TB
etc : 水冷クーラー
前回, WSL2によるUbuntu 20.04環境の構築およびCUDA, cuDNNの導入について記した.
mirai-tec.hatenablog.com
[Ubuntu環境]
OS : Ubuntu 20.04 on Windows
GPU : GTX 1060-6GB
NVIDIA Driver : 516. 94
CUDA : 11.7
cuDNN : 8.5
その後, PyTorch(v1.12)やTensorFlow(v2.10)の仮想環境をminiconda3で作成し試していたところ, PyTorchではGPUを認識しているのだが, TensorFlowではGPUが認識されず, CPUのみで動作していることが判明した.
結局, 原因はcuDNNのバージョン不一致だったのだが, 少し調べたことをまとめておく.
1. PyTorch[2]
PyTorchはGPUをどのように認識しているか, 以下の項目について確認してみた.
- PyTorchでGPUが使用可能か : torch.cuda.is_available()
- GPUデバイスの数 : torch.cuda.device_count()
- デフォルトのGPU番号 : torch.cuda.current_device()
- GPUの名称 : torch.cuda.get_device_name()
- CUDA Compute Capability : torch.cuda.get_device_capability()
$ python Python 3.9.13 (main, Aug 25 2022, 23:26:10) [GCC 11.2.0] :: Anaconda, Inc. on linux Type "help", "copyright", "credits" or "license" for more information. >>> import torch >>> torch.__version__ '1.12.1+cu116' >>> torch.cuda.is_available() True >>> torch.cuda.device_count() 1 >>> torch.cuda.current_device() 0 >>> torch.cuda.get_device_name() 'NVIDIA GeForce GTX 1060 6GB' >>> torch.cuda.get_device_capability() (6, 1) >>>
PyTorchでは, GPU/CUDA情報を正しく取得できているようだ.
2. TensorFlow
次に, 問題のTensorFlowの方も確認していく.
・デバイス情報のリスト:tensorflow.python.client.device_lib.list_local_devices()
$ python Python 3.9.13 (main, Aug 25 2022, 23:26:10) [GCC 11.2.0] :: Anaconda, Inc. on linux Type "help", "copyright", "credits" or "license" for more information. >>> import tensorflow as tf 2022-09-24 17:43:19.321941: I tensorflow/core/platform/cpu_feature_guard.cc:193] This TensorFlow binary is optimized with oneAPI Deep Neural Network Library (oneDNN) to use the following CPU instructions in performance-critical operations: AVX2 FMA To enable them in other operations, rebuild TensorFlow with the appropriate compiler flags. 2022-09-24 17:43:19.662008: E tensorflow/stream_executor/cuda/cuda_blas.cc:2981] Unable to register cuBLAS factory: Attempting to register factory for plugin cuBLAS when one has already been registered 2022-09-24 17:43:20.409488: W tensorflow/stream_executor/platform/default/dso_loader.cc:64] Could not load dynamic library 'libnvinfer.so.7'; dlerror: libnvinfer.so.7: cannot open shared object file: No such file or directory 2022-09-24 17:43:20.409554: W tensorflow/stream_executor/platform/default/dso_loader.cc:64] Could not load dynamic library 'libnvinfer_plugin.so.7'; dlerror: libnvinfer_plugin.so.7: cannot open shared object file: No such file or directory 2022-09-24 17:43:20.409571: W tensorflow/compiler/tf2tensorrt/utils/py_utils.cc:38] TF-TRT Warning: Cannot dlopen some TensorRT libraries. If you would like to use Nvidia GPU with TensorRT, please make sure the missing libraries mentioned above are installed properly. >>> tf.__version__ '2.10.0' >>> from tensorflow.python.client import device_lib >>> device_lib.list_local_devices() 2022-09-24 17:44:10.216362: I tensorflow/core/platform/cpu_feature_guard.cc:193] This TensorFlow binary is optimized with oneAPI Deep Neural Network Library (oneDNN) to use the following CPU instructions in performance-critical operations: AVX2 FMA To enable them in other operations, rebuild TensorFlow with the appropriate compiler flags. 2022-09-24 17:44:10.390243: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:966] could not open file to read NUMA node: /sys/bus/pci/devices/0000:07:00.0/numa_node Your kernel may have been built without NUMA support. 2022-09-24 17:44:10.450432: W tensorflow/stream_executor/platform/default/dso_loader.cc:64] Could not load dynamic library 'libcudnn.so.8'; dlerror: libcudnn.so.8: cannot open shared object file: No such file or directory 2022-09-24 17:44:10.450470: W tensorflow/core/common_runtime/gpu/gpu_device.cc:1934] Cannot dlopen some GPU libraries. Please make sure the missing libraries mentioned above are installed properly if you would like to use GPU. Follow the guide at https://www.tensorflow.org/install/gpu for how to download and setup the required libraries for your platform. Skipping registering GPU devices... [name: "/device:CPU:0" device_type: "CPU" memory_limit: 268435456 locality { } incarnation: 3809127935636748211 xla_global_id: -1 ] >>>
確かに, CPUしか認識していないようだ.
そもそも, tensorflowをimportした時点で,
E tensorflow/stream_executor/cuda/cuda_blas.cc:2981] Unable to register cuBLAS factory: Attempting to register factory for plugin cuBLAS when one has already been registered
って, エラーが出ている. ただ, これはcuBLAS factoryを登録しようとしたら, すでに登録済みだと言ってる感じ.
ネットで調べてもTensorflow 2.9.2にダウグレードするように書いてある[3]くらいで, あまりエラーに触れているものはないようだ.
一旦, Tensorflowは2.9.2にダウングレードすることに.
あと, 調べている中で, TensorFlow 2.10のCUDA/cuDNNのバージョン(CUDA 11.2/cuDNN 8.1)とインストールしたCUDA/cuDNNのバージョン(CUDA 11.7/cuDNN 8.5)があっていないことに気づいた.
(以前は結構このあたりのバージョンを気にしていたのだが, 今回はすっかり忘れていたのだ.)
ちなみに, TensorFlowの各バージョンとCUDA/cuDNNバージョンの関係はこちらを参考に.
そこで, まずはcuDNNを8.5から8.1にダウングレードすることに.
もし, これだけでダメな場合はCUDAも11.2にする.
NVIDIAのcuDNN Archive[4]の「Download cuDNN v8.1.1 (Feburary 26th, 2021), for CUDA 11.0,11.1 and 11.2」から「cuDNN Runtime Library for Ubuntu20.04 x86_64 (Deb)」をダウンロードし, インストールした.
$ python Python 3.9.13 (main, Aug 25 2022, 23:26:10) [GCC 11.2.0] :: Anaconda, Inc. on linux Type "help", "copyright", "credits" or "license" for more information. >>> import tensorflow as tf >>> tf.__version__ '2.9.2' >>> from tensorflow.python.client import device_lib >>> device_lib.list_local_devices() 2022-09-24 18:07:14.128639: I tensorflow/core/platform/cpu_feature_guard.cc:193] This TensorFlow binary is optimized with oneAPI Deep Neural Network Library (oneDNN) to use the following CPU instructions in performance-critical operations: AVX2 FMA To enable them in other operations, rebuild TensorFlow with the appropriate compiler flags. 2022-09-24 18:07:14.251695: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:961] could not open file to read NUMA node: /sys/bus/pci/devices/0000:07:00.0/numa_node Your kernel may have been built without NUMA support. 2022-09-24 18:07:14.272856: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:961] could not open file to read NUMA node: /sys/bus/pci/devices/0000:07:00.0/numa_node Your kernel may have been built without NUMA support. 2022-09-24 18:07:14.273250: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:961] could not open file to read NUMA node: /sys/bus/pci/devices/0000:07:00.0/numa_node Your kernel may have been built without NUMA support. 2022-09-24 18:07:14.792393: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:961] could not open file to read NUMA node: /sys/bus/pci/devices/0000:07:00.0/numa_node Your kernel may have been built without NUMA support. 2022-09-24 18:07:14.793182: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:961] could not open file to read NUMA node: /sys/bus/pci/devices/0000:07:00.0/numa_node Your kernel may have been built without NUMA support. 2022-09-24 18:07:14.793213: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1616] Could not identify NUMA node of platform GPU id 0, defaulting to 0. Your kernel may not have been built with NUMA support. 2022-09-24 18:07:14.793569: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:961] could not open file to read NUMA node: /sys/bus/pci/devices/0000:07:00.0/numa_node Your kernel may have been built without NUMA support. 2022-09-24 18:07:14.793636: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1532] Created device /device:GPU:0 with 4598 MB memory: -> device: 0, name: NVIDIA GeForce GTX 1060 6GB, pci bus id: 0000:07:00.0, compute capability: 6.1 [name: "/device:CPU:0" device_type: "CPU" memory_limit: 268435456 locality { } incarnation: 17517294614262033770 xla_global_id: -1 , name: "/device:GPU:0" device_type: "GPU" memory_limit: 4821352448 locality { bus_id: 1 links { } } incarnation: 2792029488055711110 physical_device_desc: "device: 0, name: NVIDIA GeForce GTX 1060 6GB, pci bus id: 0000:07:00.0, compute capability: 6.1" xla_global_id: 416903419 ] >>>
「Your kernel may have been built without NUMA support.」とは言われているが, とりあえずGPUを認識するようになった.
最後に, jupyter notebook上で, TensorFlowを使ってMNISTの学習を行ってみた.
$ nvidia-smi Sat Sep 24 20:41:17 2022 +-----------------------------------------------------------------------------+ | NVIDIA-SMI 515.65.01 Driver Version: 516.94 CUDA Version: 11.7 | |-------------------------------+----------------------+----------------------+ | GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC | | Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. | | | | MIG M. | |===============================+======================+======================| | 0 NVIDIA GeForce ... On | 00000000:07:00.0 On | N/A | | 44% 43C P2 57W / 120W | 5819MiB / 6144MiB | 58% Default | | | | N/A | +-------------------------------+----------------------+----------------------+ +-----------------------------------------------------------------------------+ | Processes: | | GPU GI CI PID Type Process name GPU Memory | | ID ID Usage | |=============================================================================| | 0 N/A N/A 80 G /Xwayland N/A | | 0 N/A N/A 1463 G /chrome N/A | | 0 N/A N/A 1512 G /chrome N/A | | 0 N/A N/A 1756 C /python3.9 N/A | +-----------------------------------------------------------------------------+
GPUを使って, 問題なく学習していそう.
TensorFlowをインストールする場合には, CUDA/cuDNNのバージョンには注意しましょう!!
----
[1] WSL2を用いたUbuntu環境を構築 - みらいテックラボ
[2] PyTorchでGPU情報を確認(使用可能か、デバイス数など)| note.nkmk.me
[3] TensorFlow 2.10 causes trouble! #47・google-research/multinerf
[4] cuDNN Archive | NVIDIA Developer