0%

tensorflow gpu 安装(ubuntu22.04)

ubuntu22.04 安装 tensorflow-gpu 记录,主要是 NVIDIA 驱动麻烦,这里做个记录。

准备环境

先安装好好 python3 以及 python3-pip:

1
apt install python3 python3-pip

根据自己网络情况,是否使用清华源:

设为默认

升级 pip 到最新的版本 (>=10.0.0) 后进行配置:

1
2
pip install --upgrade pip
pip config set global.index-url https://pypi.tuna.tsinghua.edu.cn/simple

如果您到 pip 默认源的网络连接较差,临时使用本镜像站来升级 pip

1
python -m pip install -i https://pypi.tuna.tsinghua.edu.cn/simple --upgrade pip

参考自: PyPI 镜像使用帮助 https://mirrors.tuna.tsinghua.edu.cn/help/pypi/

安装系统驱动

参考 : 如何在Ubuntu 22.04 LTS上安装NVIDIA驱动程序

查看设备

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
$ ubuntu-drivers devices
== /sys/devices/pci0000:00/0000:00:02.0/0000:03:00.0 ==
modalias : pci:v0000000000000012313sv00000000000000123130000000000
vendor : NVIDIA Corporation
model : GM107 [GeForce GTX 750 Ti]
driver : nvidia-driver-515-server - distro non-free
driver : nvidia-driver-470 - distro non-free
driver : nvidia-driver-390 - distro non-free
driver : nvidia-driver-450-server - distro non-free
driver : nvidia-driver-515 - distro non-free recommended
driver : nvidia-driver-510-server - distro non-free
driver : nvidia-driver-510 - distro non-free
driver : nvidia-driver-418-server - distro non-free
driver : nvidia-driver-470-server - distro non-free
driver : xserver-xorg-video-nouveau - distro free builtin

安装

如果您对推荐版本感到满意,请使用以下命令:

1
ubuntu-drivers autoinstall

如果想安装指定版本(例如:nvidia-driver-515):

1
sudo apt install nvidia-driver-515

装好之后,重启系统,则可以加载驱动,使用显卡

目前(2022年07月26日2204 的库不全,需要使用 2004 的库

export last_public_key=3bf863cc.pub

wget https://developer.download.nvidia.com/compute/cuda/repos/ubuntu2004/x86_64/cuda-ubuntu2004.pin

sudo mv cuda-ubuntu2004.pin /etc/apt/preferences.d/cuda-repository-pin-600

sudo apt-key adv –fetch-keys https://developer.download.nvidia.com/compute/cuda/repos/ubuntu2004/x86_64/${last_public_key}

sudo add-apt-repository “deb https://developer.download.nvidia.com/compute/cuda/repos/ubuntu2004/x86_64/ /“

sudo apt-get update

sudo apt-get install libcudnn8

sudo apt-get install libcudnn8-dev

2204版本:

1
2
3
4
5
6
7
8
wget https://developer.download.nvidia.cn/compute/cuda/repos/ubuntu2204/x86_64/cuda-ubuntu2204.pin
sudo mv cuda-ubuntu2204.pin /etc/apt/preferences.d/cuda-repository-pin-600
sudo apt-key adv --fetch-keys https://developer.download.nvidia.com/compute/cuda/repos/ubuntu2204/x86_64/3bf863cc.pub
sudo add-apt-repository "deb https://developer.download.nvidia.com/compute/cuda/repos/ubuntu2204/x86_64/ /"
sudo apt-get update
sudo apt-get install cuda
sudo apt-get install libcudnn8
sudo apt-get install libcudnn8-dev

参考:https://stackoverflow.com/questions/66977227/could-not-load-dynamic-library-libcudnn-so-8-when-running-tensorflow-on-ubun

安装 tensorflow-gpu

参考 : https://www.tensorflow.org/install/pip

1
pip install --upgrade tensorflow-gpu

最新的pip仓库里面,可能会将tensorflow-gpu包删除,可以直接安装tensorflow,但是需要注意是否使用了GPU

1
pip install --upgrade tensorflow

测试:

1
python -c "import tensorflow as tf;print(tf.reduce_sum(tf.random.normal([1000, 1000])))"

如果输出Could not load dynamic library 'libcudart.so.*'; 表示安装 cuda 失败了!!!

这时候可以考虑安装11的cuda版本:

1
apt install cuda-11-8

然后再试一次

成功的输出:

1
2
3
4
5
6
7
8
9
10
11
12
root@desktop:~# python3 -c "import tensorflow as tf;print(tf.reduce_sum(tf.random.normal([1000, 1000])))"
2022-07-26 13:49:48.585551: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:975] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2022-07-26 13:49:48.748572: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:975] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2022-07-26 13:49:48.748921: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:975] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2022-07-26 13:49:48.754087: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:975] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2022-07-26 13:49:48.754395: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:975] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2022-07-26 13:49:48.754683: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:975] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2022-07-26 13:49:49.974459: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:975] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2022-07-26 13:49:49.974697: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:975] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2022-07-26 13:49:49.974890: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:975] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2022-07-26 13:49:49.975066: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1532] Created device /job:localhost/replica:0/task:0/device:GPU:0 with 1441 MB memory: -> device: 0, name: NVIDIA GeForce GTX 750 Ti, pci bus id: 0000:03:00.0, compute capability: 5.0
tf.Tensor(2178.302, shape=(), dtype=float32)