Installing TensorRT in Ubuntu Desktop

TensorRT is an optimization tool provided by NVIDIA that applies graph optimization and layer fusion, and finds the fastest implementation of a deep learning model. In other words, TensorRT will optimize our deep learning model so that we expect a faster inference time than the original model (before optimization), such as 5x faster or 2x faster. The bigger model we have, the bigger space for TensorRT to optimize the model. Furthermore, this TensorRT supports all NVIDIA GPU devices, such as 1080Ti, Titan XP for Desktop, and Jetson TX1, TX2 for embedded device.

There are at least two options to optimize a deep learning model using TensorRT, by using: (i) TF-TRT (Tensorflow to TensorRT), and (ii) TensorRT C++ API. In this post, we will specifically discuss how we can install and setup for the first option, which is TF-TRT. Here is the step-by-step process:

Install NVIDIA drivers, CUDA, CuDNN and Tensorflow GPU version
I will not discuss this step because there are already a bunch of tutorials in the internet for doing this. For my case, I use CUDA 9, CuDNN 7 and Tensorflow 1.12. Starting from Tensorflow 1.9.0, it already has TensorRT inside the tensorflow contrib, but some issues are encountered. Therefore, it is preferable to use the newest one (so far is 1.12 version).
Install TensorRT

Download the TensorRT local repo file that matches the Ubuntu version you are using.
Install TensorRT from the Debian local repo package.
Note: Before issuing the following commands, you’ll need to replace ubuntu1x04, cudax.x, trt4.x.x.x and yyyymmdd with your specific OS version, CUDA version, TensorRT version and package date (according to the TensorRT deb file you download before)

$ sudo dpkg -i
nv-tensorrt-repo-ubuntu1x04-cudax.x-trt5.x.x.x-ga-yyyymmdd_1–1_amd64.deb
$ sudo apt-key add /var/nv-tensorrt-repo-cudax.x-trt5.x.x.x-ga-yyyymmdd/7fa2af80.pub
$ sudo apt-get update
$ sudo apt-get install tensorrt 
$ sudo apt-get install uff-converter-tf

If using Python 2.7:
$ sudo apt-get install python-libnvinfer-dev
If using Python 3.x:
$ sudo apt-get install python3-libnvinfer-dev
Verify the installation
$ dpkg -l | grep TensorRT
You should see something similar to the following:

ii graphsurgeon-tf 5.0.2–1+cuda10.0 amd64 GraphSurgeon for TensorRT package
ii libnvinfer-dev 5.0.2–1+cuda10.0 amd64 TensorRT development libraries and headers
ii libnvinfer-samples 5.0.2–1+cuda10.0 amd64 TensorRT samples and documentation
ii libnvinfer5 5.0.2–1+cuda10.0 amd64 TensorRT runtime libraries
ii python-libnvinfer 5.0.2–1+cuda10.0 amd64 Python bindings for TensorRT
ii python-libnvinfer-dev 5.0.2–1+cuda10.0 amd64 Python development package for TensorRT
ii python3-libnvinfer 5.0.2–1+cuda10.0 amd64 Python 3 bindings for TensorRT
ii python3-libnvinfer-dev 5.0.2–1+cuda10.0 amd64 Python 3 development package for TensorRT
ii tensorrt 5.0.2.0–1+cuda10.0 amd64 Meta package of TensorRT
ii uff-converter-tf 5.0.2–1+cuda10.0 amd64 UFF converter for TensorRT package

Done!

For how we can optimize a deep learning model using TensorRT, you can follow this video series here: