EVO ONO 模型加速

本页导航

EVO ONO 模型加速#

介绍#

FastDeploy是一款全场景、易用灵活、极致高效的AI推理部署工具，支持云边端部署。提供超过 160+ Text，Vision， Speech和跨模态模型开箱即用的部署体验，并实现端到端的推理性能优化。包括物体检测、字符识别（OCR）、人脸、人像扣图、多目标跟踪系统、NLP、Stable Diffusion文图生成、TTS 等几十种任务场景，满足开发者多场景、多硬件、多平台的产业部署需求。

部署#

本文档使用NVIDIA NX作为测试平台，其他Jetson平台类似。

部署之前，设备需要提前安装tensorRT及cuda，安装方式请参考NVIDIA官方文档。

注意：以下操作都是在设备端进行

下载源码#

本文使用的commit版本：9689bf5fce94cba1aa732a30793916cde43b4ba1

https://github.com/PaddlePaddle/FastDeploy.git

配置&编译#

根据Jetpack版本，下载paddle后端推理，并解压到FastDeploy同级目录

https://www.paddlepaddle.org.cn/inference/v2.4/guides/install/download_lib.html#c

C++部署#

cd FastDeploy
mkdir build && cd build 

cmake .. -DBUILD_ON_JETSON=ON \ 
         -DENABLE_TRT_BACKEND=ON \
         -DENABLE_VISION=ON \ 
         -DENABLE_TEXT=ON \
         -DENABLE_PADDLE_BACKEND=ON -DPADDLEINFERENCE_DIRECTORY=${PWD}/../paddle_inference_install_dir \
         -DCMAKE_INSTALL_PREFIX=${PWD}/installed_fastdeploy
#开始编译
make -j4
#安装
make install

python部署#

cd FastDeploy/python
export BUILD_ON_JETSON=ON
export ENABLE_VISION=ON

# ENABLE_PADDLE_BACKEND & PADDLEINFERENCE_DIRECTORY为可选项
export ENABLE_PADDLE_BACKEND=ON
export PADDLEINFERENCE_DIRECTORY=/Download/paddle_inference_jetson

python setup.py build
python setup.py bdist_wheel
pip3 install fastdeploy_gpu_python-0.0.0-cp36-cp36m-linux_aarch64.whl

推理模型测试#

准备#

cd installed_fastdeploy && source fastdeploy_init.sh

YOLOV5#

模型和图片可以根据README里的说明下载

注意：这里使用的是paddle的模型，onnx模型有问题

python#

# GPU上使用TensorRT推理
python infer.py --model yolov5s_infer --image 000000014439.jpg --device gpu --use_trt True

c++#

编译

cd examples/vision/detection/yolov5/cpp/
#编译测试代码
mkdir build && cd build

cmake .. -DFASTDEPLOY_INSTALL_DIR=../../../../../
make

运行

./infer_demo yolov5s_infer 000000014439.jpg 3

结果如下：#

DetectionResult: [xmin, ymin, xmax, ymax, score, label_id]
669334,46.087219, 127.901680, 94.499176, 0.856947, 0
721054,81.559006, 198.065811, 167.143372, 0.854261, 0
751434,40.270676, 395.942841, 83.398239, 0.830164, 0
751617,82.657181, 298.819305, 170.920868, 0.828556, 0
086304,112.830048, 592.226196, 276.320038, 0.783002, 0
641418,57.056015, 382.677429, 114.379959, 0.777170, 0
803772,112.910446, 613.015442, 201.026489, 0.766368, 0
868683,38.758179, 346.338654, 79.359100, 0.758027, 0
737396,89.913757, 504.320709, 285.370178, 0.717283, 0
462357,45.325516, 199.852371, 61.371841, 0.545427, 0
531250,151.546875, 38.812500, 173.625000, 0.537435, 24
980316,44.242645, 367.510651, 95.554779, 0.515579, 0
954941,47.224487, 178.287460, 60.941376, 0.496691, 0
140625,86.109375, 403.484375, 342.812500, 0.449054, 33
218750,153.437500, 102.375000, 174.234375, 0.394597, 24
406250,122.562500, 101.718750, 155.343750, 0.358583, 56
718750,117.343750, 59.531250, 152.937500, 0.317215, 24
265625,134.703125, 87.390625, 153.843750, 0.297299, 24
765625,134.781250, 41.828125, 153.421875, 0.269623, 24
328796,14.773834, 472.708252, 34.129822, 0.265332, 0

YOLOV7#

模型和图片可以根据README里的说明下载

python#

编译

python infer.py --model yolov7.onnx --image 000000014439.jpg --device gpu --use_trt True

c++ 编译

cd examples/vision/detection/yolov7/cpp/
#编译测试代码
mkdir build && cd build

cmake .. -DFASTDEPLOY_INSTALL_DIR=../../../../../
make

运行

./infer_demo yolov7.onnx 000000014439.jpg 2
#使用paddle模型
./infer_demo yolov7_infer 000000014439.jpg 3

结果如下：

DetectionResult: [xmin, ymin, xmax, ymax, score, label_id]
634705,88.168289, 298.606628, 169.180908, 0.894812, 0
397614,87.049408, 505.852631, 285.574371, 0.892970, 0
979309,112.990097, 594.087708, 271.906555, 0.887881, 0
902000,45.589203, 127.782860, 93.685974, 0.885139, 0
076599,43.947617, 366.672729, 97.737900, 0.860225, 0
047516,81.687592, 198.846161, 165.900269, 0.856663, 0
723083,58.837402, 381.935852, 114.391418, 0.852691, 0
998444,38.783173, 347.394501, 80.100067, 0.850448, 0
129486,39.812363, 395.328766, 84.123154, 0.831067, 0
015625,81.828125, 609.968750, 342.750000, 0.823380, 33
765564,113.130280, 612.558289, 193.391388, 0.762796, 0
318138,117.764587, 64.740952, 153.909332, 0.751199, 0
953125,150.718750, 38.109375, 172.796875, 0.629195, 24
281250,121.968750, 106.593750, 156.250000, 0.617437, 56
509720,47.386780, 178.708725, 61.201477, 0.532830, 0
968750,135.203125, 84.312500, 154.921875, 0.516204, 24
483490,44.773804, 199.905548, 61.227783, 0.512023, 0
765625,152.078125, 119.953125, 168.484375, 0.425604, 24
039804,125.777222, 8.523237, 171.827026, 0.398506, 0
960571,81.092270, 296.699524, 110.481308, 0.319030, 0
447540,15.560196, 471.513275, 33.871872, 0.282174, 0
453125,168.343750, 617.531250, 202.843750, 0.260488, 33

resnet#

编译：

cd examples/vision/classification/resnet/cpp
 #编译测试代码
mkdir build && cd build
cmake .. -DFASTDEPLOY_INSTALL_DIR=../../../../../
make

运行测试：

#使用onnx模型
./infer_demo resnet50.onnx ILSVRC2012_val_00000010.jpeg 2

结果如下：

ClassifyResult( label_ids: 332, scores: 0.825349, )

优化建议#

初次运行demo加载时间比较长，可以加入缓存文件，减少加载时间：

int main(int argc, char* argv[]) {
  google::ParseCommandLineFlags(&argc, &argv, true);
  auto option = fastdeploy::RuntimeOption();
  //修改option，减少加载时间
  option.trt_option.serialize_file = "./picodet.trt";
  if (!CreateRuntimeOption(&option)) {
    PrintUsage();
    return -1;
  }

  auto model = fastdeploy::vision::headpose::FSANet(FLAGS_model, "", option);
  if (!model.Initialized()) {
    std::cerr << "Failed to initialize." << std::endl;
    return -1;
  }