Deep View VisionPack
Deep View VisionPack allows developers to deploy AI Vision Models created with eIQ Toolkit to any NXP i.MX 8M device or development platform with ease. By using available hardware acceleration of the SOC, the VisionPack provides you with a highly configurable, fully optimized, tested AI Vision pipeline right out of the box.
The AI Application Zoo in the Deep View support site also provides several projects, pretrained using a variety of interesting public datasets and tested with VisionPack to allow for quick evaluation of the ModelPack workflow and easy deployment to i.MX8M Plus processors and platforms.
For the latest on Deep View AI Middleware visit https://support.deepviewml.com where you can download various eIQ Portal example dataset projects, our Model Zoo, and various demo applications showing the capabilities of the i.MX8 8M Plus.
Prerequisites
- i.MX 8M Plus EVK (Maivin Guide Coming Soon!)
- NXP Yocto BSP 5.10.72 or newer.
- Refer to the loading Linux on SD card article for how to get the latest Linux BSP on your EVK.
- Camera connected to EVK - MINISASTOCSI
- HDMI Display connected to the EVK
Installation
Deep View VisionPack can be downloaded and installed using the following commands on any supported NXP i.MX8 EVK or compatible third-party i.MX8 platforms.
The following should be run from a terminal session on your i.MX8 target.
# wget https://deepviewml.com/vpk/visionpack-latest-armv8.sh
# sh visionpack-latest-armv8.sh
Accept the EULA and then you will find the VisionPack trial in the directory named visionpack-latest.
NOTE: You can also perform the installation on a Desktop Linux or Windows WSL system for convenient access to the documentation and sample source code. Keep in mind the libraries and pre-compiled binaries will only work on the actual i.MX8 target platforms.
Setup
Once VisionPack has been installed you need to source the environment which will configure the libraries and executables for your current session. This process should be done every time a new terminal or SSH session is run.
# source visionpack-latest/bin/env.sh
Optionally you may add the source visionpack-latest/bin/env.sh to your home directory's .bashrc to have it automatically configured on login.
The VisionPack is now installed, and can be used to run your application.
Samples
Sample applications and source code can be found in the samples folder while pre-compiled binaries are located in bin. We provide small sample datasets in the datasets folder along with ground truths for each image. Pre-trained models for testing are available in the models folder along with a README.txt describing the models and how to test them.
The following examples use the included ModelPack for Detection model trained on the COCO dataset.
Detect
The detect example demonstrates the minimal implementation to load a detection model and acquire bounding boxes from an image. It supports a wide range of detection models such as ModelPack, YuNet (face detection), YOLOv4, YOLOv5, YOLOX, CenterNet, SSD MobileNet, SSD ResNet, and others.
# detect models/modelpack-detection-coco.rtm datasets/coco128/000000000641.jpg
GStreamer
We provide a detect.sh script which creates a VisionPack detection pipeline using the gst-launch-1.0 command. It demonstrates using the GStreamer deepviewrt, boxdecode, and boxoverlay plugins from VisionPack.
This example will run the ModelPack for Detection using the camera attached to the target. The model is trained on COCO which includes people, so if you stand in front of the camera it should detect you.
# detect.sh models/modelpack-detection-coco.rtm
The `detect.sh` script can be examined to see how the GStreamer pipeline was described or call `detect.sh -h` for additional usage help.
Applications
We provide a collection of example applications which can be freely downloaded from the Deep View support site available at https://support.deepviewml.com/hc/en-us/sections/6739406464269-App-Zoo
Python Applications
There are a variety of python applications that are provided within VisionPack along with the associated models required for their execution. There are examples for people detection, face detection, human pose and head pose. These can be run as follows
# python3 peopledetect.py
# python3 facedetect.py
# python3 skeleton.py
# python3 headpose.py
Should there be an issue with accessing the camera, verify which device is used for the camera and then modify the line below in each of the python files to use the associated device (0, 1, 2, 3, etc).
stream = Camera(3, CameraType.Stream)
Validation
We provide a validation library which can measure the accuracy and timing benchmarks using a model on target. This validator works using a simple annotation format from the Darknet project whereby each image.jpg
should have a corresponding image.txt
file with one line per object using the following format defining the expected bounding box for the object.
label_index x y width height
The label_index is the 0-index of the label, not the string representation, then x,y are the coordinates to the center of the object, finally width and height represent the size of the bounding box. All coordinates and sizes must be normalized 0.0 to 1.0. For example an image which is 640x480 with an object in the middle which is 200x200 pixels would have a center point of 320,240 which when normalized would give 0.5,0.5 and width and height of 0.3125 by 0.416.
We provide a subset of 128 images from COCO using this same format in the datasets/coco128 folder. The following example demonstrates validating ModelPack trained on COCO on an NXP i.MX 8M Plus using the NPU.
python3 -m deepview.vaal.validator -e npu -n unsigned -d datasets/coco128 models/modelpack-detection-coco.rtm
VisionPack 1.1.16-23-gb8ea7fc-eval EVALUATION - Copyright 2022 Au-Zone Technologies
Reading annotation: 127: 100%|███████████████████████████████████████████████████████| 128/128 [00:00<00:00, 637.80it/s]
Computing Metrics 128: 100%|██████████████████████████████████████████████████████████| 128/128 [00:05<00:00, 25.34it/s]
===========================================================
VAAL Evaluation Summary
===========================================================
Metric | Value
-----------------------------------------------------------
mAP@0.5 | 45.80
mAP@0.75 | 44.09
mAP@0.5:0.95 | 41.06
Recall@0.5:0.95 | 24.96
-----------------------------------------------------------
Timings Report (milliseconds)
-----------------------------------------------------------
load image | avg: 9.98 min: 4.92 max: 16.03
inference | avg: 17.64 min: 17.57 max: 17.76
box decode | avg: 3.20 min: 2.62 max: 5.88
Documentation
The User Manuals for VAAL, VideoStream, and Deep View RT can be found in the doc folder and includes PDF and HTML versions of the manuals.
Comments
0 comments
Please sign in to leave a comment.