The Maivin Detect application has a ByteTrack implementation included. This can be used to track objects and smooth out bounding boxes. The ByteTrack implementation uses a linear, constant-velocity Kalman filter to predict the location and size of boxes.
Configuration
The tracker can be enabled and configured by editing the configuration file for the detect application, which is located at "/etc/default/detect". Included is the relevant section with default values:
# Turn on the ByteTrack tracker. This is useful for smoothing bounding boxes
# across frames, and for associating multiple detections over time to a single
# object
TRACK = "false"
# The number of seconds a tracked object can be missing for before being removed
# from tracking.
TRACK_EXTRA_LIFESPAN = "2.0"
# The high confidence threshold for the ByteTrack algorithm
TRACK_HIGH_CONF = "0.7"
# The tracking iou threshold for box association. Higher values will require
# boxes to have higher IOU to the predicted track location to be associated
TRACK_IOU = "0.25"
# The tracking update factor. Higher update factor will mean less smoothing
# but more rapid response to change. Use values from 0.0 to 1.0. Values outside this
# range will cause unexpected behaviour
TRACK_UPDATE = "0.25"
The tracker can be turned on by setting TRACK to "true".
TRACK = "true"
When tracking is enabled, there are several parameters which can be used to change the performance of the tracker
-
TRACK_EXTRA_LIFESPAN
This controls how long a track can be missing its object before the track is removed. This is a floating point number in seconds. A larger number means the tracked object can be missing for a longer period of time before being removed. -
TRACK_HIGH_CONF
This controls the high confidence threshold used by ByteTrack to create new tracks and to receive priority matching. Effectively, this supersedes the Detection threshold by not starting tracks with objects detected beneath this threshold. However, detected objects between the high-confidence and detection thresholds can be used to extend an already existing track. This value should be higher than the value used for THRESHOLD. This is a floating point number between 0.0 and 1.0. -
TRACK_IOU
This controls how closely a detected box needs to match the predicted box from the tracker's Kalman filter for the detected box to become associated with the track. For smaller and/or faster moving boxes, a smaller value of TRACK_IOU is needed, but can introduce additional false associations due to the lower IoU value. If an objects keeps creating new tracks and changing it's ID, then it is recommend to lower the TRACK_IOU. If objects which are not the same becomes merged into the same track, then it is recommended to increase the TRACK_IOU. This is a floating point number between 0.0 and 1.0. -
TRACK_UPDATE
This controls the responsiveness of the tracks. A higher update rate will make the box respond faster to changes in the associated box's change in velocity, but will also reduce the smoothness of the track. For situations that have boxes with very large changes in velocity, a higher update factor is recommended. For models which have noisy boxes, a lower updated factor is recommended to achieve smoother tracks. This is a floating point number between 0.0 and 1.0.
In addition to these settings, when using the tracker, it is recommended to lower the score threshold of the model so that the tracker can receive low score boxes. The threshold can be reduced using the THRESHOLD configuration option.
Examples Configurations
Some example configurations which were found to be useful for the included peopledetect and facedetect models:
peopledetect.rtm
The people are usually slower moving, so a larger value of TRACK_IOU = "0.25" is used. The peopledetect model generally has scores around 0.7 or higher for well detected people, so we used TRACK_THRESHOLD= "0.7" for the tracker. We choose a lower THRESHOLD = "0.2" for the model. The boxes produced by the people detect model have a little bit of flicker and noise so a lower value was used for TRACK_UPDATE = "0.25" so that the tracker would be smooth out the boxes.
# This is the configuration file for the detect systemd service file. When
# running systemctl start detect the service will use these configurations.
# If running detect directly, you must continue to use the command-line options.
# A model is required for the detect application.
MODEL = "/usr/share/detect/peopledetect.rtm"
# An optional decoder model can be used for two-stage models. Leave commented
# out unless required, cannot be empty if defined.
# DECODER_MODEL = ""
# Enables publishing the visualization message
VISUALIZATION = "true"
# Annotation labels can display various information. This controls the string
# annotation published to the visualization topic. The following values are supported.
#
# index - Label index integer as a string.
# score - Floating-point score (0..1) as a string.
# label - Label string, fallback to index if model does not contain labels.
# label-score - The string will be the the combination of Label and Score
# from above using the following format: "Label Score".
# track - The UUID of the tracked object. This is only available when tracking
# is enabled. Otherwise will show the score.
LABELS = "label"
# Score threshold sets the minimum detection score before a bounding box is
# generated for the inferred object.
THRESHOLD = "0.2"
# Detection IOU controls the minimum overlap for merging boxes during NMS. A
# larger number will produce more boxes with some overlap while a smaller number
# will generate fewer boxes.
IOU = "0.45"
# Maximum number of boxes which can be generated.
MAX_BOXES = "50"
# The label offset is required for certain models to account for differences
# in background class handling relative to the labels.txt. It should usually
# be zero but some odd configurations will sometimes require 1 or -1 for offset.
LABEL_OFFSET = "0"
# The model can be run using different computation engines. The default engine
# is the i.MX 8M Plus NPU. Other options are the CPU or the GPU.
ENGINE = "NPU"
# Turn on the ByteTrack tracker. This is useful for smoothing bounding boxes
# across frames, and for associating multiple detections over time to a single
# object
TRACK = "true"
# The number of seconds a tracked object can be missing for before being removed
# from tracking.
TRACK_EXTRA_LIFESPAN = "2.0"
# The high confidence threshold for the ByteTrack algorithm
TRACK_HIGH_CONF = "0.7"
# The tracking iou threshold for box association. Higher values will require
# boxes to have higher IOU to the predicted track location to be associated
TRACK_IOU = "0.25"
# The tracking update factor. Higher update factor will mean less smoothing
# but more rapid response to change. Use values from 0.0 to 1.0. Values outside this
# range will cause unexpected behaviour
TRACK_UPDATE = "0.25"
# The NPU is very slow loading initial models because of graph creation times.
# This option will enable the graph caching which significantly speeds up load
# times for models. If you encounter issues set to 0 to disable.
# https://support.deepviewml.com/hc/en-us/articles/4422857692557-NPU-Model-Cache
VIV_VX_ENABLE_CACHE_GRAPH_BINARY = "0"
# Control the NPU graph cache storage location.
VIV_VX_CACHE_BINARY_GRAPH_DIR = "/var/cache"
facedetect.rtm
The faces are usually smaller and move faster relative to the box size, so a smaller value of TRACK_IOU = "0.1" is used. The facedetect model generally has scores around 0.85 or higher for well detected faces, so we used TRACK_THRESHOLD= "0.85" for the tracker and a THRESHOLD = "0.4" for the model. The boxes produced by the facedetect model are stable and less noisy, so a high value was used for TRACK_UPDATE = "0.75" so that the tracker would be more responsive to sudden changes in box movement.
# This is the configuration file for the detect systemd service file. When
# running systemctl start detect the service will use these configurations.
# If running detect directly, you must continue to use the command-line options.
# A model is required for the detect application.
MODEL = "/usr/share/detect/facedetect.rtm"
# An optional decoder model can be used for two-stage models. Leave commented
# out unless required, cannot be empty if defined.
# DECODER_MODEL = ""
# Enables publishing the visualization message
VISUALIZATION = "true"
# Annotation labels can display various information. This controls the string
# annotation published to the visualization topic. The following values are supported.
#
# index - Label index integer as a string.
# score - Floating-point score (0..1) as a string.
# label - Label string, fallback to index if model does not contain labels.
# label-score - The string will be the the combination of Label and Score
# from above using the following format: "Label Score".
# track - The UUID of the tracked object. This is only available when tracking
# is enabled. Otherwise will show the score.
LABELS = "label"
# Score threshold sets the minimum detection score before a bounding box is
# generated for the inferred object.
THRESHOLD = "0.4"
# Detection IOU controls the minimum overlap for merging boxes during NMS. A
# larger number will produce more boxes with some overlap while a smaller number
# will generate fewer boxes.
IOU = "0.45"
# Maximum number of boxes which can be generated.
MAX_BOXES = "50"
# The label offset is required for certain models to account for differences
# in background class handling relative to the labels.txt. It should usually
# be zero but some odd configurations will sometimes require 1 or -1 for offset.
LABEL_OFFSET = "0"
# The model can be run using different computation engines. The default engine
# is the i.MX 8M Plus NPU. Other options are the CPU or the GPU.
ENGINE = "NPU"
# Turn on the ByteTrack tracker. This is useful for smoothing bounding boxes
# across frames, and for associating multiple detections over time to a single
# object
TRACK = "true"
# The number of seconds a tracked object can be missing for before being removed
# from tracking.
TRACK_EXTRA_LIFESPAN = "2.0"
# The high confidence threshold for the ByteTrack algorithm
TRACK_HIGH_CONF = "0.85"
# The tracking iou threshold for box association. Higher values will require
# boxes to have higher IOU to the predicted track location to be associated
TRACK_IOU = "0.1"
# The tracking update factor. Higher update factor will mean less smoothing
# but more rapid response to change. Use values from 0.0 to 1.0. Values outside this
# range will cause unexpected behaviour
TRACK_UPDATE = "0.75"
# The NPU is very slow loading initial models because of graph creation times.
# This option will enable the graph caching which significantly speeds up load
# times for models. If you encounter issues set to 0 to disable.
# https://support.deepviewml.com/hc/en-us/articles/4422857692557-NPU-Model-Cache
VIV_VX_ENABLE_CACHE_GRAPH_BINARY = "0"
# Control the NPU graph cache storage location.
VIV_VX_CACHE_BINARY_GRAPH_DIR = "/var/cache"
Comments
0 comments
Please sign in to leave a comment.