Introduction
TFRecord is a format created by TensorFlow. This format acts like a container for storing single or multiple instance from the the dataset. The advantage of this format is that information is binary within the files. To create and read TFRrecord files can be confusing at times and that is why we have created this guide to explain how to train ModelPack on TFRecord datasets.
Dataset Schema
Each TFRecord file contains a serialization schema. That schema should be defined in both sides, creation and reading the dataset. If the schema has any disagreement on either side, the reading process will not succeed.
For ModelPack we have created a simple schema capable of handling object detection datasets. Basically, the schema is defined by a dictionary that maps the name of each attribute in the instance with their values. Values are always of type tf.train.Feature and the inner type of the feature should be created in the constructor.
feature = {
'image': tf.train.Feature(bytes_list=tf.train.BytesList(...)),
'image_name': tf.train.Feature(bytes_list=tf.train.BytesList(...)),
'width': tf.train.Feature(int64_list=tf.train.Int64List(...)),
'height': tf.train.Feature(int64_list=tf.train.Int64List(...)),
'objects': tf.train.Feature(int64_list=tf.train.Int64List(...)),
'bboxes': tf.train.Feature(float_list=tf.train.FloatList(...))
}
- 'image': contains the serialized bytes-encoded image
- 'image_name': contains the serialized bytes-encoded name of the image in the disk
- 'width': contains the serialized int64 image width
- 'height': contains the serialized int64 image height
- 'objects': contains the serialized list of class ids within the dataset (int64 values)
- 'bboxes: contains the list of bounding boxes coordinates in yolo format for each bounding box (float32 values). For this particular case, the values are flattened, so they need to be reshaped after loading into (-1, 4)
Example 1: How to write dataset samples
A very common question is to know how to create a dataset in TFRecord format. It is not complicated at all but we need to be careful at the time of loading it because the writing schema has to match the reading schema.
import cv2
import numpy as np
import tensorflow as tf
import os
def _bytes_feature(value):
"""Returns a bytes_list from a string / byte."""
if isinstance(value, type(tf.constant(0))):
value = value.numpy() # BytesList won't unpack a string from an EagerTensor.
return tf.train.Feature(bytes_list=tf.train.BytesList(value=[value]))
image_path = "Any Image that could be readable for CV"
img = cv2.imread(image_path)
rgb_img = cv2.cvtColor(img, cv2.COLOR_BGR2RGB) # needs to be in RGB format, otherwise we have to do the transformation by the time we read it
height, width, _ = rgb_image.shape
bboxes = [[0.5, 0.5, 0.7, 0.7]] # big box in the center of the image (0.5, 0.5)
class_ids = [1] # person
bboxes = np.array(bboxes).flatten()
feature = {
'image': _bytes_feature(rg_image),
'image_name': _bytes_feature(os.path.basename(iamge_path)),
'width': tf.train.Feature(int64_list=tf.train.Int64List(value=[width])),
'height': tf.train.Feature(int64_list=tf.train.Int64List(value=[height])),
'objects': tf.train.Feature(int64_list=tf.train.Int64List(value=class_ids)),
'bboxes': tf.train.Feature(float_list=tf.train.FloatList(value=bboxes))
}
example_proto = tf.train.Example(features=tf.train.Features(feature=feature))
serialied_sample = example_proto.SerializeToString()
with tf.io.TFRecordWriter("path to the destination *.tfrecord file") as tfwriter:
tfwriter.write(serialized_sample)
Example 2: How to read dataset samples
To read the file is straightforward, basically the only thing is needed to be in agreement with the writing schema. Now, we go in the opposite direction and we must define the reading schema. This reading schema uses FixedLenFeature to specify single values while VarLenFeature prefix specifies list of values. As a parameter, schema attribute receive the type of the value. For those attributes of our instances that do not return a list by themselves (FixedLenFeatures), we need to append the [] definition along with the basic datatype. For the rest of them, just need to specify the type.
feature_description = {
"image": tf.io.FixedLenFeature([], tf.string),
"image_name": tf.io.FixedLenFeature([], tf.string),
"width": tf.io.FixedLenFeature([], tf.int64),
"height": tf.io.FixedLenFeature([], tf.int64),
"objects": tf.io.VarLenFeature(tf.int64),
"bboxes": tf.io.VarLenFeature(tf.float32),
}
sample = tf.io.parse_single_example(
example,
feature_description
)
In the code section above, sample is a dictionary indexed by the keys defined within the writing schema, and the way of accessing to these fields is python-based.
img = tf.io.decode_jpeg(sample['image']).numpy()
oH, oW, _ = img.shape # oH = sample['height], oW=sample['width']
labels = tf.sparse.to_dense(sample['objects']).numpy().astype(np.int32)
bboxes = tf.sparse.to_dense(sample['bboxes']).numpy().reshape(-1, 4).astype(np.float32)
Conclusions
In this guide we have explained how to properly write and read TFRecord files. Resulting TFRecord files in the format above can be used as input dataset for ModelPack stand-alone package during training.
Previous Step | Home | Next Step |
ModelPack Docker License | ModelPack Overview |
Comments
0 comments
Please sign in to leave a comment.