GenSynth Documentation

Using GenSynth Augment

GenSynth Augment lets GenSynth generate new synthetic data to improve model accuracy via data augmentation. The types of data synthesis generated by GenSynth Augment fall into three main groups:

  • Geometric synthesis.

  • Photometric synthesis.

  • Patch-driven synthesis.

Once you select what type of data synthesis you desire, GenSynth takes care of the rest.

If you wish to use augmentation, you must insert an augmentation step into the dataset pipeline that you create in your Dataset Entity. Usually you would only add augmentation to your training data, which is the dataset returned by get_train_dataset(). You will need to import gensynth_augment_image in your Dataset Entity. You may also require the SynthesisType enum. Both are from the built-in gensynthlib Example import:

from gensynthlib import gensynth_augment_image, SynthesisType
  1. Apply it directly to a Dataset using Dataset.map(); in this case the structure of the dataset is assumed to be a dictionary containing an "image" key.

  2. Call it with an image tensor to return a new augmented-image tensor. Use this within a function passed to Dataset.map().

Input constraints

Your input image tensors must:

  • Be of type tf.uint8 with values in the range 0 to 255 (inclusive). 

  • Not have a batch dimension. It must have 3 dimensions, of height, width, and depth (colour). Depth may be either 1 or 3. If the depth is 3, the dimensions should represent RGB. 

Where to apply augmentation 

The best place to use this in the pipeline is after loading the raw images, after sharding, and before batching, scaling, and other data preprocessing steps.

How to use gensynth_augment_image

Use gensynth_augment_image in your data pipeline like this:

  1. Load data into a dictionary of {'image': image_data, ...}. The dataset 'image' key must be of type tf.uint8, with values in range [0,255] raw_dataset = ...

  2. Apply sharding:

    shard_dataset = raw_dataset.shard(num_shards, shard_index)

  3. Map the augmentation function to all images of the shard:

    augmented_dataset = shard_dataset.map(gensynth_augment_image)

  4. Complete any further processing, including batching.

You may alternatively use gensynth_augment_image one image at a time:

image = gensynth_augment_image(image=image)

If you use this method, you have more control over the augmentation since you can use these parameters:

Parameter

Type

Default

Description

synthesis_to_use

List (Synthesis)

All

The set of augmentations GenSynth will use. Options include:

  • SynthesisType.GEOMETRICS

  • SynthesisType.PHOTOMETRIC

  • SynthesisType.PATCH.

with_pre_flip_x

Boolean

True

The initial random horizontal flip augmentation is applied before other augmentations.

with_pre_flip_y

Boolean

False

The initial random verical flip augmentation is applied before other augmentations.

is_grayscale

Boolean

False

True images are grayscale.

filler

Union [int, float, str, np.ndarray]

0

The value/function to fill an image with in the event that the image content has been transformed or removed.

Example of augmentation using default parameters

In this example implementation of get_train_dataset(), a dataset is read from TF records, producing the raw records. It is then sharded according to the parameters passed to the interface. After shuffling it is then parsed with the load_data function, producing a dataset with an "image" key of uint8 images in the range 0..255. Then the gensynth_augment_image function is mapped across the dataset, which implicitly uses the tensor with the "image" key.

def get_train_dataset(self, num_shards=1, shard_index=0):        
    # Reading from TF record files having "image" of uint8 type 0..255    
    dataset = tf.data.TFRecordDataset(file_list)    
    # Select the current shard, and reduce num_train appropriately.    
    dataset = dataset.shard(num_shards, shard_index)    
    num_train = (self.num_train // num_shards
                + int(self.num_train % num_shards > shard_index))    
    # shuffle data with a buffer of 5000    
    dataset = dataset.shuffle(5000)    
    # Apply the load_data function to the shuffled items
    dataset = dataset.map(load_data)    
    # GenSynth augmentation (implicitly using the "image" key)
    dataset = dataset.map(gensynth_augment_image)
    dataset = dataset.map(scale_levels)
    dataset = dataset.batch(TRAIN_BATCH_SIZE)
    return dataset, num_train, TRAIN_BATCH_SIZE

def get_train_dataset(self, num_shards=1, shard_index=0):    
    dataset = tf.data.TFRecordDataset(file_list)    
    dataset = dataset.shard(num_shards, shard_index)    
    num_train = (self.num_train // num_shards                 
                + int(self.num_train % num_shards > shard_index))    
    dataset = dataset.shuffle(5000)    
    dataset = dataset.map(load_data)    
    # GenSynth augmentation    
    dataset = dataset.map(parameter_augment)    
    dataset = dataset.map(scale_levels)    
    dataset = dataset.batch(TRAIN_BATCH_SIZE)    
    return dataset, num_train, TRAIN_BATCH_SIZE

def parameter_augment(ds):    
    synthesis = [SynthesisType.GEOMETRICS, SynthesisType.PATCH, SynthesisType.PHOTOMETRIC]    
    ds['image'] = gensynth_augment_image(image=ds['image'],                                                 
                    synthesis_to_use=synthesis,                                                 
                    with_pre_flip_x=False,                                             
                    with_pre_flip_y=True,                                             
                    is_grayscale=False,                                             
                    filler="mean")    
    return ds

Example of augmentation using parameters

This advanced method is required if you need to customize the augmentation parameters or if the dataset is not in the format of a dictionary with an "image" key.

In this example implementation of get_train_dataset(), a dataset is read from TF records, producing the raw records. It is then sharded according to the parameters passed to the interface. After shuffling it is then parsed with the load_data function, producing a dataset with an "img" key of uint8 images in the range 0..255. Then the parameter_augment function is mapped across the dataset; within it, the image tensor is augmented.

def get_train_dataset(self, num_shards=1, shard_index=0):
    # Reading from TF record files having "image" of uint8 type 0..255
    dataset = tf.data.TFRecordDataset(file_list)
    # Select the current shard, and reduce num_train appropriately.
    dataset = dataset.shard(num_shards, shard_index)
    num_train = (self.num_train // num_shards
                + int(self.num_train % num_shards > shard_index))
    # shuffle data with a buffer of 5000
    dataset = dataset.shuffle(5000)
    # Apply the load_data function to the shuffled items
    dataset = dataset.map(load_data)
    # GenSynth augmentation with a map function
    dataset = dataset.map(parameter_augment)
    dataset = dataset.map(scale_levels)
    dataset = dataset.batch(TRAIN_BATCH_SIZE)
    return dataset, num_train, TRAIN_BATCH_SIZE

def parameter_augment(ds):    
    synthesis = SynthesisType.GEOMETRICS,
                SynthesisType.PATCH,
                SynthesisType.PHOTOMETRIC]
     ds['img'] =gensynth_augment_image(image=ds['img'],
             synthesis_to_use=synthesis,
             with_pre_flip_x=False,
             with_pre_flip_y=True,
             is_grayscale=False,
             filler="mean")   
    return ds 
def load_data(serialized):
    """
    Example load_data that produces a uint8 image from Tf Record    
    """
    feature = {'image': tf.FixedLenFeature([], tf.string),
               'label': tf.FixedLenFeature([], tf.int64)}
    
    features = tf.parse_single_example(serialized, features=feature)
    # Convert the image data from string back to the numbers
    image = tf.decode_raw(features['image'], tf.uint8)
    
    label = tf.cast(features['label'], tf.int64)
    
    # Note: the dictionary has 'img' key
    return {'img': image, 'label': label}