Dataset Builder Packages
Dataset Builder packages provide instructions for downloading a number of popular datasets, along with scripts to prepare the datasets for use with GenSynth. These packages also include the associated Dataset Interfaces for easy setup in GenSynth. Like the Dataset packages, the Dataset Builder packages act as great guides on how to prepare your own datasets in GenSynth for your own custom model development needs. Available Dataset Builder packages include:
Package Name | Description | Applies to: |
google_speech_commands | This package provides a dataset interface for Google Speech Commands based on raw data provided by https://ai.googleblog.com/2017/08/launching-speech-commands-dataset.html. It contains one-second clips (wav files), where each clip is a word spoken by different subjects. | 1.11+ |
imagenet_builder | This package provides the instructions and scripts to generate ImageNet TfRecords from the raw ImageNet classification data files, which you must first download. | 1.11+ |
imagenet50_builder | The package provides instructions and scripts to create any image classification dataset from a subset of the ImageNet classes. A configuration file is provided that extracts 50 specific classes that DarwinAI has been using for many classification experiments. | 1.11+ |
kitti_car_3d | This package provides the instructions to generate TfRecords and the dataset interface file for KITTI 3D car dataset. | 1.11+ |
voc0712_tfds | This package teaches how to make the VOC 2007/2012 object detection dataset using TensorFlow Dataset (tfds) library, demonstrating the use of tfds with GenSynth. You must explicitly download the data after the package is installed. | 1.11+ |
voc0712_tfrecords | This package provides a script to create TfRecords for the VOC2007/20012 object detection dataset from the downloaded raw dataset. | 1.11+ |