Skip to content

Image Trainers

The ml_trainers module provides essential methodologies for training machine learning models tailored to specific modalities. It includes classes designed to streamline the process of loading datasets and training models.

Trainer: Image Classification tasks

ImageClassificationTrainer

Bases: MLPattern

Source code in cucaracha/ml_trainers/image_classification_trainer.py
 18
 19
 20
 21
 22
 23
 24
 25
 26
 27
 28
 29
 30
 31
 32
 33
 34
 35
 36
 37
 38
 39
 40
 41
 42
 43
 44
 45
 46
 47
 48
 49
 50
 51
 52
 53
 54
 55
 56
 57
 58
 59
 60
 61
 62
 63
 64
 65
 66
 67
 68
 69
 70
 71
 72
 73
 74
 75
 76
 77
 78
 79
 80
 81
 82
 83
 84
 85
 86
 87
 88
 89
 90
 91
 92
 93
 94
 95
 96
 97
 98
 99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
class ImageClassificationTrainer(MLPattern):
    def __init__(self, dataset_path: str, num_classes: int, **kwargs):
        """
        This is the main constructor for a general Image Classification ML method.

        Note:
            The `dataset_path` should follow the `cucaracha` dataset folder
            organization. More details about how to organize the dataset can be
            found at the `cucaracha` documentation.

        Info:
            There are many ways to find and build datasets to use for your
            machine learning models. A simpler way is using the public datasets
            given at the `cucaracha` Kaggle repository. You can find more
            details at: [https://www.kaggle.com/organizations/cucaracha-project](https://www.kaggle.com/organizations/cucaracha-project)

        Args:
            dataset_path (str): The path to the dataset. This should follow the
             `cucaracha` dataset folder organization.
            num_classes (int): The number of classes in the dataset. This must
            be defined based on the classes presented in the dataset.
            **kwargs: Additional keyword arguments for configuring the model.
            Possible keys include:
            - 'img_shape' (tuple): The shape of the input images. Default
            is (128, 128).
            - 'architecture' (object): The model architecture to use. If
            not provided, a default SmallXception architecture will be used.
            - 'batch_size' (int): The batch size to use during training. If
            not provided, a default value from MLPattern class  will be used.
            - 'epochs' (int): The number of epochs to train the model. If
            not provided, a default value from MLPattern class will be used.
            - 'model_name' (str): The name to use when saving the trained
            model. If not provided, a default name will be generated.
        Raises:
            ValueError: If the provided architecture is not for image
            classification tasks.
        """

        super().__init__(dataset_path)
        check_architecture_pattern(kwargs, 'image_classification')

        self.img_shape = kwargs.get('img_shape', (128, 128))
        self.batch_size = kwargs.get('batch_size', 64)
        self.epochs = kwargs.get('epochs', 500)
        self.num_classes = num_classes

        self.architecture = None
        self.model = None
        # If no architecture is provided, use the default one
        self._initialize_model(kwargs.get('architecture'), kwargs)

        # if binary classification, use binary metrics
        self._initialize_metrics(kwargs)

        self.data_generator = self._create_data_generator(
            kwargs.get('data_generator')
        )
        self.class_names = {}
        self.class_weights = {}
        self.dataset = self.load_dataset(
            kwargs.get('use_data_augmentation', True)
        )

        # Define the default model name to save
        self._define_model_name(kwargs)

        self.history = None

    def load_dataset(self, use_data_augmentation: bool = True):
        """
        Loads and prepares the image classification dataset for training and
        validation.

        The root path of the dataset should follow the `cucaracha` dataset.
        Therefore, the user must have a permission to read and write in the
        dataset path folder in order to create the organized data.

        Note:
            This method is automatically called when the class is instantiated.
            However, the user can call it again to reload the dataset and make
            an internal evaluation.



        This method performs the following steps:

        1. Calls the superclass method to load the dataset.
        2. Loads the cucaracha dataset from the specified path.
        3. Prepares the dataset environment by creating subfolders for each label.
        4. Loads the organized data using `keras.utils.image_dataset_from_directory`.
        5. Maps the training and validation datasets to one-hot encoded labels.

        Returns:
            dict: A dictionary containing the training and validation datasets
            with keys 'train' and 'val'.
        """
        super().load_dataset()

        # Prepare all the dataset environment
        # Create subfolders for each label
        train_dataset, _ = load_cucaracha_dataset(
            self.dataset_path, 'image_classification'
        )

        # Load the organized data using keras.utils.image_dataset_from_directory
        train_ds, val_ds = keras.utils.image_dataset_from_directory(
            train_dataset,
            image_size=self.img_shape,
            batch_size=self.batch_size,
            validation_split=0.2,
            subset='both',
            seed=random.randint(0, 10000),
        )

        self.class_names = {
            i: name for i, name in enumerate(train_ds.class_names)
        }

        if use_data_augmentation:
            train_ds = train_ds.map(
                lambda x, y: (
                    self.data_generator(x),
                    tf.one_hot(y, depth=self.num_classes),
                )
            )
            val_ds = val_ds.map(
                lambda x, y: (
                    self.data_generator(x),
                    tf.one_hot(y, depth=self.num_classes),
                )
            )
        else:
            train_ds = train_ds.map(
                lambda x, y: (
                    x,
                    tf.one_hot(y, depth=self.num_classes),
                )
            )
            val_ds = val_ds.map(
                lambda x, y: (
                    x,
                    tf.one_hot(y, depth=self.num_classes),
                )
            )

        # Calculate class weights based on the proportions of data in the dataset
        dataset = {'train': train_ds, 'val': val_ds}
        self._collect_dataset_class_weigth(dataset)

        return dataset

    def train_model(self, **kwargs):
        """
        Trains the model using the provided dataset and configuration.

        The information of `epochs`, `batch_size`, `loss`, `optimizer`, and
        `metrics` are already defined in the class constructor and it is used
        here to adjust the model training.

        When the training is finished, the model is updated to be saved or
        checked by the user. The model is provided by the object itself using
        the `obj.model` attribute.

        Examples:
            >>> from tests import sample_paths as sp
            >>> obj = ImageClassificationTrainer(sp.DOC_ML_DATASET_CLASSIFICATION, 3) # doctest: +SKIP
            >>> obj.epochs = 10 # doctest: +SKIP
            >>> obj.batch_size = 32 # doctest: +SKIP
            >>> obj.train_model() # doctest: +SKIP

            After the training, the model can be saved using the `obj.model`
            >>> import tempfile # doctest: +SKIP
            >>> with tempfile.TemporaryDirectory() as tmpdirname: # doctest: +SKIP
            >>>     obj.model.save(os.path.join(tmpdirname, 'saved_model.keras')) # doctest: +SKIP

        As an optional parameter, one can uses the following:
        - `callbacks` (list): A list of callback instances to apply during
        training. This can be any of the callback methods provided by Keras,
        such as `EarlyStopping`, `ReduceLROnPlateau`, etc. If not provided,
        a default `ModelCheckpoint` callback is used to save the model at the
        end of each epoch.
        - `data_augmentation` (ImageDataGenerator): A data generator for data
        augmentation using the Keras ImageDataGenerator class. If not provided,
        the default data augmentation is used as defined in the
        `_create_data_generator` method in the constructor class.

        Args:
            callbacks (list, optional): A list of callback instances to apply during training.
                        These can be any of the callback methods provided by Keras,
                        such as `EarlyStopping`, `ReduceLROnPlateau`, etc.
                        If not provided, a default `ModelCheckpoint` callback is used
                        to save the model at the end of each epoch.
        """
        callbacks = kwargs.get('callbacks', [])
        if not callbacks:
            callbacks = [
                keras.callbacks.ModelCheckpoint(
                    os.path.join(self.dataset_path, self.model_name),
                    monitor='val_acc',
                    save_best_only=True,
                )
            ]

        self.model.compile(
            optimizer=self.optimizer,
            loss=self.loss,
            metrics=self.metrics,
        )

        self.history = self.model.fit(
            self.dataset['train'],
            epochs=self.epochs,
            callbacks=callbacks,
            batch_size=self.batch_size,
            validation_data=self.dataset['val'],
            class_weight=self.class_weights,
        )

    def _create_data_generator(self, layers_list: list = None):
        """
        Create a data generator for data augmentation.

        This is a data augmentation based on the Keras augmentation layers.
        If none is providaded, then a default data augmentation is set, which
        assumes the following layers: RandomFlip, RandomRotation, RandomZoom,
        RandomShear, and RandomTranslation.

        The user can provide a list of layers to be used in the data
        augmentation process, however, it must be a list of Keras layers.

        Returns:
            augmenter: A data augmentation generator.
        """
        if layers_list is not None:
            if not isinstance(layers_list, list) or not all(
                [
                    isinstance(layer, keras.layers.Layer)
                    for layer in layers_list
                ]
            ):
                raise ValueError(
                    'Data generator must be a list of Keras layers.'
                )

        data_aug = layers_list
        if data_aug is None:
            data_aug = [
                keras.layers.RandomFlip(),
                keras.layers.RandomRotation(
                    0.3,
                    fill_mode='constant',
                    fill_value=random.randint(0, 255),
                ),
                keras.layers.RandomZoom(
                    (-0.2, 0.4),
                    fill_mode='constant',
                    fill_value=random.randint(0, 255),
                ),
                keras.layers.RandomShear(
                    0.3,
                    fill_mode='constant',
                    fill_value=random.randint(0, 255),
                ),
                keras.layers.RandomTranslation(
                    (-0.3, 0.3),
                    0.1,
                    fill_mode='constant',
                    fill_value=random.randint(0, 255),
                ),
                keras.layers.RandomBrightness(0.3),
                keras.layers.GaussianNoise(0.6),
            ]

        def augmenter(images):
            for op in data_aug:
                images = op(images)

            return images

        return augmenter

    def collect_training_samples(self, num_samples: int = 30):
        """
        Collects a batch of training samples for visualization purposes.

        Args:
            num_samples (int, optional): The number of samples to collect.
            Defaults to 30.

        Returns:
            np.ndarray: A batch of training samples.
        """
        sample = []

        for i in range(num_samples):
            sample.append(next(iter(self.dataset['train']))[0].numpy())
            if len(np.concatenate(sample, axis=0)) >= num_samples:
                break
        # next(iter(self.dataset['train']))[0].numpy()[0:num_samples]
        return np.concatenate(sample, axis=0)[:num_samples]

    def _initialize_model(self, architecture: ModelArchitect, kwargs):
        """
        Initialize the model using the provided architecture.

        Args:
            architecture (ModelArchitect): The model architecture to use.
        """
        if kwargs.get('architecture') is None:
            default = SmallXception(
                img_shape=self.img_shape, num_classes=self.num_classes
            )
            self.architecture = default
            self.model = default.get_model()
        else:
            self.architecture = kwargs['architecture']
            self.model = self.architecture.get_model()

    def _initialize_metrics(self, kwargs):
        """
        Initialize the metrics based on the number of classes.
        """
        self.loss = kwargs.get('loss', keras.losses.CategoricalCrossentropy())
        self.metrics = kwargs.get(
            'metrics', [keras.metrics.CategoricalAccuracy(name='acc')]
        )
        self.optimizer = kwargs.get(
            'optimizer',
            keras.optimizers.Adam(
                keras.optimizers.schedules.ExponentialDecay(
                    initial_learning_rate=0.001,
                    decay_steps=10,
                    decay_rate=0.5,
                )
            ),
        )

    def _define_model_name(self, kwargs):
        time = datetime.datetime.now().strftime('%d%m%Y-%H%M%S')
        ds_name = os.path.basename(os.path.normpath(self.dataset_path))
        modality = self.architecture.modality
        self.model_name = (
            f'mod-{modality}-dataset-{ds_name}-timestamp-{time}.keras'
        )
        if 'model_name' in kwargs:
            self.model_name = kwargs['model_name']

    def _collect_dataset_class_weigth(self, dataset):
        """
        Collects the class weights based on the dataset proportions.
        This helps to balance the training process when the dataset is
        unbalanced.
        """
        class_counts = np.zeros(self.num_classes)
        for _, labels in dataset['train']:
            class_counts += np.sum(labels.numpy(), axis=0)

        total_samples = np.sum(class_counts)
        self.class_weights = {
            i: total_samples / (self.num_classes * count)
            for i, count in enumerate(class_counts)
        }

__init__(dataset_path, num_classes, **kwargs)

This is the main constructor for a general Image Classification ML method.

Note

The dataset_path should follow the cucaracha dataset folder organization. More details about how to organize the dataset can be found at the cucaracha documentation.

Info

There are many ways to find and build datasets to use for your machine learning models. A simpler way is using the public datasets given at the cucaracha Kaggle repository. You can find more details at: https://www.kaggle.com/organizations/cucaracha-project

Parameters:

Name Type Description Default
dataset_path str

The path to the dataset. This should follow the cucaracha dataset folder organization.

required
num_classes int

The number of classes in the dataset. This must

required
**kwargs

Additional keyword arguments for configuring the model.

{}
Possible keys include
required
- 'img_shape' (tuple

The shape of the input images. Default

required
- 'architecture' (object

The model architecture to use. If

required
- 'batch_size' (int

The batch size to use during training. If

required
- 'epochs' (int

The number of epochs to train the model. If

required
- 'model_name' (str

The name to use when saving the trained

required

Raises: ValueError: If the provided architecture is not for image classification tasks.

Source code in cucaracha/ml_trainers/image_classification_trainer.py
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
def __init__(self, dataset_path: str, num_classes: int, **kwargs):
    """
    This is the main constructor for a general Image Classification ML method.

    Note:
        The `dataset_path` should follow the `cucaracha` dataset folder
        organization. More details about how to organize the dataset can be
        found at the `cucaracha` documentation.

    Info:
        There are many ways to find and build datasets to use for your
        machine learning models. A simpler way is using the public datasets
        given at the `cucaracha` Kaggle repository. You can find more
        details at: [https://www.kaggle.com/organizations/cucaracha-project](https://www.kaggle.com/organizations/cucaracha-project)

    Args:
        dataset_path (str): The path to the dataset. This should follow the
         `cucaracha` dataset folder organization.
        num_classes (int): The number of classes in the dataset. This must
        be defined based on the classes presented in the dataset.
        **kwargs: Additional keyword arguments for configuring the model.
        Possible keys include:
        - 'img_shape' (tuple): The shape of the input images. Default
        is (128, 128).
        - 'architecture' (object): The model architecture to use. If
        not provided, a default SmallXception architecture will be used.
        - 'batch_size' (int): The batch size to use during training. If
        not provided, a default value from MLPattern class  will be used.
        - 'epochs' (int): The number of epochs to train the model. If
        not provided, a default value from MLPattern class will be used.
        - 'model_name' (str): The name to use when saving the trained
        model. If not provided, a default name will be generated.
    Raises:
        ValueError: If the provided architecture is not for image
        classification tasks.
    """

    super().__init__(dataset_path)
    check_architecture_pattern(kwargs, 'image_classification')

    self.img_shape = kwargs.get('img_shape', (128, 128))
    self.batch_size = kwargs.get('batch_size', 64)
    self.epochs = kwargs.get('epochs', 500)
    self.num_classes = num_classes

    self.architecture = None
    self.model = None
    # If no architecture is provided, use the default one
    self._initialize_model(kwargs.get('architecture'), kwargs)

    # if binary classification, use binary metrics
    self._initialize_metrics(kwargs)

    self.data_generator = self._create_data_generator(
        kwargs.get('data_generator')
    )
    self.class_names = {}
    self.class_weights = {}
    self.dataset = self.load_dataset(
        kwargs.get('use_data_augmentation', True)
    )

    # Define the default model name to save
    self._define_model_name(kwargs)

    self.history = None

collect_training_samples(num_samples=30)

Collects a batch of training samples for visualization purposes.

Parameters:

Name Type Description Default
num_samples int

The number of samples to collect.

30

Returns:

Type Description

np.ndarray: A batch of training samples.

Source code in cucaracha/ml_trainers/image_classification_trainer.py
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
def collect_training_samples(self, num_samples: int = 30):
    """
    Collects a batch of training samples for visualization purposes.

    Args:
        num_samples (int, optional): The number of samples to collect.
        Defaults to 30.

    Returns:
        np.ndarray: A batch of training samples.
    """
    sample = []

    for i in range(num_samples):
        sample.append(next(iter(self.dataset['train']))[0].numpy())
        if len(np.concatenate(sample, axis=0)) >= num_samples:
            break
    # next(iter(self.dataset['train']))[0].numpy()[0:num_samples]
    return np.concatenate(sample, axis=0)[:num_samples]

load_dataset(use_data_augmentation=True)

Loads and prepares the image classification dataset for training and validation.

The root path of the dataset should follow the cucaracha dataset. Therefore, the user must have a permission to read and write in the dataset path folder in order to create the organized data.

Note

This method is automatically called when the class is instantiated. However, the user can call it again to reload the dataset and make an internal evaluation.

This method performs the following steps:

  1. Calls the superclass method to load the dataset.
  2. Loads the cucaracha dataset from the specified path.
  3. Prepares the dataset environment by creating subfolders for each label.
  4. Loads the organized data using keras.utils.image_dataset_from_directory.
  5. Maps the training and validation datasets to one-hot encoded labels.

Returns:

Name Type Description
dict

A dictionary containing the training and validation datasets

with keys 'train' and 'val'.

Source code in cucaracha/ml_trainers/image_classification_trainer.py
 86
 87
 88
 89
 90
 91
 92
 93
 94
 95
 96
 97
 98
 99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
def load_dataset(self, use_data_augmentation: bool = True):
    """
    Loads and prepares the image classification dataset for training and
    validation.

    The root path of the dataset should follow the `cucaracha` dataset.
    Therefore, the user must have a permission to read and write in the
    dataset path folder in order to create the organized data.

    Note:
        This method is automatically called when the class is instantiated.
        However, the user can call it again to reload the dataset and make
        an internal evaluation.



    This method performs the following steps:

    1. Calls the superclass method to load the dataset.
    2. Loads the cucaracha dataset from the specified path.
    3. Prepares the dataset environment by creating subfolders for each label.
    4. Loads the organized data using `keras.utils.image_dataset_from_directory`.
    5. Maps the training and validation datasets to one-hot encoded labels.

    Returns:
        dict: A dictionary containing the training and validation datasets
        with keys 'train' and 'val'.
    """
    super().load_dataset()

    # Prepare all the dataset environment
    # Create subfolders for each label
    train_dataset, _ = load_cucaracha_dataset(
        self.dataset_path, 'image_classification'
    )

    # Load the organized data using keras.utils.image_dataset_from_directory
    train_ds, val_ds = keras.utils.image_dataset_from_directory(
        train_dataset,
        image_size=self.img_shape,
        batch_size=self.batch_size,
        validation_split=0.2,
        subset='both',
        seed=random.randint(0, 10000),
    )

    self.class_names = {
        i: name for i, name in enumerate(train_ds.class_names)
    }

    if use_data_augmentation:
        train_ds = train_ds.map(
            lambda x, y: (
                self.data_generator(x),
                tf.one_hot(y, depth=self.num_classes),
            )
        )
        val_ds = val_ds.map(
            lambda x, y: (
                self.data_generator(x),
                tf.one_hot(y, depth=self.num_classes),
            )
        )
    else:
        train_ds = train_ds.map(
            lambda x, y: (
                x,
                tf.one_hot(y, depth=self.num_classes),
            )
        )
        val_ds = val_ds.map(
            lambda x, y: (
                x,
                tf.one_hot(y, depth=self.num_classes),
            )
        )

    # Calculate class weights based on the proportions of data in the dataset
    dataset = {'train': train_ds, 'val': val_ds}
    self._collect_dataset_class_weigth(dataset)

    return dataset

train_model(**kwargs)

Trains the model using the provided dataset and configuration.

The information of epochs, batch_size, loss, optimizer, and metrics are already defined in the class constructor and it is used here to adjust the model training.

When the training is finished, the model is updated to be saved or checked by the user. The model is provided by the object itself using the obj.model attribute.

Examples:

>>> from tests import sample_paths as sp
>>> obj = ImageClassificationTrainer(sp.DOC_ML_DATASET_CLASSIFICATION, 3)
>>> obj.epochs = 10
>>> obj.batch_size = 32
>>> obj.train_model()

After the training, the model can be saved using the obj.model

>>> import tempfile
>>> with tempfile.TemporaryDirectory() as tmpdirname:
>>>     obj.model.save(os.path.join(tmpdirname, 'saved_model.keras'))

As an optional parameter, one can uses the following: - callbacks (list): A list of callback instances to apply during training. This can be any of the callback methods provided by Keras, such as EarlyStopping, ReduceLROnPlateau, etc. If not provided, a default ModelCheckpoint callback is used to save the model at the end of each epoch. - data_augmentation (ImageDataGenerator): A data generator for data augmentation using the Keras ImageDataGenerator class. If not provided, the default data augmentation is used as defined in the _create_data_generator method in the constructor class.

Parameters:

Name Type Description Default
callbacks list

A list of callback instances to apply during training. These can be any of the callback methods provided by Keras, such as EarlyStopping, ReduceLROnPlateau, etc. If not provided, a default ModelCheckpoint callback is used to save the model at the end of each epoch.

required
Source code in cucaracha/ml_trainers/image_classification_trainer.py
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
def train_model(self, **kwargs):
    """
    Trains the model using the provided dataset and configuration.

    The information of `epochs`, `batch_size`, `loss`, `optimizer`, and
    `metrics` are already defined in the class constructor and it is used
    here to adjust the model training.

    When the training is finished, the model is updated to be saved or
    checked by the user. The model is provided by the object itself using
    the `obj.model` attribute.

    Examples:
        >>> from tests import sample_paths as sp
        >>> obj = ImageClassificationTrainer(sp.DOC_ML_DATASET_CLASSIFICATION, 3) # doctest: +SKIP
        >>> obj.epochs = 10 # doctest: +SKIP
        >>> obj.batch_size = 32 # doctest: +SKIP
        >>> obj.train_model() # doctest: +SKIP

        After the training, the model can be saved using the `obj.model`
        >>> import tempfile # doctest: +SKIP
        >>> with tempfile.TemporaryDirectory() as tmpdirname: # doctest: +SKIP
        >>>     obj.model.save(os.path.join(tmpdirname, 'saved_model.keras')) # doctest: +SKIP

    As an optional parameter, one can uses the following:
    - `callbacks` (list): A list of callback instances to apply during
    training. This can be any of the callback methods provided by Keras,
    such as `EarlyStopping`, `ReduceLROnPlateau`, etc. If not provided,
    a default `ModelCheckpoint` callback is used to save the model at the
    end of each epoch.
    - `data_augmentation` (ImageDataGenerator): A data generator for data
    augmentation using the Keras ImageDataGenerator class. If not provided,
    the default data augmentation is used as defined in the
    `_create_data_generator` method in the constructor class.

    Args:
        callbacks (list, optional): A list of callback instances to apply during training.
                    These can be any of the callback methods provided by Keras,
                    such as `EarlyStopping`, `ReduceLROnPlateau`, etc.
                    If not provided, a default `ModelCheckpoint` callback is used
                    to save the model at the end of each epoch.
    """
    callbacks = kwargs.get('callbacks', [])
    if not callbacks:
        callbacks = [
            keras.callbacks.ModelCheckpoint(
                os.path.join(self.dataset_path, self.model_name),
                monitor='val_acc',
                save_best_only=True,
            )
        ]

    self.model.compile(
        optimizer=self.optimizer,
        loss=self.loss,
        metrics=self.metrics,
    )

    self.history = self.model.fit(
        self.dataset['train'],
        epochs=self.epochs,
        callbacks=callbacks,
        batch_size=self.batch_size,
        validation_data=self.dataset['val'],
        class_weight=self.class_weights,
    )

Trainer: Image Semantic Segmenation tasks

ImageSegmentationTrainer

Bases: MLPattern

Source code in cucaracha/ml_trainers/image_segmentation_trainer.py
 18
 19
 20
 21
 22
 23
 24
 25
 26
 27
 28
 29
 30
 31
 32
 33
 34
 35
 36
 37
 38
 39
 40
 41
 42
 43
 44
 45
 46
 47
 48
 49
 50
 51
 52
 53
 54
 55
 56
 57
 58
 59
 60
 61
 62
 63
 64
 65
 66
 67
 68
 69
 70
 71
 72
 73
 74
 75
 76
 77
 78
 79
 80
 81
 82
 83
 84
 85
 86
 87
 88
 89
 90
 91
 92
 93
 94
 95
 96
 97
 98
 99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
class ImageSegmentationTrainer(MLPattern):
    def __init__(self, dataset_path: str, **kwargs):
        """
        This is the main constructor for a general Image Segmentation ML method.

        Note:
            The `dataset_path` should follow the `cucaracha` dataset folder
            organization. More details about how to organize the dataset can be
        """
        super().__init__(dataset_path)
        check_architecture_pattern(kwargs, 'image_segmentation')

        self.img_shape = kwargs.get('img_shape', (160, 160))
        self.batch_size = kwargs.get('batch_size', 64)
        self.epochs = kwargs.get('epochs', 500)
        self.num_classes = kwargs.get('num_classes', 2)

        self.architecture = kwargs.get('architecture', None)
        self.model = None
        # If no architecture is provided, use the default one
        self._initialize_model(kwargs.get('architecture'), kwargs)

        # if binary classification, use binary metrics
        self._initialize_metrics()

        self.dataset = self.load_dataset()

        # Define the default model name to save
        self._define_model_name(kwargs)

    def _initialize_model(self, architecture: ModelArchitect, kwargs):
        """
        Initialize the model using the provided architecture.

        Args:
            architecture (ModelArchitect): The model architecture to use.
        """
        if kwargs.get('architecture') is None:
            default = UNetXception(
                img_shape=self.img_shape, num_classes=self.num_classes
            )
            self.architecture = default
            self.model = default.get_model()
        else:
            self.architecture = kwargs['architecture']
            self.model = self.architecture.get_model()

    def _initialize_metrics(self):
        """
        Initialize the metrics based on the number of classes.
        """
        # if self.num_classes == 2:
        #     self.loss = keras.losses.BinaryCrossentropy()
        #     self.metrics = [keras.metrics.BinaryAccuracy(name='acc')]
        # else:
        self.loss = keras.losses.SparseCategoricalCrossentropy()
        self.metrics = [keras.metrics.SparseCategoricalAccuracy(name='acc')]
        self.optmizer = keras.optimizers.Adam(1e-4)

    def _define_model_name(self, kwargs):
        time = datetime.datetime.now().strftime('%d%m%Y-%H%M%S')
        ds_name = os.path.basename(os.path.normpath(self.dataset_path))
        modality = self.architecture.modality
        self.model_name = (
            f'mod-{modality}-dataset-{ds_name}-timestamp-{time}.keras'
        )
        if 'model_name' in kwargs:
            self.model_name = kwargs['model_name']

    def load_dataset(self):
        super().load_dataset()

        # Prepare all the dataset environment
        # Create subfolders for each label
        dataset_path = load_cucaracha_dataset(
            self.dataset_path, 'image_segmentation'
        )

        def load_img_masks(
            input_img_path, target_img_path
        ):  # pragma: no cover
            input_img = tf_io.read_file(input_img_path)
            input_img = tf_io.decode_png(input_img, channels=3)
            input_img = tf_image.resize(input_img, self.img_shape)
            input_img = tf_image.convert_image_dtype(input_img, 'float32')

            target_img = tf_io.read_file(target_img_path)
            target_img = tf_io.decode_png(target_img, channels=1)
            target_img = tf_image.resize(
                target_img, self.img_shape, method='nearest'
            )
            target_img = tf_image.convert_image_dtype(target_img, 'float32')

            # # Ground truth labels are 1, 2, 3. Subtract one to make them 0, 1, 2:
            # target_img -= 1
            return input_img, target_img

        # For faster debugging, limit the size of data
        # if max_dataset_len:
        #     input_img_paths = input_img_paths[:max_dataset_len]
        #     target_img_paths = target_img_paths[:max_dataset_len]

        split_size = int(0.8 * len(dataset_path))
        input_img_paths = [path[0] for path in dataset_path[:split_size]]
        target_img_paths = [path[1] for path in dataset_path[:split_size]]

        train_dataset = tf_data.Dataset.from_tensor_slices(
            (input_img_paths, target_img_paths)
        )
        train_dataset = train_dataset.map(
            load_img_masks, num_parallel_calls=tf_data.AUTOTUNE
        )

        input_img_paths = [path[0] for path in dataset_path[split_size:]]
        target_img_paths = [path[1] for path in dataset_path[split_size:]]

        valid_dataset = tf_data.Dataset.from_tensor_slices(
            (input_img_paths, target_img_paths)
        )
        valid_dataset = valid_dataset.map(
            load_img_masks, num_parallel_calls=tf_data.AUTOTUNE
        )

        return {
            'train': train_dataset.batch(self.batch_size),
            'val': valid_dataset.batch(self.batch_size),
        }

    def train_model(self, callbacks: list = None):
        if not callbacks:
            callbacks = [
                keras.callbacks.ModelCheckpoint(
                    os.path.join(self.dataset_path, self.model_name),
                    monitor='val_acc',
                    save_best_only=True,
                )
            ]

        self.model.compile(
            optimizer=self.optmizer,
            loss=self.loss,
            metrics=self.metrics,
        )

        self.model.fit(
            self.dataset['train'],
            epochs=self.epochs,
            callbacks=callbacks,
            batch_size=self.batch_size,
            validation_data=self.dataset['val'],
        )

__init__(dataset_path, **kwargs)

This is the main constructor for a general Image Segmentation ML method.

Note

The dataset_path should follow the cucaracha dataset folder organization. More details about how to organize the dataset can be

Source code in cucaracha/ml_trainers/image_segmentation_trainer.py
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
def __init__(self, dataset_path: str, **kwargs):
    """
    This is the main constructor for a general Image Segmentation ML method.

    Note:
        The `dataset_path` should follow the `cucaracha` dataset folder
        organization. More details about how to organize the dataset can be
    """
    super().__init__(dataset_path)
    check_architecture_pattern(kwargs, 'image_segmentation')

    self.img_shape = kwargs.get('img_shape', (160, 160))
    self.batch_size = kwargs.get('batch_size', 64)
    self.epochs = kwargs.get('epochs', 500)
    self.num_classes = kwargs.get('num_classes', 2)

    self.architecture = kwargs.get('architecture', None)
    self.model = None
    # If no architecture is provided, use the default one
    self._initialize_model(kwargs.get('architecture'), kwargs)

    # if binary classification, use binary metrics
    self._initialize_metrics()

    self.dataset = self.load_dataset()

    # Define the default model name to save
    self._define_model_name(kwargs)