python - RAM Overflow Colab, when running model.fit() in Image Classifier of AutoKeras for many images - Stack Overflow

PHOTO EMBED

Mon Sep 25 2023 14:45:05 GMT+0000 (Coordinated Universal Time)

Saved by @Amy1393 #python

I highly recommend you tf.data.Dataset for creating the dataset:

Do all processes (like resize and normalize) that you want on all images with dataset.map.
Instead of using train_test_split, Use dataset.take, datast.skip for splitting dataset.
Code for generating random images and label:

# !pip install autokeras
import tensorflow as tf
import autokeras as ak
import numpy as np

data = np.random.randint(0, 255, (45_000,32,32,3))
label = np.random.randint(0, 10, 45_000)
label = tf.keras.utils.to_categorical(label) 
 Save
Convert data & label to tf.data.Dataset and process on them: (only 55 ms for 45_000, benchmark on colab)

dataset = tf.data.Dataset.from_tensor_slices((data, label))
def resize_normalize_preprocess(image, label):
    image = tf.image.resize(image, (16, 16))
    image = image / 255.0
    return image, label

# %%timeit 
dataset = dataset.map(resize_normalize_preprocess, num_parallel_calls=tf.data.AUTOTUNE)
# 1 loop, best of 5: 54.9 ms per loop
 Save
Split dataset to 80% for train and 20% for test
Train and evaluate AutoKeras.ImageClassifier
dataet_size = len(dataset)
train_size = int(0.8 * dataet_size)
test_size = int(0.2 * len(dataset))

dataset = dataset.shuffle(32)
train_dataset = dataset.take(train_size)
test_dataset = dataset.skip(train_size)

print(f'Size dataset : {len(dataset)}')
print(f'Size train_dataset : {len(train_dataset)}')
print(f'Size test_dataset : {len(test_dataset)}')

clf = ak.ImageClassifier(overwrite=True, max_trials=1)
clf.fit(train_dataset, epochs=1)
print(clf.evaluate(test_dataset))
content_copyCOPY

https://stackoverflow.com/questions/72678293/ram-overflow-colab-when-running-model-fit-in-image-classifier-of-autokeras-fo