Cactus Identification using Neural Nets

4 min readAug 14, 2019

Cactus identification problem is a kaggle competition. You can download the data here. The data consists of 32 x 32 size images containing aerial photos of a columnar cactus (Neobuxbaumia tetetzo). The goal of this competition is to detect presence of a cactus in an image.

Data:

The train data set consists of 17500 unique images and a label ‘has_cactus’ which has value 1 if there is a cactus in the image, and 0 if there is no cactus in the image. Below is a sample image from the data set.

As you can see above, we are really not able to recognize any details in the image. This is because the image dimension is so small that we are not able to recognize any details from the image. But surprisingly, neural networks are able to do this with ease and yield a very good accuracy. Our model gave us 99.76% accuracy in detecting the presence of a cactus in an image. The test size was 4000 images.

Import necessary libraries:

import pandas as pd

import keras.backend as K
from keras import layers
from keras.layers import Input, Add, Dense, Dropout, MaxPooling2D, ZeroPadding2D, BatchNormalization, Flatten
from keras.models import Model, load_model
from keras import optimizers
from keras.preprocessing import image
from keras.preprocessing.image import ImageDataGenerator
from keras.utils import layer_utils
from keras.applications import ResNet50, VGG19
from keras.applications.resnet50 import preprocess_input
from keras.callbacks import History,ModelCheckpoint,Callback
from sklearn.metrics import roc_auc_score

Using ImageDataGenerator for loading data:

We used the ImageDataGenerator to load the data from the dataframe and images from the directory. Below is the code snippet used

datagen = ImageDataGenerator(rescale=1/255.0)
train_dir="../input/train/train/"
batch_size=64
image_size=32
train_df.has_cactus=train_df.has_cactus.astype(str)
train_generator=datagen.flow_from_dataframe(dataframe=train_df[:14001],directory=train_dir,x_col='id',
                                            y_col='has_cactus',class_mode='binary',batch_size=batch_size,
                                           target_size=(image_size,image_size))


validation_generator=datagen.flow_from_dataframe(dataframe=train_df[14001:],directory=train_dir,x_col='id',
                                                y_col='has_cactus',class_mode='binary',batch_size=batch_size,
                                                target_size=(image_size,image_size))

Using VGG19 for Training:

We used VGG19 for training purpose. It is an algorithm used for transfer learning. In this algorithm, we first allowed freezing the weights of the initial few layers and trained only on the remaining layers. With this we got only 90% accuracy. So, we trained the initial layers as well. The code segment is as below.

def get_model():
 base_model=VGG19(weights='imagenet',include_top=False,input_shape=(32, 32, 3))
  
 # Freeze the layers in base model
 for layer in base_model.layers:
  layer.trainable=True
  
 # Get output from base model
 base_model_output=base_model.output# Add our layers of Dense and FC at the end
 #last_layers=MaxPooling2D()(base_model_output)# FC layer and softmax
 last_layers=Flatten()(base_model_output)
 last_layers=Dense(512,activation='relu')(last_layers)
 last_layers=Dense(num_classes,activation='sigmoid',name='fcnew')(last_layers)model=Model(inputs=base_model.input,outputs=last_layers)

We used the stochastic gradient descent (sgd) optimizer and learning rate as 0.0001 and got the best results.

model=get_model()
optimizer=optimizers.sgd(lr=0.0001)
model.compile(loss='binary_crossentropy',optimizer=optimizer,metrics=['accuracy'])
model.summary()

We added callbacks for log loss and AUC. Callbacks help in storing the history over multiple epochs, so that we get a trend.

class Loss(Callback):    
    def on_train_begin(self, logs={}):
        self.losses = []
        logs['val_auc'] = 0
            
    def on_epoch_begin(self, epoch, logs={}):
        return
    
    def on_epoch_end(self, epoch, logs={}):
        self.losses.append(logs['loss'])
        
        y_p = []
        y_v = []
        for i in range(len(validation_generator)):
            x_val, y_val = validation_generator[i]
            y_pred = self.model.predict(x_val)
            y_p.append(y_pred)
            y_v.append(y_val)
        y_p = np.concatenate(y_p)
        y_v = np.concatenate(y_v)
        roc_auc = roc_auc_score(y_v, y_p)
        print ('\nVal AUC for epoch{}: {}'.format(epoch, roc_auc))
        logs['val_auc']=roc_auc

With the number of epochs as 30, we fit the model into the train generator. We use the ‘val_loss’ as the metric to find the best model (minimum validation loss).

epochs=30

loss=Loss()
checkpoint=ModelCheckpoint("best_model.hdf5", monitor='val_loss', verbose=0, save_best_only=True, save_weights_only=True, mode='min', period=1)
history=model.fit_generator(train_generator,
                    steps_per_epoch=train_generator.n//batch_size,
                   validation_data=validation_generator,
                validation_steps=validation_generator.n//batch_size,
                   epochs=epochs,
                   callbacks=[loss,checkpoint]
    )

Now, we plotted the accuracy for the model over all the 30 epochs. The learning curve of the model shows that the model is learning properly.

import matplotlib.pyplot as plt
plt.plot(history.history['acc'])
plt.plot(history.history['val_acc'])
plt.title("Accuracy for every epoch")
plt.xlabel('epochs')
plt.ylabel('Accuracy')
plt.legend(['train','validation'],loc='lower right')
plt.show()

Running on test data:

We ran the best model on the test set and got an accuracy of 99.76%

val_model = get_model()
val_model.load_weights('best_model.hdf5')test_dir = "../input/test/test/"
test_df=pd.read_csv("../input/sample_submission.csv")
for _ , _, files in os.walk(test_dir):
    i=0
    for file in files:
        image=io.imread(os.path.join(test_dir, file))
        test_df.iloc[i,0]=file
        image=image.astype(np.float32)/255.0
        test_df.iloc[i,1]=val_model.predict(image.reshape((1, 32, 32, 3)))[0][0]
        i+=1test_df.to_csv("sample_submission.csv",index=False)

Conclusion:

Looking at the image data, we initially had a thought that the problem would be very difficult and obtaining a higher accuracy would be a challenge. But when applying the models, we got to know that the performance was more than expected. We tried Resnet50 and different optimizers (adam, rmsprop) with different learning rates(0.01,0.001,0.0001,0.00001). But we got the best results with VGG19, optimizer sgd and learning rate 0.0001.

You can also refer to our kaggle kernel here.

Thanks for reading through the blog! Hope it was useful. Please leave your comments below or contact me or my team mate Arvind Ram, via LinkedIn.