Portfolio - Nimesh Mohanakrishnan

Project Details

Motivation

"An apple a day keeps the doctor away" - a famous quote that emphasizes that apple fruit is one of the fruits that keeps us healthy. They're rich in fiber and antioxidants. Eating them is linked to a lower risk of many chronic conditions, including diabetes, heart disease, and cancer. Apples may also promote weight loss and improve gut and brain health. But various diseases that attack the apple fruit degrade the quality of the fruit, and at times, the fruit is not edible. The disorders are observed in the leaves and characterized by shape, color, and other factors. With the help of deep learning and image processing systems, apple plantation cultivators can detect and classify diseases in real time, lower the disease spread, and enrich the growth of apple fruit.

Project Phase

Identifying the Problem

Plant leaf diseases significantly threaten the growth of individual species in agricultural production. As a result, reduced yield rates can lead to indeterminable economic downfall. Therefore, the detection and classification of plant leaf diseases play a significant role in agricultural production.

We conducted a literature survey to understand the existing machine-learning methodologies and approaches to addressing the problems related to plant leaf diseases. From our study, we formulated two critical pieces of information:

Among the plant leaf diseases, the detection and classification of apple leaf diseases are the least examined.
Ensemble learning approaches are the least explored.

With the two pieces of information in mind, we framed a problem statement:

How might we develop a deep learning model to increase accuracy in classifying apple leaf
Note: We are exclusively focusing on classifying the diseases, not the detection. One of the advantages of employing pre-trained deep-learning neural networks is that they can learn on their own to detect the disease spots within a given leaf image.

Proposed Solution

Our proposed solution is to develop an ensemble learning model with appropriate image processing techniques. We focus on ensemble modeling because it significantly enhances model performance compared to single-base models.

Data Collection

We gathered the dataset from multiple sources including kaggle, internet images, and reported images from khan's lab at Cornell university.

Dataset type: jpg file type with CSV file containing image annotations.
Dataset Size: 18,632
No.of Classifications: 12
Color scheme: RGB
Image Resolution: 2048 x 1035
Types of Diseases: healthy, complex, frog_eye_leaf_spot, frog_eye_leaf_spot complex, powdery_mildew, powdery_mildew complex, rust, rust complex, rust frog_eye_leaf_spot, scab, scab frog_eye_leaf_spot, scab frog_eye_leaf_spot complex

Data Analysis

We conducted data analysis to study the dataset using a few visual techniques. This step in our project allowed us to understand the dataset better before we proceeded to the pre-processing phase.

After understanding the essential components of the dataset, it was necessary to analyze the RGB channels because it is crucial to know how various diseases differ from each other from the perspective of the RGB value distribution.

We used histograms to plot the frequency of pixels' intensity values. In an RGB color space, pixel values range from 0 to 255, where 0 stands for black and 255 stands for white. Analysis of a histograms helped us understand the brightness, contrast and intensity distribution of an image.

The red channel values seem to have a roughly normal distribution but a slightly negative skew. This indicates that the red channel tends to be more concentrated at higher values, at around 100. There is a large variation in average red values across images.

The green channel values are more evenly distributed than the red channel values, but they also have a smaller peak and a right-skewed distribution. In addition, the distribution has a larger mode of about 160 and a right skew (in contrast to red). Given that these pictures are of leaves, it makes sense that green is more prominent than red in them.

Out of the three color channels, the blue channel exhibits the most consistent distribution and the least skew (slight leftward skew). The blue channel exhibits significant diversity throughout the dataset's photos.

Image Pre-processing

We experimented with various color spaces from the OpenCV module in image pre-processing. Color spaces help in profiling the images for model training by assigning specific color schemes.

We used jet, bone, inferno, ocean, rainbow, and HSV color spaces from the OpenCV module. We decided to use jet color space because it significantly distinguishes the diseased portions. As a result, the model would be trained at ease and effectively.

Image pre-processing techniques:

rescaling = 1./255
rotation_range = 180
zoom_range = 0.15
width_shift_range = 0.15
height_shift_range = 0.15
horizontal_flip = True
vertical_flip = True
blurring = Gaussian blurring

Color Spaces for images

I used the OpenCV library to render different color spaces for images before feeding the images for model training. I used color spaces for images because it enhances model learning performance. The disease spots are easily identified by the neural networks and hence enhanced efficiency in classification accuracies.

Restructured Ensemble Model

Performance Analysis

Name	Training Accuracy	Training Loss	Testing Accuracy	Testing Loss

Front-end Development

Reflections

Classification of foliar diseases in apple leaves is essential for apple cultivation.

Ensemble modeling performance is significantly higher than the single base model performance.

Using colormaps can aid significantly in facilitating feature map generation for model learning.

Streamlit is an easy-to-use tool for developing front-end visualization for machine learning projects.

We could have experimented with all transferlearning models for model to deepen the understanding of performance analysis.

Classification of Foliar Diseases in Apple Tree Leaves using Ensemble CNN

Project Context

Roles

Tools

Team Members