grapeleafGPT

Project Overview

grapeleafGPT is an advanced AI model designed to detect Esca disease in grape leaves while also providing vineyard workers with actionable advice on safe vineyard practices. Building on the capabilities of MiniGPT-4 and LLaVA, grapeleafGPT combines image analysis with interactive language support.

While traditional LVLMs excel at recognizing common objects, they often lack specific domain expertise and the ability to understand fine-grained details, which is critical for specialized tasks like disease detection in viticulture. grapeleafGPT addresses these challenges by fine-tuning the model to understand the unique visual signatures of Esca disease, offering a comprehensive tool for anomaly detection in vineyards. Additionally, the application eliminates the need for manual threshold setting by directly assessing the presence and severity of anomalies, and providing detailed advice through a multi-turn dialogue system.

Key Features

•Esca disease detection in grape leaves
•Multi-turn dialogue system for vineyard advice
•No manual threshold setting required
•94.9% AUC in distinguishing infected leaves

Lessons learned

The best vineyard tools are built by the people who work the vines. AI makes that possible. See what this project taught me.

Demo Video

Model

grapeleafGPT is a novel approach to anomaly detection in viticulture, leveraging a pre-trained image encoder and a Large Language Model (LLM) to inform vineyard staff of Esca infection. The model employs a visual-textual feature-matching-based image decoder for accurate localization of infected areas and a prompt learner to fine-tune the LVLM for domain-specific applications.

The model was trained using a custom dataset class which handled grape leaf images by converting them into tensors, resizing each image, and applying normalization. Training focused on a single class, with a specific emphasis on distinguishing healthy leaves from those showing Esca symptoms. Gradient accumulation steps were set to 16, and the learning rate was controlled using a WarmupDecayLR scheduler, which ramped the learning rate from 0 to 0.0001 over 100 steps and decayed it over 20,000 steps. Gradient clipping was applied to maintain stability, and mixed precision training utilized FP16 and BF16 to improve computational efficiency. Masks were generated using HSV color space conversion via OpenCV, with specific color ranges defined for red-brown hues to isolate Esca symptoms.

The final model achieved an image-level Area Under the Curve (AUC) of 94.9%, meaning that it exhibited a 94.9% probability of correctly ranking a randomly chosen Esca-infected leaf higher in anomaly score than a randomly chosen healthy leaf, indicating robust performance in distinguishing between the two classes.

Capabilities

Disease Detection

Automatically identifies the presence of Esca disease in grape leaf images, highlighting the exact locations of the most prominent infection symptoms.

Multi-turn Dialogue

Engages in multi-turn dialogues with vineyard workers, offering advice on safe practices, treatment options, and preventative measures.

No Manual Threshold Setting

Unlike traditional image classifiers, grapeleafGPT does not require manual threshold setting, streamlining the detection process and improving usability in practical vineyard environments.

Robust Anomaly Detection

Capable of detecting anomalies in previously unseen grape leaf samples with minimal normal examples provided, ensuring robust performance across diverse conditions.

What I got wrong

I built this project when my AI skills were stronger than my viticulture knowledge. The model works well technically (94.9% AUC), but the implied user workflow has a fundamental problem: the model was trained on midsummer leaf images showing active Esca symptoms, yet the dialogue system advises the user on pruning decisions. Pruning happens in winter dormancy, when there are no leaves to scan. By the time you see Esca symptoms in July, the pruning window closed months ago.

I held a WSET Level 3 certification when I built this and still missed it. I had not yet worked a harvest or spent time in a cellar. This is exactly the kind of gap that appears when technologists build tools for an industry they understand theoretically but have not worked in physically.

The broader lesson is important. The AI and tech community will continue to build impressive-looking agricultural tools that miss practical realities of the people using them. A model that detects a disease you do not have in your region, or recommends an action in the wrong season, is not saving anyone time or money regardless of its accuracy score.

This is why the real value of accessible AI tooling is not in off-the-shelf products built by outside engineers. It is in giving winery and vineyard workers the ability to design and build their own tools, for their own problems, in their own region. You know which diseases matter in your vineyard. You know when your growing season starts. You know which workflows are actually costing you time. That domain knowledge is worth more than any model architecture.

Build tools that solve your problems

I help winemakers and vineyard managers learn to build their own AI tools, designed around the workflows and challenges they actually face.