GrainSpace: A Large-Scale Dataset for Fine-Grained and Domain-Adaptive Recognition of Cereal Grains

Lei Fan1,2*+, Yiwen Ding1+, Dongdong Fan1, Donglin Di3, Maurice Pagnucco2, Yang Song2
CVPR 2022
1Gaozhe Technology,
2CSE, UNSW Sydney, 3Baidu

+Equal contribution

*Corresponding Author

Abstract

Cereal grains are a vital part of human diets and are important commodities for people’s livelihood and international trade. Grain Appearance Inspection (GAI) serves as one of the crucial steps for the determination of grain quality and grain stratification for proper circulation, storage and food processing, etc. GAI is routinely performed manually by qualified inspectors with the aid of some hand tools. Automated GAI has the benefit of greatly assisting inspectors with their jobs but has been limited due to the lack of datasets and clear definitions of the tasks.

In this paper we formulate GAI as three ubiquitous computer vision tasks: fine-grained recognition, domain adaptation and out-of-distribution recognition. We present a large-scale and publicly available cereal grains dataset called GrainSpace. Specifically, we construct three types of device prototypes for data acquisition, and a total of 5.25 million images determined by professional inspectors. The grain samples including wheat, maize and rice are collected from five countries and more than 30 regions. We also develop a comprehensive benchmark based on semi-supervised learning and self-supervised learning techniques. To the best of our knowledge, GrainSpace is the first publicly released dataset for cereal grain inspection,

Dataset Information

The GrainSpace dataset provides a comprehensive collection of high-quality images for cereal grain inspection. It includes over 5.25 million images captured from three types of grains: wheat, maize, and rice. The dataset supports tasks such as fine-grained recognition, domain adaptation, and out-of-distribution recognition.

The GrainSpace dataset is licensed under the Creative Commons BY-NC-SA 4.0 license. Note that All data must not be used for commercial purposes.

TRAIN

You can access the dataset using the figshare links below:

  • TRAIN-Wheat: TRAIN-Wheat-G600-1 (18G) , TRAIN-Wheat-G600-2 (14G) | TRAIN-Wheat-P600-1 (6G) , TRAIN-Wheat-P600-2 (13G) | TRAIN-Wheat-M600 (5G)
  • TRAIN-Maize: TRAIN-Maize-G600 (10G) | TRAIN-Maize-P600 (15G) | TRAIN-Maize-M600 (3G)
  • TRAIN-Rice TRAIN-Rice-G600 (15G) | TRAIN-Rice-P600-1 (15G) , TRAIN-Rice-P600-2 (12G) , TRAIN-Rice-P600-3 (6G) | TRAIN-Rice-M600 (3G)
  • VALIDATION

    You can access the dataset using the figshare links below:

  • VAL-Wheat: VAL-Wheat (7G)
  • VAL-Maize: VAL-Maize (3G)
  • VAL-Rice VAL-Rice (6G)
  • TEST

    You can access the dataset using the figshare or baiduyun links below:

  • TEST-Figshare: TEST-Maize (3G) | TEST-Rice (6G) | TRAIN-Wheat (7G)
  • TEST-BaiduYun: TEST [百度云-密码: nvvn] (16G)
  • BibTeX

    
        @InProceedings{Fan_2022_CVPR,
            author    = {Fan, Lei and Ding, Yiwen and Fan, Dongdong and Di, Donglin and Pagnucco, Maurice and Song, Yang},
            title     = {GrainSpace: A Large-Scale Dataset for Fine-Grained and Domain-Adaptive Recognition of Cereal Grains},
            booktitle = {Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)},
            month     = {June},
            year      = {2022},
            pages     = {21116-21125}
        }