Abstract: Accurate collection of crop planting information at large area is essential for estimating agricultural productivity and ensuring food security. Different crops have varying growth cycles ...
Abstract: Large Multi-modal Models (LMMs) have made impressive progress in many vision-language tasks. Nevertheless, the performance of general LMMs in specific domains is still far from satisfactory.