| Description: |
Automation in agriculture enhances efficiency and productivity. Taking tillage as an example, driving tasks such as steering and speed control are already highly automated, shifting the focus toward automating the tillage process itself. Measuring crop residue coverage - a key factor for erosion resistance, soil structure, and moisture - with semantic segmentation of camera images requires large, accurately annotated datasets. Manual annotation is time-consuming, error-prone, and challenging due to the fine structures of straw and the indistinct boundaries of soil aggregates. To overcome these issues, synthetic training data were generated using the modeling software Blender to model soil textures, residue distributions, and environmental conditions. Photorealism was subsequently enhanced through the machine learning method ControlNet. The approach was evaluated and tested using three datasets - real-world, Blender-generated, and ControlNet-generated - assessed with the mean Intersection over Union (mIoU) and Fréchet Inception Distance (FID) metrics. A semantic segmentation network, PIDNet, trained on real-world data, achieved an mIoU of 75.0 %. The network trained on the Blender dataset obtained 52.9 % due to limited realism. In contrast, ControlNet-generated data achieved 69.3 % with improved FID scores compared to the Blender dataset, indicating higher realism and superior model performance. Finally, after fine-tuning the segmentation model based on the ControlNet dataset with real data, an mIoU of 75.4 % was reached. These findings indicate that high-quality synthetic data can reduce annotation effort, minimize labeling errors, and, in some instances, outperform real data in training machine learning models. |