With the development of the global economy, the scale of the fruit consumption market continues to expand, and the types of fruits are increasingly abundant. Fruit detection and identification technology has important application value in agricultural production, warehousing and logistics, supermarket retail and other fields. Traditional fruit detection and identification methods mainly rely on manual identification, which is limited by labor cost, identification efficiency and accuracy to a certain extent. Therefore, the development of an efficient and accurate automated fruit detection and identification system has important research significance and practical value.
In this blog post, we propose a deep learning-based fruit detection and recognition system that uses the YOLOv5 algorithm to detect and identify common fruits, and to accurately identify fruits in pictures, videos, and real-time videos. YOLOv5 , as the latest version of the YOLO series of algorithms, further improves the detection accuracy while maintaining real-time performance.
Users can upload a single image or video file for fruit detection and identification, and the system will identify the fruit in the image and display the corresponding category and confidence level.

You can use the camera for real-time fruit detection and identification
The Fruit Detection Dataset used in this system manually labels 8 categories of fruits including apples, bananas, dragon fruits, guava, oranges, pears, pineapples, and sakya fruits, for a total of 3030 images. Each category of fruits in this dataset has a large number of rotations and different lighting conditions, which helps to train a more robust detection model. The fruit detection and recognition dataset of this experiment contains 2424 images of the training dataset, 303 images of the validation set, and 303 images of the test set. Some data are selected. The sample dataset is shown in the figure

The deep learning model of this system is implemented in PyTorch and is based on the YOLOv5 algorithm for object detection. In the training stage, we used the pre-trained model as the initial model for training, and then optimized the network parameters through multiple iterations to achieve better detection performance. During the training process, we adopted techniques such as learning rate decay and data enhancement to enhance the generalization ability and robustness of the model.
In the testing phase, we use the trained model to detect new images, videos and live video streams. By setting a threshold, we filter out the detection boxes with confidence levels below the threshold, and finally get the detection results. At the same time, we can also save the detection results as pictures or videos for subsequent analysis and application.
UI part
In this system, a visual GUI interface is designed using the PyQt5 library, which mainly includes the following six modules:
(1) Image detection module: Users can select a local image for detection, and the detection results will be displayed in real time in the interface. Users can also select multiple images for batch detection and save the result records locally.
(2) Video detection module: Users can select a local video for detection, and the detection results will be displayed in real time in the interface. Users can also select multiple videos for batch detection and save the result records locally.
(3) Real-time detection module: Users can start the camera for real-time detection, and the detection results will be displayed in real time on the interface.
(4) Replace the model module: Users can choose different pre-trained models for testing, and the system provides a variety of different pre-trained models for users to choose from.
(5) Result record review module: Users can view historical test results and search and filter the test results.
(6) Other modules: The interface also includes some auxiliary modules, such as login registration, settings, etc.