Hostname: page-component-cd9895bd7-jn8rn Total loading time: 0 Render date: 2024-12-24T18:13:21.225Z Has data issue: false hasContentIssue false

Automatic counting and identification of two Drosophila melanogaster (Diptera: Drosophilidae) morphs with image-recognition artificial intelligence

Published online by Cambridge University Press:  11 December 2024

Aaron Gálvez Salido
Affiliation:
Departamento de Genética, Facultad de Ciencias, Universidad de Granada, Granada, Spain
Roberto de la Herrán
Affiliation:
Departamento de Genética, Facultad de Ciencias, Universidad de Granada, Granada, Spain
Francisca Robles
Affiliation:
Departamento de Genética, Facultad de Ciencias, Universidad de Granada, Granada, Spain
Carmelo Ruiz Rejón
Affiliation:
Departamento de Genética, Facultad de Ciencias, Universidad de Granada, Granada, Spain
Rafael Navajas-Pérez*
Affiliation:
Departamento de Genética, Facultad de Ciencias, Universidad de Granada, Granada, Spain
*
Corresponding author: Rafael Navajas-Pérez; Email: [email protected]

Abstract

Many population biology, ecology, and evolution experiments rely on the accuracy of the classification of individuals and the estimation of size population. The visual classification of vinegar flies, Drosophila melanogaster (Diptera: Drosophilidae), morphs is a laborious task usually performed by bench workers. Because of the size of the flies and the degree of precision needed to distinguish the morphological features on which the classification is based, the work is performed using a dissecting microscope. Here, we describe a method to automate the counting and identification of two types of vinegar flies, white and wild individuals. Our method is based on the image-recognition artificial intelligence (AI) tool, FlydAI (FlyDetector AI), which proved to correctly classify the flies when high-quality images were used, with a success rate of up to 100% in samples containing up to 200 individuals. This is a significant improvement with respect to preexisting approaches in terms of accuracy and specificity of the morphs detected. Although this tool is exclusively trained to routine lab tasks involving wild and white D. melanogaster, the AI can be easily trained to recognise different vinegar fly mutants and other types of insects of similar size, and its potential in other areas still needs to be explored.

Type
Research Paper
Creative Commons
Creative Common License - CCCreative Common License - BYCreative Common License - NCCreative Common License - SA
This is an Open Access article, distributed under the terms of the Creative Commons Attribution-NonCommercial-ShareAlike licence (https://creativecommons.org/licenses/by-nc-sa/4.0/), which permits non-commercial re-use, distribution, and reproduction in any medium, provided the same Creative Commons licence is used to distribute the re-used or adapted article and the original article is properly cited. The written permission of Cambridge University Press must be obtained prior to any commercial use.
Copyright
© The Author(s), 2024. Published by Cambridge University Press on behalf of Entomological Society of Canada

Introduction

Around 4000 labs worldwide work on Drosophila melanogaster Meigen, 1830 (Diptera: Drosophilidae), the vinegar fly; according to the Bloomington Drosophila Stock Center, BDSC, the database of which includes more than 3900 research groups in 72 countries. Although the first documented experimental use of Drosophila was by William Castle’s group at Harvard in 1901, this species became a model insect after Thomas H. Morgan and collaborators laid the foundations of the field of genetics in their famous Fly Room at the beginning of the twentieth century (Morgan Reference Morgan1909). Since then, hundreds of mutations have been described in D. melanogaster causing alterations in wings, abnormal body colour, odd-coloured eyes, and strangely formed heads (Chadov et al. Reference Chadov, Fedorova and Chadova2015). These mutations allow a direct association between phenotypes and genotypes, making them easily scored markers for genetic mapping, population biology, ecology, and evolution studies (Kohler Reference Kohler1994). Drosophila melanogaster enabled seminal genetics discoveries, such as the nature of Mendelian factors, physical mapping, and the effect of X-rays in chromosomal structure (Morgan Reference Morgan1909; Jennings Reference Jennings2011), and was used to test the genome sequencing shot-gun method 11 months before the publication of the human genome (Kohler Reference Kohler1994; Adams et al. Reference Adams, Celniker, Holt, Evans, Gocayne and Amanatides2000; Rubin and Lewis Reference Rubin and Lewis2000).

The vinegar fly is also the subject of a large number of experiments in a wide range of different fields (Hales et al. Reference Hales, Korey, Larracuente and Roberts2015). It is a key model for regenerative biology and medicine. Well-established protocols exist for its genetic modification, and many orthologous genes are available to study mechanisms underlying human disease (Yamamoto et al. Reference Yamamoto, Jaiswal, Charng, Gambin, Karaca and Mirzaa2014), including cancer, cardiovascular disease, and neurological diseases (Bellen et al. Reference Bellen, Tong and Tsuda2010; Pandey and Nichols Reference Pandey and Nichols2011). Drosophila melanogaster is also considered a model organism for developmental biology. Over the past four decades, Drosophila has become a predominant model to understand how genes direct the development of an embryo (Jennings Reference Jennings2011). Many driver lines from Drosophila are used to provide spatiotemporally regulated genetic handles to almost every cell type (Chan et al. Reference Chan, Chen, Hernandez, Meltzer, Park and Stahl2024). The use of vinegar flies is also common in studies related to the influence of genetic and environmental factors in behaviour, such as mating and courtship (Balaban-Feld and Valone Reference Balaban-Feld and Valone2018), aggression (Baier et al. Reference Baier, Wittek and Brembs2002), and learning and memory (Maggu et al. Reference Maggu, Kapse, Ahlawat, Arun and Prasad2022).

For some analyses, only a simple count of total number of individuals is needed, whereas for others, detailed counts (i.e., classifying different morphs) are required. Importantly, the efficient estimation of the number and type of individuals is crucial in experiments that correlate certain phenotypes with specific genotypes, as is the case for genetic mapping (Zhai et al. Reference Zhai, Hiesinger, Koh, Verstreken, Schulze and Cao2003) or in functional ecology studies (Mendes et al. Reference Mendes, Gottschalk, Corrêa and Valente-Gaiesky2021). In other cases, a measure of the raw number of individuals in a population is required, such as in fecundity experiments (Nouhaud et al. Reference Nouhaud, Mallard, Poupardin, Barghi and Schlötterer2018). The traditional manual process includes a short treatment of the flies with anaesthesia, after which they are counted and identified by experienced technicians using dissecting microscopes. In some other cases, it is important to track fly movement, as in some behavioural experiments (Nichols et al. Reference Nichols, Becnel and Pandey2012).

Although the short lifecycle of Drosophila enables the possibility of performing experiments with large numbers of individuals, some methodological bottlenecks remain. The reduced size of the individuals (about 3 mm long) and their motility often hinder their management and identification and make it difficult to track movement or to make observations of certain behaviours (Macartney et al. Reference Macartney, Pottier, Burke and Drobniak2022). Counting large populations of flies is a tedious and time-consuming process. The total processing time varies and depends on the ability of a human operator and the number of individuals required. The typology of some studies requires analyses of high numbers of individuals. In these cases, automated phenotyping is a possible solution. These methods use deep-learning models to automatically recognise and categorise elements within images and consist of several steps: (1) data collection, typically a digital image; (2) identification and classification of the objects in the image by the artificial neural networks; (3) image pattern recognition, where a predefined class label is assigned to an image, or part of an image, and tagging using object bounding boxes; and (4) segmentation, which is necessary for counting and recognising individual objects (Uchida Reference Uchida2013). In Drosophila, examples include (1) methods to efficiently count large number of individuals and (2) methods able to determine the offspring size and gender ratio in fly populations. All of the methods use as inputs images of anaesthetised flies or images of flies in sticky traps. Within the first category of examples, the most significant tools are FlyCounter, a program coded in MATLAB (Yati and Dey Reference Yati and Dey2011), and an application for Android devices that is an adaptation of a preexisting app to count seeds (Karpova et al. Reference Karpova, Komyshev, Genaev, Adonyeva, Afonnikov, Eremina and Gruntenko2020). Other authors, using ImageJ (Nouhaud et al. Reference Nouhaud, Mallard, Poupardin, Barghi and Schlötterer2018; Ng’oma et al. Reference Ng’oma, Williams-Simon, Rahman and King2020) and proprietary software (Waithe et al. Reference Waithe, Rennert, Brostow and Piper2015), have developed methods to indirectly make estimations of Drosophila population sizes based on the number of eggs laid. Within the second category of examples, there is an application for cell phones, also named FlyCounter (Genaev et al. Reference Genaev, Komyshev, Shishkina, Adonyeva, Karpova and Gruntenko2022), that is based on a YOLOv4-tiny neural network algorithm and an image-based object detection method that uses deep convolutional neural networks to identify D. suzukii individuals (Roosjen et al. Reference Roosjen, Kellenberger, Kooistra, Green and Fahrentrapp2020). Aside from counting the number of flies, the latter two approaches can also differentiate individual flies according to their sex. None of the methods developed to date has been trained to differentiate Drosophila mutants.

Here, we present FlyDetector artificial intelligence (FlydAI), an image-recognition algorithm based on TensorFlow, that can be used to count and classify large populations of wild-type and white-mutant D. melanogaster. The FlydAI uses images containing different numbers of flies and makes predictions on the total number and morphs of flies.

Materials and methods

The rationale behind the present experiment was to capture images containing two types of D. melanogaster flies so that they subsequently can be processed using ad hoc–trained image-recognition AI that automates the counting of individuals and the identification of the different phenotypes. A process map is shown in Fig. 1.

Figure 1. Pipeline of the development and use of image-recognition AI: image capturing, dataset preparation, and AI training and validation.

The workflow of FlydAI includes (1) an image in joint photographic experts group (jpeg) format is taken as input; (2) images exceeding 1094 × 768 pixels are cropped into fragments; (3) each fragment is processed individually, individuals are identified, classified by phenotype, and recorded; (4) a summary of the numbers of individuals and phenotypes counted is generated as a comma-separated values (csv) file; and (5) the original image is reconstructed using the interpreted fragments (Fig. 2).

Figure 2. Output of an image reconstructed by FlydAI. Each individual is identified by a red square; phenotypes and a value of confidence (a cutoff value of 0.65 was considered) included in brackets are indicated inside.

Image dataset preparation

Wild-type and white-mutant D. melanogaster flies were anaesthetised with ether for 45 seconds, placed onto blue, white, pink, yellow, brown, and gridded cards (see Supplementary material 1) to maximise background subtraction (Uchida Reference Uchida2013), and photographed. The number of individuals per image ranged randomly from 1 to 10. For replicates, individuals were randomly replaced and repositioned to train the model to identify different flies in different positions. A total of 350 flies were used. Images were captured with two different devices: a Canon EOS 70D camera with a 100mm f/2.8L macro lens (Canon, Ota City, Tokyo, Japan) and a Bysameyee 8-SA-00 digital microscope (Bysameyee, South Korea). A dataset of 495 images in three categories (only white mutants, only wild-type flies, and both white mutants and wild-type flies) was generated.

Development of AI for image recognition

FlydAI training was performed using TensorFlow, a free and open-source software library for machine learning and artificial intelligence (https://www.tensorflow.org). After conducting an average accuracy (mAP) test with COCO (Common Objects in Context, https://cocodataset.org/; Lin et al. Reference Lin, Maire, Belongie, Bourdev, Girshick and Hays2014) for five different architectures (for details, see Data availability), EfficientDet-Lite2 architecture (batch size = 16 for training and 1 for validation; number of epochs = 120) was selected because of its more efficient performance in terms of training and inference times.

The size of the images was standardised, so that images exceeding 1094 × 768 pixels were cropped into smaller fragments to meet this criterion. This facilitated the detection of small targets and was the optimal image size with which TensorFlow architecture was trained. Reducing or rescaling the image to suit the optimal size is not an option because doing so would reduce the level of detail and information of the image, thereby dramatically decreasing the model’s performance. When cropping images, flies that are positioned on a boundary between two fragments may be misclassified or classified twice. To avoid these potential problems, we recommend using a grid card (see Supplementary material 1) to place the anaesthetised individual flies well within the squares.

Wild-type and white-mutant flies were manually identified in the images and labelled with labelImg (https://github.com/tzutalin/labelImg). Images were randomly distributed for training (440; 89%) and validation (55; 11%). For training, we aimed to increase the variability of the dataset to help improve the AI’s resistance to photographic artefacts. An additional set of images therefore was generated with Roboflow (https://roboflow.com; Fig. 3), using as the source the initial 440 images. The Roboflow software reconstructed 921 new pictures from the original images by randomly combining their different parts, including specular images, and by adding 5% artificial noise. As a result, a final dataset of 1361 images (440 original images + 921 Roboflow-created images) was obtained and used for AI training. Reports for each validation were generated and manually inspected.

Figure 3. Example of an image reconstructed by combining fragments and adding artificial noise generated by Roboflow.

Only the images that were neither labelled nor previously processed by the AI were used to test the AI. Sets of images with different fly densities, containing 25, 50, 100, and 200 wild-type and white-mutant flies in different percentages, were reconstructed from the dataset. For that task, different combinations of fragments containing up to 10 individuals were used. In addition, three sets of images used for testing were altered by adding 5, 15, and 30% noise with Photoshop (Adobe, Inc., San Jose, California, United States of America; an example is shown in Supplementary material 2) and then were used to test the AI as previously described. The whole process, using different sets of images with the same densities (25, 50, 100, and 200) and the same noise levels (0%, 5%, 15%, and 30%), was repeated 15 times. For each replicate, the combination of fragments differed. The run time (for Google Colab (Mountain View, California, United States of America) GPU: NVIDIA Tesla K80 (Austin, Texas, United States of America) was recorded.

Data availability

A GitHub repository was created to allocate the Python scripts written to automate the process, the compiled programs, a step-by-step protocol, and the dataset used in this study: https://aarongs1999.github.io/Drosophila_AI_Tensorflow/Drosophila_AI_Tensorflow.html. Users can reproduce the analysis presented here using the sample dataset provided, run the model using private images of white and wild Drosophila, or train the model using images with different materials.

Human determination of number and phenotype of individuals

Four experienced geneticists were asked to estimate the number of wild-type and white-mutant flies in digital images of different densities (25, 50, 100, and 200 flies). Each geneticist was given 15 images that were randomly selected from the dataset of high-quality images containing a mix of wild and white flies. The geneticists did not see the images before the experiment and did not know the number of individuals in each image in advance. The time to process each image – the amount of time from when the digital image file in question was opened until the count was finished – was recorded.

In addition, three experienced geneticists were asked to estimate the number of wild and white flies in lab samples of different densities (approximately 25, 50, 100, and 200 flies) using dissecting microscopes. Each geneticist was given 10 samples that contained a mix of wild and white flies. The geneticists did not know the number of individual flies in each sample in advance. The time needed to process each sample – the amount of time from when the sample in question was provided until the count was finished – was recorded.

Statistical processing of data

For each image, the success rate of the estimations of three parameters – total number of individuals, number of wild individuals, and number of white individuals – was calculated as follows:

$$\rm Success\ rate = [(experimental\ value/real\ value) \times 100]$$

Values greater than 100 indicate overestimation, and those less than 100 indicate underestimation.

Results

FlydAI: individual counts, phenotype identification, and processing time

Individual counts

When images with a degree of noise of up to 5% were input, in all tests performed (densities 25, 50, 100, and 200) and for all 15 replicates, FlydAI estimated the correct number of individuals with a success rate over 99.8%. For images with 15% noise, the efficiency was better than 96%. In all of these cases, some flies remained not detected because the individuals were out of focus or in positions that hampered their identification, even by a human operator. In pictures with 30% artificial noise, FlydAI accurately estimated the number of individuals with an efficiency of 75.2%. Across the entire dataset, the average percentage of successful estimates of numbers of individual flies by the AI was 92.92% (Table 1).

Table 1. Success rate (SR) of estimation of number and phenotype of individuals by AI and run time for the different densities and image sources

Phenotype identification

Regarding phenotype identification, for images taken under controlled conditions of focus and light, all individuals previously detected by the AI were assigned a correct phenotype (wild or white) for all densities and in all replicates (Table 1). When images with 5% artificial noise were used as the input, all individuals were assigned the correct phenotype, except for one single wild fly that was phenotyped by the AI as a white fly. For images with 15% artificial noise, 82.3% of white individuals were assigned the correct phenotype. Wild individuals were overestimated by 15.6%, with some white individuals incorrectly assigned to the wild phenotype. For images with 30% artificial noise, wild individuals were overestimated by 47.74%, with most white individuals incorrectly assigned to the wild phenotype. In fact, using these images as the source, FlydAI correctly phenotyped only 3.67% of white-mutant flies (Table 1).

Processing time

The overall mean average run time for FlydAI was 75.83 seconds per image and did not differ significantly between the different replicates, regardless of the fly density or noise level (Table 1). The execution time was directly related to the complexity of the TensorFlow architecture. The architecture selected for the present study (EfficientDet-Lite2) proved to be more efficient of the options tested.

Human workers: individual counts, phenotype identification, and processing time

Individual counts

Human workers were asked to identify the number of white and wild individuals from both digital images and lab samples of different densities (25, 50, 100, and 200). For digital images, the average percentage of successful estimations of individual fly numbers overall was almost 100% (Table 2). For lab samples, the average percentage of successful estimations of individual fly numbers overall was 99.6% (Table 3).

Table 2. Success rate (SR) of estimation of number and phenotype of individuals from digital images by human workers and processing time

Table 3. Success rate (SR) of estimation of number and phenotype of individuals from lab samples by human workers and processing time

Phenotype identification

The average percentages of correct phenotype identifications in digital images and lab samples were equal, both being 99.6%.

Processing time

For both individual counts and phenotype identifications, the mean average time required to complete each task increased with the density of flies. For digital images, the mean average run time was 51.13, 67.45, 84.14, and 153.09 seconds per image for densities of 25, 50, 100, and 200, respectively. For lab samples, the mean average run time was 49, 95.8, 197.33, and 513.7 seconds per image for densities 25, 50, 100, and 200, respectively. The overall average processing time was 88.95 and 214 seconds per image for digital images and lab samples, respectively (Tables 2 and 3).

Discussion

Using AI for counting and identifying Drosophila melanogaster individuals

FlydAI is an AI that automates the counting and identification of large populations of wild and white morphs of D. melanogaster, minimising the labour of the manual procedure. Following the approach presented here, we counted and identified wild and white morph flies in samples up to 200 individuals: 99.89% of individuals in the samples were counted, and all of them were assigned the correct phenotype in 78.81 seconds of average run time. Based on these results, the use of FlydAI potentially could help to reduce labour time for these routine tasks in a D. melanogaster lab.

When input images are larger than 1094 × 768 pixels, FlydAI divides them in smaller fragments, and these are processed separately by the AI, processing a few individuals at a time. We tested different combinations of fragments as inputs to reach the different densities. We found the accuracy and the processing time were not significantly affected by sample size (see Table 1), indicating that the number of flies to be counted and identified can be increased without hindering these parameters. As mentioned in the section, Development of AI for image recognition, in this paper, a fly that overlaps (is partially present) in two images as a result of the AI cropping process can be misclassified or classified twice. The grid card provided (Supplementary material 1) allows the user to place the anaesthetised flies within the squares to help make sure all of them are classified correctly. The print-ready sample card can be easily transformed onto a sticky card to facilitate the process of image capturing.

Efficiency of human workers versus FlydAI

For experienced human workers, the rate of accurate estimation of number and phenotype of individuals was almost 100% in all cases, although a slight tendency was observed when viewing digital images to overestimate the number of individuals in samples of up to 100 individuals and to underestimate this number in samples at a density of 200 individuals. When observing samples under a microscope, workers showed an opposite tendency, undercounting flies at lower density and overcounting flies at the highest density (200 flies). However, all counting errors by workers were minor.

The preparation of the samples to be processed by both FlydAI and human operators does not vary significantly: in both cases, it is necessary to anaesthetise the flies, place them on a solid surface (cardboard grid or watch glass, depending on the case), and position them for observation (with a DLSR camera or dissecting microscope, respectively). Apart from that, human processing time at low or moderate densities was similar to FlydAI for digital images (Table 2) or even lower for lab samples (Table 3). As the number of individuals per sample increased, the human processing time dramatically increased, whereas the AI run time remained constant. On average, a human worker was 14.75% slower than FlydAI when processing digital images and 64.6% slower when processing lab samples. For the samples with higher densities used in this study (200 flies), processing time was 513.7, 153.09, and 76 seconds per image for lab samples, digital images, and AI, respectively. This suggests that a human worker would take twice as long as FlydAI would to process a digital image. The time spent by a human operator to process a lab sample would be almost seven times greater than the time spent by the AI (see Tables 13). It is reasonable to think that, as we increase the number of samples processed throughout a workday, the efficiency of a human worker would decline due to fatigue, whereas artificial intelligence will maintain steady performance. This makes FlydAI an effective alternative to processing by bench workers.

Automated recognition methods

Most traditional manual methods to estimate size population of insects are trap-based (pitfall traps, emergence traps, malaise traps, pan traps, light traps, etc.). In some other cases, insects become fixed to a physical surface (i.e., on spot cards, sticky traps, and fly ribbons; Gerry Reference Gerry2020) and then are counted by a human worker. In automated approaches, the surface bearing the insects or insect marks is photographed, and an AI tool processes the data (Gerry et al. Reference Gerry, Higginbotham, Periera, Lam and Shelton2011; Ding and Taylor Reference Ding and Taylor2016; Zhong et al. Reference Zhong, Gao, Lei and Zhou2018; Zhu et al. Reference Zhu, Wang, Liu and Mi2018). These image-recognition methods are used commonly to count flies (Gerry Reference Gerry2020), which may be difficult to identify and count when observed alive due to their reduced size and their unpredictable, erratic motility. Those characteristics complicate their identification, increase the likelihood that an individual fly is counted multiple times, and make obtaining clear pictures of them in flight difficult. Even though video-monitoring devices are normally intended to count bigger-sized animals (Pérez-Escudero et al. Reference Pérez-Escudero, Vicente-Page, Hinz, Arganda and de Polavieja2014; Bentley et al. Reference Bentley, Kuczynska, Eddington, Armstrong and Kloepper2023), some early efforts have been used to detect insects in motion (Bjerge et al. Reference Bjerge, Mann and Høye2021, Reference Bjerge, Alison, Dyrmann, Frigaard, Mann and Høye2023; Kirkeby et al. Reference Kirkeby, Rydhmer, Cook, Strand, Torrance and Swain2021; Tannous et al. Reference Tannous, Stefanini and Romano2023; Roy et al. Reference Roy, Alison, August, Bélisle, Bjerge and Bowden2024).

For pest control, the introduction of electronic and automatic traps that combine hardware to collect insects and software to send and process data offsite is increasingly common (Potamitis et al. Reference Potamitis, Rigakis, Vidakis, Petousis and Weber2018; Preti et al. Reference Preti, Verheggen and Angeli2020; Plá et al. Reference Plá, García de Oteyza, Tur, Martínez, Laurín and Alonso2021; Diller et al. Reference Diller, Shamsian, Shaked, Altman, Danziger and Manrakhan2023). Some such tools have already been commercialized: for example, Trapview (https://trapview.com/) developed a device that automatically collects and sends data to be analysed by AI, and RapidFLY (https://rapidaim.io/) provides a trap equipped with a sensor that is able to detect the presence of fruit flies.

In recent years, various AI have been developed and emerged. Some, such as Google’s Vertex AI, have powerful image-recognition capabilities. However, they still all have drawbacks: they operate under pay-as-you-go models with costs quickly escalating, they have limited customisability, and they depend on computational resources or Internet availability.

Ferreira Lima et al. (Reference Ferreira Lima, Damascena de Almeida Leandro, Valero, Pereira Coronel and Gonçalves Bazzo2020), Høye et al. (Reference Høye, Ärje, Bjerge, Hansen, Iosifidis, Leese and Mann2021), and Schneider et al. (Reference Schneider, Taylor, Kremer and Fryxell2023) show that there are many existing applications to automate the estimation of size population and the type of insects. However, most methods cited by those authors discriminate among individuals belonging only to different orders, genera, or, at best, species, and none of the tools has been trained to differentiate Drosophila morphs. Although this limitation might be useful for pest control, it lacks sufficient accuracy for laboratory work. The FlydAI, on the other hand, accurately recognises and counts D. melanogaster mutants using a digital image as source. Once the individuals are fixed onto a surface, only a device able to capture macro images of a minimum of 1094 × 768 pixels is needed.

Criticisms

The high percentage of success of FlydAI determined in the present study may be argued as due to the high-quality images used for the experiments. The images were taken under controlled light and focus conditions, simulating a feasible lab situation. To test the AI usability and replicability under different conditions, we altered a set of images. Adding noise to the images increases the AI’s resilience to artefacts similar to those produced in poor-light conditions. Our aim was to train a tool that can be used even when image-acquisition conditions are suboptimal – for example, during field trips or in facilities where specialised equipment is unavailable. In our tests, FlydAI success rate in assessing the correct number of flies was near 100% for all densities tested when images had up to 15% artificial noise added. The success rate dropped to 75% when images with 30% noise were used. No significant differences in success rate were observed as the density of individuals increased. Almost all of the flies detected in images with up to 5% noise were phenotyped correctly. As the level of noise increased in the images, the AI tended to misidentify white individuals, categorising them as wild-type flies: FlydAI overestimated wild-type flies by 15.2% and 47.74% for images with 15% and 30% noise, respectively; and it underestimated white flies by 17.77% and 96.33% for images with 15% and 30% noise, respectively).

To prevent the model classifying only flies presented at the same scale and to increase the efficiency of the AI, different image sources (from a DLSR camera and a digital microscope, with different levels of magnification and detail) were used for training. However, regardless of the technical and environmental conditions in which our tests were conducted, it is highly unlikely that in real Drosophila experiments, users will feed the AI model with images with the level of artificial noise we added for the present study. Most commonly used devices that take photographs are capable of producing high-quality images. In fact, more and more research demonstrates the application of images taken with mobile phone cameras in diverse experimental contexts (Ozcan Reference Ozcan2014; Switz et al. Reference Switz, D’Ambrosio and Fletcher2014).

To further prove the efficiency of the FlydAI, we applied a test using a set of images downloaded from the Internet. The lack of a sufficient number of images exclusively containing wild and white D. melanogaster flies did not permit robust conclusions, but FlydAI detected both phenotypes with the expected accuracy even when D. melanogaster drawings were used (see Data availability).

Final remarks

The data presented here indicate that FlydAI is an accurate, fast, and reliable tool for automated phenotyping of two different morphs of D. melanogaster (wild and white) and for counting large number of individuals and that using FlydAI significantly reduces processing time for counting and identification. In terms of efficiency and time employed, the tool surpassed the performance of both other methods and experienced human operators. Thus, FlydAI is a viable alternative to human workers in routine lab tasks, especially when the quality of the images is high or moderate (up to 15% noise) and the number of individuals is high (200 or higher). The protocol we present in this study could be used to benefit researchers in fields such as population biology, ecology, and evolution, where they must deal with a large number of individuals. The FlydAI is also customisable for different phenotypes and species and requires a minimal supervision by specialised operators. Programs, scripts, sample images, and the protocol have been made available in a GitHub repository, and users can easily customise the methodology by training the AI with their own images. A step-by-step protocol and the AI can be accessed via a Google Colab virtual machine. Furthermore, FlydAI is affordable; TensorFlow Lite is specifically designed for developing models on microcontrollers and other mobile and edge devices. This allows FlydAI to be run using a 2.2-GHz processor, making possible the use of a low-cost Raspberry Pi (model 4b 8-Gb RAM) compact computer (Raspberry pi OS 11 64-bit; Raspberry Pi Foundation, Cambridge, United Kingdom) to process the AI-generated files. This means that the methodology presented here is portable and can be used outdoors or in facilities where no specialised instruments are available.

Prospective work

The protocol provided in the present study (see Data availability) easily allows ad hoc training of the AI for the recognition of not only other D. melanogaster mutants but can also be customised to recognise a wide range of other animal species. Although FlydAI was designed exclusively to facilitate routine lab tasks dealing with estimating the number of white and wild D. melanogaster individuals, with the proper adjustments, it could be used in the field or in different types of facilities to routinely record fly activity. The second section of the online protocol (see Data availability) outlines how to train and use the model using other images of other materials or insects.

The success of the proposed methodology relies on the availability of images containing a representation of individuals in a population. The workflow suggests that multiple operators can capture the images onsite, with subsequent data processing centralised offsite to reduce travel time and expenses of specialised workers. One possible scenario would be for the digital images to be processed by human workers. However, as shown in the present paper, FlydAI still is a more efficient solution.

Supplementary material

The supplementary material for this article can be found at https://doi.org/10.4039/tce.2024.36.

Acknowledgements

The authors thank the Fundación Española para la Ciencia y la Tecnología, FECYT [FCT-21-17334] for funding this work. Funding for open access for this paper was provided by the Universidad de Granada.

Competing interests

The authors declare that they have no competing interests.

References

Adams, M.D., Celniker, S.E., Holt, R.A., Evans, C.A., Gocayne, J.D., Amanatides, P.G., et al. 2000. The genome sequence of Drosophila melanogaster . Science, 287: 21852195.CrossRefGoogle ScholarPubMed
Baier, A., Wittek, B., and Brembs, B. 2002. Drosophila as a new model organism for the neurobiology of aggression? Journal of Experimental Biology, 205: 12331240.CrossRefGoogle ScholarPubMed
Balaban-Feld, J. and Valone, T.J. 2018. Changes in courtship behaviour following rejection: the influence of female phenotype in Drosophila melanogaster . Ethology, 124: 149154.CrossRefGoogle Scholar
Bellen, H., Tong, C., and Tsuda, H. 2010. 100 years of Drosophila research and its impact on vertebrate neuroscience: a history lesson for the future. Nature Reviews Neuroscience, 11: 514522.CrossRefGoogle ScholarPubMed
Bentley, I., Kuczynska, V., Eddington, V.M., Armstrong, M., and Kloepper, L.N. 2023. BatCount: a software program to count moving animals. PLOS One, 18: e0278012. http://doi.org/10.1371/journal.pone.0278012.CrossRefGoogle ScholarPubMed
Bjerge, K., Alison, J., Dyrmann, M., Frigaard, C.E., Mann, H.M.R., and Høye, T.T. 2023. Accurate detection and identification of insects from camera trap images with deep learning. PLOS Sustainability and Transformation, 2: e0000051. https://doi.org/10.1371/journal.pstr.0000051.CrossRefGoogle Scholar
Bjerge, K., Mann, H.M.R., and Høye, T.T. 2021. Real-time insect tracking and monitoring with computer vision and deep learning. Remote Sensing in Ecology and Conservation, 8: 315327. https://doi.org/10.1002/rse2.245.CrossRefGoogle Scholar
Chadov, B.F., Fedorova, N.B., and Chadova, E.V. 2015. Conditional mutations in Drosophila melanogaster: on the occasion of the 150th anniversary of G. Mendel’s report in Brünn. Mutation Research/Reviews in Mutation Research, 765: 4055. https://doi.org/10.1016/j.mrrev.2015.06.001.CrossRefGoogle Scholar
Chan, I.C.W., Chen, N., Hernandez, J., Meltzer, H., Park, A., and Stahl, A. 2024. Future avenues in Drosophila mushroom body research. Learning Memory, 31: a053894.CrossRefGoogle ScholarPubMed
Diller, Y., Shamsian, A., Shaked, B., Altman, Y., Danziger, B.C., Manrakhan, A., et al. 2023. A real-time remote surveillance system for fruit flies of economic importance: sensitivity and image analysis. Journal of Pest Science, 96: 611622. http://doi.org/10.1007/s10340-022-01528-x.CrossRefGoogle Scholar
Ding, W. and Taylor, G. 2016. Automatic moth detection from trap images for pest management. Computers and Electronics in Agriculture, 123: 1728.CrossRefGoogle Scholar
Ferreira Lima, M.C., Damascena de Almeida Leandro, M.E., Valero, C., Pereira Coronel, L.C., and Gonçalves Bazzo, C.O. 2020. Automatic detection and monitoring of insect pests: a review. Agriculture, 10: 161.CrossRefGoogle Scholar
Genaev, M.A., Komyshev, E.G., Shishkina, O.D., Adonyeva, N.V., Karpova, E.K., Gruntenko, N.E., et al. 2022. Classification of fruit flies by gender in images using smartphones and the YOLOv4-tiny neural network. Mathematics, 10: 295. http://doi.org/10.3390/math10030295.CrossRefGoogle Scholar
Gerry, A.C. 2020. Review of methods to monitor house fly (Musca domestica) abundance and activity. Journal of Economic Entomology, 113: 25712580. http://doi.org/10.1093/jee/toaa229.CrossRefGoogle ScholarPubMed
Gerry, A.C., Higginbotham, G.E., Periera, L.N., Lam, A., and Shelton, C.R. 2011. Evaluation of surveillance methods for monitoring house fly abundance and activity on large commercial dairy operations. Journal of Economic Entomology, 104: 10931102.CrossRefGoogle ScholarPubMed
Hales, K.G., Korey, C.A., Larracuente, A.M., and Roberts, D.M. 2015. Genetics on the fly: a primer on the Drosophila model system. Genetics, 201: 815842. http://doi.org/10.1534/genetics.115.183392.CrossRefGoogle Scholar
Høye, T.T., Ärje, J., Bjerge, K., Hansen, O.L.P., Iosifidis, A., Leese, F., Mann, H.M.R., et al. 2021. Deep learning and computer vision will transform entomology. Proceedings of the National Academy of Sciences, 118: e2002545117. https://doi.org/10.1073/pnas.2002545117.CrossRefGoogle ScholarPubMed
Jennings, B.H. 2011. Drosophila: a versatile model in biology and medicine. Materials Today, 14: 190195.CrossRefGoogle Scholar
Karpova, E.K., Komyshev, E.G., Genaev, M.A., Adonyeva, N.V., Afonnikov, D.A., Eremina, M.A., and Gruntenko, N.E. 2020. Quantifying Drosophila adults with the use of a smartphone. Biology Open, 9: bio054452. http://doi.org/10.1242/bio.054452.Google ScholarPubMed
Kirkeby, C., Rydhmer, K., Cook, S.M., Strand, A., Torrance, M.T., Swain, J.L., et al. 2021. Advances in automatic identification of flying insects using optical sensors and machine learning. Scientific Reports, 11: 1555. http://doi.org/10.1038/s41598-021-81005-0.CrossRefGoogle ScholarPubMed
Kohler, R.E. 1994. Lords of the Fly: Drosophila Genetics and the Experimental Life. University of Chicago Press, Chicago, Illinois, United States of America.Google Scholar
Lin, T.Y., Maire, M., Belongie, S.J., Bourdev, L.D., Girshick, R.B., Hays, J., et al. 2014. Microsoft COCO: common objects in context. ArXiv. http://doi.org/10.48550/arXiv.1405.0312.Google Scholar
Macartney, E., Pottier, P., Burke, S., and Drobniak, S. 2022. Quantifying between-individual variation using high-throughput phenotyping of behavioural traits in the fruit fly (Drosophila melanogaster). EcoEvoRxiv. https://doi.org/10.32942/X22S39.Google Scholar
Maggu, K., Kapse, S., Ahlawat, N., Arun, M.G., and Prasad, N.G. 2022. Finding love: fruit fly males evolving under higher sexual selection are inherently better at finding receptive females. Animal Behaviour, 187: 1533.CrossRefGoogle Scholar
Mendes, M.F., Gottschalk, M.S., Corrêa, R.C., and Valente-Gaiesky, V.L.S. 2021. Functional traits for ecological studies: a review of characteristics of Drosophilidae (Diptera). Community Ecology, 22: 367379. http://doi.org/10.1007/s42974-021-00060-9.CrossRefGoogle Scholar
Morgan, T.H. 1909. What are “factors” in Mendelian explanations? American Breeders Association Reports, 5: 365368.Google Scholar
Ng’oma, E., Williams-Simon, P.A., Rahman, A., and King, E.G. 2020. Diverse biological processes coordinate the transcriptional response to nutritional changes in a Drosophila melanogaster multiparent population. BMC Genomics, 21: 1. http://doi.org/10.1186/s12864-019-6419-1.CrossRefGoogle Scholar
Nichols, C.D., Becnel, J., and Pandey, U.B. 2012. Methods to assay Drosophila behavior. Journal of Visualized Experiments, 61: 3795. https://doi.org/10.3791/3795.Google Scholar
Nouhaud, P., Mallard, F., Poupardin, R., Barghi, N., and Schlötterer, C. 2018. High-throughput fecundity measurements in Drosophila . Scientific Reports, 8: 4469. http://doi.org/10.1038/s41598-018-22777-w.CrossRefGoogle ScholarPubMed
Ozcan, A. 2014. Mobile phones democratize and cultivate next-generation imaging, diagnostics and measurement tools. Lab Chip, 14: 31873194. http://doi.org/10.1039/C4LC00010B.CrossRefGoogle ScholarPubMed
Pandey, U.B. and Nichols, C.D. 2011. Human disease models in Drosophila melanogaster and the role of the fly in therapeutic drug discovery. Pharmacology Reviews, 63: 411436.CrossRefGoogle ScholarPubMed
Pérez-Escudero, A., Vicente-Page, J., Hinz, R.C., Arganda, S., and de Polavieja, G.G. 2014. idTracker: tracking individuals in a group by automatic identification of unmarked animals. Nature Methods, 11: 743748. http://doi.org/10.1038/nmeth.2994.CrossRefGoogle Scholar
Plá, I., García de Oteyza, J., Tur, C., Martínez, M.Á., Laurín, M.C., Alonso, E., et al. 2021. Sterile insect technique programme against Mediterranean fruit fly in the Valencian community (Spain). Insects, 12: 415. http://doi.org/10.3390/insects12050415.CrossRefGoogle Scholar
Potamitis, I., Rigakis, I., Vidakis, N., Petousis, M., and Weber, M. 2018. Affordable bimodal optical sensors to spread the use of automated insect monitoring. Journal of Sensors, 2018: 3949415. http://doi.org/10.1155/2018/3949415.CrossRefGoogle Scholar
Preti, M., Verheggen, F., and Angeli, S. 2020. Insect pest monitoring with camera-equipped traps: strengths and limitations. Journal of Pest Science, 94: 203217. http://doi.org/10.1007/s10340-020-01309-4.CrossRefGoogle Scholar
Roosjen, P.P.J., Kellenberger, B., Kooistra, L., Green, D.R., and Fahrentrapp, J. 2020. Deep learning for automated detection of Drosophila suzukii: potential for UAV-based monitoring. Pest Management Science, 76: 29943002. https://doi.org/10.1002/ps.5845.CrossRefGoogle ScholarPubMed
Roy, D.B., Alison, J., August, T.A., Bélisle, M., Bjerge, K., Bowden, J.J., et al. 2024. Towards a standardized framework for AI-assisted, image-based monitoring of nocturnal insects. Philosophical Transactions of the Royal Society B, 379: 20230108. https://doi.org/10.1098/rstb.2023.0108.CrossRefGoogle ScholarPubMed
Rubin, G.M. and Lewis, E.B. 2000. A brief history of Drosophila’s contributions to genome research. Science, 287: 22162218.CrossRefGoogle ScholarPubMed
Schneider, S., Taylor, G.W., Kremer, S.C., and Fryxell, J.M. 2023. Getting the bugs out of AI: advancing ecological research on arthropods through computer vision. Ecology Letters, 26: 12471258. https://doi.org/10.1111/ele.14239.CrossRefGoogle ScholarPubMed
Switz, N.A., D’Ambrosio, M.V., and Fletcher, D.A. 2014. Low-cost mobile phone microscopy with a reversed mobile phone camera lens. PLOS One, 9: e95330. https://doi.org/10.1371/journal.pone.0095330.CrossRefGoogle ScholarPubMed
Tannous, M., Stefanini, C., and Romano, D. 2023. A deep-learning-based detection approach for the identification of insect species of economic importance. Insects, 14: 148.CrossRefGoogle ScholarPubMed
Uchida, S. 2013. Image processing and recognition for biological images. Development, Growth and Differentiation, 55: 523549. https://doi.org/10.1111/dgd.12054.CrossRefGoogle ScholarPubMed
Waithe, D., Rennert, P., Brostow, G., and Piper, M.D.W. 2015. QuantiFly: robust trainable software for automated Drosophila egg counting. PLOS One, 10: e0127659. http://doi.org/10.1371/journal.pone.0127659.CrossRefGoogle ScholarPubMed
Yamamoto, S., Jaiswal, M., Charng, W.L., Gambin, T., Karaca, E., Mirzaa, G., et al. 2014. A Drosophila genetic resource of mutants to study mechanisms underlying human genetic diseases. Cell, 159: 200214. http://doi.org/10.1016/j.cell.2014.09.002.CrossRefGoogle ScholarPubMed
Yati, A. and Dey, S. 2011. FlyCounter: a simple software for counting large populations of small clumped objects in the laboratory. BioTechniques, 51: 348349. http://doi.org/10.2144/000113753.CrossRefGoogle ScholarPubMed
Zhai, R.G., Hiesinger, P.R., Koh, T.W., Verstreken, P., Schulze, K.L., Cao, Y., et al. 2003. Mapping Drosophila mutations with molecularly defined P element insertions. Proceedings of the National Academy of Sciences, 100: 1086010865. https://doi.org/10.1073/pnas.1832753100.CrossRefGoogle ScholarPubMed
Zhong, Y., Gao, J., Lei, Q., and Zhou, Y. 2018. A vision-based counting and recognition system for flying insects in intelligent agriculture. Sensors, 18: 1489. http://doi.org/10.3390/s18051489.CrossRefGoogle ScholarPubMed
Zhu, C., Wang, J., Liu, H., and Mi, H. 2018. Insect identification and counting in stored grain: image processing approach and application embedded in smartphones. Mobile Information Systems, 2018: 5491706. http://doi.org/10.1155/2018/5491706.CrossRefGoogle Scholar
Figure 0

Figure 1. Pipeline of the development and use of image-recognition AI: image capturing, dataset preparation, and AI training and validation.

Figure 1

Figure 2. Output of an image reconstructed by FlydAI. Each individual is identified by a red square; phenotypes and a value of confidence (a cutoff value of 0.65 was considered) included in brackets are indicated inside.

Figure 2

Figure 3. Example of an image reconstructed by combining fragments and adding artificial noise generated by Roboflow.

Figure 3

Table 1. Success rate (SR) of estimation of number and phenotype of individuals by AI and run time for the different densities and image sources

Figure 4

Table 2. Success rate (SR) of estimation of number and phenotype of individuals from digital images by human workers and processing time

Figure 5

Table 3. Success rate (SR) of estimation of number and phenotype of individuals from lab samples by human workers and processing time

Supplementary material: File

Gálvez Salido et al. supplementary material 1

Gálvez Salido et al. supplementary material
Download Gálvez Salido et al. supplementary material 1(File)
File 5.2 MB
Supplementary material: File

Gálvez Salido et al. supplementary material 2

Gálvez Salido et al. supplementary material
Download Gálvez Salido et al. supplementary material 2(File)
File 12.2 KB