Uncovering the Microbiota of Bagworm Metisa plana (Lepidoptera: Psychidae) in Oil Palm Plantations in Malaysia

Bagworm Metisa plana is one of the major pests in Malaysia’s oil palm plantation, with infestation resulting in huge economical loss. Currently, the microbial profile of the bagworm has yet to be study. Understanding the biology of the pest such as the bacterial community is crucial as bacteria associated with insects often provide benefits to the insect, giving the insect host a better chance of survival. Here, 16S amplicon sequencing was used to identify the bacteria community of M. plana. Additionally, two comparisons were made, the bacterial communities between two larval stages (early instar stage and late instar stage) from outbreak area; the bacterial communities of late instar stage larvae from non-outbreak between outbreak areas. From this study, it was found that the bacterial community of M. plana consisted of Proteobacteria, Actinobacteria, Bacterioidetes, Firmicutes and other minor phyla, with Proteobacteria being the most dominant phylum. Furthermore, bacterial genera of M. plana consisted of Pantoea, Curtobacterium, Pseudomonas, Massilia and other minor genera, with Pantoea being the most dominant. It was also found that the alpha and beta diversity in both comparisons were not significantly different. We present our data as a first insight towards the bacterial community of M. plana, paving a way towards understanding the biology of the bagworm M. plana.


INTRODUCTION
The Lepidoptera is a vastly diverse insect order, with many species considered as major pests of agricultural importance (González-Serrano et al. 2020). The Lepidopteran pest bagworm is the most serious and economically important pests in the oil palm plantations in Malaysia (Cheong & Tey 2012;Kamarudin & Wahid 2007;Kok et al. 2011;Salim et al. 2015;Wood 1968). The bagworm outbreak can result in a terrible yield loss which can translate into millions of Ringgit Malaysia (Malaysia's local currency) (Ahmad Ali et al. 2011;Salim et al. 2015). Of the common species of bagworm found in the oil plantations (Mahasena corbetti, Pteroma pendula and Metisa plana), the M. plana is the most serious leaf defoliator (Ahmad Ali et al. 2011;Sankaran 1970;Wood 1968). Although there are available and effective control measures (Salim et al. 2015;Salim & Hamid 2012;Wood 2019;Yap 2000) the outbreak and infestation of the bagworm is still an occurring problem due to the lack of understanding of the pests (Cheong & Tey 2012;Kok et al. 2011).
Huge ranges of microorganisms colonise the insects, from the largest of fungi to the smallest of virus. The microbiota composition of the insects differs greatly and are affected by different factors such as insect developmental stages, environments, and even diet (Chaturvedi et al. 2017;Hammer et al. 2014;Mereghetti et al. 2017;Voirol et al. 2018). Often times, these microorganisms provide various benefits to the wellbeing of the insect, but sometimes may be pathogenic (De Smet et al. 2018;Douglas 2015;Morimoto et al. 2019;Voirol et al. 2018). An example of benefits from insect-bacteria interaction is the acquisition of nutrients. Chewing insects that feed on leaves would not have enough nitrogen solely from their diet. This insufficient nitrogen obtained from the diet would be supplemented by bacterial symbionts which can fix nitrogen and convert it into appropriate nitrogen-containing compounds (Hansen et al. 2020;Nardi et al. 2002;Voirol et al. 2018). Some symbiotic bacteria could also protect the host against pathogens. In a separate study, Shao showed that the dominant symbiotic bacterium Enterococcus mundtii actively secretes bacteriocin against 187 bacterial invaders. This interaction protects the host from other invading bacteria and at the same time, provides the bacterium an advantage which contributed to its dominance (Shao et al. 2017).
The bacterial community of the M. plana bagworm to the best of the authors' knowledge has yet to be explored. The current study therefore aims at identifying and compare the bacterial community of the insect host. This knowledge can help to further understand the biology of the pest, and could potentially be used to improve on the integrated pest management methods such as using microbes as a biocontrol agent (Charles et al. 1996;Federici 2005;Köhl et al. 2019). Here, we used 16S rRNA amplicon sequencing to investigate the bacterial community of M. plana and to see whether there is any difference in the bacterial community: (1) between the early instar stage and late instar stage M. plana larvae from the outbreak area; and (2) between the late instar stage M. plana larvae from nonoutbreak area and outbreak area.

Samplings
The M. plana larvae of late instar stage was collected in the month of August 2020, from non-outbreak area located in Felda Jengka 7, Jengka, Pahang, Malaysia. M. plana larvae of both early instar stage (1st instar to 3rd instar) and late instar stage (4th instar to 6th instar) were collected in the month of September 2019 from outbreak area located in Felda Gunung Besout 02/03, Trolak, Perak, Malaysia. The instar stage of M. plana larvae was determined by the length and morphology of the case as described by Kok et al. (2011). The outbreak area is categorized by the persistent infestation of bagworm larvae of more than the economic threshold level (ETL), which is five larvae per frond (Salim et al. 2015).

Ethics Statements
This species is a pest and is not protected by law. Bagworm was declared a dangerous pest under the Malaysia Act 167, Plant Quarantine Act 1976 (Kamarudin et al. 2017). Sampling was performed with proper protective equipment to ensure no contamination from and to the bagworm samples.

Total DNA Extraction
Genomic DNA (gDNA) was extracted using Qiagen DNeasy Blood and Tissue Kit (Cat No. / ID: 69506) with slight modifications in 4 replicates for each group (late instar stage larvae from non-outbreak area, early instar stage and late instar stage larvae from outbreak area). For each replicate, 20 whole bagworms were removed from their bags and surface sterilised before being placed in 1.5 mL microcentrifuge tube before adding 180 µL of ATL buffer. The samples were then kept at -20°C for 30 min before being homogenised using micropipette tips. Twenty microlitre (20 µL) of proteinase K was added to the sample and mixed by vortexing before the samples were incubated at 56°C for 10 min. The samples were then vortexed for 15 sec before adding 200 µL of AL buffer. The samples were mixed by vortexing and incubated at 56°C for 10 min. Ice-cold absolute ethanol of 200 µL was added to the samples and mixed. The samples were centrifuged at 6,000 × g for 1 min and the supernatant were transferred to DNeasy Mini spin column. The spin columns were then centrifuged at 6,000 × g for 1 min. The spin columns were placed in a new 2 mL collection tubes and 500 µL of Buffer AW1 was added before centrifuging for 1 min at 6,000 × g. The spin columns were again placed in new 2 mL collection tubes and added with 500 µL of Buffer AW2 before centrifuging at 13, 200 × g for 8 min. The spin columns were placed in new 1.5 mL microcentrifuge tubes and 50 µL of Buffer AE was added directly to the spin columns' membranes. They were then incubated for 3 min at room temperature before centrifuging at 6,000 × g for 1 min. The eluates were pipetted back into the spin column's membrane and incubated for 3 min before centrifuging at 6,000 × g for 1 min. Gel electrophoresis was performed and the results were visualised under ultraviolet light.

Library Preparation and 16S Amplicon Sequencing
The extracted gDNA were sent to the sequencing service provider, Apical Scientific Sdn Bhd (https://apicalscientific.com/) for library preparation and sequencing. V3-V4 variable regions of the 16S ribosomal RNA gene was amplified using the forward primer (5' CCTACGGGNGGCWGCAG) and reverse primer (5' GACTACHVGGGTATCTAATCC). After passing the quality check, the V3-V4 variable region were amplified using locusspecific sequence primers with overhang adapters (forward overhang 5' TCGTCGGCAGCGTCAGATGTGTATAAGAGACAG-[locus-specific sequence]; reverse overhang 5' GTCTCGTGGGCTCGGAGATGTGTATAAGAGACAG-[locus-specific sequence]). All the PCR reactions were carried out with Q5® Hot Start High-Fidelity 2X Master Mix.

Sequence analysis
The analysis was done using Mothur software (v.1.44.3) (Schloss et al. 2009) with adaptations from MiSeq standard operating procedure (SOP) (https://mothur. org/wiki/miseq_sop/) (Kozich et al. 2013). The forward reads and reverse reads were merged, and primers were removed. Sequences that were longer than 440 base pair (bp), but shorter than 406 bp, and with any ambiguities were removed. Duplicates sequences and sequences that only appeared once were also removed. A customized reference targeting the V3-V4 region of the 16S rRNA gene was made from SILVA Seed v132 (Quast et al. 2013). Unique sequences were then aligned to the customised refence. Sequences that start before position 2 and ends after 17012, with homopolymer more than 8 as well as a length shorter than 406 bp were removed before removing gap characters. The sequences were pre-clustered, and chimeras were removed. The remaining sequences were classified to SILVA reference database using Bayesian classifier at 80% confidence threshold. Sequences that were classified into "Chloroplast", "Mitochondria", "Unknown", "Archaea" and "Eukaryote" were removed. The sequences with similarity of 97% were then clustered into operational taxonomical units (OTU).

Bacterial community analysis
As the samples showed unequal sampling depth, we investigated the alpha and beta diversity of the bacterial communities using rarefied OTU tables. To access the alpha-diversity, we calculated the Shannon diversity index, observed species richness (Sobs) and Shannon evenness index. Wilcoxon test was performed using to see whether the alpha diversity as well as beta-diversity were significantly different. Principle Coordinate Analysis (PCoA) was plotted to visualise the cluster separation of the bacterial community's structure. Analysis of Molecular Variance (AMOVA) was performed to see whether the centre of the cluster representing each group were significantly different. We performed Homogeneity of Molecular Variance (HOMOVA) to see whether the variation in each group were significantly different from each other. All statistical tests were performed with significance at adjusted p-value at 0.05.

Overview of the Bacterial Community in M. plana larvae
From the results of the study, it was observed that the bacterial community of M. plana was dominated by Proteobacteria, followed by Actinobacteria, Bacterioidetes, Firmicutes and other phyla which constitute a minor percentage of the bacterial community (Appendix A). At the bacterial family level, the most dominant family was the Enterobacteriaceae, followed by Microbacteriaceae, Burkholderiaceae, Pseudomonadaceae, Sphingobacteriaceae and other bacterial families, constituting a minor percentage in the bacterial community (Appendix B). At the genera level, the Pantoea genus was the dominant genus, followed by unclassified genus in the Enterobacteriaceae family, Curtobacterium, Pseudomonas, Massilia, and other minor genera (Appendix C).

Comparison Between Early Instar and Late Instar Stage
To obtain the bacterial community composition of the M. plana larvae at early instar and late instar stage, the V3 and V4 region of the bacterial 16S rRNA gene was amplified. A total of 2,738,727 sequences were obtained from 8 samples. After quality checks and removing unwanted sequences, a total of 385,297 sequences with 3,757 unique sequences were obtained. The sequences were then clustered at 97% similarity into 959 Operational Taxonomical Units (OTUs). The rarefaction curve did not completely plateau (Fig. 1), suggesting the sequencing depth was insufficient to capture the entire bacterial community. The bulk of the bacteria were of Proteobacteria (82.36%), Actinobacteria (14.8%), Bacteroidetes (1.48%), Firmicutes (1.01%) and remaining individual phyla consisting of less than 1% ( Fig. 2a and Appendix A). Wilcoxon test showed no significant difference in relative abundance in any of the bacterial phyla between the two development stages. At family level, the Enterobacteriaceae was the dominant family (75.37%), followed by Microbacteriaceae (13.63%), Burkholderiaceae (3.4%), Pseudomonadaceae (2.56%), Sphingobacteriaceae (1.09%) and the remaining families individually having less than 1% relative abundance ( Fig. 2b and Appendix B). Result showed no significant difference in relative abundance between the bacterial families (Appendix B). At genera level, the bacterial community was dominated by Pantoea with 60.57% average relative abundance, followed by unclassified Enterobacteriaceae, Curtobacterium, Pseudomonas, Massilia and remaining genera individually having less than 1% relative abundance ( Fig. 2c and Appendix C). After performing Wilcoxon test, there were no significantly different bacterial genera (Appendix C). Shannon diversity index, observed species richness and Shannon evenness were calculated to estimate the diversity of the bacterial community, the number of species and the evenness of the bacterial community (Table 1). However, result showed that the Shannon diversity index, sobs and evenness between the early instar stage and late instar stage were all not significantly different. The PCoA was ordinated to visualise the cluster separation of the bacterial community. However, the ordination ( Fig. 3) did not show clear separation between the early instar stage and late instar stage. AMOVA test was done on the samples to test whether the cluster of the early instar and late instar stage was significantly different. The result (Table 2) revealed that the observed separation in the early instar and late instar stage was not significantly different.  We also wanted to know whether the variation of the bacterial community in the early instar stage larvae was significantly different from that of the late instar stage. This was done by performing HOMOVA with the result (Table 3) showing no significant difference in the variation with the early instar stage and late instar stage.

Comparison Between Non-Outbreak Area and Outbreak Area
The V3 and V4 region of the bacterial 16S rRNA gene was amplified using late instar stage larvae from the non-outbreak area and outbreak area. A total of 2,848,936 sequences were obtained from eight samples. After quality checks and removing unwanted sequences, a total of 271,821 sequences with 2,471 unique sequences were obtained. The sequences were then clustered at 97% similarity into 796 Operational Taxonomical Units (OTUs). The rarefaction curve did not plateau (Fig. 4), suggesting the sequencing depth was insufficient to capture the entire bacterial community. The most abundant phyla consisted of Proteobacteria (51.30%) followed by Actinobacteria (45.22%), Bacteroidetes (1.98%) and the rest of the phyla individually consisting of less than 1% in relative abundance ( Fig. 5a and Appendix D). After performing Wilcoxon test, we observed no significantly different bacterial phyla (Appendix D). The most abundant families consisted of Enterobacteriaceae (43.54%), followed by Microbacteriaceae (41.67%), Pseudomonadaceae (4.18%), Burkholderiaceae (2.4%), Sphingobacteriaceae (1.74%), Kineosporiaceae (1.24%) and other families individually having less than 1% relative abundance pooled as "Others" ( Fig. 5b and Appendix E). We again compared the relative abundance of families between the two areas and found that there were no significantly difference bacterial families (Appendix E). The most dominant bacterial genera that can be found were the Curtobacterium (40.24%) and Pantoea (37.29%) ( Fig. 5c and Appendix F) but statistical test showed no significantly different bacterial genera. Shannon diversity index, observed species richness and Shannon evenness were calculated but result showed that the Shannon diversity index, sobs and evenness between the early instar stage and late instar stage were all not significantly different (Table 4). From the PCoA (Fig. 6), we observed a clear separation between the samples from non-outbreak area and outbreak area. AMOVA test (Table 5) showed separation between the two areas was significantly different. This meant that the bacterial community structure was different from one another.  The HOMOVA test (Table 6) showed that there was no significant difference in the variation of bacterial community between the two areas. The non-outbreak area has a higher variation (0.063) compared to the outbreak area (0.027).

DISCUSSION
At present, the microbiota of M. plana has yet to be uncovered. From the results, it was observed that the microbiota of M. plana was diverse but dominated by the phylum Proteobacteria and Actinobacteria, with a dominance of more than 97%. Nonetheless, the dominant phyla and other minor phyla such as Actinobacteria, Bacterioidetes and Firmicutes could be found in other lepidopteran such as silkworm Bombyx mori (Chen et al. 2018), oriental fruit moth Grapholita molesta (Yuan et al. 2021), cotton leafworm Spodoptera littoralis (Chen et al. 2016) and many other lepidopteran species compiled by (Voirol et al. 2018;Snyman et al. 2016). The presence of Enterobacteriaceae, Microbacteriaceae, Burkholderiaceae, Pseudomonadaceae and Sphingobacteriaceae were also observed in different lepidopteran studies (Jones et al. 2019;Robinson et al. 2010;Xia et al. 2013;Voirol et al. 2018). In terms of bacterial genus, Pantoea, Curtobacterium, Pseudomonas and Massilia genera found in the study were also found in other lepidopteran species (Robinson et al. 2010;Chen et al. 2018;Voirol et al. 2018;Jones et al. 2019).
Focusing on the most dominant genus found in this study, the Pantoea, could the dominance of this genus have any effect on the host M. plana? From literatures, a wide range of insect were observed to have relationship with different Pantoae species (Akhoundi et al. 2012;Aly et al. 2008;Asis & Adachi 2004;Azad et al. 2000;Walterson & Stavrinides 2015), with some relationship being mutualistic or commensalistic (Pinto-Tomás et al. 2009;Maccollom et al. 2009;Walterson & Stavrinides 2015). It was also reported that the P. agglomerans with another bacteria Klebsiella pneumoniae were able to mend the gut of irradiated Mediterranean fruit fly Ceratitis capitata and influencing the fitness of fruit fly 197 fitness in a positive way (Niyazi et al. 2004;Maccollom et al. 2009). Furthermore, it was reported that P. agglomerans could fix atmospheric nitrogen (Vorwerk et al. 2007;Walterson & Stavrinides 2015). As previously mentioned, chewing insects that feed on leaves such as bagworm could not depend solely on their diet to get enough nitrogen (Hansen et al. 2020;Nardi et al. 2002;Voirol et al. 2018) and this nitrogen deficiency may be supplemented by Pantoae which can fix nitrogen and convert it into appropriate nitrogen-containing compounds. These examples of the benefits of Pantoea may have helped the bagworm to survive in the oil palm plantations.
Shifting the focus onto the next abundant genus, the Curtobacterium, it is said that the habitat of the Gram-positive, obligate aerobic chemoorganotrophs (Finn et al. 2014) mainly associated with plants and notably the phyllosphere (Behrendt et al. 2002;Chase et al. 2016;Komagata et al. 1965). In the genus, only the C. flaccumfaciens is linked to plant pathogenesis, while there are indications of other ecological roles performed by the other species of the genus such as endophytic symbionts (Bulgari et al. 2009), stimulate plant defence responses (Bulgari et al. 2011), reduce plant disease symptoms (Lacava et al. 2007), and even promote plant growth (Sturz et al. 1997). However, as the bacterium is mainly associated with plants, we believed that the bacterium does not contribute to the survivability of the bagworm and the bagworm merely obtain the bacterium from their diet without any benefits although further research is needed to prove this.
Nevertheless, there could also be a possibility that the larvae obtained these bacteria solely from their environment or diet but provided little or no benefit. Phalnikar et al. (2018) observed that the most common and abundant OTUs in butterflies were also common in different insect-associated microbiomes. This led them to hypothesise that the insect-bacterial co-occurrence may indicate evolved functional relationships, or it could merely act as ecological or dietary roles. The latter hypothesis might be due to absence or presence of very little resident bacteria found in caterpillar such as in a study done by Hammer et al. (2017) and is in agreement with Phalnikar et al. (2018) where they found a substantial overlap of bacterial communities from larval and dietary resources which indicated that bacterial communities in larval are mainly influenced by passive procurement of bacteria from dietary resources (Phalnikar et al. 2018). Furthermore, a study showed that insects that feed on foliar obtained their microbiomes from the soil (Hannula et al. 2019). The authors in the mentioned study stated that the microbiome of the caterpillar that fed on intact plant had a more distinct microbiome and the microbiome resembled the soil microbiomes. In another study, (Gomes et al. 2020) found that the caterpillar's bacterial communities resembled the local soil microbiomes in which the host plant was growing. Nevertheless, it is important to note that the microbiome varies greatly across Lepidopteran species and even within species (Voirol et al. 2018). As the entire larvae were sampled, there was no trace as to where exactly these bacteria reside, although some studies had found that the bacterial communities from the whole insect can be similar to the bacterial communities sampled from the gut (Hammer et al. 2014;Voirol et al. 2018;Sabree et al. 2012;Sudakaran et al. 2012). Further studies to compare the microbiota of oil palm leaves and the bagworm microbiota is recommended in order to confirm if the bacteria found in this study is resident bacteria of bagworm.
In this study, we compared the bacterial community of bagworm of two developmental stages in outbreak area, and the bacterial community of bagworm from different areas. However, we did not observe any significant difference in the alpha and beta diversity for both comparisons. This phenomenon was also observed in some Lepidopteran species such as Plodia interpunctella and Plutella xylostellai, where their bacterial community did not change across developmental stages (Mereghetti et al. 2017;Ng et al. 2018;Voirol et al. 2018;Xia et al. 2018). The similarity in the bacterial community between developmental stage could also be attributed to the larvae having the same host plant (oil palm tree Elaeis guineensis), as different diet might influence bacterial communities in different ways such as promoting differential bacterial growth (Staudacher et al. 2016;Vorholt 2012;Yang et al. 2001). In regard to the comparison between areas where we observed no significant difference in alpha and beta diversity, a study found high consistency of the most dominant bacterial amplicon sequence variant (ASV) were detected in all the monophagous caterpillar Tyria jacobaeae across habitats regardless of their size (Gomes et al. 2020). In their study, Gomes et al. (2020) suggested that the fairly stable internal bacterial composition is possibly affected by the physiology of the caterpillar or an adaption to the exclusive diet of ragwort plants as well as phytochemicals. Following their observation, we could hypothesise that the same situation could have happen to the bacterial community of the bagworm from different areas.
This study provides a first insight to the bacterial community of the M. plana larvae and the information here may be of use for future management of the bagworm such as the use of biocontrol to control the outbreak. Nonetheless, it is still at the stage where more research is needed such as determining whether the bagworm microbiota was obtained from their diet or influenced by soil microbiota, which could be important if we wish to use biocontrol to target the resident microbiota of bagworm larvae. Furthermore, a metatranscriptomic analysis on the bacteria of the bagworm allows us to observe the gene expression profile of the complex microbial communities. This would allow us to see how the microbiome respond in the bagworm.

CONFLICT OF INTEREST
The authors declare no competing interests.

FUNDINGS
This study was not funded by any specific grant.

AUTHOR CONTRIBUTIONS
AT contributed to the conceptualisation, methodology, formal analysis, investigation, visualisation, writing (original draft), and writing (review and editing) of the project. CMRZA contributed to the conceptualisation, methodology and investigation, resources of the project. NHH contributed to the conceptualisation, methodology and investigation, resources of the project. GA contributed to the conceptualisation, methodology, resources, writing (review and editing), supervision, project administration, and funding acquisition of the research. HS contributed to the conceptualisation, methodology, resources, writing (review and editing), supervision, project administration, and funding acquisition of the research.

AVAILABILITY OF DATA AND MATERIALS
The datasets generated from next-generation sequencing are available in the NCBI Sequence Read Archive (SRA) repository, BioProject ID: PRJNA718136.

APPENDICES Appendix A
Bacterial phyla of early instar stage and late instar stage of M. plana larvae from the outbreak area.