Post-genomic analysis of Monosporascus cannonballus and Macrophomina phaseolina - potential target selection

. Monosporascus cannonballus Pollack & Uecker and Macrophomina phaseolina Tassi (Goid) are phytopathogenic fungi responsible for causing "root rot and vine decline" in melon ( Cucumis melo L.). Currently, cultural management practices are predominantly employed to control these pathogens, as the use of pesticides not only has detrimental environmental impacts but has also proven ineffective against them. These fungi have already undergone molecular characterization, and their genomes are now available, enabling the targeted search for protein targets. Therefore, this study aimed to identify novel target proteins that can serve as a foundation for the development of fungicides for effectively managing these pathogens. The genomes of M. cannonballus (assembly ASM415492v1) and M. phaseolina (assembly ASM2087553v1) were subjected to comprehensive analysis, filtration, and comparison. The proteomes of both fungi were clustered based on functional criteria, including putative and hypothetical functions, cell localization, and function-structure relationships. The selection process for homologs in the fungal genomes included a structural search. In the case of M. cannonballus , a total of 17,518 proteins were re-annotated, and among them, 13 candidate targets were identified. As for M. phaseolina , 30,226 initial proteins were analyzed, leading to the identification of 10 potential target proteins. This study thus provides new insights into the molecular functions of these potential targets, with the further validation of inhibitors through experimental methods holding promise for expanding our knowledge in this area.


Introduction
Developing effective and environmentally friendly pesticides at low cost has posed a challenge to meet growing global population demands.To avoid fungicide resistance and minimize environmental impact, chemical disease control methods that are safe for humans have been pursued.Consequently, extensive research has focused on understanding the mode of action, resistance risks, specificity, and target knowledge (Umetsu & Shirai, 2020).
Genomic data has emerged as a crucial high-throughput technology in investigating plant-pathogen interactions.These advancements have facilitated the design of new inhibitors and assays, enabling the targeting of novel disease factors for potential chemical control (Martins et al., 2016).Additionally, genomic analysis has proven effective and cost-efficient in studying fungi such as Fusarium graminearum (Atasanova, Bresso, Maigret, Martins, & Richard-Forget, 2022).Exploiting the availability of complete pathogen genomes, bioinformatic analysis, and structure-based drug design has facilitated a rational search for protein targets.Structural criteria-based selection of potential targets has been successfully applied to various plant pathogens (Bresso et al., 2016;Martins et al., 2016).
Monosporascus cannonballus Pollack and Uecker (1974), and Macrophomina phaseolina Tassi (Goid) are ascomycete phytopathogenic fungi of great interest in the northeast region of Brazil, as they cause severe root diseases in melon (Cucumis melo L.) (Sales Júnior et al., 2012).M. cannonballus is a major pathogenic agent associated with Monosporascus root rot and vine decline (MRRVD) in melon, affecting melon production in 22 countries and causing significant losses (Sales Júnior, Negreiros, Beltrán, & Armengol, 2018;Yan, Zang, Huang, & Wang, 2016;Markakis et al., 2018;Negreiros, Sales Júnior, Rodrigues, León, & Armengol, 2019).This thermophilic fungus, with an optimal growth temperature ranging from 25 to 35ºC, is well-adapted to semi-arid and arid conditions, thus thriving in the northeast region of Brazil (Sales Júnior et al., 2018).Despite the ongoing challenges in controlling this pathogen, there are currently no registered products for its management, with limited reports of fludioxonil and fluazinam inhibiting M. cannonballus mycelial growth (Sales Júnior et al., 2018;Cavalcante et al., 2020;Tavares et al., 2023).Integrated management combining various control techniques appears to be the best approach for disease control (Sales Júnior et al., 2018).However, the mode of action for fludioxonil and fluazinam remains partially understood, and the emergence of resistance has been a looming concern (Jampilek, 2016;Bersching & Jacob, 2021).M. cannonballus causes substantial economic losses by infecting plant roots, primarily secondary and tertiary roots, with secondary symptoms appearing towards the end of the growth cycle, including root necrosis and small root lesions.
Therefore, the development of novel compounds that can effectively target plant roots is crucial to improve disease control.In this sense, M. phaseolina, the causal agent of gray stem rot in various plant species, infects over 500 botanical hosts, including economically important crops such as beans, cotton, sorghum, soybeans, beets, peanuts, and melons (Sales Junior et al., 2020).It causes damage to roots, stems, seedlings, and seeds, employing microsclerotia as resistant structures that allow long-term survival in the soil and serve as the primary inoculum source.The losses incurred include root rot, collar rot, damping-off in seedlings, and seed infections (Gupta, Sharma, & Ramteke, 2012).While it is distributed in different climatic zones worldwide, its incidence is more pronounced in tropical and subtropical regions.In semi-arid regions like productive melon fields in Northeast Brazil, the pathogen's impact is amplified during periods of drought, water stress, and high temperatures, which facilitate its survival and development (Radwan, Rouhana, Hartman, & Korban, 2014).
Although there are 12 registered products to control M. phaseolina, ranging from microbiological fungicides like Trichoderma afroharzianum strain Th2RI99, which primarily acts through antibiosis, to fungicides like fludioxonil, with a mechanism of action that is still being elucidated, the available options include highly hazardous products for the environment and human health.Therefore, developing efficient and sustainable methods of disease control is of utmost importance (Melo et al., 2021).
Both M. cannonballus and M. phaseolina have their genomes available in public databases.The University of New Mexico (USA) sequenced isolates of M. cannonballus, with the summary of this sequencing published in 2020, while the genome of M. phaseolina was first sequenced by a research group in Bangladesh in 2012 (Islam et al., 2012).Exploiting this genomic information, post-genomic studies can be conducted to identify target genes for the development of new drugs.Hence, this study aimed to identify, annotate, and select potential targets for fungicide development to effectively control these pathogens.

Material and methods
In silico genomic analysis is a recent strategy that uses genomic and transcriptomic data to explore genes and their interactions with specific proteins, in search of answers to biological questions and identification of potential targets for new compounds.Molecular targets are often determined through protein alignments against various databases, considering specific characteristics (Martins et al., 2016).
The genomes of M. cannonballus (assembly ASM415492v1) and M. phaseolina (assembly ASM2087553v1) were downloaded from GenBank to generate a comprehensive sequence dataset for gene annotation and protein selection.Following several annotation and functional attribution steps, the proteins were clustered based on functional criteria, including putative and hypothetical proteins, nuclear proteins, membrane proteins, receptors and transporters, actins, mitochondrial proteins, and proteins involved in DNA and RNA interactions.Subsequently, a structural analysis was conducted to identify homologs in the Protein Data Bank (PDB) database, facilitating homology modeling.

Dataset configuration
To curate a protein dataset and identify potential fungicidal targets from the genomes of M. cannonballus and M. phaseolina, a systematic multistep selection process was conducted, as depicted in Figure 1.The initial step involved grouping and compiling the translated proteins from both genomes, resulting in the creation of a comprehensive dataset.Protein function attribution and genome annotation were performed using predefined parameters outlined in Table 1.To ensure comprehensive annotation, all unigenes were subjected to gene annotation against various databases, including non-redundant GenBank and Reference Genes, following the approach described by Martins et al. (2016).

Target selection strategy
The systematic criteria for target selection were defined as follows: i) protein annotation, phenotypic description, and expression during plant infection; ii) discarding proteins with low structural similarity to the Protein Data Bank (PDB); iii) prediction of cell localization and accessibility to chemical compounds; iv) considering the number of gene copies in the genome and a molecular size between 400 and 600 amino acids; and v) ensuring the absence of orthologs in non-target organisms such as insects, plants, and humans.
The initial step involved protein annotation and phenotypic characterization obtained from the Pathogen Host Interactions database (PHI-base).Redundancy was eliminated in the subsequent stage, followed by filtering for proteins localized in the cytoplasm and accessible to chemicals.A BLAST search against the PDB was conducted for candidate selection.Manual curation was performed in the third step to identify genes with one or two copies in the genome and a protein size ranging from 400 to 600 amino acids.Finally, orthologs in other non-target species were filtered based on a sequence identity criterion of above 60% (Martins et al., 2016).

Molecular modeling of target proteins
In the absence of experimentally resolved 3D structures, computational methods were employed to predict 3D protein models and obtain information regarding protein structure and functions.To generate accurate 3D models of the selected targets, we used the Modeller program hosted on the SwissProt server (https://swissmodel.expasy.org/).Once the target protein was identified as the most suitable template for comparative modeling, a multiple sequence alignment was conducted using standard parameters to confirm sequence similarity and validate the conservation of structural features (Martins et al., 2016).
Homology modeling, a specific technique within comparative modeling, involves several steps.These steps include: i) identification of template proteins to serve as structural references, ii) sequence alignment between the target protein and template proteins, iii) generating coordinated copies for confidently aligned regions, iv) constructing coordinates for missing atoms in the target structure, and v) refining the model and assessing its quality (Bresso et al., 2016).

Target selection from Monosporascus cannonballus genome
Through automatic reannotation of the M. cannonballus genome, a total of 17,518 proteins were identified (Figure 2).Among these proteins, the main gene families included Protein kinase-like, Alpha/Beta hydrolase hydrolases, MFS transport proteins, the NAD (P)-binding domain superfamily, and the triphosphate hydrolase superfamily.Additionally, various other protein families corresponding to known metabolic pathways and genes were also identified.In a recent study, Robinson, Natvig, and Chain (2020) conducted comparative genomic analyses and explored functional gene content and synteny within Monosporascus isolates in a search for genes associated with fungal-plant interactions and genomic regions with a lack of synteny between Monosporascus variants.Their annotation revealed genetic content similar to other Sordariomycetes and Xylariales members, regardless of genome size.A similar lack of information was observed in our study, necessitating genome reannotation to facilitate target selection from the fungus genome.The available genome dataset of M. cannonballus (Robinson et al., 2020) has posed challenges in terms of sequencing and annotation.Robinson et al. (2020) also observed a significant presence of bacterial contigs from Ralstonia pickettii in the predicted gene assemblies, possibly indicating endosymbionts.This observation aligns with biological and ecological studies suggesting the involvement of bacterial and actinomycete components in inducing the germination of M. cannonballus ascospores (Sales Júnior et al., 2018).New species belonging to the Monosporascus genus representing a group of significant plant pathogens have been reported to be widely distributed in natural arid ecosystems.Among the nine described species, M. cannonballus and M. eutypoides specifically infect Cucurbitaceae roots in agricultural environments.In Brazil, recent surveys have also described five new Monosporascus species (M.brasiliensis, M. caatinguensis, M. mossoroensis, M. nordestinus, and M. semiaridus) (Negreiros et al., 2019).This biological diversity within this group underscores the need for further genomic investigations.
From the initial set of 17,518 proteins, the screening criteria for molecular target selection narrowed it down to 7,630 candidate protein sequences.Further searches for known structures resulted in the identification of 13 candidate targets from the M. cannonballus genome (Table 2).Table 2 provides information on these 13 candidate targets, including the sequence codes for target search and identification, their functions, alignment parameters (e-value and score), segment similarity, and references to research conducted on these target proteins outside the field of phytopathogenic fungi.The first target listed, Thioredoxin reductase Trr1/Trr2, plays distinct roles in the redox system involving cysteine synthesis and host infection (Dankai, Pongpom, & Vanittanakom, 2018).The second target, Mannose-1-phosphate guanylyltransferase, affects cell growth and morphology by altering cell membrane permeability (Taj et al., 2022).The third target, Mitochondrial ferrochelatase, is associated with the inner mitochondrial membrane and has an active site facing the matrix (Ferreira et al., 1995).The fourth target, Sterol 24-C-methyltransferase, is involved in ergosterol biosynthesis and homeostasis (Nes et al., 2018).The fifth target, homoaconitase LysF, leads to attenuated virulence at a low dose (Liebmann et al., 2004).The sixth target, 2-alpha-mannosyltransferase, catalyzes the transfer of an alpha-D-mannosyl residue from GDPmannose to a lipid-linked oligosaccharide (Schutzbach, Springfield, & Jensen, 1980).The seventh target, Phosphatidate cytidylyltransferase, regulates membrane phospholipid synthesis via phosphatidylserine synthase (Carman & Han, 2018).The eighth target, UDP-N-acetyl-glucosamine-1-P transferase Alg7, catalyzes the initial step of N-glycosylation (Hernández-Elvira et al., 2019).The ninth target, Stearic acid desaturase (SdeA), negatively regulates thermotolerance by modifying saturated fatty acid levels (Zhan et al., 2021).The tenth target, Mevalonate kinase, is involved in isoprenoid biosynthesis (Hogenboom et al., 2004).

Target selection from Macrophomina phaseolina
The selection process from the M. phaseolina genome resulted in the identification of ten target proteins (Table 3).Out of the initial 30,226 proteins, we filtered for sequences larger than 400 amino acids, resulting in 12,745 proteins.Further filtering excluded hypothetical, putative, nuclear, membrane, receptor/transport, actin, mitochondrial, DNA, and RNA binding proteins, leaving us with 3,897 sequences.These sequences were then searched against the Protein Data Bank using BLASTP, resulting in 243 sequences with similar structures.However, only 10 of these proteins were considered potential targets based on factors such as the number of copies and homology with non-target organisms.
Table 3 provides a list of the selected proteins as potential molecular targets from the M. phaseolina genome.All candidate proteins have known functions and are essential for the fungus's life cycle.Designing specific inhibitors against these targets can interfere with crucial cellular processes and impede the growth of the fungus in plants.Notable proteins such as thioredoxin reductase, mevalonate kinase, and glutamyl-tRNA synthetase were identified.Molecular homology modeling revealed the globular structures of these enzymes.The genome of M. phaseolina was sequenced and assembled, estimated to be approximately 49 Mb in size, and organized into 15 super-scaffolds, with a coverage of 92.83%.The genome annotation predicted a total of 14,249 open reading frames (ORFs), of which 9,934 proteins were validated through transcriptomic data (Islam et al., 2012).The annotated genome revealed an abundance of oxidases, peroxidases, and hydrolytic enzymes, indicating the secretion of proteins involved in the degradation of cell wall polysaccharides and lignocellulosic materials during host tissue infection.To counteract plant defense responses, M. phaseolina encodes a significant number of P450s, MFS-like membrane transporters, glycosidases, transposases, and secondary metabolites compared to other sequenced ascomycete species (Islam et al., 2012).Notably, the M. phaseolina genome exhibits a distinct set of carbohydrate esterases (CE).
In contrast to M. cannonballus, the genomic data of M. phaseolina is well-structured.This is likely due to the broader host range of M. phaseolina, which characterizes it as a polyphagous fungus capable of infecting over 500 plant species, including economically important crops like soybeans (Glycine max L.) and corn (Zea mays L.) (Ishikawa, Ribeiro, Oliveira, Almeida, & Balbi-Peña, 2018).
More recently, high-throughput sequencing of M. phaseolina isolates revealed 22 contigs with an N50 of 4,257,441 bp, and 99.3% completeness in terms of reference and universal single-copy orthologs, encompassing 14,471 genes (Purushotham et al., 2020).This information is valuable for post-genomic analysis and facilitates targeted searches for genes present in multiple or specific fungal genomes.It expands the potential for fungicide design against a wide range of organisms or enables the development of a "magic bullet" specifically targeting a particular pathogen.Additionally, comparing sequences with gene databases of non-target organisms helps identify potential unwanted or toxic effects (Martins et al., 2016).

Molecular modeling of the targets
The 3D structure of the selected M. cannonballus and M. phaseolina genome targets was predicted to facilitate further analysis of their active sites and mode of action as potential inhibitors (Table 4).In terms of developing fungicidal agents, there is a wide range of possibilities for potential drug targets encoded in genomes.These targets include membrane receptor proteins, host interaction factors, permeases, enzymes involved in intermediary metabolism, replication, and transcription systems, DNA repair, and many more.Exploring these possibilities comprehensively allows for developing multifaceted control strategies that effectively combat plant diseases (Bresso et al., 2016;Martins et al., 2016).
In the field of industry, modern chemical methods, including molecular modeling, are increasingly employed as powerful tools for studying structure-function relationships.The integration of in silico (computational) and experimental methods has led to an enhanced understanding of intermolecular recognition.By combining these approaches, experimental validation can elucidate mechanisms and suggest improvements in the effectiveness of new molecules (Bresso et al., 2016;Martins et al., 2016).

Conclusion
Through bioinformatics analysis, a total of 17,518 genes from Monosporascus cannonballus and 30,223 genes from Macrophomina phaseolina were re-annotated.The genomic analysis of these two fungi identified 23 new potential target proteins, with 13 targets for M. cannonballus and 10 targets for M. phaseolina.This study has identified promising protein targets for the development of potential fungicides against M. cannonballus and M. phaseolina, the causal agents of disease in melon.Further investigations are needed to validate the potential of these targets through enhanced in silico simulations and in vitro bioassays.This work represents an initial step towards the development of fungicides and opens new possibilities for controlling the diseases caused by M. cannonballus and M. phaseolina, paving the way for innovative approaches in pathogen control.

Figure 2 .
Figure 2. Automatic superfamily annotation of Monosporascus cannonballus sequences using InterProScan results.The genome sequencing of M. cannonballus, conducted by the University of New Mexico in the United States, reported approximately 11,800 predicted genes and an estimated total genome size of 70 Mb.The automatic annotation of the genome indicated a genetic content comparable to other members of the Sordariomycetes and Xylariales.However, studies by Robinson, Natvig, and Chain in 2020 found no correlation between genome size and predicted gene number among Monosporascus clusters or within the Xylariales family.Syntenic comparisons between Monosporascus and other Xylariales genomes revealed regions of both synteny and large regions lacking similarity.In a recent study,Robinson, Natvig, and Chain (2020) conducted comparative genomic analyses and explored functional gene content and synteny within Monosporascus isolates in a search for genes associated

Table 4 .
Molecular modeling of some Macrophomina phaseolina and M. cannonballus targets.