Novel G-protein-coupled receptor-like proteins in the plant pathogenic fungus Magnaporthe grisea

An analysis of the Magnaporthe grisea genome and comparison with other fungi identified homologs of known G protein-coupled receptor-like proteins and a novel class of GPCR-like receptors in M. grisea that are specific to filamentous ascomycete fungi.

G proteins. The G proteins in turn activate different signaling pathways, such as those activated by adenylate cyclase and phospholipase C. These GPCRs are of immense importance as they are major targets for drug discovery [4].
A classification scheme that encompasses all GPCRs is the grouping into classes A-E [5]. A-C are the main classes present in animals: class A is the largest and comprises the rhodopsin-like receptors, class B comprises the secretin-like receptors and class C the metabotropic glutamate/pheromone receptors. Class D is unique to fungi and comprises fungal pheromone receptors. Class E contains cAMP receptors, such as the cAMP receptors of Dictyostelium. Other classes include frizzled/smoothened, adhesion receptors and the insect-specific chemosensory receptors [6,7]. Sequence conservation between GPCR classes is limited, however, with each receptor class exhibiting specific identifiable characteristics [6,8]. The secretin and the adhesion receptors are characterized by conserved cysteine residues or by known cysteine-rich domains resembling the epidermal growth factor (EGF) domain at their amino termini.
GPCRs form the largest family of receptors in animals, with more than 600 members in the human genome [9,10]. Only a handful of GPCRs have been identified in fungal genomes, however. In Saccharomyces cerevisiae and Schizosaccharomyces pombe only three and four receptors, respectively, are well characterized [1,[11][12][13][14][15][16]. In the Neurospora crassa genome a total of 10 receptors is predicted [17]. A recent report for Aspergillus nidulans identified GPCRs similar to the yeast pheromone receptors, the glucose-sensing receptor GPR1, the nitrogen-starvation sensing STM1, and the Dictyostelium discoideum cAMP receptors [18]. Given the prevalence and significance of GPCRs in higher eukaryotes, their relative paucity in the kingdom Fungi warranted further investigation. To see if we could find additional families, we searched the predicted proteome of the rice blast fungus Magnaporthe grisea.
The fungal plant pathogen M. grisea is a powerful model system to study the pathogenicity determinants required for plant cell-surface recognition and production of an appressorium, a specialized structure required to penetrate the plant surface [19,20]. The fungus causes rice blast disease, the most destructive disease of rice worldwide. M. grisea is amenable to molecular genetic manipulation and the subject of largescale genome-wide functional studies following the recent completion of a draft genome sequence [21]. Infection begins when a conidium, attached to the plant surface, sends forth a germ tube that differentiates to form a highly melanized appressorium. Turgor pressure inside the appressorium results in a penetration hypha breaching the cell wall and invasion of the plant tissues. This developmental program, which is accompanied by a number of biochemical and developmental changes, is a result of perception by the fungus of appropriate environmental and plant cell-surface signals and induction of a cascade of signaling pathways.
Cell-surface receptors that perceive signals at critical times in the life cycle of M. grisea and other pathogenic fungi are strongly implicated as pathogenicity determinants. Signaling plays a key role in appressorium formation and infection in M. grisea. The cAMP-dependent and pheromone response, as well as other mitogen-activated protein kinase (MAPK)-, phospholipase-and calmodulin-dependent pathways, are essential for pathogenicity and are likely to involve perception of signals through GPCRs [22][23][24]. The three identified G-protein alpha subunits, required for different aspects of development and pathogenicity, possibly transduce perceived signals to the above-mentioned pathways [25]. The M. grisea G proteins probably receive signals from receptors such as PTH11, an integral membrane protein required for pathogenicity [26]. As animal GPCRs are important targets for drug discovery, identifying fungal receptors would be equally important for understanding and controlling M. grisea and other fungal pathogens.
Identification of new GPCR classes is difficult because of low sequence similarity; even within related classes, sequence conservation is limited to the membrane-spanning regions [8]. There are also large variations in the type and number of receptors in classes that show no sequence or structural similarities to each other. We therefore carried out an exhaustive analysis to mine the proteome of the sequenced genome of the rice blast fungus M. grisea for GPCR-like proteins. Homologs of known fungal GPCRs were found in the M. grisea proteome, including the pheromone receptors STE2 and STE3 and the glucose-sensing receptor GPR1. In total, 76 GPCRlike proteins were identified in the present study of which 61 represent a large novel class related to PTH11, a receptor implicated in fungal development and pathogenicity and proposed to act upstream of the cAMP-dependent pathway. Many of these novel receptors will have roles in known pathways or may define new pathways involved in fungal development.

Identification of novel classes of GPCR-like proteins in M. grisea
We searched the M. grisea proteome for GPCR-like proteins on the basis of their similarity to known receptors. GPCR sequences including all present in the GPCR database (GPCRDB [5]) were used as a query in a BLASTP search against the M. grisea predicted protein set [21]. The proteins retrieved in this search were used to BLAST the M. grisea proteins again to find all related sequences (Table 1). A total of 14 GPCR-like proteins were found. These included homologs of characterized fungal GPCRs (GPR1, STM1, and the STE2-and STE3-like pheromone receptors). Other proteins identified were similar to the cAMP receptors and to Table 1 Predicted G-protein-coupled receptor-like proteins in M. grisea Known   Classes of GPCR-like protein in M. grisea were subdivided on the basis of BLASTP analysis and shared domains, as described in Materials and methods. They were clustered into paralogous families if the proteins showed 30% identity and 80% overlap over the complete length of the protein.  In total we identified 76 receptors, including members of known classes as well as novel classes. Sixty-one represented a novel class that included PTH11. All other receptors identified were assigned to different classes on the basis of their similarity to known receptors using BLASTP against the Gen-Bank (nonredundant) and Swiss-Prot databases, and their conserved domain characteristics. We found three members of the mPR class and one (MG0532.4) with weak similarity to animal GPCRs. No members of these classes have been reported previously in fungi. Within each class, members were assigned to paralogous families ( Table 1). Many of the genes in Table 1

Shared and unique GPCR-like protein classes in M. grisea
M. grisea GPCRs were compared with published fungal genome sequence databases to identify proteins belonging to the same GPCR classes. A BLASTP search against the genome of the closely related filamentous fungus N. crassa [17], using all the M. grisea GPCR-like proteins as query, revealed the presence of similar proteins in N. crassa, including PTH11 homologs (Table 2 and Additional data file 2). No PTH11 homologs were found in S. cerevisiae and S. pombe. Further analysis revealed putative homologs of the mPR-1 class in both yeasts, in which they had not previously been identified.
In addition, we found no evidence for cAMP receptor-like GPCRs in either yeast, unlike both M. grisea and N. crassa. The cAMP, STM1, and mPR receptors are shared between fungi and other eukaryotic species. However, the fungal pheromone receptors (class D) and GPR1-like receptors appear to be fungus-specific.
Members of the large class of PTH11-related receptors were restricted to a fungal subphylum. BLASTP of all the PTH11 class members, and PSI-BLAST using conserved regions, against the GenBank (nonredundant) and Swiss-Prot databases and publicly available fungal genomes retrieved matches in members of the subphylum Pezizomycotina within the Ascomycota, including Podospora anserina, Blumeria graminis, Fusarium graminearum and Aspergillus species. Other fungi belonging to the Ascomycota but not to this subphylum, such as S. cerevisiae, S. pombe, Candida albicans and Pneumocystis carinii lacked PTH11-related sequences. Also, no PTH11-related sequences were found in the genomes of the Basidiomycetes Cryptococcus neoformans, Ustilago maydis and Phanerochaete chrysosporium. No matches were found in plant, animal or prokaryotic genomes.

Phylogenetic analysis of PTH11-related GPCR-like proteins in M. grisea and N. crassa
PTH11-related receptors from M. grisea and N. crassa were classified into paralogous families (Additional data file 2). We also identified any that were orthologs between these two species. PTH11-related receptors in M. grisea and N. crassa and other sequences from P. anserina and B. graminis were aligned to determine any relationships. The region containing the conserved PTH11-domain was used to build a phylogenetic tree (Figures 1, 2a). Our analysis indicated that PTH11related proteins form a large and divergent protein family that evolved before the divergence of M. grisea and N. crassa. M. grisea and N. crassa orthologs occurred in the same clades ( Figure 1). Many different clades on the tree may represent paralogous sequences. The tree supports the putative orthologs and paralogs we identified (see Additional data file 2). Even though only the PTH11 domain was used to build the tree (the amino-terminal CFEM domain seen in a few proteins was not included), the 13 CFEM domain-containing proteins occurred together in one clade, indicating that the sequences are closely related. The phylogeny also revealed that within certain clades there was a marked expansion of the PTH11-related proteins in M. grisea compared to N. crassa. This is particularly notable for the CFEM domaincontaining proteins. There were six M. grisea members containing the CFEM domain in a paralogous family (Table 1 and Figure 1; a total of 12 related CFEM and seven-span proteins), but only one from N. crassa. We found 38 PTH11-related proteins in A. nidulans with an E-value less than 1e-09. Further characterization of these proteins will be required to define the number of seven-span PTH11-related proteins in this genome. Preliminary analysis shows that only two seven-span proteins contain the CFEM domain in A. nidulans. These observations could represent either expansion since speciation of the CFEM-containing PTH11 relatives in M. grisea, or loss of these proteins in the other fungal species.

New domain signatures as defined by conserved regions in homologous classes of identified receptors
Members of each class of M. grisea GPCR-like proteins described above, for example, cAMP-, STM1-like, PTH11related receptors, have domains that are conserved within each class. Sequence alignments from the BLASTP searches revealed specific regions containing shared residues for each of these classes of receptors. Figure 2 shows an alignment for some of the sequences that belong to classes other than the better-studied pheromone and glucose-sensing receptors. In all the PTH11-related members the region towards the amino terminus was conserved (Figure 2a, PTH11_dom). The extreme amino-terminal and the carboxy-terminal sequences flanking this region were divergent. Conserved residues occurred within the seven-span regions for all of these proteins. This is consistent with other observations that sequence conservation is typically limited to the transmembrane regions in GPCRs. The M. grisea protein MG06738.4, which has similarity to the cAMP receptors, shared conserved amino-acid residues between positions 81-179 with MG06797.4, MG00326.4, MG06257.4, related N. crassa proteins and other cAMP receptors (cAMP_dom, Figure 2b). Other proteins -MG00258.4 and MG10544.4 -with weak similarity to cAMP receptors also shared residues within this domain (data not shown). MG04698.4 shared two domains between amino-acid residues 22-101 and 244-327 with STM1, MG02855.4 and related proteins from different eukaryotic species (stm1_dom, Figure 2c). MG05072.4 shared residues within the region of 56-277 with MG09091.4 (residues 18-228), MG04679.4 (residues 260-497) and other proteins that were retrieved in the BLAST search, including mPR receptors (mPR_dom, Figure 2d).
The proteins containing the PFAM GPCR domains are indicated in Table 1. It is worth noting that the low scores for the PFAM domains that we observed may be due to the need to update these domain alignments by adding many new proteins, including those we discovered. For example, MG06452.4 contains a putative STE3 domain; the alignment score (E-value) is low, however. With the new fungal genomes being sequenced, more STE3 homolog sequences are available and inclusion of these in the seed alignment defining the STE3 domain will make the domain more representative for fungal STE3 domains.
Each class of receptors contained specific conserved regions within the membrane-spanning topology. A representative example of each class, showing the location of the conserved region within the membrane topology is illustrated in Figure  3. For fungus-specific receptors, the conserved domain spanned almost the entire length of the seven transmembrane regions. When other eukaryotic receptors were included in the class, however, only shorter conserved domains were discerned. These conserved residues may reflect functional constraints and may be valuable for studying the structure-function relationships of these proteins.

Distinct classes of GPCR-like proteins identified in M. grisea
Fungi respond to a variety of signals from the environment that regulate cellular metabolism and development as well as host-pathogen interactions. Cell-surface receptors perceive these signals and relay them to intracellular signaling path-Gene phylogeny based on the conserved membrane-spanning PTH11-domain Figure 1 (see previous page) Gene phylogeny based on the conserved membrane-spanning PTH11-domain. The tree shown was constructed using parsimony methods. Numbers on branches represent bootstrap values based on 100 random dataset simulations. Open ovals indicate putative paralogs and filled ovals the M. grisea-N. crassa orthologs. For sequences other than the ones predicted from M. grisea and N. crassa genome sequences the GenBank accession numbers are indicated. The abbreviations for species names are indicated in parentheses after the accession numbers as follows: BG, Blumeria graminis; PA, Podospora anserina; NC, N. crassa. The product of the gene PTH11 was referred to as Pth11p in the original report. Subsequently it has been referred to as PTH11. We refer to this gene product as PTH11 in this paper and would like to propose revision of its name from Pth11p to PTH11. ways. We searched the proteome of M. grisea for GPCR-like proteins and identified a total of 76 sequences (Table 1). This is the largest number of GPCR candidates identified for any fungal species. The identified proteins in M. grisea include homologs of known fungal receptors and a few other eukaryotic receptors. Putative orthologs of fungal STE2-and STE3like pheromone receptors required for the mating responses in yeast were identified. A homolog of GPR1, which is involved in pseudohyphal differentiation in S. cerevisiae, and two proteins that share similarities with STM1 from S. pombe were also found [11,13,16]. Six proteins shared similarities with cAMP receptors from Dictyostelium. In Dictyostelium the cAMP receptors are involved in establishing polarity during chemotaxis [3]. All the above M. grisea proteins can be annotated as putative GPCRs on the basis of homology to known receptors. It is likely that they respond to similar ligands, such as pheromones, nutrients and cAMP (Table 1). Response to fungal mating pheromones and the existence of pheromone receptors in M. grisea was first suggested by the observation that M. grisea responded to S. cerevisiae pheromones in a mating-type-specific manner [22]. Intracellular cAMP, produced by adenylate cyclase, is a critical factor regulating appressorium development in M. grisea. Lee and Dean have found that the fungus will respond to exogenously added cAMP by development of appressoria, although the concentrations required are high [31]. They noted that the cell wall and cell membrane should be relatively impermeable to cAMP, and thus any responses to extracellular cAMP will be due to cAMP receptors. Further research will be required to learn about the mechanism of perception of exogenous cAMP and other ligands and their targets within the cell. PTH11-related proteins share a number of characteristics diagnostic of GPCRs and define a new class of GPCR-like proteins. The predicted membrane topology suggests a sevenspan protein with an amino terminus outside the cell, that could respond to extracellular signals, and a cytoplasmic carboxy-terminal domain that could interact with G proteins. All the PTH11-related proteins shared conserved residues within the membrane spans, as observed in other GPCRs classes [8]. A subclass of the PTH11 receptors showed another characteristic that is seen in a few classes of human GPCRs: they have an amino-terminal cysteine-rich EGF-like CFEM domain. The animal secretin receptors are characterized by six conserved cysteines at the amino terminus, with cysteine bridges implicated in ligand binding. Some of the adhesion receptors have cysteine rich-EGF-like domains at their amino termini [6,8]. CFEM-domain-containing proteins, which are smaller in size and lack the seven transmembrane regions, may interact with the CFEM-containing GPCR-like proteins (Additional data file 3 and [29]). The CFEM-containing proteins have a signal peptide and/or a glycosylphosphatidylinositol (GPI) anchor. Thus they are either secreted from the cell or are anchored to the cell membrane. They may be similar to the odorant-binding proteins, which also have cysteine-rich domains and have been proposed to interact with odorant-GPCRs [32].

Unique classes of fungal G-protein-coupled receptors with ancient origins
Having diverged approximately 1,460 million years ago (Mya) [33], it is clear that fungi have classes of GPCRs that are distinct from those of animals. The class D fungal pheromone receptors define a fungus-specific class of receptors. We found the GPR1-like receptors to be also fungal specific. Classes of receptors specific to a group of species also occur in animals. For example, some of GPCRs in Anopheles gambiae constitute an insect-specific class of chemosensory receptors [7]. Insects are estimated to have diverged from other animals nearly 1,000 Mya. Thus, we would expect to find novel fungal GPCRs with no similarities to ones present in other eukaryotic kingdoms. The largest class of M. grisea GPCR-like proteins we identified is the novel PTH11-related class. It is interesting that we only found homologs of PTH11 in fungi belonging to subphylum Pezizomycotina within the Ascomycota (this subphylum has an estimated divergence date of 1,140 Mya). None was found in fungi belonging to other subphyla in Ascomycota or Basidiomycota, estimated to have diverged from each other 1,210 Mya. This indicates that these proteins are extremely ancient in origin, having possibly evolved to serve specialized functions in a specific subgroup of fungi. They are either unique to this fungal group or have evolved sufficiently to be unrecognizable.

Relationships between the PTH11-related proteins
The PTH11-related proteins form a large and divergent protein family, as suggested by the similarity between the proteins and the phylogenetic tree (Table 1, Figure 1). This gene family may have evolved before the divergence of M. grisea and N. crassa. There are a few orthologs between these species; however, it is apparent that this family has undergone considerable expansion in M. grisea compared to N. crassa, with the largest subclass in M. grisea being the CFEM-containing proteins. Many of the PTH11-related genes are located in close proximity to each other on the genome (data not shown), whereas none of the other GPCR-like proteins, Alignmen (see previous page)t of GPCR-like proteins Figure 2 Alignment of GPCR-like proteins. Domains conserved in (a) PTH11-, (b) cAMP-, (c) STM1-and (d) mPR-related classes are shown. Representative sequences from each class were aligned using T_Coffee [39]. The alignment was analyzed using GenDoc. We used the default setting using the conservative shading mode with similarity groups enabled. Black and the dark and light gray represent 80% or greater conserved, 60% or greater conserved, and less than 60% conserved, respectively. Conservative substitutions were counted as a single residue type. The GenBank or Swiss-Prot (SP) accession numbers or the accession numbers of the predicted proteins in the M. grisea or N. crassa genome databases are indicted on the left [21,42]. The boundaries of each sequence used in the alignment are indicated on the right. except a pair of cAMP-receptor-related proteins, occurs in close proximity. A paralogous pair, MG07553.4 and MG07565.4, occurs close together on linkage group III, indicating that these genes may have arisen as a result of duplication. We blasted these sequences against each other and observed that they show 30% identity with an E-value of 7e-54. This suggests that even if these genes are a result of duplication, they have diverged sufficiently and are not incorrect duplicate predictions of the same gene due to sequencing or assembly errors. Both these genes contain the CFEM domain and also occur in the same clade on the phylogenetic tree ( Figure 1). Another pair of CFEM-containing proteins is located in close proximity (LGI, group 1). The above examples of relative expansions within the PTH11-related proteins, as compared to N. crassa, are an indication that gene duplica-tion may still be occurring in M. grisea. In N. crassa it is believed that because of the phenomenon of repeat induced point mutations (RIP), gene duplications are not maintained [17]. There is evidence of RIP in M. grisea, but the present study provides an example that has escaped the RIP process [34]. Other possibilities are that these genes duplicated before the evolution of RIP or have escaped RIP because M. grisea rarely undergoes meiosis in the wild.

Regulation of the activity of GPCR-like proteins by differential expression and interaction with different signal transducers
Differential expression and interaction with different signal transducers could be a way to regulate specific signaling pathways. Results from genome-wide microarray experiments  (Table 1). In addition to differential regulation of the GPCRlike proteins, their interaction with different G proteins could channel various signals to different pathways. As well as the well studied interactions with G proteins, it has been proposed that the seven-span receptors may also interact with other signal transducers and receptor-interacting proteins to transmit the signal to different cellular pathways.

Conclusion
The number of classes of GPCR-like proteins identified in the present study is the largest reported in fungi. Further research on these receptors will help delineate potentially novel signaling pathways with which they interact. The new class of PTH11-related receptors, specific to an Ascomycota subphylum and relatively numerous in M. grisea, is particularly interesting. PTH11 is an integral membrane protein localized to the cell membrane and is required for pathogenicity [26]. It is proposed to act upstream of the cAMP pathway as a receptor that channels signals to this pathway. PTH11 does not have an ortholog in N. crassa. Also, as discussed earlier, only one CFEM-containing seven-span protein is present in N. crassa compared to 12, including PTH11, in M. grisea. It remains to be determined whether other members of this expanded class of PTH11-related proteins are involved in different aspects of pathogenicity. The subphylum Pezizomycotina includes the majority of known ascomycete species, and includes pathogens and mutualists. Because PTH11related GPCR-like proteins are present in non-pathogens, many members of this class are likely to be involved in functions not related to pathogenesis. All the seven-span receptors and their characteristic domain signatures we discovered (Figures 2, 3) will be valuable in the identification and comparative studies of new receptors in the many fungal genomes being sequenced.

Identification of GPCR-like proteins in Magnaporthe grisea
Known GPCR sequences, including ones present in the GPCRDB [5], were BLASTed against the predicted M. grisea proteome to identify homologs in M. grisea [21]. The database containing 7,900 GPCR sequences (updated 28 May 2003) was used as a query in a BLASTP search against the M. grisea predicted proteins with an E-value limit of 1e-09. Results from an InterPro scan of the M. grisea proteins were searched for domains containing the following terms: 'GPCR' and 'G-protein-coupled receptors' [27]. M. grisea PTH11, a GPCR-like protein (see Results), was also used in a BLASTP search against the M. grisea proteome. BLAST and PfamA searches and related sequence analysis were done using Genomax (Informax (now Invitrogen)).

Characterization of the GPCR-like proteins and identification of additional members in M. grisea and other fungi
GPCR-like sequences were evaluated for seven transmembrane regions by TMPRED, Phobius and TMHMM [35][36][37]. Default settings were used. In nearly all cases at least two of the algorithms predicted the seven-span helix topology (Additional data file 1). A BLASTP search using the sevenspan polypeptide sequences as query against the M. grisea protein set was also done to identify any other similar members. The set of identified seven-span proteins was then subject to BLASTP analysis against GenBank and Swiss-Prot to confirm sequence similarity to GPCRs. This exercise also allowed identification of other members that were similar to these sequences. The M. grisea seven-span proteins identified as above were used as a query in a BLAST search against the N. crassa predicted proteins [17] to identify homologs. The M. grisea and N. crassa proteins were placed into clusters using the blastclust program [38]. All M. grisea and N. crassa proteins that had at least 30% identity and 80% overlap over the length of the proteins were clustered together. Members of the same species within a cluster were considered paralogs. Orthologs were defined as proteins that had bidirectional best BLAST hits. A TBLASTN search using the sevenspan containing sequences as query against the GenBank EST database was performed to identify any identical matches in the M. grisea ESTs (or other closely related fungal sequences). The GPCR-like sequences identified in M. grisea were used as query in BLASTP searches (cutoff < 1e-09) against the S. cerevisiae and S. pombe genomes and other completely sequenced fungal genomes to identify putative homologs in these species.

Alignments and phylogenetic relationships between the predicted GPCR sequences
The alignment of sequences within related classes in Figure 2 was done using T_Coffee and minor editing as per results from the BLAST alignments was done using GenDoc [39]. For phylogenetic analysis, the conserved PTH11-domain that spans the membrane-spanning regions was used. Sequences were aligned using ClustalW version 1.81 [40]. The phylogenetic tree was constructed using PAUP by both neighbor-joining and parsimony methods followed by bootstrap analysis (100 bootstrap replications). A tree was also constructed using the neighbor-joining method implemented in the software package MEGA 2.1 [41]. All methods showed similar relationships between the proteins.

Additional data files
The following additional data is available with the online version of this paper: additional data file 1 is a