Hidden localization motifs: naturally occurring peroxisomal targeting signals in non-peroxisomal proteins

Functional but silent peroxisomal targeting signals have been found in non- peroxisomal proteins. This discovery has important implications for sequence-based signal prediction and for evolution.


Background
For an increasing number of otherwise uncharacterized protein sequences from genome-sequencing projects, function assignment is attempted solely with in silico prediction meth-ods, as reliable and cost-effective large-scale experimental methods are not available. In addition to sequence homology and annotation transfer considerations [1], these function assignments increasingly rely on algorithms that recognize protein-sequence features responsible for posttranslational modifications, subcellular localization and interactions with specific domains of other proteins.
Although considerable effort has been invested in achieving low false-positive prediction rates, our experience with tools for recognizing glycosyl phosphatidylinositol (GPI) lipid [2,3] and myristoyl [4][5][6] anchor attachment sites and for predicting potential targets for PTS1-dependent translocation to peroxisomes [7] shows that a small but noticeable number of proteins without appropriate biological context (for example with contradictory subcellular localization or in taxa without the modifying enzyme or receptor) are systematically hit by these tools. For example, we found more than a dozen metazoan lysozymes [7,8], known extracellular proteins, that are predicted to have carboxyl termini with a functional peroxisomal targeting signal 1 (PTS1) region.
Are these false-positive predictions? All three of the sequence-analysis tools mentioned above check query sequences for a recognition pattern that is explicitly described in terms of its physical properties and it is possible to check the concordance between pattern descriptions and query sequence individually. Nevertheless, this visual inspection is frequently unable to rationalize the findings as false-positive predictions, as all known components of the pattern appear to be present. Even in the case of high accuracy of the prediction tool, an erroneous prediction cannot be excluded. Alternatively, these predicted sequence motifs may occur by chance and be functional in an appropriate test system, but still have no biological meaning because the necessary cellular context is absent in vivo. Only experimental tests can resolve this contradiction. As a case study, we report the results of an experimental analysis that demonstrates the existence of naturally occurring peroxisomal targeting signals in several known non-peroxisomal proteins. We also discuss the evolutionary perspective of functional localization signals in unrelated proteins as well as the consequences for experimental localization determination and function prediction from sequence.
The major mechanism for targeting proteins to the matrix of peroxisomes, which are membrane-bounded organelles [9] of eukaryotic cells, is initiated in the cytoplasm by interaction of the receptor protein peroxin 5 (PEX5) with the carboxy-terminal signal PTS1 on the target protein [10,11]. This signal consists of three regions of sequence comprising approximately 12 residues [12,13]. It is composed of the most carboxy-terminal tripeptide (classically, the -SKL terminus), preceded by a region of around four residues (which interact with the surface at the mouth of the PEX5 binding cavity), and a solvent-accessible (or easily unfoldable) stretch of around five residues further upstream. The PTS1-prediction program 'PTS1' [14] identifies PTS1 signals in query protein sequences by evaluating their carboxy-terminal ends with respect to features necessary for interaction with the tetratricopeptide repeats of PEX5. The predictor's scoring function searching for this motif within the 12 carboxy-terminal residues achieves an estimated sensitivity of 90% and a selectivity above 99% [7].

Results
The carboxyl termini of several non-peroxisomal proteins interact with PEX5 Screening of SWISS-PROT [15] entries with the PTS1 predictor identified proteins from several families that are clearly not peroxisomal but score highly and are predicted as PEX5 targets [7,8]. We were not able to rationalize these results as false predictions as the proteins' carboxyl termini did not deviate from the generalized PTS1 sequence pattern [13]. To verify whether these proteins could indeed interact with PEX5, we tested the carboxyl termini of seven representative proteins in a yeast two-hybrid system: hen egg-white lysozyme (P00698, secreted); dog lysozyme C from milk (P81708); tyrosinase from human (P14679, a melanosomal type I membrane protein); frog tyrosinase (Q04604); Drosophila sevenless (P13368, a large transmembrane protein required for photoreceptor development); precursor of lysosomal bovine cathepsin D (P80209); and a mitochondrial ribosomal protein from yeast (P12687). We also examined the carboxyl terminus of a mouse dihydrofolate reductase construct with an added SKL peptide, which has been shown not to be imported into yeast peroxisomes [16,17].
Depending on their taxonomic origin, the carboxyl termini of the eukaryotic sequences were assayed for interaction with the tetratricopeptide repeat domains of either human or yeast PEX5 using published methodologies [12]. The query sequences, along with prediction scores and measured βgalactosidase activities, are summarized in Table 1. The results show that all peptide sequences interact with the PTS1-receptor PEX5 in the two-hybrid system. Hence, the carboxy-terminal sequences of these assayed non-peroxisomal proteins fulfill the requirements to function as PTS1 signals.

The accessibility of the PTS1-like carboxyl terminus is critical
The fact that the peroxisomal translocation machinery fails to import naturally occurring mature proteins carrying PTS1 signals into peroxisomes in vivo could be explained by the non-accessibility of their carboxyl termini. These could either be hidden in the native structure of the mature protein or of its functional complexes, or competing translocation machineries could lead to a removal of the respective proteins from the cytosol before their recognition by PEX5.
The first possibility is exemplified by DHFR-SKL. The carboxy-terminal 16 residues of the DHFR-SKL construct (EKGIKYKFEVYEKSKL, sequences appended to DHFR are in bold type, see results in Table 1) interact with yeast PEX5 in the two-hybrid test but in vivo the complete construct is not imported into peroxisomes, thus confirming the prediction [16,17]. For comparison, it should be noted that two other DHFR-derived constructs with slightly longer carboxyl termini (IKYKFEVYEKGGKSKL and IKYKFEVYEK-KNIESKL) are predicted to be peroxisomally targeted. Their scores calculated with the PTS1 predictor [7] are 13.2 and 9.9, respectively (compare with data in Table 1). They were experimentally shown [17] to be translocated to peroxisomes. In the native three-dimensional structure of DHFR [18], the carboxyl terminus is part of a β-sheet that is buried in the fold, deprived of flexibility and accessibility. Seemingly, this structure prevents the carboxy-terminal appended residues SKL in the construct from entering the PEX5 binding cavity, whereas slightly longer carboxyl termini may do. In our two-hybrid test system, the carboxy-terminal 16-mers are always considered exposed as, in the non-native sequence environment of the carboxyl terminus of the GAL4 activation domain, they are free from interfering or blocking structural features. Thus, DHFR-SKL fails to be imported into peroxisomes because its carboxyl terminus is sequestered in the structure of the mature protein.

Competing targeting signals prevent translocation into peroxisomes despite the presence of PTS1-like carboxyl termini
Alternatively, functional PTS1 signals can be overruled by other localization signals [7]. For instance, distribution of the mammalian alanine-glyoxylate amino transferase (AGT) between peroxisomes and mitochondria is regulated by the variable occurrence of an amino-terminal mitochondrial targeting signal in the mature protein (depending on the usage of two alternative transcription initiation sites) [19,20].
Does a naturally occurring PTS1-like carboxyl terminus of a clearly non-peroxisomal protein that is capable of interacting with PEX5 indeed lead to in vivo import of the respective protein, provided that a potentially overruling sequence signal is eliminated? A set of three target proteins with amino-terminal leader sequences was chosen from Table 1. Chicken lysozyme (SWISS-PROT id P00698), a secreted enzyme, is one of the best characterized proteins and has an apparently accessible carboxyl terminus as deduced from its threedimensional structure (Protein Data Bank (PDB) number 1H6M [21]). The corresponding carboxy-terminal 16-mer produces moderate β-galactosidase activity in the yeast twohybrid assay (most of the other proteins in Table 1 appear to All eukaryotic target sequences (characterized by species, SWISS-PROT or NCBI-Refseq accession number, score from the PTS1 predictor [7], carboxy-terminal sequence and description) were tested for interaction with the tetratricopeptide (TPR) repeat domain of human PEX5, except for P12687 and DHFR-SKL where the corresponding TPR domains were derived from yeast PEX5. The prokaryotic proteins were assayed using PEX5 from both yeast and human. As the estimated length of the PTS1 signal is 12 carboxy-terminal residues [13], we chose the carboxy-terminal 16-mers to be sure that we have included the complete motif-carrying segment. *A PTS1 prediction score above zero is considered predictive of a functional PTS1 signal; a score between -10 and 0 is considered a 'twilight zone' prediction. It should be noted that the negative score for the DHFR-SKL carboxyl terminus in its context is generated by the PTS1 predictor [7] solely by terms that evaluate its potential accessibility for PEX5. † A yeast-two hybrid assay is considered positive if the measured β-galactosidase activity is clearly greater than zero. Experience from previous test series suggests a lower limit of around 10 Miller Units per mg protein [12] for the detection of a productive interaction. The measured β-galactosidase activities (including standard deviations) range from weak (P81708, P13368) to strong (P80209, P12687).
interact even more strongly with PEX5). Human tyrosinase (P14679) is a melanosomal marker protein that functions in the formation of pigments such as melanins. Yeast 60S ribosomal protein L2 (P12687), or MRP7, is a component of the large subunit of the mitochondrial ribosome.
Green fluorescent protein (GFP) was appended to the amino terminus of each of the selected proteins. It can be assumed that translocation into the endoplasmic reticulum (ER) or mitochondria is disrupted by the resulting shift of the signal peptide from the amino terminus to the center of the protein.
The resulting molecules are expected to be redirected into peroxisomes if their carboxyl termini can act as PTS1 signals.
Targeting of the GFP-constructs in vivo was indeed confirmed by co-localization with a peroxisomal DsRed2-SKL construct in COS7 cells for the metazoan enzymes ( Figure 1) and with DsRed-SKL in yeast cells for the Saccharomyces cerevisiae protein ( Figure 2). Thus, the PTS1 signals at the carboxyl termini of the assayed proteins are normally suppressed by alternative amino-terminal targeting sequences. A similar mechanism can be inferred for other eukaryotic SWISS-PROT proteins listed in Table 1, although steric carboxy-terminal accessibility or other factors might also play a role.

Functional PTS1 sequences can occur in organisms without peroxisomes
The occurrence of silent PTS1s without a targeting role raises the question of whether such signals can also evolve in organisms that do not carry peroxisomes. To test this hypothesis, we extended Table 1 with a set of four predicted carboxyl termini from prokaryotic enzymes: Escherichia coli glutamate-1-semialdehyde 2,1-aminomutase (P23893), E. coli transaldolase A (P78258), Methanopyrus kandleri riboflavin synthase (NCBI-Refseq accession NP_613646) and Archaeoglobus fulgidus 2-nitropropane dioxygenase (NCBI-Refseq accession NP_070998). Indeed, these proteins harbor carboxyl termini that qualify as PTS1 signals (lower part of table 1). As confirmation, for the bacterial protein glutamate-1-semialdehyde 2,1-aminomutase (GSA) we used the same methodology for subcellular localization determination as for yeast MRP7. The resulting GFP-GSA construct is also imported into peroxisomes (Figure 2), demonstrating that its PTS1-like carboxyl terminus is functional in the mature protein.

Discussion
In families of orthologous proteins, peroxisomal location and its targeting signal in the amino-acid sequence are not necessarily conserved. For example, in plants the five enzymes of the glyoxylate cycle are localized to peroxisomes, but in S. cerevisiae three of the five (aconitase, isocitrate lyase, and the respective malate dehydrogenase isoform) could not be found in peroxisomes [22]. Thus, it is not surprising to find sporadically occurring PTS1 signals in protein families (see some examples in Table 1).
In dually localized proteins such as AGT [23], the PTS1 signal has a biological role as a targeting signal. However, the carboxyl termini of the proteins from Table 1 do not seem to fulfill any specific targeting function. We suggest that these PTS1 Targeting of GFP-tyrosinase and GFP-lysozyme to peroxisomes in human cells  signals occur as a result of neutral mutation. The presence of a functional PTS1 signal would not lead to evolutionary pressure in this context because mislocalization is prevented by overriding the function of these sequences either by alternative exposure of amino-terminal signals or by steric carboxyterminal inaccessibility.
The case of lysozyme is particularly noteworthy because a large number of homologous proteins were systematically hit when performing a SWISS-PROT screen using the prediction tool (30 cases with putative PTS1s and 46 other lysozyme carboxyl termini are shown in Figure 3). Because of the close relationship of the originating species and the occurrence of several isozymes, the lysozyme sequences in the multiple alignment share a high degree of similarity. The PTS1 carboxyl termini seem to be a mimicry of the sequence needed to support structural features of the protein. The cysteine at the antepenultimate position, which is present as part of a disulfide bridge [21] in the final secreted form of lysozyme, happens to fulfill the need for a small residue at the respective PTS1 location. The PTS1 is mostly functional, with a positively charged or amidic penultimate amino acid and the correct hydrophobic carboxy-terminal residue, which is the case for a large proportion of the lysozymes. Note that the disulfide bridge will not be formed in our GFP-lysozyme test case because translocation of the fusion protein into the endoplasmic reticulum is prevented.
We conclude that a PEX5-interacting sequence can evolve simply by mutational alterations in the carboxy-terminal region of a protein. In non-globular regions of proteins, sequences that code for targeting to other subcellular compartments, or for posttranslational modifications, might appear in similar ways during evolution. For example, the sequence motif coding for amino-terminal N-myristoylation of glycines behaves as an exchangeable functional module, as protein families do exist where it has been substituted by alternative sequence determinants that facilitate membrane association [6]. This is exemplified by the Arabidopsis thaliana Rab5 ortholog Ara7 Targeting of GFP-MRP7 and GFP-GSA to peroxisomes in yeast cells

im pe ja nu s A W RN RC K G TDV HA WI RG CR L ( + ) P 24 36 4 L . le uc om el ana A W RN RC K G TDV SV WT RG CR L ( + ) P 00 70 3 M . ga ll op av o A W RN RC K G TDV HA WI RG CR L ( + ) P 19 84 9 P . cr is ta tu s A W RN RC K G TDV HA WI RG CR L ( + ) P 24 53 3 S . re ev es ii A W RN RC K G TDV NA WI RG CR L ( + ) P 81 71 1 S . so em me rr ing ii A W RK RC K G TDV NA WT RG CR L ( + ) Q 7L ZI 3 T . sa ty ra A W RN RC K G TDV QA WI RG CR L ( + ) Q 7L ZT 2 T . te mm in ck ii A W RN RC K G TDV HA WI RG CR L ( + ) Q 7L ZQ 1 T . si ne ns is A W TK YC K G KDV SQ WI KG CK L ( # ) P 12 06 7 S . sc ro fa A W RT HC Q N KDV SQ YI RG CK L ( # ) P 12 06 8 S . sc ro fa A W RA HC Q N KDV SQ YI RG CK L ( # ) P 12 06 9 S . sc ro fa A W KA HC Q N KDV SQ YI RG CK L ( # ) P 00 70 7 O . ve tu la A W RK HC K G TDV ST WI KD CK L ( # ) P 00 70 2 P . co lc hi cu s c ol chi c us A W RK HC K G TDV NV WI RG CR L ( # ) P 49 66 3 P . ve rs ic ol or A W RK HC K G TDV NV WI RG CR L ( # ) P 51 78 2 T . vu lp ec ul a A W RN KC E G KDL SK YL EG CH L ( -) P 00 70 4 N . me le ag ri s A W RK HC K G TDV RV WI KG CR L ( -) Q 06 28 5 B . ta ur us A W KS HC R D HDV SS YV EG CT L ( -) P 37 71 3 C . hi rc us A W KS HC R D HDV SS YV EG CT L ( -) P 00 69 7 R . no rv eg ic us A W QR HC K N RDL SG YI RN CG V ( -) P 17 60 7 O . ar ie s A W KS HC R D HDV SS YV EG CS L ( -) Q 06 28 3 B . ta ur us A W KS HC R D HDV SS YV EG CT L ( -) P 81 70 9 C . fa mi li ar is A W RA HC E N RDV SQ YV RN CG V ( -) P 37 71 4 C . hi rc us A W KS HC R D HDV SS YV EG CT L ( -) P 11 94 1 O . my ki ss A W RL HC Q N QDL RS YV AG CG V ( -) Q 05 82 0 R . no rv eg ic us A W QR HC Q N RDL SG YI RN CG V ( -) Q 06 28 4 B . ta ur us A W KS HC R D HDV SS YV QG CT L ( -) P 80 19 0 O . ar ie s A W RS HC Q N QDL TS YI QG CG V ( -) P 08 90 5 M . mu sc ul us A W RA HC Q N RDL SQ YI RN CG V ( -) P 80 18 9 B . ta ur us A W RS HC Q N QDL TS YI QG CG V ( -) P 17 89 7 M . mu sc ul us A W RT QC Q N RDL SQ YI RN CG V ( -) Q 27 99 6 B . ta ur us A W KN KC R N RDL TS YV KG CG V ( -) P 79 68 7 A . ni gr ov ir idi s A W RN HC Q N RDV SQ YV QG CG V ( -) P 12 06 6 A . ax is A W KS HC R G HDV SS YV EG CT L ( -) P 04 42 1 B . ta ur us A W KS HC R D HDV SS YV EG CT L ( -) P 79 15 8 C . ja cc hu s A W KA HC Q N RDV SQ YV QG CG V ( -) P 37 71 2 C . dr om ed ar ius A W KN HC E G HDV EQ YV EG CD L ( -) P 61 63 3 C . ae th io ps A W RN HC Q N RDV SQ YV QG CG V ( -) P 61 63 0 C . to rq ua tu s a ty s A W RN HC Q N RDV SQ YV QG CG V ( -) P 61 63 1 C . an go le ns is A W KK HC Q N RDV SQ YV EG CG V ( -) P 61 63 2 C . gu er ez a A W KK HC Q N RDV SQ YV EG CG V ( -) P 61 63 4 E . pa ta s A W RN HC Q N RDV SQ YV QG CG V ( -) P 61 94 4 F . ru br ip es A W NR HC Q N RDL SA YI AG CG L ( -) P 79 17 9 G . go ri ll a gor il la A W RN RC Q N RDV RQ YV QG CG V ( -) P 61 62 6 H . sa pi en s A W RN RC Q N RDV RQ YV QG CG V ( -) P 79 18 0 H . la r A W RN RC Q N RDL RQ YI QG CG V ( -) P 30 20 1 M . mu la tt a A W RN HC Q N RDV SQ YV QG CG V ( -) P 79 80 6 M . ta la po in A W RN HC H N RDV SQ YV QG CG V ( -) P 79 81 1 N . la rv at us A W RN HC Q N RDV SQ YV KG CG V ( -) P 61 62 7 P . pa ni sc us A W RN RC Q N RDV RQ YV QG CG V ( -) P 61 62 8 P . tr og lo dy tes A W RN RC Q N RDV RQ YV QG CG V ( -) P 61 62 9 P . an ub is A W RN HC Q N RDV SQ YV QG CG V ( -) Q 9D D6 5 P . ol iv ac eu s A W RQ HC Q G QDL SS YL AG CG L ( -) P 79 23 9 P . py gm ae us A W RN RC Q N RDV RQ YV QG CG V ( -) P 07 23 2 T . ve tu lu s A W RN HC Q N KDV SQ YV KG CG V ( -) P 79 84 7 P . ne ma eu s A W RN HC Q N KDV SQ YV KG CG V ( -) P 16 97 3 O . cu ni cu lu s A W RN HC Q N QDL TP YI RG CG V ( -) P 79 26 8 S . oe di pu s A W KA HC Q N RDV SQ YI QG CG V ( -) P 79 29 4 S . sc iu re us A W KA HC Q N RDV SQ YV QG CG V ( -) Q 9P U2 8 S . ma xi mu s A W KR HC Q G QDL SS YV AG CG V ( -) P 87 49 3 T . ob sc ur us A W RN HC Q N KDV SQ YV KG CG V ( -) Q 9D FF 3 O . my ki ss A W RL HC Q N QDL RS YV AG CG V Carboxyl terminus
and its paralog Ara6. Ara7 is geranylgeranylated on carboxyterminal cysteines just as Rab5 is in other species. However, the closely related paralog Ara6 lacks the carboxy-terminal cysteines and has an experimentally verified amino-terminal myristoylation motif [24].
Many of these signals seem to remain silent under normal physiological conditions (as is the case for the PTS1 signal in some metazoan lysozymes) but have the potential to become important in some future evolutionary scenarios or in pathological situations. Alternatively, the PTS1 signal might have become obsolete and the corresponding sequence segment is now subject to evolutionary alterations. Apparently, the cell exploits only a fraction of the potential molecular capabilities of its proteins.
Futhermore, subcellular targeting is organized in a hierarchy of cellular recognition mechanisms. The co-translational sorting into the ER serves as a first decision node. Posttranslational processes such as interaction with chaperones, folding, and covalent modifications are concomitant with the appropriate exposure of targeting signals. The amino-terminal signals are made first and are therefore favored when it comes to recognition by receptors. PEX5 needs only to categorize the remaining unsorted proteins with accessible carboxyl termini into 'stay here' or 'let's go into peroxisomes'. This might also explain why the PTS1 signal is comparatively short and permissive for a wide range of residues.
Clearly, the fact that functional sequences for subcellular targeting occur in unrelated proteins needs to be considered for prediction-tool development. The construction of a negative learning set (sequences without the specific localization signal) on the basis of proteins with differing cellular localization is problematic. For example, a set of non-peroxisomal but organellar localized [25], viral [26] or bacterial sequences might contain a considerable number of proteins that potentially interact with PEX5. Thus, such a set does not directly qualify for automated learning procedures or the assessment of false-positive prediction [27,28].
Surprisingly, when Maurer-Stroh and Eisenhaber applied their myristoylation site predictor for eukaryotic proteins to bacterial proteomes [5], systematic hits were found despite the absence of known amino-terminal N-myristoyltrans-ferases (NMT) in bacteria. Are these false-positive predictions? A literature search revealed that myristoylation by host NMTs has physiological relevance for several secreted proteins of intracellular bacterial parasites [5]. Thus, the sequence motif coding for amino-terminal N-myristoylation is typical for eukaryotes but occurs also in bacteria. In many cases, it remains without phenotypic effect for bacteria but may become evolutionarily important in the case of host-parasite interactions.
In the case of the endothelin-converting enzyme 1 and the neprilysin-like zinc metallopeptidase family, the carboxy-terminal CXAW motif is a valid prenylation motif. This carboxyterminus is functionally hidden because the protein is exported to the extracellular side of the cytomembrane and the carboxy-terminal residues are apparently involved in folding and enzyme function [29].
Clearly, the accessibility of the recognition motif in the substrate protein to the respective receptor or protein-modifying enzyme is a major issue. For PTS1 signal prediction from the amino-acid sequence, carboxy-terminal exposure needs to be assessed both from the steric point of view as well as in the context of competing translocation mechanisms. Analyzing only the carboxy-terminal dodecamer peptide [7,13] might not suffice for reliable prediction of accessibility to the receptor, but a full solution would require sufficiently accurate three-dimensional structure prediction.
In databases, it should also be routine to flag proteins that contain several competing targeting signals with differing priority. Finally, silent localization signals might become active in mutant protein constructs and lead to non-native localizations, an issue that needs to be assessed especially in localization screens of proteins with uniformly incorporated fluorescent dyes such as GFP. It cannot be excluded that the subcellular location of a considerable number of proteins has not been correctly determined in published large-scale studies that rely on this methodology [30,31].
To conclude, sequence segments coding for subcellular targeting or for posttranslational modifications can occur in proteins that are not substrates in either of these processes. Accurate prediction techniques reveal candidate proteins carrying hidden sequence signals. Many of these can be experi-Multiple alignment of lysozyme carboxyl termini Figure 3 (see previous page) Multiple alignment of lysozyme carboxyl termini. A screen of the SWISS-PROT database [15] for proteins that harbour PTS1 signals produced a set of lyosozymes, well characterized secreted enzymes that are not usually found in peroxisomes. Rather than occurring sporadically, a large fraction of the known sequences from this family was obtained using the PTS1 prediction tool [7]. Moreover, these hits could not be rationalized as false positives as they did not deviate from the PTS1 sequence motif [11][12][13]. The multiple alignment shows intact vertebrate lysozyme carboxy-terminal 20-mers (with accession number and species name) retrieved from the SWISS-PROT database. From a total of 76 entries, 23 have predicted PTS1s (score > 0; at the top, marked with '+'), seven are in the twilight zone (-10 < score < 0; in the middle, marked with '#') and 46 are not predicted (score < -10; at the bottom, marked with '-'). There appears to be an overlap between the PTS1 motif and sequence variability within the lysozyme family. For example, the absolutely conserved cysteine near the carboxyl terminus is needed for the formation of a disulfide bridge in the mature protein [21]. This cysteine also meets the requirement for a small residue at the antepenultimate position of the PTS1 sequence.
mentally confirmed. In the case of the PTS1 predictor program, there is no reasonable argument to assume a difference in prediction accuracies for real and hidden PTS1s as, in both cases, productive interaction of the carboxyl terminus with PEX5 is the criterion for a functional PTS1.

Cloning procedures
Oligonucleotides were purchased from MWG Biotech (Munich, Germany). The E. coli strain DH5α, Bethesda Research Laboratories) was used for all transformations and plasmid isolations. For the yeast two-hybrid-assay, the hybridized oligonucleotide pairs coded for the carboxy-terminal 16-mers of the selected proteins flanked by BamHI (5') and EcoRI (3') restriction sites. Each oligonucleotide pair was introduced into a BamHI-EcoRI-digested pGAD.GH fragment, generating plasmids containing the Gal4p activation domain in addition to the desired carboxy-terminal 16-mer extension (Gal4pAD-16mer). All pGAD.GH constructs were sequenced (VBC Genomics, Vienna, Austria). The plasmids pAH987 and hP87 contain the binding domain of Gal4p fused to the TPR domain of S. cerevisiae or Homo sapiens PEX5, respectively (Gal4pBD-TPR) [12].
Chicken cDNA for the amplification of lysozyme was generated from chicken oviduct using Tripure (Invitrogen) according to the manufacturer's instructions. Reverse transcription was performed using RNA-PCR Core Kit (Applied Biosystems) following the manufacturer's instructions. For the amplification of tyrosinase, we used cDNA from the melanoma cell line 29 WUBI (generous gift of Walter Berger, Vienna). The coding regions of lysozyme and tyrosinase were gained by PCR (for oligonucleotide primers see Table 2) using the Advantage cDNA Polymerase Mix kit from Clontech and the GeneAmp PCR-system from Perkin Elmer. The PCR-fragments were cloned into the pCR2.1 vector (Invitrogen) by T/ A cloning and sequenced as control (VBC Genomics). The fragments containing the lysozyme or tyrosinase coding regions were excised with EcoRI/BamHI and ligated into pEGFP-C1 (Clontech). The DsRed2-SKL construct was obtained by PCR using Pfu-polymerase (Promega) and the plasmid pDsRed2-C1 (Clontech) as template (for oligonucleotides, see Table 2). The PCR fragment and the plasmid were both cut with Eco47-3/XhoI and the PCR fragment encoding the carboxy-terminal SKL was introduced to replace the original DsRed2 end sequence. The final plasmid encodes the DsRed2-SKL protein under the control of the cytomegalovirus promoter.
Standard procedures were used for cloning of the GFP-MRP7 and GFP-GSA constructs including control sequencing (VBC Genomics). The plasmids expressing GFP and GFP-SKL under control of the MLS1 promoter were described previously [32]. The DNA fragment coding for DsRed-SKL was obtained by PCR (for oligonucleotides, see Table 2; template pDsRed, Clontech) and cloned (BamHI-and partially with PstI) after the MLS1 promoter in the vector YEplac181. DNA fragments coding for MRP7 and GSA were obtained by PCR (see Table 2 for oligonucleotide sequences) and cloned (BamHI-SphI) in-frame with GFP to give rise to the expression of GFP-MRP7 and GFP-GSA, respectively, all of them under the control of the MLS1 promoter.

In vivo localization study in COS7 cells
COS7 cells were transfected with the pEGFP-C1-constructs and DsRed2-SKL by electroporation using 920 µF and 220 Table 2 Oligonucleotides used for the amplification of the GFP-constructs mV (Gene pulser II, Bio-Rad), grown on coverslips for 36 h, washed, fixed with 0.5% formaldehyde in PBS for 15 min and covered with geltol. Cells were analyzed using the Olympus BX51 fluorescence microscope (60 × enlargement).

In vivo localization study in yeast cells
The yeast strain used in this study is S. cerevisiae CB80 (MATa, ura3-52, leu2-1, trp1-63, his3-200). Yeast transformants were selected and grown on minimum medium containing 0.67% yeast nitrogen bases without amino acids (Difco Laboratories), 2% glucose and amino acids (20-150 µg/ml) as required (SC-leu-ura). For fluorescence microscopy, yeast cells were grown at 30°C with shaking in selective media with 0.5% glucose as sole carbon source until the glucose concentration was very low (0.05%, usually 16 h), harvested by centrifugation and resuspended in the original volume of induction medium containing 0.67% yeast nitrogen bases without amino acids, 0.1% yeast extract, 30 mM potassium phosphate pH 6.0, 0.125% oleate, 0.2% Tween-80 and amino acids as required. Cells were grown for 16 h in induction medium and observed live for fluorescence. Briefly, cells were collected by centrifugation and washed twice in water. Cell pellets were resuspended in induction medium without oleate and aliquots were spotted onto multitest slides (ICN Biochemicals) previously coated with concanavalin A (6 mg/ml, Sigma). Cells were allowed to attach for 5 min at room temperature and the slides were washed twice with induction medium and a coverslip applied for observation. Fluorescence was viewed with a Zeiss Axioplan 2 fluorescence microscope using a 63 × (1.4 NA) lens. Digital images were captured with a Quantix CCD camera using Lightview software without further modification. The pictures were mounted and falsecolor overlays were made in Adobe Photoshop.