The full-length cDNA of the g subunit of R-phycoerythrin from Corallina officinalis L. was cloned by rapid amplification of cDNA ends (RACE) method, and sequenced. The full-length cDNA is a 2 308 bp consisting of 5‘‘ untranslated region (UTR) of 1 203 bp, an open reading frame (ORF) of 960 bp that encodes 320 amino acids, and 3‘‘ UTR of 145 bp. The mature g polypeptide contains two unique internal repeat domains as reported by Apt et al. (2001). Sequence analysis of the different clones revealed different 3‘‘-end sequences at the g subunit. The difference between the 3‘‘-end sequences suggests that the g subunit may have more than one copy, or have gone through different post-transcriptional modification. By comparing the DNA and cDNA sequences, we found that the g subunit is an intronless gene. This is the first report of the g subunit gene of R-phycoerythrin from C. officinalis .
全 文 :Received 27 Oct. 2003 Accepted 25 Mar. 2004
Supported by the Local Key Project of China in Fujian Province (2000H004, 2001Z127).
* Contributed equally with the first author to this article.
** Author for correspondence. Tel (Fax): +86 (0)591 3769704; E-mail:
http://www.chineseplantscience.com
Acta Botanica Sinica
植 物 学 报 2004, 46 (10): 1135-1140
Cloning and Sequencing the g Subunit of R-Phycoerythrin
from Corallina officinalis
WANG Sheng, ZHONG Fu-Di*, WU Zu-Jian, LIN Qi-Ying, XIE Lian-Hui**
(Key Laboratory for Biopesticide and Chemical Biology, Ministry of Education; Institute of Plant Virology,
Fujian Agriculture and Forestry University, Fuzhou 350002, China)
Abstracts: The full-length cDNA of the g subunit of R-phycoerythrin from Corallina officinalis L. was
cloned by rapid amplification of cDNA ends (RACE) method, and sequenced. The full-length cDNA is a 2 308
bp consisting of 5 untranslated region (UTR) of 1 203 bp, an open reading frame (ORF) of 960 bp that
encodes 320 amino acids, and 3 UTR of 145 bp. The mature g polypeptide contains two unique internal
repeat domains as reported by Apt et al. (2001). Sequence analysis of the different clones revealed
different 3-end sequences at the g subunit. The difference between the 3-end sequences suggests that
the g subunit may have more than one copy, or have gone through different post-transcriptional modification.
By comparing the DNA and cDNA sequences, we found that the g subunit is an intronless gene. This is the
first report of the g subunit gene of R-phycoerythrin from C. officinalis .
Key words: Corallina officinalis ; R-phycoerythrin; g subunit; rapid amplification of cDNA ends (RACE)
The major light-harvesting antennae in red alga and
prokaryotic cyanobacteria are macromolecular complexes
called phycobilisomes (Glazer, 1989). The phycobilisomes
are highly ordered structures composed of two function-
ally different protein types: the major pigmented
phycobiliproteins directly involved in light absorption and
the linker polypeptides that serve a structural role and help
maintain efficient energy transfer within the complex. There
are three prominent classes of chromophorylated proteins
in phycobilisomes: allophycocyanin (APC), Amax (650 nm),
phycocyanin (PC), Amax (620 nm) and phycoerythrin (PE),
Amax (565 nm) (Apt and Grossman, 1993).
The phycoerythrin has three dissimilar subunits: the a,
b , and g polypeptides. As main components of
phycoerythrin, a and b subunits encoding genes that are
located on the plastid genome (Shivji, 1991) have been
cloned in some kinds of red algae, such as Rhodella reticu-
late (Thomas and Passaquet, 1999), R. violacea (Bernard
et al., 1992), Porphyra purpurea (Reith and Munholland,
1995), P. tenera (Kim et al., 1997), P. yecoensis (Kim and
Fujita, 1997), Polysiphonia boldii (Roell and Morse, 1993),
Griffithsia monilis, Aglaothamnion neglectum (Apt et al.,
1993), and Gracilaria lemaneiformis (Sui and Zhang, 2000).
In red algae, the PE linker polypeptides, referred as g
subunits, are chromophorylated and form a stable complex
with a and b subunits of PE (Bernard et al., 1996; Ritter
et al., 1999). The g subunits from several red algae are
partially characterized (Apt and Grossman, 1993). They are
located on the nuclear genome (Egelhoff and Grossman,
1983). The number of g subunits associated with PE varies
with different red alga. Callithamnion byssoides, C. roseum,
Audouinella saviana, G. coulteri, and A. neglectum all have
two g subunits, while Porphyridium purpureum has three g
subunits and Rhodella violaceae has only one g subunit
(Apt and Grossman, 1993). So far only the g subunit encod-
ing gene of the red alga A. neglectum has been cloned (Apt
and Grossman, 1993; Apt et al., 2001). As for the protein,
only a few g subunits, including a part of R. violaceae g33
subunit (Lichtle et al., 1996), the N-terminal and C-terminal
sequences of Porphyridium purpureum g subunit (Sidler,
1994), and the chromophores binding sites of G. coulteri,
were sequenced (Klotz and Glazer, 1985).
As noted above, very limited sequence data are avail-
able on the g subunit encoding gene of phycoerythrin.
In this study, the cloning of the g subunit gene from
C. officinalis using RACE and PCR methods is
presented.
1 Materials and Methods
1.1 Algae materials
Samples of Corallina officinalis L. were collected from
Putian Bay, Fujian, China. They were rinsed with distilled
water, and frozen in liquid nitrogen, and stored at -50 ℃
until use.
.Rapid Communication.
Acta Botanica Sinica 植物学报 Vol.46 No.10 20041136
1.2 Purification of R-phycoerythrin and N-terminal se-
quence determination
The purification of R-phycoerythrin was performed as
previously described (Wang et al., 2002). Phycoerythrin
was denatured, and the polypeptide components were re-
solved by SDS-PAGE. The resolved polypeptides then were
electroblotted on PVDF membranes, which were stained
with 0.1% coomassie brilliant blue G250. The target band of
g subunit was cut out of the PVDF membranes, and the N-
terminal sequence was determined in the laboratory of pro-
tein chemistry of Hunan Normal University, Changsha,
China.
1.3 RNA isolation and first-strand cDNA synthesis
The total RNA was extracted from C. officinalis using
UNIQ-10 Spin Column Total RNA Isolation Kit (Shanghai
Sangon Biological Engineering Technology and Service Co.,
Ltd., China). The cDNAs were synthesized according to
the manufacturer’s protocol, using primers, 5 -CDS, 3 -
CDS and SMARTⅡA Oligo provided by the SMART™
RACE cDNA Amplification Kit (BD Bioscience Clontech
Company, USA).
1.4 cDNA amplification
According to the N-terminal amino acid sequence of g
subunit, we designed the degenerate oligo nucleotide
primers, R-Gsp1 and R-Gsp2 for the 3 RACE of g subunit,
while based on the 3 RACE results, we generated sequence-
specific primers, R-Gsp3 and R-Gsp4 for the 5 RACE of
gamma subunit (Table 1). The 3-end of cDNA was ampli-
fied by nested PCR with first amplification using R-Gsp1
and UMP, and followed by using R-Gsp2 and NUP. The 5-
end of cDNA was amplified by the same method as de-
scribed above, using R-Gsp3 and UMP first, and then R-
Gsp4 and NUP. The primers, UPM-long primer, UPM-short
primer and NUP, were provided by the SMART™ RACE
cDNA amplification Kit (BD Bioscience Clontech Company,
USA).
1.5 DNA isolation and PCR amplification
The total DNA was extracted from C. officinalis using
UNIQ-10 Spin Column DNA Isolation Kit (Shanghai Sangon
Biological Engineering Technology and Service Co., Ltd.,
China). The target DNA fragment was PCR-amplified with
primers RD1 and RD2 that were designed based on the full-
length cDNA sequence (Table 1). The PCR amplification
was performed with 35 cycles at 95 ℃ for 1 min, 55 ℃ for 45
s, and 72 ℃ for 2 min (Whatman Biometra Thermal Cycler
Model T3, Germany). The PCR products were gel-purified
using UltraClean™ 15 according to the manufacturer’s pro-
tocol (MO BIO laboratories, Inc., USA).
1.6 PCR production linking to the vector, cloning and
sequencing
The purified PCR products were cloned to pMD18T-
Vector (TaKaRa Biotechnology Co., Ltd., Dalian, China),
and screened with blue-white selection (Wang and Fang,
1998). The clones were conformed by restriction digestion,
and then sent to BioAsia Company (Shanghai, China) for
sequence analysis.
2 Results
2.1 N-terminal amino acid sequence determination
Twenty amino acids were determined as follows:
GGFDAASVEYPNAPSFAGKY. This sequence has been
submitted in SWISS-PROT protein knowledgebase with
accession number P83592.
2.2 The full-length cDNA
The full-length cDNA sequence was obtained by analy-
sis of 3-end and 5-end sequences of the cDNA clones. It is
2 308 bp long consisting of a 5 untranslated region (UTR)
of 1 203 bp, an open reading frame (ORF) of 960 bp that
encodes 320 amino acids, and 3 UTR of 145 bp (accession
number AY209894). The N-terminal sequence of the mature
g subunit polypeptide started at position 72 of the amino acid
sequence deduced from the cDNA sequence, indicating that
71 amino acids were lost probably due to post-translational
modification. The presequence is rich in hydroxylated amino
acids such as serine and threonine, similar as reported by Apt
and Grossman (1993) and Apt et al. (2001). The predicted
molecular mass of the mature protein, when including the
four-tetrapyrrole chromophores, is 29.2 kD, which is close
to the observed value (about 30 kD). The predicted pI of
the matured polypeptide is 9.09. The 62% GC content pref-
erence at the 3 site of this peptide triplet was significantly
different from that of AN g33 (99%) in A. neglectum.
Table 1 Primer sequences
Name of primer Sequence of primer (5→ 3)
R-Gsp1 GGNTTT(C)GAT(C)GCNGCNA(T)G(C)NGTNGAG(A)T (N=A, C, G, or T)
R-Gsp2 AAT(C)GCNCCNA(T)G(C)NTTT(C)GCNGGNAAG(A)T (N=A, C, G, or T)
R-Gsp3 CGAGGAAAGTAGCAGCTGAGACGGGCA
R-Gsp4 GCTTCTGTGCCTGACGGAATGCCATCG
RD1 AAC TCG CAC AAC CAT GGA CAG
RD2 GCT CTA GTA CCT GCA GAA CG
WANG Sheng et al.: Cloning and Sequencing the g Subunit of R-Phycoerythrin from Corallina officinalis 1137
The ORF is predicted using ATG as initiation codon and
TAG as stop codon. The Kozak sequence (ribosome bind-
ing sites), ACCATGG, is identified at 3 bp upstream of the
initiation codon of g subunit. However, the polyadenylation
signal sequence AATAAA and a AU-rich element are not
found in 3 UTR, which may imply that the polyadenylation
and degradation of algal mRNA may be different from other
organisms. It is worthy to note that the 5 UTR of this cDNA
is longer than the encoding region.
2.3 Comparison with known g PE subunit sequences
The C. officinalis g subunit has various amino acid se-
quence homologies with other PE g subunit, ranging from
6% to 62% (Fig.1). The highest degree of homology is with
the previously characterized A. neglectum PE g33
polypeptide. It has no identifiable similarity to functionally
related polypeptides present in cyanobacterial
phycobilisomes or to any other polypeptides in the
databases.
Comparing with the chromophore binding sites of
GCBS, ANg31, and ANg33, we identified four potential
chromophore-binding sites at cysteine residues 96 (PUB3),
135 (PUB1), 212 (PEB), and 299 (PUB2) in the COg polypep-
tide (Fig.2).
The amino acid sequence derived from the nucleotide
sequence, coupled with the information from N-terminal
sequencing of the mature protein, suggested a presequence
of 71 amino acids. Like chloroplast transit peptides of higher
plants, the presequence has a high content of serine,
threonine, alanine, valine, arginine, and lysine (42 out of 71
amino acids) (Schleiff and Soll, 2000). Furthermore, this
presequence showed similarity to the transit peptide of
ANg33 and ANg31. The COg subunit may utilize the same
signal for importing proteins into plastids like ANg. Unlike
ANg, the COg subunit has a much longer transit peptide
(Fig.3). The function of the additional sequence remains to
be investigated.
Interestingly, the mature COg polypeptide has a large
internal repeat like that of ANg31 (Apt et al., 2001). The first
repeat begins at amino acid 109 and ends at amino acid 202,
while the second repeat begins at amino acid 222 and ends
Fig.1. Amino acid sequence comparison between the Corallina officinalis PE g31 (COg) and Aglaothamnion neglectum PE g33 (ANg33)
(Apt and Grossman, 1993), PE g31 (ANg31) (Apt et al., 2001), the partial sequence of the Rhodella violaceae g33 subunit (RVg33) (Lichtle
et al., 1996), the N-terminal (PCgNT) and C-terminal (PCgCT) sequence of a Porphyridium purpureum g subunit (Sidler, 1994). Identical
amino acids between two sequences for a given position are in capital letters.
Acta Botanica Sinica 植物学报 Vol.46 No.10 20041138
at amino acid 317 (Fig.4). Sequence comparison among the
repeat domains from all g subunits indicates that the great-
est similarity among these sequences is contiguous to con-
served cysteine, tyrosine, and arginine residues. Since each
g PE subunit contains two domains that are homologous to
each other, Apt et al. (2001) considered that these domains
probably resulted from a gene duplication following by a
fusion event. The diversity of two internal repeat domains
of g PE subunits may indicate their heterogenization.
2.4 Comparison between different 3-end sequences of COg
Based on the restriction enzyme mapping of 3 RACE,
four different recombinant clones were sequenced, three
out of which turned out to be correct. A remarkable differ-
ence is found upstream of the polyA sequence: the 3-end
sequence of the clone COg2 has a 26 bp insertion, while the
clone COg4 has an 18 bp insertion as compared with COg3
(Fig.5). The difference between the 3-end sequences sug-
gests that the g subunit may have more than one copy, or
have gone through different post-transcriptional
modification.
The secondary structures of different 3-end sequences
were predicted using RNAdraw (Fig.6). A remarkable differ-
ence was found in the size and number of their stem-loops.
The secondary structures of 3 UTR of the clone COg2 and
COg3 have five stem-loops, but COg4 only has four stem-
loops. The difference in the secondary structures implied
that those 3-end sequences may have different effects on
mRNA stability, and therefore, may affect their expression
and regulation.
2.5 Analysis of DNA sequence of COg
The COg subunit DNA was PCR-amplified with primers
RD1 and RD2. Sequence analysis indicates that the COg
subunit is an intronless gene. This DNA sequence has been
submitted in GenBank database with accession number
AY308999.
3 Discussion
The PE protein of red alga has three distinct subunits,
a, b, and g. The a and b subunit genes have been well
characterized in comparison with g subunit. Only few PE g
gene sequences have been analyzed up to now. The only
species of which the full-length sequence of g subunit has
been obtained is Aglaothamnion neglectum (Apt and
Fig.2. Chromophores binding sites sequence comparison be-
tween the Corallina officinalis PE g31 (COg), Aglaothamnion
neglectum PE g33 (ANg33) (Apt and Grossman, 1993) and PE g31
(ANg31) (Apt et al., 2001), and the chromophores binding sites
characterized for G. coulteri (GCBS) (Klotz and Glazer, 1985).
Identical amino acids for a given position are marked with asterisks.
Fig.3. Transit peptide sequence comparison between the Corallina officinalis PE g31 (COg) and Aglaothamnion neglectum PE g33
(ANg33) (Apt and Grossman, 1993) and PE g31 (ANg31) (Apt et al., 2001). Identical amino acids for a given position are marked with
asterisks. The mature protein sequences for a given position are underlined.
Fig.4. Amino acid comparison between the domain regions A and B from the PEg subunits from Fig.3. Identical amino acids for a given
position are marked with asterisks.
WANG Sheng et al.: Cloning and Sequencing the g Subunit of R-Phycoerythrin from Corallina officinalis 1139
Grossman, 1993; Apt et al., 2001). The full-length cDNA
and intronless DNA fragment of g subunit of R-Phycoeryth-
rin from C. officinalis were now cloned and sequenced.
The presence of the g subunit transcript in the poly(A)+
RNA fraction confirms that this polypeptide is nuclear en-
coded in C. officinalis, which is identical with those previ-
ously reported (Apt and Grossman, 1993; 2001). In red algae,
a and b subunits encoding genes have been located on the
plastid genome (Shivji, 1991). Thus, it requires coordinated
expression of genes located in two different cellular com-
partments for the synthesis of red algal phycobilisomes.
The 71-amino acid transit peptide is predicted to be present
at the amino acid terminus of the C. officinalis g subunit
primary translation product and the transit peptide would
probably target the protein to the plastid, because
phycobilisomes are located on the thylakoid membranes.
The 71-amino acid presequence has considerable similarity
to the transit peptide of ANg33. One might infer that the two
peptides may utilize the same signal for importing proteins
into plastids. However, the transit peptide of COg subunit
is 31 amino acids longer than that of ANg33 and it is not
clear what kind of information is contained in the additional
31-amino acids.
The number of g subunits associated with PE varies
from one to three with different red algae (Apt and Grossman,
1993) and the molecular mass of g subunit is variable. C.
officinalis only has one g subunit (COg) and the molecular
mass is about 31 kD. In A. neglectum, there are two differ-
ent g subunits, one with an apparent molecular mass of 31
kD (ANg33) and the other of 33 kD (ANg31) (Apt and
Grossman, 1993). The two kinds of g subunits are serologi-
cally distinguishable, and are spatially separated within the
phycobilisomes (Apt et al., 2001). Although the molecular
mass of COg is equal to ANg31, the highest degree of ho-
mology is with the ANg33 polypeptide. Hence, g subunits
cannot be properly classified on the basis of their sizes.
A number of important questions remain unanswered
regarding the g subunits from red algae. We still do not
understand the evolutionary origins of these subunits, the
way in which they interact with a and b PE subunits and
the mechanism of regulation between plastid encoded a
and b PE subunits and nuclear encoded g PE subunits. The
answers will require more sequence data and more detailed
biochemical researches.
Acknowledgements: We thank Dr. DUAN Yong-Ping
(Institute of Plant Uirology, Fujian Agriculture and For-
estry University) and Prof. HU Fang-Ping (Department of
Plant Pathology, Fujian Agriculture and Forestry
University) for a thoughtful review of the manuscript.
References:
Apt K E, Grossman A R. 1993. Characterization and transcript
analysis of the major phycobiliprotein subunit genes from
Aglaothamnion neglectum (Rhodophyta). Plant Mol Biol, 21:
27–38.
Apt K E, Hoffman N E, Grossman A R. 1993. The g subunit of R-
phycoerythrin and its possible mode of transport into the
plastid of red algae. J Biol Chem, 268: 16208-16215.
Fig.5. Comparison between different 3-end sequences of COg. COg2, COg3, and COg4 show the different recombinant clones.
Identical sites for a given position are marked with asterisks.
Fig.6. Comparison of secondary structures between different 3-end sequences of COg. COg2, COg3, and COg4 show the different
recombinant clones.
Acta Botanica Sinica 植物学报 Vol.46 No.10 20041140
(Managing editor: ZHAO Li-Hui)
Apt K E, Metzner S, Grossman A R. 2001. The g subunits of
phycoerythrin from a red alga: position in phycobilisomes
and sequence characterization. J Phycol, 37: 64-70.
Bernard C, Etienne A L, Thomas J C. 1996. Synthesis and binding
of phycoerythrin and its associated linkers to the
phycobilisomes in Rhodella violacea (Rhodophyta): compared
effect of high light and translation inhibitors. J Phycol, 32:
265-271.
Bernard C, Thomas J C, Mazel D, Mousseau A, Castets A M,
Tandeau M N, Dubacq J P. 1992. Characterization of the
genes encoding phycoerythrin in the red alga Rhodella violacea:
evidence for a splitting of the rpeB gene by an intron. Proc
Natl Acad Sci USA, 89: 9564-9568.
Egelhoff T, Grossman A. 1983. Cytoplasmic and chloroplast
synthesis of phycobilisome polypeptides. Proc Natl Acad
Sci USA, 80: 3339-3343.
Glazer A N. 1989. Light guides. J Biol Chem, 264: 1-4.
Kim B K, Chung G H, Fujita Y. 1997. Phycoerythrin-encoding
gene from Porphyra tenera. Plant Physiol, 115: 1287.
Kim B, Fujita Y. 1997. Phycoerythrin encoding gene from
Porphyra yezoensis. Plant Physiol, 113: 1003.
Klotz A V, Glazer A N. 1985. Characterization of the bilin attach-
ment sites in R-Phycoerythrin. J Biol Chem, 260: 4856-4863.
Lichtle C, Garnier F, Bernard C, Zabulon G, Spliar A, Thomas J,
Etienne A L. 1996. Differential transcript ion of
phycobiliprotein components in Rhodella violaceae. Plant
Physiol, 112: 1045-1054.
Reith M E, Munholland J. 1995. Complete nucleotide sequence
of the Porphyra purpurea chloroplast genome. Plant Mol Biol
Rep, 13: 333-335.
Ritter S, Hiller R G, Wrench P M, Welte W, Diederichs K. 1999.
Crystal structure of a phycourobilin-containing phycoeryth-
rin at 1.90-a resolution. J Struct Biol, 126: 86-97.
Roell M K, Morse D E. 1993. Organization, expression and nucle-
otide sequence of the operon encoding R-phycoerythrin al-
pha and beta subunits from the red alga Polysiphonia boldii.
Plant Mol Biol, 21: 47-58.
Schleiff E, Soll J. 2000. Traveling of protein through membranes:
translocation into chloroplast. Planta, 211:449-456.
Shivji M S. 1991. Organization of the chloroplast genome in the
red alga Porphyra yezoensis. Curr Genet, 19: 49-54.
Sidler W A. 1994. Phycobilisome and phycobiliprotein structures.
Bryant D A. The Molecular Biology of Cyanobacteria.
Dordrecht: Kluwer Academic Publishers. 139-216.
Sui Z-H, Zhang X-C. 2000. Cloning and analysis of phycoeryth-
rin genes in Gracilaria lemaneiformis (Rhodophyceae). Chin
J Oceanol Limnol, 18: 42-46.
Thomas J C, Passaquet C. 1999. Characterization of a phyco-
erythrin without a-subunits from a unicellular red alga. J Biol
Chem, 274: 2472-2482.
Wang G-L , Fang H-Y . 1998. Principle and Technique of Plant
Genetic Engineering. Beijing: Science Press. 552. (in Chinese)
Wang S , Wu Z-J, Lin Q-Y, Xie L-H . 2002. Isolated and purifica-
tion of phycoerythrin from Corallina officinalis and its spec-
trum characteristics. J Fujian Agr Forest Univ (Nat Sci), 31:
495-499. (in Chinese with English abstract)