Comparative Genomics of Phylogenetically Diverse Unicellular
Eukaryotes Provide New Insights into the Genetic Basis
for the Evolution of the Programmed Cell Death Machinery
Aurora M. Nedelcu
Received: 20 August 2008 / Accepted: 12 January 2009 / Published online: 10 February 2009
Springer Science+Business Media, LLC 2009
Keywords Unicellular eukaryotes Apoptosis
Abstract Programmed cell death (PCD) represents a
Programmed cell death Metacaspase Domain shuf ing
signi cant component of normal growth and development
Domain recruitment Lateral gene transfer
in multicellular organisms. Recently, PCD-like processes
have been reported in single-celled eukaryotes, implying
that some components of the PCD machinery existed early
Introduction
in eukaryotic evolution. This study provides a comparative
analysis of PCD-related sequences across more than 50
Programmed cell death (PCD) an active process resulting
unicellular genera from four eukaryotic supergroups:
in the controlled elimination of unwanted or damaged
Unikonts, Excavata, Chromalveolata, and Plantae. A com-
cells has long been recognized as a signi cant component
plex set of PCD-related sequences that correspond to
of normal growth and development in multicellular
domains or proteins associated with all main functional
organisms, both animals and plants (e.g., Jacobson et al.
classes from ligands and receptors to executors of PCD
1997; Lam 2004). Recently, PCD-like processes (i.e.,
was found in many unicellular lineages. Several PCD
diagnostic features such as protoplast shrinking, accumu-
domains and proteins previously thought to be restricted to
lation of reactive oxygen species, DNA-laddering,
animals or land plants are also present in unicellular species.
externalization of phosphatidylserine, caspase-like activity)
Noteworthy, the yeast, Saccharomyces cerevisiae used as
have been reported in several unicellular groups includ-
an experimental model system for PCD research, has a
ing dino agellates, green algae, diatoms, yeasts,
rather reduced set of PCD-related sequences relative to
kinetoplastids, apicomplexans, and amoebozoans (e.g.,
other unicellular species. The phylogenetic distribution of
Cornillon et al. 1994; Ameisen et al. 1995; Madeo et al.
the PCD-related sequences identi ed in unicellular lineages
1999; Vardi et al. 1999; Al-Olayan et al. 2002; Arnoult
suggests that the genetic basis for the evolution of the
et al. 2002; Segovia et al. 2003; Bidle and Falkowski 2004;
complex PCD machinery present in extant multicellular
Nedelcu 2006; Moharikar et al. 2006; Bidle et al. 2007;
lineages has been established early in the evolution of
Zuppini et al. 2007; Deponte 2008a; Bidle and Bender
eukaryotes. The shaping of the PCD machinery in multi-
2008), suggesting that some components of the PCD
cellular lineages involved the duplication, co-option,
machinery existed early in the evolution of the eukaryotic
recruitment, and shuf ing of domains already present in
lineage. Nevertheless, the mechanistic basis for PCD in
their unicellular ancestors.
single-celled organisms and the evolutionary relationships
between these PCD-like processes and the better under-
Electronic supplementary material The online version of this
stood forms of PCD in multicellular lineages are still to be
article (doi:10.1007/s00239-009-9201-1) contains supplementary
deciphered. Several homologues of genes involved in the
material, which is available to authorized users.
most studied form of animal PCD, apoptosis, have been
A. M. Nedelcu identi ed in unicellular lineages, and their involvement in
Department of Biology, University of New Brunswick,
PCD-like processes addressed (e.g., Madeo et al. 2002;
P.O. Box 4400, Fredericton, NB, Canada E3B 5A3
Fahrenkrog et al. 2004; Wissing et al. 2004; Walter et al.
e-mail: abpy2r@r.postjobfree.com
123
J Mol Evol (2009) 68:256 268 257
As many as 37 PCD-related sequences have been
2006; Buettner et al. 2007; Moharikar et al. 2007), yet
identi ed in unicellular species from at least one of the four
many others are reportedly missing (Koonin and Aravind
eukaryotic supergroups investigated in this study (i.e.,
2002).
Unikonts, Excavates, Chromalveolates, and Plantae; at this
The increasing availability of genome sequences from
time, genomic information from the Rhizaria is not avail-
various eukaryotic groups provides an opportunity (i) to
able). The phylogenetic distribution of these sequences
explore the degree of conservation of the PCD machinery
suggests that the potential (i.e., the genetic basis) for the
across evolutionarily distant lineages, (ii) to infer which
evolution of the complex PCD machinery present in mul-
elements might have been present early in evolution, and (iii)
ticellular lineages was established early in the evolution of
to investigate the evolutionary processes (e.g., gene dupli-
eukaryotes. Compared to their counterparts in multicellular
cation and diversi cation, co-option, loss or replacement,
lineages, many PCD-related domains in single-celled
domain recruitment and shuf ing, lateral gene transfer)
eukaryotes are found in single-domain proteins or in
responsible for the shaping of the PCD machinery in speci c
unique domain combinations, indicating that the early
lineages. Several recent studies have addressed the early
shaping of the PCD machinery in multicellular lineages
evolution of the PCD machinery and the potential bacterial
involved the duplication, co-option, recruitment, and
origin of some of the genes involved in PCD (e.g., Aravind
shuf ing of domains already present in their unicellular
et al. 1999; Koonin and Aravind 2002). However, these
ancestors.
studies were based on information from a very limited
number of unicellular lineages mostly yeast and several
unicellular lineages thought to be early-branching eukary-
otes and suggested that unicellular lineages possess a Methods
very limited PCD-related gene toolkit. For instance, of the 33
Several protein databases (e.g., Interpro http://www.ebi.
domains and proteins involved in apoptosis and related
ac.uk/interpro/; Pfam http://www.sanger.ac.uk/Software/
pathways investigated by Koonin and Aravind (2002),
Pfam/; Prosite http://www.expasy.org/prosite/; Uniprot
only 13 are indicated as having potential homologues
http://www.pir.uniprot.org/; Superfamily http://supfam.cs.
in unicellular lineages either eukaryotes or prokaryotes.
bris.ac.uk/), as well as genome and EST databases (Joint
Furthermore, because some PCD-related sequences
Genome Institute http://www.jgi.doe.gov/; NCBI http://
appeared to be missing in unicellular eukaryotes but had
www.ncbi.nlm.nih.gov/; Protist EST Program http://
potential homologues in prokaryotes, it was proposed that an
amoebidia.bcm.umontreal.ca/pepdb/), were searched for
in ux of bacterial genes occurred in the multicellular
ancestor of the eukaryotic crown group (note that the term PCD-related sequences [in particular, the domains of
crown group is obsolete; current evidence supports the death described and used in previous comparative analyses
independent evolution of multicellular fungi, plants, and (Aravind et al. 1999, 2001; Koonin and Aravind 2002)].
animals from distinct unicellular ancestors; see, e.g., Embley Initial searches employed (i) text searches using PCD-
and Martin 2006). related keywords and Interpro/Pfam accession numbers
To address the issues discussed above, this study (i) corresponding to PCD-related domains, and (ii) Blast sear-
provides a comparative analysis of PCD-related sequences ches [tblastn, blastp, psi-Blast (Altschul et al. 1990, 1997)]
[employing a domain-centered approach (Aravind et al. using sequences from the closest species as queries. Gene
2001)] from phylogenetically diverse unicellular lineages and protein sequences retrieved in this manner were checked
and (ii) indicates potential mechanisms involved in the early for the presence of the corresponding PCD-speci c domains
evolution of the eukaryotic PCD machinery. Although our using SMART, Pfam, and InterProScan (http://smart.embl-
understanding of the eukaryotic tree has improved greatly in heidelberg.de/; http://www.sanger.ac.uk/Software/Pfam/search.
the last decade, the exact relationships among the major shtml; http://www.ebi.ac.uk/InterProScan/); only sequen-
eukaryotic groups (in particular, those including unicellular ces with domains con dently predicted [using the default
lineages) are uncertain; in addition, the monophyly of some cutoffs speci c for each domain; see http://www.ebi.ac.uk/
of these groups as well as the root of the eukaryotic tree are interpro/documentation.html and Schultz et al. (1998)]
still debated (e.g., Keeling et al. 2005; Embley and Martin were included in this study. Sequences were aligned with
2006; Yoon et al. 2008). Five or six major eukaryotic Muscle [http://www.drive5.com/muscle/ (Edgar 2004)].
supergroups that diverged from each other early in the Phylogenetic analyses (gaps and unalignable regions
evolution of eukaryotes are recognized to date: the Unikonts excluded) were performed using MrBayes v3.0B4 (http://
(i.e., Opisthokonta and Amoebozoa), Chromalveolata, mrbayes.csit.fsu.edu/; mixed amino acid model; 3,500,000
Plantae, Rhizaria, and Excavata the latter four groups also generations; 100 sample frequency; 5000 burn-in) and
being known as Bikonts (Cavalier-Smith 2002; Stechmannn PhyML (http://atgc.lirmm.fr/phyml/; 200 replicates; four-
and Cavalier-Smith 2003; Keeling et al. 2005). category gamma distribution; proportion of variable sites
123
258 J Mol Evol (2009) 68:256 268
estimated from the data; best- t amino acid model indi- to Schizosaccharomyces pombe (an archiascomycete/ ssion
cated by ProtTest). yeast). Losses of sets of genes in some lineages can be
understood in terms of lineage-speci c differences in their
biology and/or ecology; it is possible that some of the PCD-
related sequences missing in S. cerevisiae (and other bud-
Results and Discussion
ding yeasts) were involved in pathways that have been lost
(or reshaped) during this lineage s adaptation to its unique
A Complex Set of PCD-Related Sequences in
lifestyle and/or mode of growth and reproduction.
Phylogenetically Diverse Unicellular Lineages
The nding that the closest unicellular relatives of mul-
ticellular animals and plants the choano agellates and the
Several protein as well as genome and EST databases (see
green algae, respectively, have such a complex PCD-related
Supplementary Table 1) have been searched for domains
set of sequences (including some sequences thought to be
and proteins known to be associated with PCD in animals
restricted to animals or plants; see discussion in next sec-
and/or land plants (see Methods). As genomic information
tions) suggests that the evolution of the complex PCD
for many unicellular groups is still limited, the inability to
machinery known in multicellular lineages involved the
detect a particular PCD-related protein or domain in the
co-option of sequences already present in their unicellular
available sequence data cannot be taken, at this time, as
ancestors. Because (i) genomic information from many
indicating that the sequence is absent in that group. On the
unicellular eukaryotic groups is still limited, and (ii) the
other hand, because most of the PCD-related domains
relationships among the four eukaryotic supergroups as
included in Fig. 1 were inferred using domain prediction
well as the monophyly of some of the major eukaryotic
tools (see Methods), they require functional con rmation.
groups are still debated (e.g., Keeling et al. 2005; Yoon
Nevertheless, many PCD-related sequences were found
et al. 2008), the eukaryotic ancestral set of PCD-related
in more than 50 unicellular genera from four eukaryotic
sequences cannot be inferred at this time. However, based
supergoups: Unikonts (amoebozoans, choano agellates,
on the available information, several conclusions can be
fungi), Chromalveolata (cryptomonads, pelagophytes,
drawn.
oomycetes, diatoms, haptophytes, ciliates, dino agellates,
Of the 37 entries in Fig. 1, as many as 23 PCD-related
apicomplexans), Plantae (glaucophytes, red algae, green
sequences appear to be shared by all four eukaryotic su-
algae), and Excavata (kinetoplastids, euglenoids, jakobids,
pergroups and, thus, are likely to have been present in their
diplomonads, trichomonads, heteroloboseans). Figure 1
last common ancestor. In addition, eight other PCD-related
provides a list of 37 PCD-related domains [i.e., domains
sequences are shared by three of the four supergroups.
of death (Koonin and Aravind 2002)] and proteins asso-
These include the BAG and API5 domains shared by
ciated with all main functional classes (from ligands and
Unikonts, Excavates, and Plantae, to the exclusion of
receptors to executors of PCD) found in unicellular species
Chromalveolates; the NB-ARC, NACHT, and MDM35
from at least one eukaryotic supergroup; a succinct dis-
domains shared by Unikonts, Chromalveolates, and
cussion of their role in PCD and their phylogenetic
Plantae, to the exclusion of Excavates; and the DEATH,
distribution is provided later in this section.
DED, and sestrin domains shared by Unikonts, Excavates,
Overall, among the unicellular lineages investigated in
and Chromalveolates, to the exclusion of Plantae. Thus,
this study, the choano agellates, Amoebozoa, and Excavata
depending on the phylogenetic relationships among and
appear to posses the largest number of PCD-related
sequences (Fig. 1). Of the unicellular species with an within these four eukaryotic supergroups and in the
available genome sequence, the choano agellate Monosiga absence of lateral gene transfer between 23 and 31 PCD-
brevicollis, considered to be a close relative of Metazoa related sequences can be hypothesized to have been present
(King 2004), the amoebozoan Dictyostelium discoideum, the in their last common ancestor. For instance, if the root of the
excavate Naegleria gruberi, and the green alga Chlamydo- eukaryotic tree were between the Unikonts and the Bik-
monas reinhardtii have the largest PCD gene complements. onts as proposed by Stechmann and Cavalier-Smith
Noteworthy, the yeast, Saccharomyces cerevisiae used (2003) the 31 sequences that are shared between Unikonts
as a model system for PCD research has a rather reduced and Bikonts would be hypothesized to have been present in
set of PCD-related sequences relative to other unicellular the last common ancestor of eukaryotes.
lineages. This nding is consistent with earlier reports The remaining six PCD-related sequences included in
(Aravind et al. 2000) that ca. 300 genes most of which Fig. 1 appear to have speci cally evolved in the Unikont or
belong to functionally connected groups have been lost Plantae lineages, from sequences already present in their
(and ca. 300 other genes have diverged beyond recognition) unicellular ancestors. These include (i) the tumor suppressor
in the lineage leading to S. cerevisiae (a hemiascomycete/ p53 to date reported only in animals, choano agellates,
budding yeast) after its divergence from the lineage leading and the amoebozoan Entamoeba histolytica (Mendoza et al.
123
J Mol Evol (2009) 68:256 268 259
Fig. 1 Comparative analysis of PCD-related domains and proteins identi ed in each lineage. A question mark denotes cases in
(in italics) across four eukaryotic supergroups: Unikonts, Excavata, which the nding of a domain/protein is restricted to one instance
Chromalveolata, and Plantae (see Supplementary Table 1 for spe- (although sequence information from several species is available),
cies). Only sequences identi ed in at least one unicellular lineage are thus allowing for the possibility of a prediction artifact, contamina-
included. CF, choano agellates; Sc, Saccharomyces cervisiae; Api- tion, or lateral gene transfer event; asterisks indicate known cases of
compl., Apicomplexa; RA, red algae; GA, green algae. Numbers in lateral gene transfer from the Plantae lineage (see text for discussion).
brackets indicate the number of PCD-related domains or proteins (of See text for full names, descriptions, and references to Interpro
the total 37 PCD-related sequences included in this analysis) accession numbers for each domain or protein
2003; Nedelcu and Tan 2007; see discussion below); (ii) the present only in animals and amoeobozoans, and the CARD
programmed cell death protein 10 (PDCD10) found only domain, present only in animals and possibly Amoebozoa;
in animals and choano agellates; (iii) the paracaspases, and (iv) type II metacaspase and a family of plant-speci c
123
260 J Mol Evol (2009) 68:256 268
proteins (i.e., possessing a recognizable TRAF domain;
cell death inhibitors the Mlo family, found only in plants
IPR012227) could not be found outside Metazoa, TRAF zinc
and their algal relatives (Fig. 1).
nger and MATH domains were predicted by Pfam in
Several additional proteins that are involved in PCD but
multiple instances in all four eukaryotic supergroups in
function in other conserved vital cellular activities as
Fig. 1 (in some cases together, e.g., in fungi and in the
well and thus are ubiquitous among eukaryotic lineages
amoebozoan D. discoideum, and even coupled with a RING
(Ekert and Vaux 2005; Modjtahedi et al. 2006) were not
domain in D. discoideum and the excavates, Leishmania
incorporated in this analysis; these include, for instance,
spp.).
Bit1, a Bcl-2 inhibitor of transcription/precursor of mito-
Because of their sequence similarity and similar a-
chondrial peptidyl-tRNA hydrolase 2; PIG3, a p53-induced
helical fold, it was suggested that three of the other
gene/proline oxidase; Beclin1, a Bcl-2 interacting protein/
adaptors found only in animal apoptosis proteins, CARD
autophagy protein; PDCD11, programmed cell death pro-
tein 11/ribosomal RNA biogenesis protein; and several (caspase recruitment domain; IPR001315), DED (death
mitochondrial proteins that are also involved in bioener- effector domain; IPR001875), and DD (death domain;
getic and redox metabolism, such as cytochrome c, the IPR000488) have evolved from a common ancestor before
apoptosis inducing factor (a FAD-dependent oxidoreduc- the divergence of the extant animal lineages (Aravind et al.
tase), and the components of the permeability transition 2001). Noteworthily, proteins containing putative CARD
pore complex. Also, Bax Inhibitor-1 (BI-1), a cell death (IPR001315), DED (IPR001875), or Death (IPR000488)
suppressor in animals and plants, might also be added to domains were found in unicellular taxa from four of the
this list, as BI-1-like sequences (though without a canonical ve eukaryotic supergroups (Fig. 1), suggesting a possible
BI-1 domain) have been found in many unicellular lineages earlier origin and diversi cation for this family. These
[e.g., yeast, green algae, amoebozoans, apicomplexans include putative CARD domains predicted by Prosite in the
(Huckelhoven 2004)], and at least the yeast BI-1 is able to amoebozoan E. histolytica and the excavate Leishmania
block Bax-induced cell death (Chae et al. 2003). major; putative Death domains predicted by Prosite and
Pro leScan in the ciliate P. tetraurelia, the excavates,
Ligands, Receptors, and Adaptors L. major and Trypanosoma cruzi, and the oomycete
P. ramorum; and putative DED domains, predicted by
In mammals, apoptosis can be induced via the activation of Prosite and Pro leScan, in the ciliates T. thermophila and
death-inducing signaling complexes at the plasma mem- P. tetraurelia and the excavates T. vaginalis and Naegleria
brane. These include: ligands (e.g., the tumor necrosis gruberi.
factor, TNF); death receptors, such as Fas, TNFR1, and Similarly, as the TIR (Toll/IL-1R homologous region;
TNFR2 (which contain multiple copies of a cysteine-rich IPR000157) domain was not detected in fungi or early-
extracellular domain, TNFR, and an intracellular Death branching eukaryotes, it was hypothesized that TIR could
domain); and adaptors, such as, TRAF, MATH, CARD, have been acquired either from the mitochondrial precursor
DED, DD, TIR, TRADD, and FADD. (and later lost in multiple eukaryotic lineages) or through
TNF and TNFR-like proteins are thought to be speci c lateral gene transfer from bacteria to the multicellular
to metazoans (Koonin and Aravind 2002). Nevertheless, ancestor of the crown eukaryotes (Koonin and Aravind
putative TNF-like domains or signatures (IPR008983) and 2002). Nevertheless, putative TIR domains are predicted
TNFR/NGFR cysteine-rich region signatures (IPR001368) by Pfam, Prosite, or Smart in several unicellular lineages,
have been predicted by Superfamily or Prosite in several including the excavate T. vaginalis, the ciliate P. tetraur-
unicellular lineages (e.g., the excavates, Giardia lamblia, elia, the apicomplexan P. falciparum, the choano agellate
Trichomonas vaginalis, Leishmania spp.; the amoebozoans, M. brevicollis, the amoebozoans Dictyostelium spp., the
D. discoideum and Entamoeba dispar; the choano agellate, oomycetes Phytophtora spp., the green algae C. reinhardtii
M. brevicollis; the ciliates, Paramecium tetraurelia and and Micromonas spp., and two chromalveolate species that
Tetrahymena thermophila; the apicomplexan, Theileria belong to lineages not included in Fig. 1 (i.e., the hapto-
parva; the oomycetes, Phytophptora spp.; the green algae, phyte Emiliania huxleyi and the pelagophyte Aureococcus
Ostreococcus spp., Chlorella sp., and C. reinhardtii; and the anophagefferens). The identi cation of TIR-containing
red alga, Cyanidioschyzon merolae) (Fig. 1). proteins among unicellular lineages from phylogenetically
TRAFs (TNF receptor-associated factors) are adaptor diverse groups (Fig. 1) is consistent with an early acqui-
proteins that interact with TNF receptors; they comprise sition for this domain. Last, although not associated with
three structural domains a RING-type Zn nger CARD and/or DD domains as in metazoans, the NB-ARC
(IPR001841), one to seven TRAF-type zinc ngers domain (IPR002182) a signaling motif shared by plant
(Znf_TRAF; IPR001293), and a MATH (Meprin and TRAF resistance gene products and regulators of cell death in
homology; IPR002083) domain. Although legitimate TRAF animals (van der Biezen and Jones 1998) was also
123
J Mol Evol (2009) 68:256 268 261
2003) was found to be widely distributed across
predicted by Pfam in several fungal and chromalveolate
eukaryotes, though apparently missing in yeast (Fig. 1).
(i.e., the diatoms Thalassiosira pseudonana and Phaeo-
Several programmed cell death (PDCD) proteins have
dactylum tricornutum, the pelagophyte A. anophagefferens,
also been shown to be expressed or up-regulated during
and the haptophyte E. huxleyi) proteins (Fig. 1).
apoptosis in animals. Programmed cell death protein 2
(PDCD2) is expressed during apoptosis of lymphoid and
PCD Regulators: Proapoptotic
myeloid cells and thus may play an important role in cell
death and/or in regulation of cell proliferation (Vaux and
Cell death is controlled by many regulators, which either
Hacker 1995). PDCD2 proteins contain a PDCD2_C ter-
have an inhibitory effect on PCD (antiapoptotic) or block
minal domain (IPR007320) and a zinc nger, MYND-type
the protective effect of inhibitors (proapoptotic). Bcl-2 is a
(IPR002893). Interestingly, proteins containing the PDCD-
family that contains both pro (e.g., Bax)- and anti (e.g.,
2 C terminal domain were found in all four eukaryotic
Bcl-2)-apoptotic proteins with BH1, BH2, BH3, or BH4
supergroups (Fig. 1). However, in most unicellular lin-
motifs. Noteworthy, while Bcl-2-like proteins have not
eages, only the PDCD-2 C terminal domain is present in
been found outside Metazoa, BH2 and BH3 motif signa-
these proteins (for exceptions, see discussion below).
tures (PS01258, PS01259) are predicted by Prosite in
Another programmed cell death protein, PDCD5 or
several fungi, dyno agellates (Gonyaulx polyedra, Alex-
TFAR19 (TF-1 cell apoptosis-related gene 19 protein) was
andrium spp., Pyrocystis spp.), excavates, and plants.
shown to be up-regulated in tumor cells undergoing
Similarly, the NACHT (NAIP, CIIA, HET-E and TPI)
apoptosis (Liu et al. 1999). Notably, the DNA-binding
domain; (IPR007111) is a nucleoside triphosphatase
TFAR19 domain (IPR002836) was found in lineages from
(NTPase) domain found in animal apoptosis proteins (both
all four major eukaryotic groups in Fig. 1. PDCD protein 6,
antiapoptotic the neuronal apoptosis inhibitor protein,
or ALG-2 (apoptosis-linked gene 2), is a calcium-binding
NAIP; and proapoptotic CARD4) as well as in a protein,
protein of the penta-EF-hand family that is also essential
HET-E, responsible for vegetative incompatibility (a form
of PCD) in the fungus Podospora anserina (Koonin and for the execution of apoptosis (Jung et al. 2001; Krebs et al.
Aravind 2000). The presence of NACHT domains in PCD- 2002); Alix/AIP1 (ALG-2-interacting protein X/apoptosis-
related proteins in both animals and fungi was seen as linked gene 2-interacting protein 1) an adaptor protein
evidence for an ancient role of NACHT in PCD preced- that contains a BRO1 domain can bind to ALG-2 and
ing the radiation of animals and fungi (Koonin and Aravind regulate caspase-dependent and -independent cell death
2000). Interestingly, putative NACHT domains (associated (Sadoul 2006). ALG2-like proteins and proteins containing
with WD40 repeats, as in nematode proteins) have been the BRO1 domain (IPR004328) were found in all major
predicted by Prosite in ciliates, amoebozoans, and green eukaryotic groups (Fig. 1). In contrast, PDCD protein 10
algae (Fig. 1), suggesting that the NACHT domain could (PDCD10 or TFAR15; IPR009652), of unknown function,
be even older than previously proposed, possibly preceding was found only in metazoans and their unicellular relative,
the Chromalveolata/Plantae/Unikont divergence. Note- the choano agellates (Fig. 1).
worthily, NACHT NTPases are a sister group of another Finally, LSD1 is a putative Zn nger (IPR005735)
family of ATPases, the AP-ATPases, which include the thought to play a role in the regulation of transcription (via
human apoptotic effector APAF-1 and numerous plant either repression of a prodeath pathway or activation of an
proteins involved in stress and disease responses (Koonin antideath pathway) in response to signals emanating from
and Aravind 2002). cells undergoing pathogen-induced hypersensitive cell
Another group of pro-apoptotic regulators consists of death (a form of PCD) in plants (Lam 2004). Although
the mammalian CAS (cellular apoptosis susceptibility; previously thought to be speci c to land plants, proteins
IPR005043) proteins, which are homologous to the yeast containing one or more LSD1 Zn ngers were also found in
chromosome-segregation protein, CSE1 (Brinkmann et al. excavates, ciliates, green algae, and choano agellates (for
1995); they are involved in both cellular apoptosis and the latter, see discussion below).
proliferation, presumably by facilitating the nuclear import
of proteins (such as p53 and other transcription factors) PCD Regulators: Antiapoptotic
(Brinkmann et al. 1995). A conserved function for these
proteins is supported by their presence in all four eukary- Among the many described antiapoptotic regulators, the
otic supergroups in Fig. 1. Likewise, GRIM-19 (gene defender against death (DAD) proteins can cause apoptosis
associated with retinoic-interferon-induced mortality 19; if mutated (Nakashima et al. 1993). Proteins with a putative
IPR009346) described as a death regulator that interacts DAD domain (IPR003038) were predicted by Pfam in
with Stat3 (a transcription factor with important roles in unicellular lineages from all four eukaryotic supergroups
cell growth and antiapoptosis in humans) (Lufei et al. (Fig. 1). Notably, the dad1 homologue in C. reinhardtii
123
262 J Mol Evol (2009) 68:256 268
restricted to metazoans, two tumor suppressor p53-like
was recently shown to be down-regulated with the onset of
sequences were found in the choano agellate, M. brevi-
PCD in UV-exposed cells (Moharikar et al. 2007). Another
collis (Nedelcu and Tan 2007). As one of the two M.
rather conserved antiapoptotic factor is the apoptosis
brevicollis p53-like sequences contains a SAM domain
antagonizing transcription factor (AATF) a protein that
(IPR001660) which is associated with the p63/73 mem-
contains a Traub domain (IPR012617), which also appears
bers of the p53 family (Yang et al. 2002) these ndings
to be widely distributed across eukaryotes (Fig. 1).
suggest an early duplication and diversi cation of this gene
Several other proteins that act as antiapoptotic regulators
family, before the evolution of Metazoa (Nedelcu and Tan
are known. The inhibitors of apoptosis proteins (IAPs) are
2007). Furthermore, a diverged p53-like sequence has also
a family of polypeptides that contain the BIR domain
been reported in E. histolytica (Mendoza et al. 2003),
(baculovirus inhibitor of apoptosis protein repeat; or pro-
teinase inhibitor I32, inhibitor of apoptosis IPR001370). tracing back the origin of this important tumor suppressor
Although initially described in metazoans, putative BIR family to Amoebozoa. Noteworthily, although no p53-like
domains were also predicted by Pfam and Smart in fungi sequences have been identi ed outside unikonts, p53-like-
(including yeast), choano agellates, ciliates, excavates mediated responses have been reported in green algae
(N. gruberi), and apicomplexans (Plasmodium spp.) (Nedelcu 2006), and homologues of several p53-induced
(Fig. 1). Likewise, BAG proteins also have antiapoptotic genes are found in many unicellular lineages (see below).
activity, by increasing the anti-cell-death function of Bcl-2 In multicellular organisms, PCD and cell cycle regula-
(Doong et al. 2002). Notably, while Bcl-2 proteins have not tion re ect the two opposing options faced by a cell during
been identi ed outside metazoans, putative BAG domains development: death and proliferation (Aravind et al. 1999).
(IPR003103) are predicted by Pfam in plants, fungi In animals, in addition to p53, this decision is mediated by
(including yeast), excavates (N. gruberi) and green algae, the transcription factors E2F-1/DP-1 and the retinoblas-
and by Prosite in the amoeba, E. dispar (Fig. 1). toma (Rb) protein, with the latter being antiapoptotic and
The apoptosis inhibitory protein 5 (API5) is an additional sequestering the former. Until recently, the transcription
antiapoptotic factor, which in humans prevents PCD factors that link cell cycle control to apoptosis were
induced by the deprivation of growth factors (Tewari et al. thought to be restricted to animals (Aravind et al. 1999).
1997). Interestingly, API5 domains (IPR008383) were also However, although missing in yeast, the E2F_TDP domain
predicted by Pfam in plants, as well as in several unicellular (IPR003316) was identi ed in many unicellular lineages
lineages from the Excavata and Unikonts groups (Fig. 1). (Fig. 1). Likewise, the two domains associated with reti-
Similarly, A20 is known as an inhibitor of cell death in noblastoma-like and retinoblastoma-associated proteins
animals (DeValck et al. 1996); its N-terminal half interacts (IPR002719 and IPR002720) also missing in yeast, were
with the conserved C-terminal TRAF domain of TRAF1 nevertheless found in unicellular groups (Fig. 1).
and TRAF2, while its C-terminal domain mediates inhibi-
tion of NF-jB activation (Song et al. 1996). Putative A20- p53-Induced Genes
type zinc ngers (IPR002653) were also found (alone or in
association with another zinc nger, AN1; IPR000058) not Although p53 homologues have only been reported in two
only in animals, but also in plants and among unicellular unicellular lineages (discussed above), several genes that
lineages from all four eukaryotic groups (Fig. 1). are known to be p53 targets in animals are found in many
Finally, several plant-speci c cell death inhibitors are unicellular lineages. For instance, the human p53CSV
known. The Mlo family includes integral membrane proteins (a member of the MDM35 family: mitochondrial distribution
whose de ciency is thought to lower the threshold required and morphology family 35) is a transcriptional target for p53
to trigger the cascade of events that result in plant cell death that mediates cell survival in response to genotoxic stress, by
(Devoto et al. 1999; Kim et al. 2002). While not reported in inhibiting the activation of procaspase-3 and -9 (Park and
animals, fungi, excavates, and chroamalveolates, proteins Nakamura 2005). In addition to the MDM35 protein reported
with predicted Mlo domains (IPR004326) were, neverthe- in yeast [which is essential for maintenance of normal
less, found in unicellular green algae (Fig. 1). mitochondrial distribution and morphology (Dimmer
et al. 2002)], proteins with putative MDM35 domains
Nuclear Factors (IPR007918) were also predicted by Pfam in apicomplexans,
amoebozoans, and choano agellates (Fig. 1). Similarly, the
LPS-induced tumor necrosis factor a factor (LITAF) is
The tumor suppressor, p53, is a transcription factor that
plays the leading role in malignancy and in maintaining the known as a p53 target (p53-induced gene 7, or PIG7) in
genome s integrity and stability, by orchestrating various mammalian cells following treatment with lipopolysaccha-
responses to DNA damage, including cell cycle arrest and ride, and proteins with a LITAF domain (IPR006629) were
PCD (Helton and Chen 2007). Although believed to be predicted in several unicellular lineages (Fig. 1).
123
J Mol Evol (2009) 68:256 268 263
Fig. 2 a Partial alignment of representative type I and type II c
Sestrin (PA26 p-53 induced protein; IPR006730) was
metacaspase predicted sequences from red algae (Porphyra yezoensis;
described as a novel p53 target gene, differentially induced
Py), green algae (Chlamydomonas reinhardtii, Cr; Volvox carteri,
by genotoxic stress (UV, c-irradiation, and cytotoxic drugs) Vc), vascular plants (Arabidopsis thaliana; At), excavates (Trypan-
in a p53-dependent manner (Dimmer et al. 2002); inter- osoma cruzi, Tc; Leishmania braziliensis, Lb), diatoms (Thalassiosira
pseudonana, Tp; Phaeodactylum tricornutum, Pt), haptophytes
estingly, although apparently missing in yeast, proteins
(Emiliania huxleyi; Eh), pelagophytes (Auroecoccus anaphagefferens;
with predicted sestrin domains were found in several uni-
Aa), yeasts (Schizosaccharomyces pombe, Sp; Saccharomyces cere-
cellular lineages from three eukaryotic supergroups visiae, Sc) showing the conservation of the cysteine-histidine dyad
(Fig. 1). Likewise, PIG8 (p53-induced gene 8; a.k.a. EI24, and the insertion characteristic of plant type II metacaspases (for more
sequences and a full alignment see Supplementary Fig. 1). Numbers
etoposide-induced 2.4) is induced by p53 in cells treated
following species abbreviations are Uniprot IDs, if composed of both
with the cytotoxic drug etoposide (Lehar et al. 1996).
letters and numbers, or JGI IDs, if consisting of only numbers; the
Notably, putative EI24 domains (IPR009890) were pre- Porphyra yezoensis cluster is based on several GenBank overlapping
dicted in fungi (though missing in yeast), land plants, and ESTs (AU189679, AU189520, AU186857, AU188368, AU194902,
AV433034). b Bayesian analysis (58 taxa; 122 sites; numbers
many unicellular lineages (Fig. 1), and an EI24-like protein
represent posterior probability distributions of trees) of selected type I
appears to be induced during PCD in green algae (Nedelcu
and II metacaspases from Plantae (red algae, in red; green algae,
2006). in dark green; plants, in light green), Chromalveolata (diatoms, in
purple; haptophytes, in orange; pelagophytes, in pink), Excavata (in
blue), and Unikonts (fungi, in brown). Maximum likelihood analyses
Executors
predict similar relationships (bootstrap values for key nodes are
indicated in italics, below the posterior probability values)
The essential executors in metazoan apoptosis are caspas-
es a class of cysteine proteases that catalyze peptide bond
Phylogenetic analyses do support the inclusion of these
cleavage at aspartyl residues in their substrates. While
algal sequences in the type II metacaspase group (Fig. 2),
homologues of caspases have not been found outside
indicating that the diversi cation of the metacaspase family
Metazoa, two related cysteine protease families have been
started early in the evolution of the Plantae lineage. These
described previously: paracaspases present in metazoans
analyses also indicate that independent lineage-speci c
and the amoebozoan Dictyostelium; and metacaspases
expansions involving type I metacaspases took place in
reported in plants, fungi, and some protozoans (Uren et al.
several unicellular groups, including trypanosomatids,
2000). Metacaspases share with caspases the presence of a
diatoms, and haptophytes (Fig. 2b).
conserved catalytic dyad composed of a cysteine and a
Interestingly, type I and type II metacaspases have also
histidine residue [although several exceptions have been
been found in the closest unicellular relative of animals, the
reported (Mottram et al. 2003)]. However, in contrast to
choano agellates (Fig. 1), but are thought to have been
caspases, which are speci c for acidic residues, meta-
acquired from a green algal lineage early in the evolution of
caspases appear to prefer basic residues (Gonzalez et al.
the choano agellates (Nedelcu et al. 2008). This scenario is
2007; Vercammen et al. 2007; Deponte 2008b). Remark-
supported by the presence of an LSD1-type Zn nger
ably, in addition to fungi and protozoans, metacaspase-like
(discussed above) in the N-terminal of the Monosiga type I
sequences were found in many unicellular lineages
metacaspase; this speci c association is only known in land
(Fig. 1), and the presence of the conserved catalytic dyad
plants (and possibly their close green algal ancestors), and
argues for their performing similar proteolytic activities
although both LSD1 Zn ngers and type I metacaspases are
(Fig. 2a). Nevertheless, as metacaspases are known also to
present in many unicellular lineages (Fig. 1), the two
be involved in PCD-unrelated functions (e.g., Helms et al.
domains are found together only in Monosiga (Nedelcu
2006; Vercammen et al. 2007; Ambit et al. 2008), func-
et al. 2008). A lateral gene transfer event is also consistent
tional studies are needed to address the involvement of
with the absence of type II metacaspases as well as LSD1
these sequences in PCD-like processes.
Zn ngers from the Unikont lineage (Fig. 1). If the uni-
Two types of metacaspases, types I and II, have been
cellular ancestors of Metazoa possessed metacaspases, they
reported in land plants the main difference being the
must have been lost and/or replaced by caspases early in the
presence of an N-terminal extension in type I metacaspases
evolution of Metazoa, as an early-diverged metazoan the
and of an insertion between the p20- and the p10-like
cnidarian, Nematostella vectensis already contains a
subunits in type II metacaspases (Uren et al. 2000). Nota-
diversi ed family of caspases (as well as a putative para-
bly, metacaspases displaying the insertion characteristic of
caspase; see http://genome.jgi-psf.org/Nemve1/Nemve1.
type II metacaspases (Fig. 2a) were also found in green
home.html).
algae; furthermore, although no metacaspase sequences
The speci c internucleosomal fragmentation of DNA
could be found in the available red algal genomes, a
(DNA-laddering) is considered to be a diagnostic feature of
putative red algal-type II metacaspase (based on several
PCD. Several endonucleases involved in apoptotic DNA
Porphyra ESTs in GenBank) was also identi ed (Fig. 2a).
123
264 J Mol Evol (2009) 68:256 268
123
J Mol Evol (2009) 68:256 268 265
Trypanosoma spp., and Leishmania spp. or as three copies
fragmentation have been identi ed. Among them, mito-
in C. reinhardtii).
chondrial endonucleases of the EndoG type (containing a
In some cases, the domains present in complex multi-
DNA/RNA nonspeci c endonuclease domain; IPR001604)
domain PCD-related proteins are present in unicellular
have been shown to participate in this process in both
lineages both as single-domain proteins and in multido-
mammals and yeast (Li et al. 2001; Buettner et al. 2007),
main proteins encompassing some or all the domains found
and a proapoptotic nuclease activity for EndoG was recently
in speci c PCD proteins. For instance, human TRAF pro-
reported in trypanosomatids. Interestingly, EndoG-like
teins are composed of three domains RING, TRAF, and
sequences appear to be absent in the land plant lineage, and
MATH and all three domains are present in unicellular
alternative endonucleases are responsible for DNA frag-
lineages, either as single-domain proteins or in multido-
mentation in plants (Balk et al. 2003). However, proteins
main proteins with a TRAF-like domain organization (e.g.,
with a predicted DNA/RNA nonspeci c endonuclease
TRAF in green algae and ciliates; RING-TRAF in cho-
domain were found in green algae, suggesting that EndoG-
ano agellates; RING-TRAF, RING-TRAF-TRAF, and
like sequences were present in the unicellular ancestors of
RING-TRAF-TRAF-TRAF in ciliates; RING-TRAF-
Viridiplantae and were later lost or replaced in the lineage
MATH and RING-TRAF-TRAF-MATH combinations in
leading to land plants. Putative mitochondrial endonucle-
some Leishmania and Dictyostelium proteins).
ases of the EndoG-type were also found in many other
In other cases, PCD domains found in complex PCD
unicellular lineages (Fig. 1). Noteworthily, a DNA-ladder-
proteins in multicellular lineages are found in unique
ing effect during PCD-like processes was observed in some
combinations in unicellular lineages, suggesting that, in
of these unicellular lineages [e.g., in Chlamydomonas
addition to duplication and recruitment, domain shuf ing
(Nedelcu 2006; Moharikar et al. 2006)].
was also important in the early evolution of the PCD
Finally, among the proteins involved in the cytoskeletal
machinery. This is the case for the NB-ARC domain, found
rearrangements required for phagocytosis of apoptotic
in combination with TIR and LRR domains in plant disease
cells, the mammalian ELMO1 and its Caenorhabditis
resistance proteins, and with CARD and WD40 domains in
elegans orthologue, CED-12, are required for