CURRICULUM VITAE
of
Saba Amsalu Teserra
a. CONTACT INFORMATION
Address : *** ******** ***. #*, **** Alto, CA 94306
Cell phone : 1-404-***-****
Email : ***********@*****.***
US citizen
b. EDUCATION
• PhD. in Computational Linguistics (Natural Language Processing), Bielefeld University
(Germany), April 2004 - June 2007
- Dissertation Title : Bilingual word and chunk alignment: a hybrid system for
Amharic and English (Advisor: Dafydd Gibbon)
- Grade: summa cum laude (with highest honor)
- Award: Dissertation award of the Westfaelisch-Lippische Universitaetsgesellschaft
2007.
Supplement to Thesis:
- Pattern recognition approach of word alignment,
- Stochastic POS Tagging (Conditional Random Fields and Markov Random Fields),
- Spelling-checker for non-standard languages.
• MSc. in Information Science, Addis Ababa University, Faculty of Informatics, August
2001
- Thesis title: The Application of Information Retrieval Techniques to Amharic Doc-
uments on the Web [Grade: Excellent]
c. RECENT JOBS
Advanced Technology Architect/ NLP, TIBCO, Palo Alto, CA : Research, Design,
Prototyping and evaluation of -
• Sentiment Analysis of social media data (Twitter, Facebook, Review)
• Feature identification through research of grammatical dependency in sentences (Stan-
ford Parser, NLTK)
• Developed decision tree learning system for topic identification
• Developed PMI/LSI aspect identification and categorization system
1
• Developed algorithms for normalizing Twitter data: resolving subject pronoun drop,
g-drop, question identification, dealing with twitter lingo and dissecting and cleaning
tweet parts for dependency parsing.
Design Team (E-Learning System): CEISMC, Georgia Tech., Spring-Summer 2012
Associated Visiting Scholar: Penbroke College, Oxford University, [Statistical analysis
of big data] Autumn. 2011 -
Drupal Consultant: Street grace (April 2011-Aug 2011)
d. ACADEMIC JOBS
• Mobility Grant by the Academy of Finland (# 129127): Collaborated with Dr. Anssi
Yli-Jyr on SMT of Amharic, 2008-2009.
a
• Guest Professor, University of Gondar, Dep. of Computer Science, 2008.
• Visiting Professor, Georgia Tech., School of Mathematics, 2007-2008.
• Lecturer, Addis Ababa University, Faculty of Informatics, 2001 - 2004
e. LANGUAGE SKILLS
Amharic, English, German
f. SOFTWARE ENGINEERING
1. Software Tools Developed
a) A system that aligns parallel text (Model I & Model II (Platform: C
b) Finite - State Morphological Analyzer for Amharic, including evaluation software (Plat-
form: XFST).
c) Miscellaneous - Concordance generator, Font converter, etc. (Platform: Linux shell
and C
d) Payroll system study and design: Addis Ababa University -(as coordinator and system
analyst).
e) Class Scheduling System: Addis Ababa University - (as a programmer).
2. Proficiency in Programming Languages:
a) Low level: C, C++
b) Web programming: PHP, Javascript (jquery, ajax), CSS
c) Database systems (MySQL)
d) Markup: XML(including DTD, XML Schema design and XSLT), HTML
g. Pedagogical Experience
1. Teaching
- University of Gondar: Courses to Computer Science students, graduating class
a) Scripting in XML: DTDs, XML Schema, XSLT, Summer 2008
b) Introduction to Scripting in Perl, Summer 2008
- Georgia Tech.: Introduction to Probability and Statistics, Math 3215, Fall 2007
2
- Bielefeld University: Verfahren der Verarbeitung sprachlichen Wissens (Methods
of Processing Language based Knowledge), Summer 2006
- Addis Ababa University: Courses to Master and Bachelor students in Informatics:
a) Modern Information Storage and Retrieval, 50 students, Fall & Spring 2001/02/03,
b) Information Systems Analysis and Design, 100 x 2 students, Fall & Spring 2002/03,
c) Programming in C, C++, Visual Basic, 100 x 4 students, Fall & Spring 2001/02/03,
d) Data structures and Algorithm Analysis, 30 students, Fall 2004
e) Information Theory, 30 students, Fall 2002/03
f) Statistical Data Analysis (SPSS), 30 students, Spring 2003
g) Introduction to Artificial Intelligence, 100 x 4 students, Fall 2004.
2. Supervision of Master Thesis on:
a) Application of Case-Based Reasoning for Amharic Legal Precedent Retrieval: A Case
Study with the Ethiopian Labour Law (Ethiopia Tadesse), 2002.
b) N-gram Based Automatic Indexing for Amharic Text (Bethlehem Mengistu), 2002.
c) Text Retrieval Using Self-Organized Map: The case of ILRI Digital Library (Mulugeta
Bayeh), 2002.
d) Amharic Text Retrieval: An experiment Using Latent Semantic Indexing with Singular
Value Decomposition (Tewodros Hailemeskel), 2003.
e) Application of WEBSOM to Amharic Text Retrieval (Bizuneh Mamuye), 2003.
f) Supervision of Final year Bachelor of Information System Studies: Mostly projects
of Software development for Business - such as sales management, payroll, inventory
control systems, etc.
h. PUBLICATIONS
1. Saba Teserra, Decision tree learning for sentiment topic identification: (2013) in prepara-
tion
2. Saba Teserra and Raghu Thiagarajan, Normalizing tweet language for natural language
processing: (2013) in preparation
3. Saba Teserra, Evaluation of Entity/Aspect identification methods: (2013) in preparation
4. Dieter Metzing and Saba Amsalu Teserra, Conjunctive coordination in Amharic: some
typological approaches, Form and Function in Language Research, Papers in Honour of
Christian Lehmann, Berlin (2009), Pages 283 - 300, ISBN: 978-3-11-021612-7.
5. S. Amsalu, H. Matzinger and M. Vachkovskaia, Thermodynamical Approach to the Longest
Common Subsequence Problem, in Journal of Statistical Physics (2008), Vol. 131, No.6.
6. S. Amsalu, S. Popov and H. Matzinger, Macroscopic non-uniqueness and transversal uc-
tuation in optimal random sequence alignment, in ESAIM: P&S (2007), Vol. 11, pp.
281-300.
7. Teserra, Saba Amsalu, Bilingual word and chunk alignment : a hybrid system for Amharic
and English, Universit t Bielefeld Fach: Linguistik Fakult t f r Linguistik und Liter-
a au
aturwissenschaft Graduiertenkolleg: Aufgabenorientierte Kommunikation (DFG GK 256),
2007.
8. Saba Amsalu, Scaling up from word to phrasal alignments of Amharic-English parallel
corpora, In proceedings of the 9th Nordic Conference on Bilingualism, Joensuu, (2006).
3
9. Saba Amsalu, Maximum Likelihood Alignment of Translation Equivalents, In proceedings
of the 5th International Conference on Natural Language Processing, Turku, (2006).
10. Saba Amsalu and Girma A. Demeke, Induction of Amharic Verb Stem lexicon for Finite-
State Morphological Analysis, In proceedings of World Congress of African Linguistics,
Addis Ababa, (2006).
11. Saba Amsalu and Girma A. Demeke, Non-concatenative Finite-state Morphotactics of
Amharic Simple Verbs, In Journal of Ethiopian Language Studies (2006).
12. Saba Amsalu and Sisay Fissaha Adafre, Machine Translation for Amharic, where we are,
SALTMIL, Genoa, (2006).
13. Saba Amsalu, Data-driven Amharic-English Bilingual Lexicon Acquisition, In proceedings
LREC, Genoa, (2006).
14. Saba Amsalu and Dafydd Gibbon, Methods of Bilingual Lexicon Extraction from Amharic-
English Parallel Corpora, In proceedings of World Congress of African Linguistics, Addis
Ababa, (2006).
15. Saba Amsalu and Dafydd Gibbon, A complete FS model for Amharic morphographemics,
Proceedings of FSMNLP, Helsinki, (2005).
16. Saba Amsalu and Dafydd Gibbon, Finite state morphology of Amharic, Proceedings of
RANLP, Borovets, Bulgaria, (2005), p. 47 - 51.
17. Saba Amsalu, The Application of Information Retrieval Techniques to Amharic Doc-
uments in the Web, Master Thesis, Department of Information Science, Addis Ababa
University, (2001).
i. TALKS
1. On Amharic Spell-checker, University of Gondar, Dep. of Computer Science, (June 2008).
2. Constructing a spelling-checker where there is no standard spelling: The case of Amharic,
ACAL/ALTA, Florida, (March 2007).
3. A Hybrid Word Alignment System, Oxford University Computing Laboratory, Oxford,
(January 2007).
4. Data-driven Amharic-English Bilingual Lexicon Acquisition, LREC, Italy, (May 2006).
5. Maximum Likelihood Alignment of Translation Equivalents, The 5th International Con-
ference on Natural Language Processing, Turku, (August 2006).
6. Scaling up from word to phrasal alignments of Amharic-English parallel corpora, The 9th
Nordic Conference on Bilingualism, Joensuu, (August 2006).
7. Induction of Amharic Verb Stem lexicon for Finite-State Morphological Analysis, World
Congress of African Linguistics, Addis Ababa, (August 2006).
8. Methods of Bilingual Lexicon Extraction from Amharic-English Parallel Corpora, World
Congress of African Linguistics, Addis Ababa, (August 2006).
j. RECENTLY REVIEWED PAPERS
1. Part-of-Speech Tagging of Amharic, for the journal of Language Resources and Evaluation
(2010)
2. A Morphological Processor for Amharic and Tigrinya, for the journal of Language Re-
sources and Evaluation (2010)
4
3. Finite State Morphology of the Nguni Language Cluster: Modelling and Implementation
Issues, for FSMNLP (2009)
4. Automatic training of lemmatization rules that handle morphological changes in pre-, in-
and suffixes alike, for EACL (March 30 - April 3, 2009
5. Arabic finite-state morphological processing, for EACL (March 30 - April 3, 2009)
5