Sumitra Sarada Nair
Personal Details
Nationality: Indian
Center for the Environmental Implications of Nanotechnology (CEIN)
California NanoSystems Institute
University of California
Los Angeles, CA 90095-7277
Email: abike8@r.postjobfree.com
Tel: 001 310-***-****
Education
09/03 - 02/07 PhD, Department of Automatic Control & Systems
Engineering, The
University of Sheffield, UK. Project: Function Estimation using
Kernel Methods for Large Data Sets.
Background
Master in Technology (Computer & Information Science)
Cochin University of Science & Technology, India
Master in Science (Mathematics)
Cochin University of Science & Technology, India
Work Experience
04/09 - present Postdoctoral Research fellow, The Center for Environmental
Implications
of Nanotechnology (CEIN), University of California, Los
Angeles, CA.
Project: Learning the E?ect of Nanoparticles on the
Environment
using Property Structure Activity Relationship Model
The aim of the study is to develop and assess a Property-Activity
Relationship (PAR) model suitable to predict the toxicity of engineered
nanoparticles (ENP) from their physicochemical properties. The interactions
of engineered nanoparticles (NP) with biological systems may result in the
activation of toxicity mechanisms ensuing nanomaterial-induced damages. In
vitro and in vivo toxicity screening provides valuable information to
characterize nanoparticle activity. The experimental data collected using
high throughput screening techniques were used for the analysis. The data
was highly imbalanced and hence it was analyzed using kernel algorithms for
imbalanced data.
07/08 - 03/09 Bioinformatics Scientist, Terry Fox Laboratory, British
Columbia
Cancer Research Center, Vancouver Canada.
Project: Analysis of Genomics data using Machine learning
techniques
The main aim of the project was to develop an automated learning model for
measuring the telomere length of blood cell population. The telomeres
consist of repetitive DNA at the end of chromosomes. Abnormal shortening of
telomeres is observed in some diseases including cancer and methods to
measure telomere length are used as disease indicators. The blood cell data
collected using ?ow cytometry were used for the analysis. The learning
model I developed for measuring the telomere length utilized Kohonen self
organizing map (KSOM) for clustering the various cell population of
interest. Various ?ltering techniques were also ?tted to the algorithm for
removing the noisy data. The learning model showed an excellent performance
in terms of accuracy and computational e?ciency.
03/07 - 05/08 Postdoctoral Research fellow, INSERM/U887, UFR
STAPS,
University of Burgundy, Dijon, France.
Project: Analysis of Gait data using Machine
learning techniques
The project consisted of developing the applications of neural networks and
kernel methods to the analysis of gait data. The study of the data was
plagued with several problems that are common to the analysis of biological
data. Among these were the following - missing data, the identi?cation of
outliers and last but not least the problems of too much or too little
data. The latter problem had arisen during the classi?cation of abnormal
gait in arthritic patients using electromyographic patterns (EMG) data.
While the number of subjects was limited due to strict selection criteria,
the quantity of data associated with each subject was very high. Kernel
conjugate gradient algorithms that I developed, were applied for the
classi?cation of normal and arthritic patients. It was found to give a
performance that is vastly superior to supervised and unsupervised neural
network methods.
09/03 - 02/07 Research Fellow, Department of Automatic Control and
Systems
Engineering, The University of She?eld, UK.
The work was focused on the reproducing kernel Hilbert space (RKHS)
approach to modeling large data sets. I developed computationally e?cient
algorithms, viz, iterative RKHS methods (kernel conjugate gradient and
steepest descent), supervised pre-clustering methods for sparse regression
and online methods for RKHS learning.
Prior to 2003 Lecturer, Engineering college, India.
During this job, I supervised students' projects. I was also involved in
curriculum development and the preparation of new modules.
.
Computing and Language Skills
Experience of programming in C, Visual Basic, R, Java and Matlab on
various platforms including Linux, Unix and Windows XP.
Language Skills
Fluent in oral and written English, Hindi and Malayalam.
Positions of Responsibility
. Student Ambassador, The University of Sheffield
Worked at various events aimed at raising the awareness of Higher
education among school children. This involved giving tours and
describing the university to groups of school children.
Honours and Awards
. Passed GATE-98 with percentile score 94.78 and received GATE
scholarship during M. Tech Course in Computer & Information Science.
GATE is an all India examination conducted jointly by IIT's and IISC,
India
. Passed Kerala State Level Lectureship Eligibility Examination in
Mathematics
Professional Activities
. Reviewer of IEEE Transactions on Systems, Man, and Cybernetics, Part
B.
. Reviewer of International Journal of Systems Science.
Research Interests
My research interests span various aspects of computational data
modeling, namely, machine learning, bioinformatics, computational
nanotechnology, data mining and signal processing. My educational
qualifications include a PhD in machine learning and masters in computer
science as well as mathematics - an ideal combination for pursuing a
research career in the field of computational data modeling. The study of
nanparticles, genomics data and arthritis data using machine learning
techniques during postdoctoral works and development of theoretical models
and algorithms for studying large data sets by applying kernel theory as
part of PhD work- all these research experiences molded me into a good
candidate in the area of data modeling.
My PhD work was focussed on the study of large data sets using kernel
methods. Kernel algorithms, the new phase of machine learning algorithms
have a strong theoretical foundation and have become a popular tool of data
modeling because of their guaranteed convergence and good generalization
capacity. The main disadvantage of kernel methods is their computational
complexity, which scales as O(N3), where N is the number of training
points. For making the algorithms effective, I approached the problem
using three different techniques, namely, iterative methods, pre-
clustering and on-line techniques.
As part of my postdoctoral works, I got an opportunity to analyze gait
data using machine
learning techniques. The study involved human subjects suffering from two
forms of arthritis, namely, rheumatoid arthritis and hip osteoarthritis. We
compared the electromyographic data (EMG) collected from arthritis
patients with those of healthy subjects with no musculoskeletal disorder
by modeling the task as a classification problem. We were also able to
determine the major differences between the gait of normal subjects and
arthritic patients.
I have experience in analyzing the genomics data. I was involved in a
project which dealt with the automation of flow-FISH analysis. The flow-
FISH is a technique for measuring the telomere length of blood cell
population and it makes use of multi-parameter measurements in flow
cytometry (FCM) and fluroscent in-situ hybridization (FISH). The flow-
FISH analysis involves manual gating of cell population and as the flow-
FISH data is multidimensional, the manual analysis is a time consuming
process. The automated model we developed was able to calculate telomere
length in a consistent, stable, and accurate fashion.
My current project involves the study of nano particles. Nanomaterials are
manufactured for their unique properties that enable new applications and
products. These specific properties may lead to different interactions with
and impacts on ecological receptors and the environment, resulting in
effects that can be significantly different from those known for bulk
materials. In this regard, in vitro and in vivo toxicity screenings
provide valuable information to characterize the effects of nanoparticles
(NPs) on ecological receptors and the environment. The aim of the study is
to develop an approach for the development of a classification based model
that relates nanoparticle toxicity to basic physicochemical nanoparticle
properties.
Whilst significant progress has been made towards developing new algorithms
for solving data modeling as well as applying on practical problems,
there is considerable scope
for further work. Iterative algorithms like Kalmann filter, Newton's
method, Quasi-Newton and Gauss Newton can be used to solve the learning
model derived using kernel theory. Pre-clustering techniques can be applied
for active learning, novelty detection and outlier mining. The fields like
bioinformatics and signal processing usually generate huge amounts of
information in the form of complex non-vectorial data like strings, text
and graphs. The design of kernels that make use of the nature and the
geometry of data for complex non-vectorial points is a developing field of
research. Semi-supervised learning, which is a fast growing field,
generally makes use of designed kernels. Other interesting areas are
computer vision, data visualization methods, ensemble learning and Bayesian
methods. Hands on experience in the field of machine learning,
bioinformatics and nanoinformatics and a strong mathematical and computer
science background would help me to bring
further progress to the field of data modeling.
Publications
Journals
S. Nair, R. French, D. Laroche and E. Thomas. Application of Machine
Learning Algorithms to
the analysis of Electromyographic patterns from Arthritic patients. IEEE
Transactions on Neural
Systems & Rehabilitation Engineering, 18, pp-174 - 184, 2010.
T. J. Dodd, S. Nair and R. F. Harrison . The E?ect of the Order of
Parameterisation in Gradient
Learning for Kernel Methods. IET Control Theory & Applications. 2010. In
press.
S. Nair and T. J. Dodd. Supervised pre-clustering for sparse regression
(2010). Submitted to IEEE
Transactions on Systems, Man, and Cybernetics, Part B.
S. Nair, R. Liu, R. Robert, S. George, A. E. Nel and Y. Cohen (2010). A
property activity relation-
ship model for learning the e?ect of nanoparticles on biological cells. In
preparation for submission
to Small.
Conferences
R. Robert, R. Liu, S. Nair, S. George, A. E. Nel and Y. Cohen (2010). Data
Mining of High Throughput Screening Toxicity of Engineered Nanoparticles.
AIChE Annual Meeting (accepted).
S. Nair, R. Liu, R. Robert, S. George, A. E. Nel and Y. Cohen (2010).
Learning the e?ect of nanoparticles on biological cells using property
activity relationship model. ICEIN, Los Angeles.
T. J. Dodd, S. Nair and R. F. Harrison (2005). Gradient Based Methods:
Functional vs Parametric Forms. Proceedings of the 16th IFAC World
Congress, Prague.
Referees
Dr. Elizabeth Thomas
INSERM/U887
UFR STAPS, University of Burgundy, Dijon, France
abike8@r.postjobfree.com
Dr. T. J. Dodd
Department of Automatic Control & Systems Engineering
The University of She?eld, She?eld S1 3JD, U.K
t.j.dodd@she?eld.ac.uk
Dr. Yoram Cohen
Chemical and Biomolecular Engineering Department
5531 Boelter Hall
University of California, Los Angeles
Los Angeles, CA 90095-1592
abike8@r.postjobfree.com