Post Job Free

Resume

Sign in

Data Project

Location:
Morgantown, WV, 26505
Posted:
October 26, 2010

Contact this candidate

Resume:

Sumitra Sarada Nair

Personal Details

Nationality: Indian

Center for the Environmental Implications of Nanotechnology (CEIN)

California NanoSystems Institute

*** ******** *****, **** ****

University of California

Los Angeles, CA 90095-7277

Email: abike8@r.postjobfree.com

Tel: 001 310-***-****

Education

09/03 - 02/07 PhD, Department of Automatic Control & Systems

Engineering, The

University of Sheffield, UK. Project: Function Estimation using

Kernel Methods for Large Data Sets.

Background

Master in Technology (Computer & Information Science)

Cochin University of Science & Technology, India

Master in Science (Mathematics)

Cochin University of Science & Technology, India

Work Experience

04/09 - present Postdoctoral Research fellow, The Center for Environmental

Implications

of Nanotechnology (CEIN), University of California, Los

Angeles, CA.

Project: Learning the E?ect of Nanoparticles on the

Environment

using Property Structure Activity Relationship Model

The aim of the study is to develop and assess a Property-Activity

Relationship (PAR) model suitable to predict the toxicity of engineered

nanoparticles (ENP) from their physicochemical properties. The interactions

of engineered nanoparticles (NP) with biological systems may result in the

activation of toxicity mechanisms ensuing nanomaterial-induced damages. In

vitro and in vivo toxicity screening provides valuable information to

characterize nanoparticle activity. The experimental data collected using

high throughput screening techniques were used for the analysis. The data

was highly imbalanced and hence it was analyzed using kernel algorithms for

imbalanced data.

07/08 - 03/09 Bioinformatics Scientist, Terry Fox Laboratory, British

Columbia

Cancer Research Center, Vancouver Canada.

Project: Analysis of Genomics data using Machine learning

techniques

The main aim of the project was to develop an automated learning model for

measuring the telomere length of blood cell population. The telomeres

consist of repetitive DNA at the end of chromosomes. Abnormal shortening of

telomeres is observed in some diseases including cancer and methods to

measure telomere length are used as disease indicators. The blood cell data

collected using ?ow cytometry were used for the analysis. The learning

model I developed for measuring the telomere length utilized Kohonen self

organizing map (KSOM) for clustering the various cell population of

interest. Various ?ltering techniques were also ?tted to the algorithm for

removing the noisy data. The learning model showed an excellent performance

in terms of accuracy and computational e?ciency.

03/07 - 05/08 Postdoctoral Research fellow, INSERM/U887, UFR

STAPS,

University of Burgundy, Dijon, France.

Project: Analysis of Gait data using Machine

learning techniques

The project consisted of developing the applications of neural networks and

kernel methods to the analysis of gait data. The study of the data was

plagued with several problems that are common to the analysis of biological

data. Among these were the following - missing data, the identi?cation of

outliers and last but not least the problems of too much or too little

data. The latter problem had arisen during the classi?cation of abnormal

gait in arthritic patients using electromyographic patterns (EMG) data.

While the number of subjects was limited due to strict selection criteria,

the quantity of data associated with each subject was very high. Kernel

conjugate gradient algorithms that I developed, were applied for the

classi?cation of normal and arthritic patients. It was found to give a

performance that is vastly superior to supervised and unsupervised neural

network methods.

09/03 - 02/07 Research Fellow, Department of Automatic Control and

Systems

Engineering, The University of She?eld, UK.

The work was focused on the reproducing kernel Hilbert space (RKHS)

approach to modeling large data sets. I developed computationally e?cient

algorithms, viz, iterative RKHS methods (kernel conjugate gradient and

steepest descent), supervised pre-clustering methods for sparse regression

and online methods for RKHS learning.

Prior to 2003 Lecturer, Engineering college, India.

During this job, I supervised students' projects. I was also involved in

curriculum development and the preparation of new modules.

.

Computing and Language Skills

Experience of programming in C, Visual Basic, R, Java and Matlab on

various platforms including Linux, Unix and Windows XP.

Language Skills

Fluent in oral and written English, Hindi and Malayalam.

Positions of Responsibility

. Student Ambassador, The University of Sheffield

Worked at various events aimed at raising the awareness of Higher

education among school children. This involved giving tours and

describing the university to groups of school children.

Honours and Awards

. Passed GATE-98 with percentile score 94.78 and received GATE

scholarship during M. Tech Course in Computer & Information Science.

GATE is an all India examination conducted jointly by IIT's and IISC,

India

. Passed Kerala State Level Lectureship Eligibility Examination in

Mathematics

Professional Activities

. Reviewer of IEEE Transactions on Systems, Man, and Cybernetics, Part

B.

. Reviewer of International Journal of Systems Science.

Research Interests

My research interests span various aspects of computational data

modeling, namely, machine learning, bioinformatics, computational

nanotechnology, data mining and signal processing. My educational

qualifications include a PhD in machine learning and masters in computer

science as well as mathematics - an ideal combination for pursuing a

research career in the field of computational data modeling. The study of

nanparticles, genomics data and arthritis data using machine learning

techniques during postdoctoral works and development of theoretical models

and algorithms for studying large data sets by applying kernel theory as

part of PhD work- all these research experiences molded me into a good

candidate in the area of data modeling.

My PhD work was focussed on the study of large data sets using kernel

methods. Kernel algorithms, the new phase of machine learning algorithms

have a strong theoretical foundation and have become a popular tool of data

modeling because of their guaranteed convergence and good generalization

capacity. The main disadvantage of kernel methods is their computational

complexity, which scales as O(N3), where N is the number of training

points. For making the algorithms effective, I approached the problem

using three different techniques, namely, iterative methods, pre-

clustering and on-line techniques.

As part of my postdoctoral works, I got an opportunity to analyze gait

data using machine

learning techniques. The study involved human subjects suffering from two

forms of arthritis, namely, rheumatoid arthritis and hip osteoarthritis. We

compared the electromyographic data (EMG) collected from arthritis

patients with those of healthy subjects with no musculoskeletal disorder

by modeling the task as a classification problem. We were also able to

determine the major differences between the gait of normal subjects and

arthritic patients.

I have experience in analyzing the genomics data. I was involved in a

project which dealt with the automation of flow-FISH analysis. The flow-

FISH is a technique for measuring the telomere length of blood cell

population and it makes use of multi-parameter measurements in flow

cytometry (FCM) and fluroscent in-situ hybridization (FISH). The flow-

FISH analysis involves manual gating of cell population and as the flow-

FISH data is multidimensional, the manual analysis is a time consuming

process. The automated model we developed was able to calculate telomere

length in a consistent, stable, and accurate fashion.

My current project involves the study of nano particles. Nanomaterials are

manufactured for their unique properties that enable new applications and

products. These specific properties may lead to different interactions with

and impacts on ecological receptors and the environment, resulting in

effects that can be significantly different from those known for bulk

materials. In this regard, in vitro and in vivo toxicity screenings

provide valuable information to characterize the effects of nanoparticles

(NPs) on ecological receptors and the environment. The aim of the study is

to develop an approach for the development of a classification based model

that relates nanoparticle toxicity to basic physicochemical nanoparticle

properties.

Whilst significant progress has been made towards developing new algorithms

for solving data modeling as well as applying on practical problems,

there is considerable scope

for further work. Iterative algorithms like Kalmann filter, Newton's

method, Quasi-Newton and Gauss Newton can be used to solve the learning

model derived using kernel theory. Pre-clustering techniques can be applied

for active learning, novelty detection and outlier mining. The fields like

bioinformatics and signal processing usually generate huge amounts of

information in the form of complex non-vectorial data like strings, text

and graphs. The design of kernels that make use of the nature and the

geometry of data for complex non-vectorial points is a developing field of

research. Semi-supervised learning, which is a fast growing field,

generally makes use of designed kernels. Other interesting areas are

computer vision, data visualization methods, ensemble learning and Bayesian

methods. Hands on experience in the field of machine learning,

bioinformatics and nanoinformatics and a strong mathematical and computer

science background would help me to bring

further progress to the field of data modeling.

Publications

Journals

S. Nair, R. French, D. Laroche and E. Thomas. Application of Machine

Learning Algorithms to

the analysis of Electromyographic patterns from Arthritic patients. IEEE

Transactions on Neural

Systems & Rehabilitation Engineering, 18, pp-174 - 184, 2010.

T. J. Dodd, S. Nair and R. F. Harrison . The E?ect of the Order of

Parameterisation in Gradient

Learning for Kernel Methods. IET Control Theory & Applications. 2010. In

press.

S. Nair and T. J. Dodd. Supervised pre-clustering for sparse regression

(2010). Submitted to IEEE

Transactions on Systems, Man, and Cybernetics, Part B.

S. Nair, R. Liu, R. Robert, S. George, A. E. Nel and Y. Cohen (2010). A

property activity relation-

ship model for learning the e?ect of nanoparticles on biological cells. In

preparation for submission

to Small.

Conferences

R. Robert, R. Liu, S. Nair, S. George, A. E. Nel and Y. Cohen (2010). Data

Mining of High Throughput Screening Toxicity of Engineered Nanoparticles.

AIChE Annual Meeting (accepted).

S. Nair, R. Liu, R. Robert, S. George, A. E. Nel and Y. Cohen (2010).

Learning the e?ect of nanoparticles on biological cells using property

activity relationship model. ICEIN, Los Angeles.

T. J. Dodd, S. Nair and R. F. Harrison (2005). Gradient Based Methods:

Functional vs Parametric Forms. Proceedings of the 16th IFAC World

Congress, Prague.

Referees

Dr. Elizabeth Thomas

INSERM/U887

UFR STAPS, University of Burgundy, Dijon, France

abike8@r.postjobfree.com

Dr. T. J. Dodd

Department of Automatic Control & Systems Engineering

The University of She?eld, She?eld S1 3JD, U.K

t.j.dodd@she?eld.ac.uk

Dr. Yoram Cohen

Chemical and Biomolecular Engineering Department

5531 Boelter Hall

University of California, Los Angeles

Los Angeles, CA 90095-1592

abike8@r.postjobfree.com



Contact this candidate