Post Job Free

Resume

Sign in

Data Developer

Location:
Fair Lawn, NJ
Posted:
March 15, 2014

Contact this candidate

Resume:

SKILLS SUMMARY

A versatile researcher and data analyst with a Ph.D. in Physics & Mathematics and over 20 years of practical experience.

Extensive work on statistical estimation of protein phases resulted in 261 structures solved and 45 scientific publications

(detailed citations are available upon request).

EXPERIENCE

COLUMBIA UNIVERSITY, New York, NY 2002-present

Data Scientist

• Strong using Mathematics Statistics methodology

• Experience on main Hadoop ecosystem's project : Pig, Hive, Hbase.

• Develop Pig Latin scripts and using Hive query language for data analyses.

Handled importing data from different data sources, using transformations

with Hive, MapReduce.

• Working experience using Sqoop to import data into HDFS from RDBMS .

Exported the analyzed data to the databases using Sqoop for visualization.

• Experience in Hadoop administration : installation and configuration of clusters

using Apache and Cloudera.

• Installation and configuration Hive, Pig, Sqoop, Flume, Oozie on the Hadoop cluster.

• Developed MapReduce jobs using Hive and Pig.

• Optimized MapReduce jobs to use HDFS to increase performance by using different

compression mechanisms.

• Good experience streaming data into Apache Hbase using Apache Flume.

• Experience with working NoSQL data base.

• Good experience with R software environment for statistical computing and graphics

on UNIX and Windows platform.

• Experience using SAS, R and Matlab in a professional capacity.

• Good experience with SQL Server 2012 Management Studio Express.

SOUTHERN RESEARCH INSTITUTE, Birmingham, AL 1999-2002

Research Scientist II

• Developed application for data processing.

Scaling X-Ray intensities for heavy atom derivatives based on non-linear regression analyses.

• Designed an application for protein phase determination

based on algorithm of maximum-likelihood function.

Refinement of protein structures by the maximum-likelihood

THE UNIVERSITY OF CONNECTICUT,Storrs, CT 1991-1999

Postdoctoral Fellow, Molecular & Cell Biology Department

• Working as senior developer. Designed the application based on OS/390 architecture.

• Unix/Linux Administrator using Protein Data Bank.

Structure validation and quality of the PDB entries: MolProbity, Procheck,

Prosa-web, WHAT_CHECK, What_If.

BAYLOR COLLEGE OF MEDICINE, Houston, TX 1990-1991

Visiting Assistant Professor

• Working as senior developer in Fortran. Designed the application for tape binary conversion.

Data conversion and formatting from IBM mainframe tapes to make it usable on your PC or UNIX computer.

EDUCATION

SHEMYAKIN INSTITUTE OF BIOORGANIC CHEMISTRY OF THE U.S.S.R. ACADEMY OF SCIENCES, Moscow,

Russia (Ph.D. in Physics and Mathematics).

INSTITUTE OF PROTEIN OF THE U.S.S.R. ACADEMY OF SCIENCES, Pushino-na-Oke,

Russia (Pre-graduate Student Trainee).

LENINGRAD STATE UNIVERSITY, Department of Physics, Leningrad,

Russia (Diploma – Master’s equivalent – in Physics).

TECHNICAL SKILLS

Programming Languages/Software/Technology Platforms: Hadoop – Big Data analysis; MapReduce; Data Science;

Java,Python, C-Shell, bash; Perl, C-Shell, bash;Fortran, Windows, C-Shell, bash.

Operating Systems: Linux/Ubuntu/SuSE/Red Hat, UNIX/IRIX, OS/390, Windows 7, RDOS, DOS.

REFERENCES

Available upon request.



Contact this candidate