Bikash Sharma
CONTACT ** ***** ****.,***# ***-B Cell: 814-***-****
State College E-mail: ******@***.***.***
Pennsylvania-16803, USA WWW: www.cse.psu.edu/ bus145
RESEARCH
INTERESTS Distributed systems performance, power and fault management
Cloud computing
Virtualization
MapReduce
EDUCATION
The Pennsylvania State University August 2007 to Present
PhD candidate in Computer Science and Engineering
CGPA: 3.82/4.0
Advisor: Dr. Chita R. Das
Relevant Courses : Machine learning & Data mining, Distributed Systems, Oper-
ating System, Cloud Computing, Algorithms Design & Analysis, Data Structures,
Programming Languages, Computer Networks, Computer Security, Applied Statis-
tics, Introduction to Regression Analysis, Information Retrieval & Search Engines,
Computer Architecture, Database Systems.
NIT Rourkela, India August 2003 to May 2007
Bachelor of Technology in Computer Science and Engineering
CGPA: 9.01/10.0
Second best computer science undergraduate
PUBLICATIONS
Bikash Sharma, Tim Wood, Chita R. Das, HybridMR: A Hierarchical Scheduler
for MapReduce Clusters, under submission to USENIX HotCloud 2012.
Bikash Sharma, Ramya Prabhakar, Seung-Hwan Lim, Mahmut T. Kandemir,
Chita R. Das, MROrchestrator: A Dynamic Resource Orchestration Framework
for Hadoop MapReduce Clusters, under submission to IEEE Cloud 2012.
Bikash Sharma, Victor Chudnovsky, Joseph L. Hellerstein, Rasekh Rifaat, Chita
R. Das, Modeling and Synthesizing Task Placement Constraints in Google Com-
pute Clusters, in ACM SOCC 2011..
Seung-Hwan Lim, Bikash Sharma, Byung Chul Tak, Chita R. Das, A Dynamic
Energy Management In Multi-tier Data Centers, in IEEE ISPASS 2011.
Seung-Hwan Lim, Bikash Sharma, Gunwoo Nam, Eun Kyoung Kim, Chita R.
Das, MDCSim: A Multi-tier Data Center Simulation Platform, in IEEE Cluster
2009.
Gunwoo Nam, Pushkar Patankar, Seung-Hwan Lim, Bikash Sharma, Chita R.
Das, A Resilient and Collaborative Replacement Framework for Per-Flow Moni-
toring, in IEEE ICDCS 2009.
Bikash Sharma, Applications of Data Mining in the Management of Performance
and Power in Data Centers, Technical Report, Department of Computer Science
and Engineering, The Pennsylvania State University, December 2009.
PROJECTS Research Projects
Resource management in Hadoop MapReduce Clusters
Developed heuristics for e cient allocation of resources in MapReduce clusters
The goal is to improve cluster resource utilization and application performance
Performance and power management of multi-tier data centers
Developed a multi-tier data center simulation platform.
1 of 4
Analyzed the e ects of high speed interconnects like IBA, 10GigE on optimizing
throughput, latency and energy consumption of data centers.
Privacy Preserving Data Mining (B-Tech Thesis )
Implemented various randomization algorithms for data perturbation to preserve
privacy in statistical databases.
Analysis of approximately regenerating the distorted data suitable for secure data
mining applications using cryptography and stochastic techniques.
Web server cache prefetching
Implemented various web server cache prefetching algorithms to determine the
optimal prefetching technique given speci c browser and Internet tra c conditions.
Software Projects
Hospital management system in rural areas
Developed a software system that could enable rural people to query and gather
information on medical status during emergencies like epidemics.
The project was selected as Microsoft best student project in the penultimate stage.
CiteseerX - Scienti c Literature Digital Library and Search Engine
Developed People Search module in MyCiteseerx, personal portal for Citeseerx
members using Java Spring framework and Lucene search engine library.
Instant Messenger
Developed an instant messenger distributed system using Qt cross-platform appli-
cation framework and C++.
Virtual Machine System Security
Analysis of XSM infrastructure in Xen hypervisor.
Implemented policies for the management of security domains in Xen.
Chord
Implemented a distributed lookup protocol, Chord in C
Network Protocols Simulation
Implemented various network protocols like TCP, UDP using C socket program-
ming
User-level thread library
Implemented various user-level thread functionalities in C including synchroniza-
tion and inter-thread communication primitives
Developed a MapReduce-based distributed application using this thread package
INTERNSHIP IBM Research (Bangalore, India)
Research Intern (May 2011 - August 2011)
Fault Diagnosis for Shared Dynamic Clouds
Worked on techniques for e cient and accurate faults diagnosis in virtualized con-
solidated environments like public utility clouds.
Google Inc. (Seattle, USA)
Software Engineering Intern (May 2010 - December 2010)
Characterization of constraints present in Google compute clusters workload
Detailed study of di erent jobs and machines constraints present in Google work-
load. Developed statistical models to characterize these constraints. Implemented
a simulation tool based on Google scheduler to validate these models using real
workload from Google. Based on the characterization models, developed synthetic
workload generator for these constraints.
Indian Institute of Science (Bangalore, India)
Summer Research Intern (May 2006-August 2006)
2 of 4
Awarded Young Engineering Fellowship Program (YEFP) scholarship for summer
internship at IISc, Bangalore (YEFP scholarship is o ered to only twenty students
(computer science undergraduates) annually in India to increase research awareness
in undergraduate students).
Implemented various web server cache prefetching algorithms for web browsers.
Tata Consultancy Services (Mumbai, India)
Summer Software Intern (May 2005-August 2005)
Worked in a team developing the project McKinsey Consultancy, People Link
Developed a client response system module in Java.
RESEARCH
CONTRIBU- Fault manager for clouds Design of CloudPD, a problem detection and diagnosis
TIONS framework for shared virtualized dynamic clouds. CloudPD can di erentiate cloud
related faults (that are manifestations of virtualization techniques like live migra-
tion and dynamic resizing) from application level performance anomalies with high
accuracy and low false positives. CloudPD is a completely automated end-to-end
system for fault detection and classi cation.
Resource management in data analytics systems like Hadoop MapReduce clusters.
The focus is on increasing the resource utilization of the clusters and reduce the
job completion times of MapReduce jobs. The other dimension to it is the de-
sign of a hierarchical scheduler called HybridMR for Hybrid MapReduce clusters
consisting of native and virtualized machines, with an objective to maximize the
QoS of interactive services running alongside with batch workloads like MapReduce.
Modeling and Synthesis of an important cloud workload property called task place-
ment constraints. Analyzed and quanti ed the performance impact of constraints
in terms of a ecting task scheduling delays. Proposed benchmarking algorithms to
incorporate representative task placement constraints and machine properties from
Google clusters in existing performance benchmarks.
Dynamic energy management in multi-tier data centers- Proposed two techniques
to reduce the energy consumption of multi-tier data centers, while ful lling the
user agreed SLA. The rst heuristic uses dynamic provisioning to nd the min-
imum number of servers in each tier to satisfy the SLAs. The second heuristic
works at the granularity of an individual server and determines the CPU speeds of
servers that provides optimal energy e ciency coupled with exploiting sleep state
of servers to achieve further energy savings.
MDCSim: A Multi-tier Datacenter Simulator, a simulation platform for analyzing
the performance and power usage of multi-tier data centers. MDCSim can simu-
late large data centers consisting of thousands of servers and can experiment with
various server cluster interconnects. This simulation framework provides useful
guidelines in designing e cient and cost-e ective data centers.
Improvement in the per- ow monitoring system of networks by preventing the
over ow of ow table. Using a dynamic framework, we estimate the number of
active ows and heavy hitters to reduce the overhead of vast data exchanges across
networks.
AWARDS
Robert M. Owens Memorial Scholarship in Computer Science and Engineering at
the Pennsylvania State University in 2007.
Young Engineering Fellowship from IISc, Bangalore in 2006.
3 of 4
All Orissa ICSE School of Associations Scholarship award.
Microsoft student project award for Hospital Management System in rural areas .
Oracle Certi ed Associate(OCA).
Second prize for best website design competition at the tech.fest of NIT Rourkela.
TECHNICAL
SKILLS Languages: C, C++, Java, Python
Operating System: Linux, Windows, Solaris
Databases: MySQL, Oracle, MS Access
Tools & IDE: Vim, Emacs, Microsoft Visual Studio, Java Netbeans, Eclipse
Others: Socket programming, Shell scripting, R, CSIM, Qt, C++ STL, GDB de-
bugger, Matlab programming, HTML, Latex
TEACHING
EXPERIENCE Teaching Assistant, Pennsylvania State University
Introduction to Digital Systems
Responsibilities included holding student help sessions, grading, presentation of
some lectures on behalf of the instructor.
Teaching Assistant, Pennsylvania State University
Intermediate C++ programming
Responsibilities included managing labs, grading and help sessions.
Teaching Assistant, Pennsylvania State University
Object-Oriented Programming with JAVA
Responsibilities included managing labs, grading and help sessions.
REFERENCES Available on request.
4 of 4