Download PDF
Ahmed Radwan
*.********@*****.***
Cell# (786) 543-
7772
INTERESTS
I have a diversity of interests focused around areas of cloud computing, data/metadata management, semantics
and data integration, with emphasis on using advancements in these areas to build solutions that are useful for
customers.
WORK EXPERIENCE
Cloudera Inc., Nov. 2010 present Palo Alto, CA
Senior R&D Engineer Platforms
Designed and developed solutions for efficient transfer and processing of massive amounts of structured and
unstructured data on distributed/cloud computing environments. This work involved challenging problems in
terms of devising efficient techniques for data extraction/loading using optimized import/export interfaces
supported by the different databases and Enterprise Data Warehouses. Other challenges included the metadata
management and data integration problems across such autonomous distributed systems, and the analysis and
optimization of the performance and scalability of such solutions. Investigating techniques for better resources
utilization for MapReduce jobs in terms of resource management and scheduling strategies.
Contributions to various open-
source projects including Apache Hadoop Common, MapReduce and Yarn, Apache
Sqoop and Apache Flume.
Apache Sqoop committer and member of the Project Management Committee (PMC). Sqoop is a open-
source
tool designed for efficiently transferring bulk data between Hadoop and structured datastores such as
relational databases.
Apache Flume committer and PMC member. Flume is an open-
source distributed, reliable, and available service
for efficiently collecting, aggregating, and moving large amounts of log data.
Yahoo! Inc., Nov. 2008 Nov. 2010 Sunnyvale, CA
Senior Software Engineer Cloud Computing
Building solutions for managing massive amounts of structured and unstructured data on distributed/cloud
computing environments. Design of efficient models and techniques for metadata management, data processing
and performance optimization on the cloud.
Conceived, designed and led the development of MapReduce-
Legos; a data processing abstraction layer on top of
Hadoop MapReduce; this layer provides a refined model for MapReduce jobs, enabling an optimized way of
describing and running Extract Transform and Load (ETL) workflows on Hadoop MapReduce clusters. The
project is used by production systems to process petabytes of data on daily basis at Yahoo! Inc. The work was
also published and demoed in a number of internal Yahoo! conferences. A peer-
reviewed article describing this
achievement was published in the International Journal of Cloud Computing.
Designed and developed a declarative SQL query engine on top of Hadoop MapReduce and distributed file system.
IBM Research, May Aug. 2006, May Aug. 2007 San Jose, CA
Graduate Research Intern -
Information Integration (IBM Almaden Research Center).
Designed a novel similarity measure and top-
k enumeration algorithm used to quantify the distance between
schema concepts in the schema integration problem to efficiently calculate the best k candidate integrated
schemas. This work has significant importance in metadata management in cloud computing systems as it
facilitates the process of federating data from multiple autonomous data sources and generating a unified non-
redundant representation of the data. This work had led to the publication of an article in the ACM SIGMOD
conference.
Studied the problem of expressing Extract, Transform and Load (ETL) dataflows using declarative mapping
semantics, and vice versa. This work is adopted by IBM and is being productized as the FastTrack component of
IBM Information Server. These contributions led to the submission of a patent disclosure describing this work.
The work was also demoed in the IBM Information On Demand (IOD) conference and an article detailing the
work was published in the IEEE ICDE.
University of Miami, Jan 2005 Oct. 2008 Miami, FL
Research/Teaching Assistant Electrical and Computer Engineering
Conducting research on information integration in a grid environment with applications on bioinformatics.
Designed a web services-
based data federation architecture for bioinformatics applications. The system is called
Biofederator and was awarded the prestigious IBM Faculty Award in 2009. Based on collaborations with
bioinformatics researchers, several domain-
specific data federation challenges and needs are identified. The
BioFederator addresses such challenges and provides an architecture that incorporates a series of utility services.
These address issues like automatic workflow composition, domain semantics, and the distributed nature of the
data. It also incorporates a series of data-
oriented services that facilitate the actual integration of data. The
BioFederator is deployed on a grid environment over the web. An article describing the design was published in
the AAAI IIWeb; additional details and applications were presented in a book chapter published by IOS Press.
Studied and developed novel data integration and processing techniques for data intensive applications, this
work was applied to a bioscience study where, for the first time, we presented a whole genome prediction of
nucleosome exclusion regions for the human genome. The details of this work were published in an article in the
BMC Genomics journal in 2008 and featured in the 57th Annual Meeting of the American Society of Human
Genetics (ASHG). Also the output results were made available to the scientific community as part of the
University of California at Santa Cruz (UCSC) Genome Browser custom data tracks.
Research team member of the Latin American Grid (LAGrid-
BioGrid) project; BioGrid is addressing research
issues for enabling grid computing technologies in bioinformatics applications. My work focused on: studying,
designing and developing Grid/Web services for bioinformatics applications.
Electronics Research Institute (ERI), Feb. 1999 Aug. 2004 Cairo, Egypt
Researcher Electrical and Computer Engineering
Member of the Parallel Processing Team and team leader for a number of projects sponsored by the National
Science Foundation (NSF) U.S.A. and the European Union.
Designed parallel/distributed texture segmentation techniques that were applied on real-
time distributed
surface inspection systems. I have discovered simple, elegant, and yet very powerful and useful parallel
algorithms that advanced the real-
time distributed computing inspection systems. These scientific contributions
led to publishing an article in the IEEE SMC conference, and another detailed article in Elsevier PRL journal.
MentorGraphics Corporation, Jul. 2001 Jul. 2002 Cairo, Egypt
R&D Engineer Modeling and Interconnectix (ICX)
Studying problems in Electronics Design Automation (EDA) and designing and developing EDA tools and
simulation packages, these tools were used in The IBIS to Spice converter, to generate SPICE models from IBIS
data sheet files. My studies and designs enhanced the modeling process in terms of time and accuracy.
PUBLICATIONS
Peer-
reviewed Book Chapters:
Rosa Badia, Gargi Dasgupta, Onyeka Ezenwoye, Liana Fong, Howard Ho, Sawsan Khuri, Yanbin Liu, Steve Luis,
Anthony Praino, Jean-
Pierre Prost, Ahmed Radwan, Seyed Masoud Sadjadi, Shivkumar Shivaji, Balaji
Viswanathan, Patrick Welsh, and Akmal Younis, "High Performance Computing and Grids in Action, chapter
Innovative Grid Technologies Applied to Bioinformatics and Hurricane Mitigation," pp. 436-
462, IOS Press, ISBN
978-
1-
58603-
839-
7, Amsterdam, 2008.
Peer-
reviewed Articles in Journals:
Ahmed Radwan, Akmal Younis, Santhosh Srinivasan and Abhay Gupta, MR-
LEGOS: A Refined MapReduce Model,
International Journal of Cloud Computing (IJCC) 1(1), 2011, pp. 58-
80.
Ahmed Radwan, Akmal Younis, Peter Luykx and Sawsan Khuri, "Prediction and analysis of nucleosome exclusion
regions in the human genome," BMC Genomics, 2008, pp. 9:186.
Ahmed Abouelela Radwan, Hazem M. Abbas, Hesham Eldeeb, Abdelmonem A. Wahdan and Salwa M. Nassar,
"Automated Vision System for Localizing Structural Defects in Textile Fabrics," Elsevier Pattern Recognition
Letters, 26, 2005, pp. 1435-
1443.
Peer-
reviewed Articles in Conferences:
Ahmed Radwan, Lucian Popa, Ioana Roxana Stanoi, Akmal A. Younis, Top-
k generation of integrated schemas
based on directed and weighted correspondences, ACM SIGMOD Conference, 2009, pp. 641-
654.
Stefan Dessloch, Mauricio A. Hernandez, Ryan Wisnesky, Ahmed Radwan, Jindan Zhou, Orchid:Integrating
Schema Mapping and ETL, IEEE International Conference on Data Engineering (ICDE), 2008, pp. 1307-
1316.
Ahmed Radwan, Akmal Younis, Mauricio Hernandez, Howard Ho, Lucian Popa, Shivkumar Shivaji, and Sawsan
Khuri, "BioFederator: A Data Federation System for Bioinformatics on the Web," Proc. AAAI Sixth Int. Workshop
on Information Integration on the Web (IIWeb) 2007, pp. 92-
97.
A. Abouelela Radwan, H. Abbas, H. El deeb, S. Nassar, "A statistical approach for textile fault detection," Proc. IEEE
conference System, Man, Cybernetics (SMC), 2000, pp. 2857-
2861.
Presentations, Abstracts and Posters:
Ahmed Radwan, Santhosh Srinivasan and Kalyan Ayloo, MR-
LEGOS: A Data Warehousing ETL Toolkit, Yahoo!
TechPulse conference, 2010.
Ahmed Radwan, Ryota Egashira, Brian Keefe, A MapReduce Approach for Efficient Data Extraction from
Database Management Systems, Yahoo! TechPulse conference, 2010.
Ahmed Radwan and Abhay Gupta, Lotus MapReduce Legos, Yahoo! TechPulse conference, 2009.
Sawsan Khuri, Ahmed Radwan, Peter Luykx and Akmal Younis, Nucleosome Exclusion Regions across the
Human Genome, American Society of Human Genetics (ASHG) 57th Annual Meeting, San Diego, California, 23-
27
October 2007.
Ahmed Radwan, Lucian Popa and Ioana R. Stanoi, Calculating Confidences and A Cost Function for Ranking
Schema Integration Alternatives, IBM Almaden Research Center Intern Showcase, 2007.
Ahmed M. Radwan, Ryan Wisnesky, Jindan Zhou, Didier Garcia, Bo Shao, Stefan Dessloch, Mauricio A. Hernandez,
Lucian Popa and Howard Ho, Orchid: ETL Mapping Transformation with Clio, IBM Almaden Research Center
Intern Showcase, 2006.
EDUCATION
Doctor of Philosophy (Ph.D.) in Electrical and Computer Engineering.
Thesis Title: Information Integration in a Grid Environment -
Applications in the Bioinformatics Domain. University
of Miami, U.S.A, Dec. 2010, GPA: 4.0.
Master of Science (MS) in Electrical and Computer Engineering.
Thesis title: Image processing -
Statistical Approach for Texture segmentation -
An implementation on a parallel
inspection system. Ain Shams University, Cairo, Egypt, Aug. 2002.
Bachelor of Science (BS) in Electrical and Computer Engineering.
Ain Shams University, Cairo, Egypt, 1998. Graduation Project: V-
CAD: An FPGA based Design Flow. (Grade:
Distinction). Electronic design automation tool including a schematic capture, a VHDL netlister, an automatic test
pattern generation and a PLA synthesis tool. The tool was featured in the Design Automation & Test in Europe
(DATE) conference in the year 1999, and was developed using Visual C++.
HONORS
-
Membership of the Eta Kappa Nu HKN International Honor Society for Electrical Engineers (2006 Present).
-
Membership of the Institute of Electrical and Electronics Engineering IEEE (2008-
Present).
-
Membership of the Association for Computing Machinery ACM (2009-
Present).
-
My BioFederator research work was awarded the prestigious IBM faculty award in 2009.
SERVICE TO PROFESSION
Conference Reviewer: VLDB 2007 The 33rd Very Large Data Bases Conference.
Conference Reviewer: ICDE 2008 The IEEE 24th International Conference on Data Engineering.
Conference Reviewer: ICMT 2010 International Conference on Model Transformation.
TECHNICAL SKILLS
Programming using Java, C/C++, Visual C++, Pascal, Prolog, x86 assembly and network programming using
sockets.
Familiar with the following programming environments: MS Win95/2000/XP, Solaris UNIX, Red Hat Linux,
PARIX (Parallel UNIX), and PVM (Parallel Virtual Machine).
Special purpose languages: VHDL, JavaCC, Lex & Yacc, SQL, XQuery, SPARQL, PHP, JSP, and MATLAB.
ETL and data warehousing tools (IBM Datastage).
Eclipse, Rational Rose, UML, EMF, Apache Axis, web services, HTML, XML, RDF and OWL semantic web
technologies, Hadoop MapReduce, Pig Latin.