EDWARD COHEN
Email: *********@********.***
Address: ** *** *** ******
City: GORDONVILLE
State: TX
Zip: 76245
Country: USA
Phone: 903-***-****
Skill Level: Experienced
Salary Range: $208,000
Willing to Relocate
Primary Skills/Experience:
See Resume
Educational Background:
See Resume
Job History / Details:
DR. EDWARD I. COHEN
39 Sky Way Circle
Gordonville, Texas 76245
"Microsoft Word Resume Available on Request"
PERSONAL SUMMARY
Dr. Cohen is a senior Information Technology expert, with broad experience in the areas of system design, development, test, and implementation. He has provided leadership in many projects in both senior technical and project management roles. He frequently works on system infrastructure design in complex, heterogeneous environments. He has deep experience in system performance analysis, performance modeling, capacity planning, and load testing. He has experience across many system architectures, including distributed heterogeneous systems, client/server, and traditional mainframe environments.
Dr. Cohen has been the technical lead in many system management process engineering projects. In particular, he has worked on the design, development, and implementation of 1) Change Management, 2) Problem Management, and 3) Disaster Recovery.
WORK EXPERIENCE
(03/2009-10/2012) Implemented and executed the Capacity Planning Process for Frito-Lay (PepsiCo) that was defined in fall 2008. Basic responsibility was advanced data analysis, modeling, and defining the plan for server upgrades based on workload growth trends. Recent results allowed Frito-Lay to delay a 5 million dollar system upgrade from 2010 to 2012. See description of this process below in item for (08/2008 - 01/2009).
(10/2009 - 01/2010) Hyperformix Performance Optimizer (P.O.) and its custom modeling solution for SAP applications was used at PepsiCo to model a major implementation of the core corporate process for Order Management (ECC). The official deployment plan and schedule was used to establish calendar checkpoints and expected workload growth. Each major checkpoint was modeled with both HP Superdome UNIX servers and IBM AIX servers to determine required server configurations through full rollout in 2013.
(01/2009-03/2009) Led the -Stress Test and Customer Acceptance Analysis- efforts for a new web-based, multi-tier client-server application. The implementation was in Java, and it runs on a set of VMWARE Linux servers hosted on IBM X-server platforms. The -Virginia Education Wizard- provides guidance and assistance to high school students in selecting their career path and college education plans. It includes videos, on-line aptitude tests, information about colleges and universities, and information on available financial resources. Responsibilities included training and mentoring the young test team in designing, coding, and debugging a set of 11 test scripts using the Rational (IBM) Robot and Performance Tester suite of test tools. The team was trained in proper testing methodology, Application Performance Analysis, and System Capacity Planning. Technical responsibilities included application analysis, executing test scenarios, monitoring the server CPU statistics using the EM7 performance appliance and Compuware -APPVANTAGE- to analyze network packet traces. The initial results indicated that the system was constrained and unstable, even at low loads. At the start, no more than 15 active users (requirement is hundreds to thousands of users) could be supported on the planned hardware configuration which was particularly constrained at the DB server tier. The full load requirement couldn`t be met even with the maximum server configurations possible. As a result, crash efforts were put into motion to address various aspects of the capacity shortfall. The Oracle DB was fine-tuned, the Java application was scrubbed over and over to reduce the resource usage of many application functions and sub-functions. In addition the -Memcache- look aside database buffering product was added to the Application Servers to reduce the DB calls for static table data. As a result of our efforts, at first availability the production configuration could support thousands of concurrent users. The -WIZARD- deployment went off without a hitch on schedule and has not hit any performance or capacity issues. The network packet trace analysis showed that the team had accomplished an 80% reduction in network traffic between the application servers and database server, which also is a measure of the reduction in server loads.
WORK EXPERIENCE (cont.)
(08/2008-01/2009) Worked on two consulting engagements for Frito-Lay, a large snack food manufacturer and a division of PepsiCo Corp. In the first project the current workload profiles of nine LPARs across five IBM Mainframe systems were analyzed, and the workload growth history over 18 months was studied. Results identified a significant capacity constraint on the largest system that needed immediate attention. Database parameter changes were recommended and implemented providing some relief. Recommended improved workload balancing across systems which allows the deferral of a CPU upgrade for 4 quarters.
In the second engagement a complete -Proactive Capacity Planning Process- was developed to insure that system capacity management would be done on an ongoing basis. The process is general and can be used for all corporate platforms (IBM Mainframes, UNIX servers, and WINTEL servers). The process consists of five phases: 1) Gathering Capacity Data, 2) Retrieving Capacity Data, 3) Converting Capacity Data, 4) Capacity Model Runs, and 5) Generating Charts and Reports. The process was defined and documented at three levels: Process Overview, Design Documentation for each Phase, and Implementation Specifications. A data archive was designed for all related Capacity Planning data files, charts, and reports. Deliverables included: Full Process Documentation, Data Conversion Script, Capacity Planning Simulation Model, Capacity Planning Data Archive, an Automated Script to drive the process, and a standard documentation package (using Crystal Reports).
(10/2007-06/2008) Worked on two consulting engagements for ING, a large financial management corporation. They are planning to migrate their IBM Mainframe applications to a new release of a major application package that will also require migration of their database to DB2 from the current non-relational database technology. The application currently has problems meeting its batch SLA`s at both month-end and quarter-end. The analysis approach was to do a simulation model of the application capacity requirement after the application/database migration, two years of projected workload growth, and other workload changes. The results fell within a range that reinforced the customer expectations of required CPU capacity across a 24 month post deployment period. Additional results were obtained by doing a sensitivity analysis across a range of values quantifying the increased CPU requirements of the new application release.
The vendor`s performance claim was used as the default hypothesis for this phase of the modeling. The planned database migration has typically been found to cause a 25% CPU increase for database access. A sensitivity study was done to determine the potential bottom line impact across a range of application database intensities. During the first engagement described above I noticed symptoms of seriously inefficient database I/O in the current application environment. This led to a second engagement to study the current I/O pattern which would be used for another 2 years. In this study I found that the database I/O rate was much higher than it should have been, particularly in the database index components. I found that it was the typical problem of not keeping the database structure up to date and not using the most efficient access pattern. I developed new guidelines in these areas and trained a group of customer application support personnel to go through each file used by each job to correct what I call -Performance Bugs-. The customer wanted quick hitter improvements that would not require code changes and the resulting extra testing. Dozens of key batch jobs were selected for testing in top down order based on total run time (2-20 hours) and total I/O counts. Easy to implement changes were installed in the test environment, and the resulting improved jobs were benchmarked on the same data as the prior month-end run. The I/O count was typically reduced by 30% or more, frequently by over 50%, and in some key huge jobs (10-20 hours) the savings exceeded 90% of the I/O load.
(03/2007-07/2007) Traveled to Johannesburg, South Africa to work on an IBM engagement as a -Senior IT Testing Architect- at Standard Bank of South Africa, a major South African bank. Server environment consisted ran on SUN/Solaris and included both a variety of Sun servers and Fujitsu servers providing Sun server images through virtualization and partitioning. Performed an audit of their entire cradle-to-grave testing process. The results were viewed as a negative evaluation of their current efforts because their infrastructure and application testing was found to be incomplete, immature, and dangerous. They don`t sufficiently test new infrastructure and applications before deploying them world-wide. The danger is that what should be the final phase of testing (load, stress, and performance testing) occurs in production using their customers as test subjects. The organization and executives run hot and cold on testing. Just after a new application disaster they are hot to improve their testing, but after things settle down they decide that they really can`t afford to increase the overall testing budget. Two major recommendations were considered: 1) implementation of a new test lab for load, stress, and performance testing done by a dedicated staff of testing experts and 2) combining -Integration Test- and -User Acceptance Test- into one test phase to reduce cost, overhead, and schedule impacts.
WORK EXPERIENCE (cont.)
(08/2006-12/2006) Used Hyperformix Performance Optimizer simulation models to analyze the capacity of system servers for Motorola, the largest world-wide manufacturer of cell phones. The goal was to determine if the existing Sun/Solaris server configuration was capable of supporting the expected workload during the peak holiday season. The applications on these servers supported cell phone assembly lines in Europe, South America, and Asia. IPS Simulation Models were written for the major online applications and their associated background batch components. Available current data for resource consumption and assembly line workload was gathered and used to generate a base workload profile. The model was then cranked up to the workload levels expected for the holiday season to determine if the workload could be supported on the configured set of servers. In some cases the identification of a significant fixed background load led to the conclusion that the installed hardware would be sufficient. In one case the modeling results showed that there would be a problem. We recommended that the server in danger be reconfigured to increase its capacity.
(11/2004-06/2006) Worked as a Senior Consultant in the Performance and Capacity Planning department of a large financial corporation. Overall configuration consisted of five major computing centers spread across the country. Environment at each location included a large IBM mainframe configuration, multiple large AIX LPAR server configurations, and an even larger configuration of powerful Windows servers. The I/O architecture was based totally on a large configuration of high-end SAN subsystems. Major projects included the early study of a major new three-tier IBM Websphere-based corporate application. This application is a new system for insurance agents providing quotations for fire insurance policies. Responsibilities included the design and execution of a Performance Stress Test and then using the test data to drive an IPS simulation model to predict the performance and capacity of the application across the expected workload levels for the next ten years. Results were very encouraging, and the Websphere application appears to be very efficient and scales linearly. Responsibilities included high level script design and analysis as input to the Silk scripting group.
In another project a major database conversion program was reviewed, and design improvements were formulated and tested. The database was scattered across multiple MS SQL-Server platforms and consolidation on a single AIX DB2 database was recommended. The runtime of the conversion program initially was projected to consume 560 hours of CPU time on the most powerful IBM mainframe. The redesigned program was projected to run in 4 hours or less, using more efficient I/O processing, more aggressive I/O Memory Caching, and exploiting concurrency and multiprocessing. Capacity Planning models (Excel) were implemented for a variety of web-based workloads, including MS Active Directory and a network of banking ATMs. In another project a sequence of ten IPS simulation models were used to study the capacity of a proposed large distributed corporate-wide application. The models used a high level of design abstraction to study the characteristics of the interaction of all application phases to identify the most problematic mix of application programs.
(01/2003-09/2004) Produced the Network System Management infrastructure (NSM) for Caterpillar Tractor. The overall distributed system architecture of NSM was defined, and the initial set of processes was selected for design and implementation. Processes for Change Management and Problem Management were defined through many levels of detail. Personnel roles and responsibilities were identified, a corporate wide NSM organizational structure was developed, and the initial staff was trained for their new roles. After the processes were defined a -Vendor Tool Selection- was performed based on the support needs of the defined processes. Approximately 20 vendors were studied and analyzed including product demonstrations. The Remedy Action Request (AR) system was selected to manage all system changes and problems. The last phase of the project was a pilot implementation of the above processes and tools with a team of staff members and vendor consultants. In Phase 2, a Global Configuration Repository for a three-tier client/server environment was designed, and a Configuration Management process was defined to control the accuracy of the repository. Relationships to Change Management and Problem Management were identified, and their functional interface requirements were defined.
(01/2002-10/2002) Analyzed the performance and capacity of a new SAP based application for FORD Motor Company. Used the Hyperformix IPS (Integrated Performance Suite) product set to model the new customer environment. Used IPS Navigator to manage the overall modeling methodology. Used IPS Profiler to profile existing applications and to produce estimated profiles of the new application from design documentation. Used IPS AMG (Application Model Generator) to define the modeled applications. Executed simulations of the new application running on a variety of distributed client-server hardware configurations using the IPS Infrastructure Optimizer. Directed single-user testing to gather Network Sniffer Traces in the customer`s SAP environment. Worked on the development of a new SAP application profiling methodology using SAP transaction log data. In a follow-on phase of the project an SAP User Exit program was designed, coded in the SAP application language ABAP, tested on Windows and AIX servers, and finally run on a large Solaris based production environment covering eight different SAP servers. File transaction log data, externalized through an ABAP User Exit, was used to define application transaction resource usage profiles. A successful Proof of Concept was achieved when these profiles were then used to populate a Hyperformix AMG model of the SAP production environment.
WORK EXPERIENCE (cont.)
(09/1998-12/2001) Led a small team of senior technical consultants in attacking the problems of poor performance leading to system instability at Target, a major national retail chain. Worked as both project manager and technical lead consultant. The team identified, analyzed, and fixed multiple problems across many system components including CICS transaction flows, DB2 database, VSAM files, I/O configuration, and network management. Over a three month period system performance and stability were improved sufficiently to handle the busy holiday shopping season.
Led a major stress test of the client`s new Y2K application suite to identify functional problems and verify performance and capacity planning estimates. Performance and capacity goals were exceeded for all key workload components in terms of transactions per second and response time. Many functional problems were discovered and corrected with cooperation from the application development team. These problems typically required a high workload level or specific application conditions to manifest themselves and could only be found in a stress test environment. The subsequent smooth rollout of the Y2K applications was the result of this intensive test effort. Loadrunner was a key component of the test environment.
Dr. Cohen also worked with the client`s support team during the application rollout and the post rollout transition period. The solid performance and stability of the new application suite and upgraded system configuration allowed the client`s team to quickly move on to follow-up application components. He then worked with application development on additional components such as tuning the overnight batch workload to improve its capacity and shorten its runtimes.
(11/1996-06/1998) Analyzed the need to consolidate multiple workloads for efficiency and to support a significant increase in workload processing capacity for the largest on-line and batch application systems at a mid-size mortgage processing company. This problem was attacked on all fronts, including significant hardware upgrades, software upgrades, operational improvements, and application re-engineering. A roadmap leading to a 10X increase in workload capacity was formulated. All major hardware components, including CPU, memory, channels, control units, DASD, tape transports, and tape libraries, have been upgraded to larger configurations of more powerful units. Key applications were redesigned to take advantage of the new hardware environment and to exploit concurrency, parallelism, and the concept of I/O avoidance through Data in Memory (DIM). The programming staff was retrained to focus on parallelism and I/O efficiency in their program designs. A state of the art workload manager was added to the environment to increase the productivity of the user community. Custom program interfaces between the workload manager and the legacy system were designed and implemented for all critical online and batch application systems.
(01/1996-11/1996) Worked on a Systems Management engagement at Caterpillar, a large earth-moving equipment manufacturer. My major design responsibilities covered the areas of Problem Management and Change Management. The key deliverables were detailed process definitions including all required support documentation and staffing/training recommendations. In addition I led a pilot implementation of Network Management, Event Management, and Configuration Management using the Tivoli suite of products. Multiple system platforms were included in the corporate environment, and the key ones were HP-UNIX, SUN-SOLARIS, and Oracle DB.
(09/1995-12/1995) Worked on an engagement designing detailed processes to manage the multiple aspects of system management at Mutual of Omaha, a large insurance company. Processes included Performance Engineering, Capacity Planning, Problem Management, Change Management, Network Management, Configuration Management, and Disaster Recovery.
(05/1995-07/1995) Designed detailed System Management processes for NationsBank. Key topics covered include 1) Problem Management, 2) Change Management, 3) Configuration Management, 4) Data Management / Recovery, and 5) Disaster Recovery. The key deliverables were detailed process definitions including all required support documentation.
(03/1995-05/1995) Analyzed the information technology requirements for a large retail company to move from a mainframe host-centric environment to a client/server architecture including an IBM MVS mainframe as an enterprise server, RS/6000 AIX systems with Sybase as database servers, Windows/NT LAN Servers, and a combination of Windows and MAC desktop workstations. This is a key function in driving their corporate financial applications, including billing, customer order entry, and a credit authorization application which interfaces to a credit-checking service bureau for large contractor orders. Specific technical focus areas included data backup/recovery, network management, performance and capacity planning, help desk function, and workstation standards and services.
WORK EXPERIENCE (cont.)
(06/1994-12/1994) Led a team of application experts at the U.S. Naval Academy to implement a complete replacement for its administrative applications. The new system environment was a cluster of UNIX SMP (Symmetric Multiprocessor) servers running ORACLE database software. The configuration also includes High Availability RAID data storage. The development environment for the new applications utilizes a Client-Server Architecture and the ORACLE CASE methodology. Responsibilities included System Architecture, Project Management, and Project Planning. A long term plan for the full life cycle of the project was developed including User Requirements, System Design, Application Implementation, Testing, and System Management. The life cycle includes transition from development to production, application deployment across diverse user groups, and the management of the environment after deployment.
(08/1993-04/1994) Led a combined consultant and staff team in an engagement to design and implement "System Management Strategies for A Distributed Client/Server Complex" at Panhandle Eastern. Key areas included 1) Change Management, 2) Problem Management, and 3) Capacity Planning. The key deliverables were detailed process definitions including all required support documentation.
(04/1993-07/1993) Designed the Architecture of a ground data center for Motorola`s Iridium Satellite Phone System. High availability was the key goal of the architecture. It basically consisted of fully cross-connected dual SUN servers running SunOS and Oracle. It also used Data General High Availability Clariion RAID disk subsystems.
(09/1991-12/1992) Managed all aspects of a large project in an airline service company to replace an IMS mainframe database system with a dedicated server platform using UNIX and the ORACLE relational database product set. The project was in support of a corporate strategy of total out-sourcing of all I/S functions. The project team consisted of 35-40 professionals, including programmers, testers, user reps, and project administrators. Responsibilities included project planning, recruiting, budget and manpower, vendor and subcontractor negotiations, overall system architecture and application design, hardware selection and installation, facilities management contracts, and user training and documentation. In support of selecting the server platform an -Analytical System Selection Methodology- was developed to quantify and compare over 40 UNIX vendors to select an optimal out-sourcing business partner. This methodology can be customized to find the optimum match for any given set of user requirements across a range of factors, including performance and capacity, connectivity capabilities, software function, dedication to standards, reliability, system administration functionality, and vendor financial stability. System selection included negotiating bids with vendors based on the "Total Cost of System Ownership" over a 5 year horizon to obtain the highest value per price.
(05/1978-07/1991) Dr. Cohen was responsible for the performance of large systems for the IBM corporation, including operating systems, subsystems (e.g., IMS, CICS, DB2, VSAM), and system hardware (processors, control unit, DASD, etc.). Managed a group of senior computer scientists, programmers, and engineers. Prime focus areas included overall system directions, coordination of multiple hardware and software product plans, and design and performance analysis of specific functions such as Storage Hierarchies (Expanded Storage, DASD, and Processor Caching Structures), Multi-System Coupling, Cooperative Processing, and Enhanced I/O Architectures. Was department manager of a large test organization responsible for system-wide functional and performance test.
Technical accomplishments include developing performance definitions, metrics, and sound system performance measurement methodologies. Established the use of proper "Scientific Method" in performance evaluation and experimentation. Designed the first successful implementation of Symmetric Multiprocessing, which is still used in all IBM and IBM-compatible mainframe systems. Discovered and quantified Large System Effects which degrade system performance at high transaction loads. Worked on a variety of ad-tech projects which have led to the introduction of Expanded Storage, Advanced Systems Coupling, and the Data in Memory (DIM) strategy.
EDUCATION
B.S., Physics, Rensselaer Polytechnic Institute, Troy, N.Y.
M.S., Computer Science, The Ohio State University, Columbus, Ohio
Ph.D., Computer Science, The Ohio State University, Columbus, Ohio
HONORS AND AWARDS
IBM Management Excellence Award for both technical and people management.
IBM Systems Research Institute, Guest Faculty.
IBM Outstanding Technical Achievement Award (OTAA) for "Exploitation of Processor Storage".
IBM Outstanding Technical Achievement Award (OTAA) for "MVS/XA True Ready Queue Design".
IBM Outstanding Technical Achievement Award (OTAA) for pioneering work in the field of "Large System Performance Analysis".
Transaction Processing Performance Council (TPPC), Technical Advisor on -Performance Methodologies-
PATENTS
"DYNAMIC QUEUING METHOD" - (U.S. and International)
"A DISPATCHER SWITCH FOR A DYNAMIC SYSTEM PARTITIONER" - (U.S. and International)
MAJOR PUBLICATIONS
"Storage Hierarchies", IBM Systems Journal
"Coupled Systems for Performance", SHARE European Association (SEAS) Conference
"The IBM Large System Performance Reference (LSPR)", IBM Customer Education
"MVS/XA Performance Considerations", SHARE User Group Conference Proceedings
"An Analytical System Selection Methodology", CMG International Conference Proceedings
"Storage Hierarchies - A Natural System Structure", CMG International Conference Proceedings
"A Domain Strategy for Computer Program Testing", IEEE Transactions on Software Engineering
MAJOR PRESENTATIONS
"Large Systems: Where We Have Been and Where We Are Going", Invited Keynote Address at Large System Performance Conference
"Coupled Systems for Performance" SHARE European Association (SEAS) Conference
"MVS/XA Performance Considerations", SHARE User Group Conference
"Storage Hierarchies - A Natural System Structure", CMG International Conference
"Semiconductor Storage Systems", DATAQUEST Conference
"An Analytical System Selection Methodology", CMG International Conference
"MVS/XA Values & Performance Expectations", IBM Worldwide Customer Show
REFERENCES
References will be provided upon request.
"Microsoft Word Resume Available on Request"