Post Job Free
Sign in

Design Engineer,Embedded System, VLSI, Verification, Test, C, C++

Location:
Los Angeles, CA
Posted:
January 25, 2019

Contact this candidate

Resume:

SOHEIL SHABABI www.shababi.us *******@***.*** 323-***-**** 3025 Royal St. #Apt 344B, Los Angeles, CA 90007 Objective: Seeking for fulltime position for CPU/ASIC/CAD/Physical Design and Verification starting from May 2019 EDUCATION

University of Southern California, Master of Science in Computer Engineering. Los Angeles, CA Expected May 2019 Courses: Digital System Design (EE560), Advanced Computer Architecture (EE557), VLSI System Design (EE577b, EE577a), Diagnosis and Design of Reliable Digital Systems (EE658), Computer Aided Design (EE681), Mathematical Foundations for System Design (EE599), Computer System Organization (EE457) Shahid Beheshti University, Bachelor of Science, Computer Engineering, Tehran, Iran May 2017 Thesis: Hardware Acceleration of Image Edge Detection on FPGA-SoC platform using OpenCL, demo: http://www-scf.usc.edu/~shababi/OpenCL.html WORK EXPERIENCE

Hardware Design Engineer, Naminic Corporation, Tehran, Iran May 2015 - Sep 2016

• Designed and implemented a CNC Controller on FPGA Platforms, using NIOS2 ipcore

• Developed application layer of Network Management Software which connects a network of ARM Processors together for Laser-Tag products, using Qt platform TECHNICAL SKILLS

Processors Knowledge: Out of Order (Tomasulo), Multicore (CMP), VLIW, Superscalar, SMT(HTT), Vector Processors Verification: UVM, Formal Verification, System Verilog Assertion Test Knowledge: DFT, Boundary Scan, ATPG, Fault Simulation Data Transmission Protocols: Cache Coherency Protocols PCIe, AXI, APB, DDRx MOESI (Snooping Based), ccNUMA 3-Leg (Directory Based) Hardware Description Languages: Scripting Languages: Software Development Languages Version Controls: System Verilog, Verilog, VHDL, OpenCL TCL, Bash, Python, Perl, Make C/C++, Java Git, Github Tools:

ASIC 45nm design flow: Design Compiler(Synthesis), Innovus(PnR), PrimeTime(STA), Cadence Virtuoso(Layout), ABC(Technology Mapping, Logic Optimization) Simulation: NCSim, QuestaSim, Modelsim, Isim, HSpice FPGA design flow: Vivado, Xilinx ISE(Nexys4), Altera Qsys/Quartus (DE2, DE1SoC) Architecture Design Tools: SimpleScalar, Cacti, PIN PUBLICATION

Deep Learning-based Circuit Recognition Framework using Sparse Mapping and Level Dependent Decaying Sum Circuit Representation Arash Fayyazi, Soheil Shababi, Pierluigi Nuzzo, Shahin Nazarian, Masoud Pedram – Accepted for DATE 2019 Conference, Italy, Florence ACADEMIC PROJECTS

NoC: Developed a Chip Multi-Processor(CMP) in a Ring Router NoC from RTL Design to Layout. Connected processor cores to router nodes by implementing network interface component (NIC). Performed RTL coding, RTL simulation, synthesize, post synthesize simulation, place and route (PnR), post PnR simulation, and Static Timing Analysis (STA) on Chips Multi Processor(CMP). Achieved clock period of 5.12(nm), 11.4

(nm) and 15(nm) for pre-synthesis, post-synthesis and post-PnR phases, respectively, and layout area of 686000 (um2) in 45nm technology.

CAD: Designed and implemented a VLSI Placement CAD Tool optimized for low-power and high-speed technologies such as AQFP, using C++ and shell scripts. Developed a detailed placer for level-based placement. Solved a Linear Assignment Problem by Hungarian Algorithm to find optimal placement for each cell. Implemented global placer to minimize wire length, and legalization to remove overlaps between cells.

UVM: UVM based Constrained Random Verification of AMBA APB Slave. Designed Sequence Generator to produce TLM Sequence Items. Designed Driver to convert TLM sequence items to APB pins signals. Designed Monitors to convert APB pins to transactions.

Ethernet: Constrained Random Verification of an Ethernet Switch. Generating random constraints packets and using Mail Box to transfer packets between driver, monitor and packet generator. Monitors on all ports of the DUT, Checker that compares the packets from the Monitors to maintain score.

PCIe: A 2 lane-2 link PCIe architecture on Xilinx Artix-7 FPGA. RTL design and synthesis of “Elastic Buffer” to detect ordered sets (SOS),

“Deskew Buffer” to adjust delays between lanes, an “Decoder 10b/8b” to adjust running disparity and avoid DC coupling noise. FPGA prototyping verification by measuring relative skew between lanes and adding trigger conditions in Chipscope ILA.

AXI: Developed Advanced eXtensible Interface (AXI) protocol with five channels to connect 4 processors to 4 memories in a 2x4 mesh interconnect with deterministic routing logic “x first, y next”. Managed out of order packets using reorder buffers. Implemented TCL scripts to verify functionality of AXI protocol by comparing output result with golden files.

OoO: Out of Order (OoO) Tomasulo processor which performs speculative instruction execution beyond branches by using dynamic instruction scheduling and dynamic branch prediction using Branch Prediction Buffer (BPB) and Return Address Stack (RAS). RTL design, synthesis and FPGA implementation of modules like reservation tables (to avoid CDB conflict), dispatch unit, FRAT, RRAT, FRL, PRF, and CFCs.

CMP: Chip Multiprocessor (CMP) with a 7-stage 16-threaded CPU (4 Thread 4 Core – 4x4), L1 private cache and L2 shared cache. Maintained Cache Coherency and sequential consistency in L1 cache of 4 cores by implementing non-blocking cache with MOESI protocol, SCU, CCU, MSHR,WSHR,FSHR, SC, LL, and Cache Maintenance instructions. Accelerated matrix multiplication by 16X by utilizing 4 cores.

CDC: Cross Domain Clocking (CDC) in a dual clock FIFO with double synchronization FFs and gray codes to control metastability in a CDC environment. Applied a dual ported Flow Through and Pipelined SSRAM for FIFO memory, and a small register based memory as internal FIFO of consumer with FWFT (Fast Forward Fall Through) technique. Performed FIFO width and depth expansion, using Ping Pong method.

SRAM: Designed 1Kb 6T SRAM with 2 Banks in Cadence Virtuoso. Applied Super Buffer to adjust input capacitance of SRAM address decoders. Utilized pre-decoders to accelerate address/column decoders. Designed Sense Amplifier, Precharge circuit, SRAM cell, SRAM Banks, Address/Column Decoders, Read/Write Circuitry. Archived frequency of 1 GHz in 90nm CMOS technology. CERTIFICATIONS

TCL Programming from Novice to Expert, Udemy, Certification Link: https://www.udemy.com/certificate/UC-3DYSCLRW/

Master Git and Github, Udemy (ongoing course), Course Link: https://www.udemy.com/github-ultimate/

GNU Programmer (with focusing on makefile, gdb debug), Udemy, Certification Link: https://www.udemy.com/certificate/UC-8VYAWMCQ/



Contact this candidate