Post Job Free
Sign in

Signal Processing Assistant

Location:
Houston, TX
Posted:
January 30, 2013

Contact this candidate

Resume:

Guohui Wang

PhD student, ECE Department, Rice University, Houston, Texas

Webpage: www.GuohuiWang.com Email: ***@****.***

Mobile computing, GPGPU computing, parallel processing, wireless communication.

Research

Interests

Rice University 2008 - now

Education

PhD candidate in Electrical Engineering (GPA: 4.12)

Chinese Academy of Sciences, Beijing, China 2005 - 2008

M.S. in Computer Science

Peking University, Beijing, China 2001 - 2005

B.S. major in Electrical Engineering

B.S. minor in Economics

Qualcomm San Diego, CA

Work

Intern (Received Qualstar Diamond Award) Summer 2012

Experience

GPGPU computing research on mobile GPU

Study general-purpose computing on mobile GPU using OpenCL framework. Imple-

ment and optimize Qclbenchmark (Qualcomm OpenCL Benchmark) on Snapdragon

MSM8960/APQ8064 mobile chip.

Optimize computer vision and image processing algorithm on Adreno320 GPU and build

Android demos to showcase the capability of GPGPU on mobile devices.

National Instruments R&D Austin, TX

Research Intern Summer 2011

1Gbps 8x8 MIMO LTE-Advanced transceiver prototype

Implemented FPGA-based LTE-Advanced prototype that achieves close to 1Gbps data rate

and was demonstrated in the keynote presentation at NIWeek 2011 Conference. I developed

high performance channel estimation and MIMO detection using LabVIEW FPGA. Fixed-

point simulations are written in C and MathScript. The whole design was synthesized with

Xilinx ISE. The channel estimation and MIMO detection modules consume 91.1% slices and

52.5% DSP48s on a Xilinx Virtex5 XC5VSX95t FPGA when targeting at 100MHz.

Rice University Houston, TX

Research

Research Assistant May 2010 - present

Experience

Research on multicore mobile platform

Studied the mobile CPU-GPU co-design and workload partitioning for augmented reality

applications to reduce power consumption on mobile devices. OpenGL ES, C/C++ and

Java are used to develop benchmarks on Android platform for NVIDIA Tegra-2 device.

GPGPU parallel computing

(Related publications: ASILOMAR 2012, SASP 2011, ASILOMAR 2011, JSPS 2011)

Studied massively parallel accelerators on GPGPU for high performance DSP algorithms

such as error correction codes and MIMO detection. Focused on algorithm mapping onto

GPGPU architecture and performance optimization. The techniques used to improve per-

formance include parallelism optimization, memory optimization, adaptive thread/thread

block con guration and so on. For example, the GPGPU-based LDPC decoder achieves

over 100 Mbps throughput on an NVIDIA Fermi GPU.

Rice University Houston, TX

Research Assistant August 2008 - present

Algorithms and architecture for high performance communication systems

(Related publications: ISCAS 2013, ASILOMAR 2012, ASAP 2011, ISCAS 2011, ASILO-

MAR 2009)

Studied and improved the algorithms and architectures for 4G wireless communication sys-

tems such as channel decoder and MIMO detector. Designed and implementation very high

throughput, high e ciency, low complexity ASIC architectures using Verilog HDL.

Tools: MATLAB simulation, xed-point simulation in C, Verilog HDL.

Proposed and implemented a exible router architecture to eliminate memory con icts

in parallel decoding systems and enable high-throughput multi-standard interleaver.

Designed a exible Turbo decoder supporting HSPA+, LTE and WiMAX standards.

Designed VLSI architecture of High Throughput multi-layered LDPC Decoder.

Designed and implemented an FPGA prototype of 3GPP LTE Uplink Receiver.

Rice University Houston, TX

Research Assistant September 2010 - May 2011

Use High-Level Synthesis (HLS) tools to design DSP accelerators.

Implemented ASIC accelerators for several key modules in wireless communication sys-

tems such as QR decomposition, CORDIC module and fully parallel matrix multiplica-

tion. Mentor Graphics Catapult C HLS tool and Design Compiler were used.

Chinese Academy of Sciences Beijing, China

Research Assistant September, 2005 - June, 2008

VLSI architecture for 2K HD cinema system

(Related publications: High Technology Letters 2008)

Designed and implemented VLSI architecture for high throughput 2K High-de nition

digital cinema (DCI-complaint) playback system. Developed a high-throughput bu er

system to handle concurrent multi-channel streams and to achieve real-time video-audio

synchronization. Implemented color space conversion, bu ering systems and package

control module using Verilog HDL. The system can decode 250Mbps JPEG2000 data

and output dual-channel 1.8Gbps digital video.

Developed MXF package parsing tools and JPEG2000 decoding software.

[Book chapter]

Publications

Y. Sun, G. Wang, B. Yin, J. R. Cavallaro and T. Ly, High-level Design Tools for Complex

DSP Applications, DSP for Embedded and Real-Time Systems: Expert Guide, Elsevier,

2012.

[Journal Papers]

G. Wang, Y. Sun, J. R. Cavallaro and Y. Guo, High-Throughput Low-Complexity In-

terleaver Architecture Solving Memory Contention Problem for Parallel Turbo Decoder, in

preparation to submit to Journal of Signal Processing Systems.

Y. Sun, G. Wang and J. R. Cavallaro, A 1.2Gbps 3GPP LTE Turbo Decoder, in prepa-

ration to submit to IEEE Transaction on VLSI Systems.

M. Wu, Y. Sun, G. Wang, and J. R. Cavallaro, Implementation of a High Throughput

3GPP Turbo Decoder on GPU, Journal of Signal Processing Systems (JSPS), 2011.

G. Wang, Z. Zhu, K. Zhang, Z. Wang, A Novel Design Of the High Speed Bu er and

Video/audio Synchronization in High Resolution Digital Cinema System, High Technology

Letters (In Chinese), Vol.9, 2008.

[Conference Papers]

G. Wang, Y. Xiong, J. Yun, and J. R. Cavallaro Accelerating Computer Vision Algo-

rithms Using OpenCL Framework on the Mobile GPU - A Case Study, Submitted to IEEE

International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2013.

B. Rister, G. Wang, M. Wu and J. R. Cavallaro An Fast and E cient SIFT Detector using

the Mobile GPU, Submitted to IEEE International Conference on Acoustics, Speech, and

Signal Processing (ICASSP), 2013.

G. Wang, A. vosoughi, H. Shen, J. R. Cavallaro, and Y. Guo Parallel Interleaver Architec-

ture with New Scheduling Scheme for High Throughput Con gurable Turbo Decoder, To

appear at IEEE International Symposium on Circuits and Systems (ISCAS), 2013.

G. Wang, H. Shen, B. Yin, Y. Sun and J. R. Cavallaro, High Performance E cient Parallel

Nonbinary LDPC Decoding on GPU, 46th Asilomar Conference on Signals, Systems, and

Computers (ASILOMAR), 2012.

B. Yin, M. Wu, G. Wang, and J. R. Cavallaro, Low Complexity Opportunistic Decoder for

Network Coding, 46th Asilomar Conference on Signals, Systems, and Computers (ASILO-

MAR), 2012.

G. Wang, M. Wu, Y. Sun and J. R. Cavallaro, GPGPU Accelerated Scalable Parallel

Decoding of LDPC Codes, 45th Asilomar Conference on Signals, Systems, and Computers

(ASILOMAR), 2011.

G. Wang, Y. Sun, J. R. Cavallaro and Y. Guo, High-Throughput Contention-Free Con-

current Interleaver Architecture for Multi-Standard Turbo Decoder, IEEE International

Conference on Application-speci c Systems, Architectures and Processors (ASAP), 2011.

G. Wang, M. Wu, Y. Sun, J. R. Cavallaro, A Massively Parallel Implementation of QC-

LDPC Decoder on GPU, IEEE Symposium on Application Speci c Processor (SASP), 2011.

Y. Sun, G. Wang and J. R. Cavallaro, Multi-Layer Parallel Decoding Algorithm and VLSI

Architecture for Quasi-Cyclic LDPC Codes, IEEE International Symposium on Circuits and

Systems (ISCAS), 2011.

G. Wang, B. Yin, K. Amiri, Y. Sun, M. Wu and J. R. Cavallaro, FPGA Prototyping of

A High Data Rate LTE Uplink Baseband Receiver, 43rd Asilomar Conference on Signals,

Systems and Computers (ASILOMAR), 2009.

[Patents]

G. Wang, A. Vosoughi, H. Shen, J. R. Cavallaro, and Y. Guo, System and method for

parallel interleaver for high data rate turbo decoder . U.S. Patent Application. Filed in July,

2012.

G. Wang, Y. Sun, J. R. Cavallaro, and Y. Guo, System and Method for Contention-Free

Memory Access in an Interleaver . U.S. Patent Application. Filed in Nov, 2010.

A. Vosoughi, G. Wang, H. Shen, J. R. Cavallaro, and Y. Guo, Scalable interleaved address

generation for UMTS/HSPA+ turbo decoder . U.S. Patent Application. Filed in July, 2012.

G. Wang, Z. Wang, Z. Wei, Z. Zhu, The Method, System and Device to Implementing

Video/audio Synchronization, led in August, 2007; China Patent No.200710120585.0.

G. Wang, Z. Wei, Z. Wang, A Fast and High Performance Method for Multimedia Video

Zooming, led in November, 2007; China Patent No.200710178188.9.

Z. Wei, G. Wang, Z. Wang A method of Watermark Generation and Detection for digital

cinema Copyright Protection, led in April, 2008; China Patent ZL200810103472.4.

Z. Zhu, Z. Wang, X. Wang, Z. Wei, G. Wang, A copyright protection method and sys-

tem for audio and video contents in digital cinema, led in June, 2008; China Patent

ZL200810114749.3.

Computer Architecture Parallel Computing Operating Systems

Course Work

Stochastic Process Numerical Analysis Information Theory

Advanced VLSI Design VLSI System Test Arch. of Wireless Comm.

Comm. Theory & Systems Error Correcting Codes Communication Network

Digital System Design Digital Image Processing Computer Vision

Course Lab Instructor, ECE Department, Rice University

Teaching

Teach lab sessions, prepare and conduct weekly three-hour lectures reviewing the week s

Experience

course material and explaining lab project materials, grade homework and projects.

ELEC 220: Fundamental of Computer Engineering (Spring 2009, 2010, 2011, 2012)

Teaching Assistant, ECE Department, Rice University

ELEC 303: Random Signals (Fall 2009)

ELEC 522: Advanced VLSI Design (Fall 2010)

Paper reviewer

Professional

IEEE Transactions on Circuits and Systems for Video Technology (TCSVT), 2012

Services

IEEE Communications Letters, 2012

Journal of Computer Science and Technology (JCST) 2012

Springer Frontiers of Computer Science Journal (FCSJ), 2012

IEEE Computer Architecture Letters (CAL), 2012

EURASIP Journal on Wireless Communications and Networking 2011

IEEE International Symposium on Circuits and Systems (ISCAS) 2011, 2012, 2013

European Signal Processing Conference (EUSIPCO) 2011

IEEE International Conference on Communications (ICC) 2011, 2013

Great Lakes Symposium on VLSI (GLSVLSI) 2011

IEEE Workshop on Signal Processing Systems (SiPS) 2012

International Symposium on Information Theory and its Applications (ISITA) 2010

IEEE International Conference on Application-speci c Systems, Architectures and Pro-

cessors (ASAP) 2009, 2012

Committee member of Rice Center for Engineering Leadership 2009 - 2012

Activities

Graduate Student Mentor Program in ECE Depart., Rice Univ. 2009 - 2012

Available upon request.

References

Copyright Protection†, ï¬ led in April, 2008; China Patent ZL200810103472.4.



Contact this candidate