Email: ****.*******@*****.*** **** Toledo Ave
Home: 408-***-**** Santa Clara
Mobile: 408-***-****
Ca 95051
PROFILE
Self motivated professional with in depth experience of embedded kernel
debugging, interrupt and trap processing, virtual memory management,
multiprocessor threading, hardware interfacing, networking, driver
debugging, graphics, flash memory drivers, JTAG debugging, uBoot, and
Build systems (AOSP, BuildRoot,
Worked primarily with major corporations and a few start up companies in
both employee and contractor roles. This included stints with Sun
Microsystems, MIPS, Hewlett Packard, Silicon Graphics, Tensilica, VMWare,
and Auspex.
SOFTWARE EXPERIENCE
Kernel Hardware
Sun, Hewlett Packard, Auspex, Stratus, SGI/MIPS, Siemens, AMD-64, Intel
(ia32 & ia64), Xtensa, ARM.
Extensive work on Embedded Linux, Android, uBoot, GIT, Gerrit, JTAG NAND
Flash, LCD, Wifi.
OS/Kernel Software
Linux (X86, MIPS, ia64, SPARC, Xtensa), Android, FreeBSD, NetBSD, Solaris,
SunOS, HP-UX, IRIX.
Kernel Debugging, VM, Networking, Context switching, File Systems, Drivers
(Disk, SCSI, TTY, Mem), Interrupt and Trap processing, memory
initialization, VM, arch developing, cache aliasing, JTAG/OCD.
Extensive experience with Linux Networking and Embedded /Android kernels
at UNM, Tensilica, and MIPS/Imagination. Android, uBoot, Buildroot, Linux-
From-Scratch (LFS), and OpenWrt builds.
C Compilers
Building GNU GCC and LLVM Compilers; studied Stabs and DWARF formats.
AT&T(Johnson and Ritchie) and Honeywell Compiler Development. Compiling for
ARM and executing on MIPS via libportable.
Kernel and System Debugging
GNU GDB (kgdb, Crash, DDD), SGI LKCD, Sun (kdbx, kgdb), HPUX (KWDB, DDE,
LanDDB), RedHat Crash Utility(Network Dump), Linux KGDB Stub, JTAG-OCD
based gdb (xocd, openocd), Android adb.
Network Storage
Raid, IP SANS{ISCSI}, NAS{Network Lock Management, Distributed Multi-cast
NFS}
Graphics
OpenGL MPEG-2, X11 Server, Touch Screen Mouse, Open-Windows, Sunview Raster-
op, PowerVR.
ARPA Internet Protocols
TCP, IP, UDP, ICMP, GGP, EGP, RDP, HMP, SLIP, PPP, Corba R3P, NFS, RPC,
XDR, NFS Lock Manager, Port Mapper, PPPOE.
Programming Languages
C, C++, Java, PostScript, Perl, SPARC, AMD X86_64, Intel IA32, IA64,
MIPS, Xtensa, and ARM Assembler, Corba IDL, Verilog, HDL, TIE.
Development Tools
Gerrit, rpm, bugzilla, dbx, kdbx, gdb, kgdb, ddd, adb, kadb, kdb, ssh,
repo, git, qgit, cvs, cvssup, cscope, modinfo, mkinitrd,
Ethereal/WireShark, POSIX Threads, tcpdump, netdump, xocd, openocd, Android
adb.
Interest
Embedded Linux/Android Development, OpenOCD, ARM/OMAP/Panda-Board, Ubuntu,
Ubiquiti Routers, Maintaining Xtensa Linux, Quantum Mechanics,
Biochemistry, Optics, the financial market mathematics.
WORK EXPERIENCE 1994-2015
Self Employed Nov 2014 - Feb 2015: Finished Living room and Kitchen
Remodel
Silver Peak (MIPS) June-2014 - Nov 2014:
Developed and debugged network acceleration appliances with goal of
supporting under AWS, OpenStack and running withing a newer Linix kernel
and Fedora root.
Worked on using hardware accelerated Ethernet I/O (SRIOV) on our Amazon AWS
based Wan Optimization appliance. I documented the detailed steps on
building our appliance on Open Xen, created a pair of appliance instances
an AWS in California and Virginia, and doing transcontinental heavy load
testing.
We had a problem with the SR-IOV driver not allowing MAC-address-spoofing
use by our appliance without a kernel upgrade at AWS and perhaps for the
appliance as well.
Re-integrated our SVN based kernel source back into Linus's GIT repo and
used it to debug kernel problems with kgdb via UART and the USB debug port.
Studied the zero copy (ZC) kernel driver and looked for ways to prevent
and detect corruption to kernel data structures that were whack by bugs in
the appliance during development. The application had a set of FIFO's that
were mapped to the kernel ZC driver to share skbuff data; similar to the ZC
driver we used at BlueLane.
We had a problem where a few appliances were locking up during early boot
and I set up scripts to build the appliance with KGB and KDB enabled for
local development at Silver Peak in an attempt to diagnose the hang
appliance, if caused by software, with KDB and KGDB. Hoped to use the new
USB debug port which starts very early in the boot process; was having a
problem resuming the process.
I was going to use this early KGDB serial environment to fix the kernel so
that the kernel crash dump facility could be use on Xen hypervisers. KDUMP
isn't currently supported by Xen and Microsoft's hypervisors.
I proposed integrate all of the kernel src code back together as a GIT
branch and re-base the code to a current kennel release and use Gerri t
and Jenkins for code reviews and testing.
I was following the Open-Stack Gerrit list for a few SR-IOV related
changes. I had prepared my workstation to try out the Juno Open-Stack
release, try out the SR-IOV facility, and promote changes to our appliance
to make it more useful in an Open-Stack environment.
Imagination Technologies (MIPS) June-2012 - June 2014:
Worked in the MIPS Android development and support group. We maintained
core MIPS features in Android, the QEMU simulator, Goldfish kernel, and
supported MIPS customers on their Android platforms.
Maintained Android on MIPS Processors:
Ported Android NDK Based Mozilla to MIPS - Mozilla linker updated to update
GOT table for DLLs.
Ported Linux Test Project (LTP) to Android - Used to verify ARM to MIPS
portability library.
Ported BusyBox to Android and pushed upstream - used for debugging and
needed by LTP.
Developed Android LibPortable Mechanism - Achieved 100% LTP compliance.
https://android.googlesource.com/platform/development/+log/master/ndk/sourc
es/android/libportable
Converted early Bionic crt startup code from asm to C code.
Helped in fixing a QEMU bug involving self modifying code. Turns out it was
a bug already in the mainstream QEMU code that hadn't been ported to
Android's QEM.
Ran Android CTS test suits to locate and fix MIPS issues:
Problems with displaying pop up due to table screen size.
Problems in framework with media codecs.
Modifying CTS scripts to bring out a Render-script SMP cache problem.
Worked with the team on the Android-4.4 PDK.
Worked on WIFI wpa_supplicant daemon config changes required in Android-4.4
PDK.
Started investigation into Dalvik Java Debug Wire Protocol problem with JDB
debugger.
Breakpoints are not being temporally removed while continuing/stepping
past them.
Maintained and Developed Android on Ingenic XB4780 Tablets and Development
Boards:
Pushed Changes to Kernel, Device, Hardware, Codex, and Xboot Git
Repositories - Worked with Ingenic Engineers to root cause stability,
functionally, and CTS compliance problems.
Worked with EJTAG Proxy developer to debug Ingenic kernel and Xboot bugs
with gdb.
Prototyped JTAG and extended debugging support on the Ingenic devices from
circuit diagrams.
Fixed a few kernel and xboot bugs using gdb via JTAG.
Re-based kernel between a few git repo's with different roots; integrated
bug fixes and PVR Updates.
Added memory based logging to X-boot to fix USB timing problems with UART
based logging.
Extracted log with gdb via jtag.
Greatly extended USB logging details and added automatic indentation at
function entry and exit.
Upgraded PowerVR kernel driver and library from 1.9 > 1.12; Needed for
Android-4.4 PDK.
Back-ported and tested changes from master to earlier releases (Android-
4.3, 4.2).
Worked on compiling Ingenic Codec libraries from source code and
integration into build.
Worked with Google NDK Maintainer on GCC compiler update to support Ingenic
DSP instructions.
Worked on the root cause of a problem with screen rotations. Turned out to
be that multiply-add (MADD) instructions are not working correctly in all
instruction sequences. Fixed build flags to drop madd instructions.
Upgraded Imagination Creator C120 (URBOARD) from Android-4.1 to Android-
4.4.
See website for details:
http://community.imgtec.com/developers/creator-ci20/
http://elinux.org/CI20_Distros#Android
Updated Boards from a 4-Gbyte NAND chip to a 8-Gbyte chip: We were having
flash space problems with CTS and the new ART runtime replacing Dalvik on
the Ingenic tablet and with most set-top boxes coming to the market
providing 8Gbytes flash memories it appeared desirable to upgrade the
production C120/urboard to have 8GB. I found a hardware equivalent chip
and updated the kernel, boot-loader (xboot), and build files to enable
using the 8-Gbyte chip. In a half dozen places integers had to changed to
long long integers. I needed to support the older boards, at least for a
while, and extended the build environment to support existing combinations
of UART ports, NAND Manager Versions, NAND chip sizes, and partition tables
easily. The 8Gbyte choice made my C120/Urboard Over The Air (OTA) update
implementation much easier; as well as leaving flash space for future
Android releases and user data.
Worked with Serge Vakulenko on EJTAGproxy workarounds for Ingenic JTAG:
http://code.google.com/p/ejtagproxy/
Got a EJTAG header added to the production board - works great, even with
DDD.
UART port 3 conflicted with EJTAG - switched board and code to use port
4.
Updated UR-BOARD Android Build Environment to address various issues:
Fixed HDMI, Audio, Ethernet, and WiFi problems;
Ex: Upgraded to newer Broadcom WiFi chip.
Codec problems with H.264 - Ingenic had mixed in xb4775 PADDR changes into
the xb4780 code. The URBOARD xb4780 CPU has a DMA controller with a MMU and
thus doesn't need to use physical addresses while passing around surface
flinger data.
Tensilica Corporation Aug-2009 - Apr 2012:
Maintained and Developed Tensilica's Xtensa Linux Kernel and Development
Tools:
(See: http://www.linux-xtensa.org, SMP code being push
upstream by Max Oct 1013)
Implemented SMP cache coherency and stabilized SMP code - Studied the ARM
and SPARC MMU kernel code and added cache and TLB flushing code to maintain
memory coherency in a normal environment with cache aliases. When I started
Tensilica had just received a huge contract to produce an SMP version of
the LX Xtensa processors, the MX. The 2.6.24 kernel didn't boot a kernel
with cache aliases at the time. Within six months I had the 2.6.29 kennel
running rock solid with heavy audio, memory and LTP stress test
experiencing less than one percent errors for almost a month.
Modified the kernel to being easily debugged, including compiling it
easily without any optimization (-O0) and made the back-trace totally
accurate in the presentation of local and formal variables on the stack
across multiple exceptions. This made it easy to clearly see the soft IRQ
layer as well as the interrupt context. For example, this increased
visibility made the need for an additional spin-lock in the Ethernet
driver obvious.
Unfortunately the 2010 recession caused the customer to cancel it's
interest in the SMP Linux project. At that time the RTL verification team
hadn't finished verifying the cache coherency hardware enhancements. The MX
release was canceled, Tensilica's support for the Xtensa architecture was
dropped by Sales thought it continued to be sold.
Created an Xtensa Linux Development Environment - To support the
development of Codecs for the Tensilica HiFi-2 Processors I updated the
Kernel to support the Lazy saving of the Co-Processor Registers and dealing
with process migration in an SMP environment. I also updated U-Boot to the
latest mainline code, set it up for the Tensilica's largest FPGA board
(LX200) and clearly documented the procedure for setting it up as well as
the Development environment.
http://wiki.linux-
xtensa.org/index.php/SMP_HiFi_2_Development_Board
http://wiki.linux-xtensa.org/index.php/Setting_up_U-Boot
This environment provided Codec developers the normal set of tools
needed for software development, including the GCC compiler, gdb, make,
autoconf, vim, git, ssh, strace, and binutils tools (nm, ld,
Enhanced the VM code to Support the Kernel HIGHMEM mechanism - Modified
X86 and ARM HIGHMEM code to support the Xtensa architecture. Initialized
first level page tables to map a region of virtual memory to dynamically
map for the kernel memory above 128K that's being used by a process.
Modified the pages structures to make viewing of the pages tables easy and
set up PTE's during early memory initialization to be use for mapping
memory past the 128Mbytes accessible via the V2 MMU TLB entries.
Also helped Chris Zankel with his Extended Memory implementation
worked with Lintronics folks with their auto-loading PTE implementation of
extending memory. All three extended memory mechanisms involve significant
changes in early boot memory allocation, page miss handling in the
exception handler, and tricky code dealing with cache aliases.
Added JTAG Verification of Tensilica's Debugger (XOCD) - Added code to
verify JTAG operations so that Systems on a Chip (SoC) with memory access
congestion could be debugged. This, for example, was needed on a very
important customers SoC which had a very large number of processors. The
problem was that a JTAG operation might not complete if there was serious
congestion for the memory bus (PIF). The simple solution to this overrun
problem was reading the JTAG status registher after every memory/JTAG
operation to verify that the operation had completed successfully.
Unfortunately this verification took a very long time and made download
times unbearably long when enabled. I subsequently fixed this problem with
a new probe via Asynchronous JTAG operations. I also enhanced maintenance
of XOCD by added internal function tracing to make the internal structure
and operation of XOCD easy to observe and understand.
Participated in Xtensa Debug Module (XDM) Specification and a Bit of ARM
Development - our group upgraded the JTAG hardware for Xtensa to work
better with ARM cores. The new debug module will provide a Access Port
that an ARM DAP can access and allowing the Tensilica Trace module (TRAX)
output to be written to the current Trace RAM or the ARM Trace Funnel to
the ATB bus. Helped resolve some minor issues with the TRAC interface and
communication bugs. Discussed heterogeneous debugging issue like stopping
and starting various core synchronously as well as the possibility of using
a clock signal from the XDM for hardware JTAG flow control to prevent
overruns.
Developed a FTDI FT22332 Driver for XOCD and ML605 Board- Initially
researched and then proposed the addition of a FT2232H chip to the
Tensilica ML605 FPGA Development Board. I subsequently developed a Multi-
probe, multi-target controller for this popular and cost effective FT2232
family of probes supported by OpenOCD. The OpenOCD implementation is
Asynchronous, proving much greater performance when reading status and/or
data back from the JTAG TAP. I implemented an Asynchronous (ASYNC), two-
pass, approach with minimal changes to the existing Synchronous design.
The new family of probes decreases the Linux Kernel download time from
three minutes, with the MacGregor and Byte-tools probes, to 25 seconds with
the Flyswatter2 FT2232H probes; which cost five times less. With the ASYNC
mechanism memory verification was only twice as slow as without
verification which is about ten times better than the non-ASYNC MacGregor
and Byte-tools probes. Kernel upload time was decreased by 70X with ASYNC
and display of typical kernel data structures was about 3X faster. I made
the FT2232 driver rock solid for a variety of windows and Linux development
environments with a BitRock installer making installation a piece of cake.
I'm currently using this probe on Xtensa and ARM Kernel Development Boards.
Added Xtensa V3 MMU Support to the Linux Kernel and U-Boot - Tensilica's
latest MMU has configurable TLB entries that initially come up mapped
Virtual == Physical. I brought up support in U-Boot and the Linux kernel
to re-map the Linux kernel during very early startup code. This approach
makes it easy to debug the kernel when started directly with GDB or via U-
Boot. I modified the kernel linker scripts so that GDB could step through
the code while it was re-mapping itself and keep GDB in sync with the
source code. This made it easy for our customers to understand and modify
the code for their particular needs.
Developed and Maintained the Kernel and Build-Root - Used the environment
I set up for the HiFi-2 (Diamond Core 330) Codec Development to root cause
problems in the kernel and uClibc/Pthreads library. Similar to Ubuntu
running on a Beagle and Panda Boards for ARM, this comfortable environment
made root causing problems relatively easy. Good tools like have a clear
back trace of the uClibc library in applications with gdb, strace, oProfile
are very valuable in development as well as having standard build tools
like wget, tar, vim, Automake and Autoconf.
Upgraded Audio driver to work on new Avnet LX110 board - Modified audio
driver to deal with new I2C, DAC, and interrupt processing as well as
improving performance and ease of debugging.
Added support for cross compiling the Linux Kernel with Tensilica's FLIX
Compiler (XCC) - The Kernel has a feature were parts of the kernel are
compiled with different options for different versions of the Intel C
compiler. I used an identical mechanism to extract the version of XCC and
then added workarounds for a dozen or so deficiencies I found in the
compiler. We fixed a few of these deficiencies in each release and the
workarounds are automatically disabled with each future releases. The
complete Kernel can now be safely compiled from no optimization (-O0) to
maximum optimization (-O3) with both XCC and GCC with the simple selections
via the kernel menu configuration. This included dynamically changing the
stack size to compensate for larger stacks encountered when many additional
stack frames are added for static in-line functions that are no longer
compiled in-line for -O0 kernels. These kernels make it much easier for
Tensilica's customers diagnose problem. Linker relaxation of long vs short
branches is now also handled automatically in the kernel makefiles. I also
worked with Marc to keep the literals close to associated text sections;
see comment in arch/xtensa/kernel/Makefile for details.
Provided consulting support to Tensilica customers via contract and the
linux-xtensa.org mailing list.
Implemented a MP implementation of AES using Tensilica's CPU Design
Language, TIE.
Bluelane Technology (Now part of VMWare) July 2005 - Aug 2009(Employee):
Helped develop and maintain a TCP proxy that's used as the foundation for
a network in-line patch proxy. Developed a SSL proxy that uses the Cavium
encryption accelerator with their Turbo OpenSSL code. Modified the Linux
Kernel to boot an encrypted root using keys based on hardware information
(Ex: mother board serial numbers). Modified tracing code to be much more
detailed, indented, and including the stock TCP code.
Merged our 2.4.12 kernel up to 2.6.16 and updated system packages to match
the LFS/BLFS (Linux From Scratch) versions suggested for the 2.6.16 kernel.
Resolved kernel bugs in both our Hardware and VMWare based products.
Migrated our LFS repository from CVS to GIT and set it up to build kernels
with the Source Forge KGDB patch configured as well configuring the Linux
standard KEXEC/DUMP to generate core dumps.
Set up Dave Anderson's Crash code to generate a detailed crash analysis
that can be sent back to Bluelane for analysis. Used KGDB+DDD as well as
the network tracing to fix many bugs in our code. Some were due to a new
back pressure algorithm that I developed to handle noise in our proxy when
filtering data from a customers server back to a client.
One interesting bug was due to earlier developers not trimming off the
receive queue when it gets filled and there are holes in the received
sequence number space. Another was the lack of accurate time stamps to
allow the congestion avoidance algorithms to be fully utilized requiring
workarounds with the congestion window.
A bug in the VMware tools Ethernet driver and the ESX server was
particularly fun. ESX was writing junk into a new kernel when the symbols
in the old kernel had changed their physical address; apparently ESX
apparently didn't notice that we had written into the text segment during
the reboot and didn't purge it's emulation cache. I root caused ESX's bug
by check summing the text segment, making the it read only and proving that
our code wasn't whacking the kernel text.
Universal Network Machines (UNM) July 2004 - May 2005 (Consultant):
Modified TCP stack to do asynchronous DMA. The UNM kernel lacked a
conventional scheduler where a process can sleep. The existing
tcp_sendmsg code was spinning in a loop waiting for DMA to complete. I
split tcp_sendmsg into two parts. The first part set up a DMA transfer to
socket buffers (skbuffs) on a new DMA write queue, preserving push(PSH)
marks. I modified the DMA handler to send a hardware msg to the thread
maintaining the socket and then execute the second half of tcp_sendmsg,
moving the data to the write queue and starting the forwarding of the data
down the rest of the IP stack. I modified the DMA hardware interface from
initiating a single DMA transfer to following a chain of DMA transfers that
I placed in the unused skb_shared_info frags array. Since the skbuff
structure was already being flushed this saved on the cache flushes needed
to pass the DMA vectors from the current CPU to the processor scheduling
transfers with two DMA hardware engines.
Made extensive modifications to the embedded kernels debugging code;
unifying the syntax and indenting traces of TCP, IP, routing, and
(network and host) interface code. The style was similar to an example I
presented at a USENIX conference in Europe on an implementation of OSI
IP and OSI Transport (TP4) that I developed. In both cases the trace syntax
was 'C' like to make it easy to read and used the classic SunOS NFS/RPC/XDR
ten trace levels. I used a gcc compiler profile hook to maintain stack
stack depth information because the compiler didn't support frame pointers;
making stack back traces difficult (even GDB back traces were often
truncated). This enhanced trace covered all of the Linux kernel network
code that we were using (NetLink, TCP, IP, ICMP, Net-Filter, Routing, and
Network Interface) as well as UNM's Host and Embedded RPC code; Example
traces available upon request.
Modified the use of the socket reference count (refcnt) from a simple count
to a bit field. The 2.4 Linux refcnt code had previously been hacked from
a straight forward concept to something that no one in our group could any
longer explain. Since the UNM processors were not preemptive and CPU
performance was important, I dropped most of the the socket holds and puts
and used bit fields to identify the holder of a reference in the few
remaining places were permanent references were being made or dropped. All
but one type of the new references required only a single bit. I Modified
our 2.4 Linux {TCP, IP, PPPOE, and NetLink} Kernel code to use this new
paradigm which fixed problems that we were experiencing in stress test that
involved large number to sockets sleeping in TIME-WAIT.
Redesigned host interface code to be stable for long periods of time and
under all conditions. For example, I added a CANCEL msg to handle host side
interrupts and module unloading. I modified the code from spinning to using
wakeup's and added debug tracing. Fixed a few race conditions and increased
reliability for our offloaded SSL from 20 minutes to 48 hours and from 20
TPS to 800 TPS.
Rewrote the loop back driver to send packets directly to target sockets.
The original code tried to make use of the ASIC features that didn't work
and mistakenly shared skbuffs between processors.
Added debug code to detect skbuffs being used by other processors and fixed
a case in the neighbor code.
Worked with UNM's kgdb developers on getting UNM kgdb working well with the
UNM processors. For example adding SMP support, GUI support with DDD, and
support of multiple breakpoints. Disabled the use of in-line functions in
debug kernels to facilitate debugging.
Worked with UNM's C compiler developers on enhancing gdb functionality.
Used George Anzinger and Amit Kale's kgdb patches on the host Linux kernel
for development of UNM's driver and as well as fixing related bugs. Used
Amit's link time shell script for dynamic kgdb module support.
Fixed TCP protocol problems with the ANVL test suite utility with the
tracing code that I had developed. UNM's ASIC has a half dozen processors
on it for enhanced processing power; this necessitated it not supporting
hardware cache coherency between the processors. Since all of the stock
Linux networking code assumes cache coherency we had to modify the code to
support this paradigm. The sockets were bound to a processor whereas
routing, ICMP, and ARP code needed extensive changes. I worked on
stabilizing this code, including an enhancement made by one of our earlier
developers that sent multi-cast IP packets to each processor for ARP, ICMP,
and routing support.
StoneFly Networks July 2003 - March 2004 (Employee):
Worked with George Anzinger on back porting his 2.6 kgdb patch back to
2.4.18.
Fixed bugs and wrote documentation on how to easily setup kgdb in the
StoneFly environment.
Modified environment to only compile modules used and to quickly link them
into the kernel for easy debugging with gdb via ddd.
Profiled the kernel with SGI's gprof, resolved problems with kgdb, and
studied the kernel hot spots while running our ISCSI server.
In addition, developed a patch for the Linux kernel TCP stack to support
faster I/O by avoiding the copying of data between the kernel and
applications. Initially working with the patch implemented by Thomas
Ogrisegg:
http://lwn.net/Articles/18906/
Studied the new 2.6 kernel sendfile code and the FreeBSD Zero Copy Sockets
implementation:
http://kerneltrap.org/node/294
http://people.freebsd.org/~ken/zero_copy/
and the Linux VM copy on write (COW) code.
Worked with our architects and team leader to implement a zero copy
implementation which used tokens to coordinate the sharing of pages between
the kernel and user applications. I used cmsg(3) structures with the
sendmsg(2) and recvmsg(2) system calls to associate tokens with single or
multiple segments referenced by the msghdr scatter/gather iov array during
existing I/O system calls.
Modified tcp_sendmsg to map in the users pages and then pass each segment
and it's associated token down to do_tcp_sendpages and modified
tcp_recvmsg to return multiple tokens while receiving data or, if
configured, while waiting for data. I kept tokens on one of three queues
that I added to the socket. While the segments associated with a token
was being sent the tokens were kept on an in-use-queue. Once a tokens
page reference count dropped to zero the token was moved to a return_queue,
and once the tokens were returned to the user the memory used for the token
was put on a recycle-queue for rapid reallocation on future I/O on the
socket.
Maintained a reference pointer to the tokens in socket buffer frag
structures and bumped the reference with the existing get_page and
put_page function calls used by the networking stack. I released the
tokens when the the socket was closed and made the tokens visible to the
application as select(2) exceptions. I put zero copy configuration under
sysctl(8) control and made the zero copy token queues and configuration
visible to system administrators via the /proc interface.
Note: By recycling tokens it is possible for incoming data to be delivered
to the token pages that had been used on completed output and thus easily
delivered to an application using the zero copy enabled socket
Modified the TCP test program, ttcp, demonstrated a 15 fold increase in TCP
performance over the loop back driver with 64 kilobyte writes and made
Stonefly's ISCSI server rock solid while using this enhanced system call
interface.
Worked with Linux Kernel Raid and Broadcom drivers, setup lab DNS server
(including reverse maps) and annex like access for kgdb and telnet.
Upgraded kernel with 2.6 PCI hardware ID tables and upgraded to the 2.6
Ethernet driver.
Silicon Graphics November 2001 -July 2002 (Contractor)
Developed Linux Kernel Crash Dump (LKCD) and RedHat crash analysis tools
for ia64 NUMA system. Crash tools supported the ia64 and ia32
architectures. The Mission Critical crash tool is now part of the RedHat
Advanced Server Project's Network Dump facility and is a front end to gdb.
Tools were developed at:
http://lkcd.sourceforge.net/
Dealt with ia64 stack frame unwinding, NUMA changes (MMU and scheduler),
and saving context in trap and machine check code.
Used the SGI Medusa ia64 simulator, worked on a gdb stub for the kdb kernel
debugger, and was responsible for the LKCD integration, testing and RPM
distributions.
Studied, and used, the Intel ia64 specifications, David Mosberger's ia64
Linux Kernel book, and lots of the 2.4 Linux kernel code. David Mosberger's
book has been particularly helpful in understanding the details of the
Linux ia64 MMU and low memory implementation.
Modified the SGI XSCSI drivers to support a polling interface and modified
the LKCD dump driver