Multi-Paradigm Scheduling for Distributed Real-Time Embedded Computing
Christopher D. Gill and Ron K. Cytron Douglas C. Schmidt
fcdgill, *******@**.*****.*** *******@***.***
Department of Computer Science Electrical & Computer Engineering
Washington University, St.Louis University of California, Irvine
Abstract computing resources is hard; building them on time and within bud-
get is even harder. A particularly essential task is supporting the
Increasingly complex requirements, coupled with with tighter
quality of service (QoS) demands of mission-critical DRE systems
economic and organizational constraints, are making it hard to
that possess a mix of hard and soft real-time requirements, such as
build complex distributed real-time embedded (DRE) systems en-
avionics mission computing systems [1], mission-critical distributed
tirely from scratch. The proportion of DRE systems made up of
audio/video processing [2], [3], and real-time robotic systems [4].
commercial-off-the-shelf (COTS) hardware and software is there-
fore increasing signi cantly. There are relatively few systematic
empirical studies, however, that illustrate how suitable COTS-based B. Key Challenges: Flexibility and QoS Assurance
hardware and software have become for mission-critical DRE sys- DRE systems have historically been custom developed in an ad
tems. hoc and in exible manner. While many operational systems have
This paper provides the following contributions to the study of been built this way, this development process failed to address the
real-time quality of service (QoS) assurance and performance in following challenges adequately:
COTS-based DRE systems: (1) it presents evidence that exible Reducing total ownership costs: Custom software develop-
con guration of COTS middleware mechanisms, and the operating ment and evolution is labor-intensive and error-prone for complex
system settings they use, allows DRE systems to meet critical QoS DRE systems, and can represent a substantial fraction of system
requirements over a wider range of load and jitter conditions than lifecycle costs. Moreover, incommensurate lifetimes between long-
lived DRE systems ( = 20 years) and COTS platforms and tools
statically con gured systems, (2) it shows that in addition to making
critical QoS assurances, non-critical QoS performance can be im- (2 5 years) lead to pervasive software obsolescence that multiply
proved through exible support for alternative scheduling strategies, total ownership costs by requiring periodic software redevelopment
and (3) it presents an empirical study of three canonical scheduling and COTS refresh.
strategies speci cally the conditions that predict success of a strat- Portable QoS management: Modern DRE systems must invest
egy for a production-quality DRE avionics mission computing sys- an ever-increasing proportion of functionality and QoS management
tem. Our results show that applying a exible scheduling framework in software. Rapidly emerging technologies and exibility required
to COTS hardware, operating systems, and middleware improves for diverse operational contexts force deployment of multiple soft-
real-time QoS assurance and performance for mission-critical DRE ware versions on various platforms, while simultaneously preserv-
systems. ing key QoS properties, such as real-time response and end-to-end
Keywords: Middleware and APIs, Quality of Service Issues, Dis- priority preservation.
tributed Real-time and Embedded Systems, Mission Critical Sys- Dependence on rigid assumptions: Custom DRE systems are
tems, Dynamic Scheduling Algorithms and Analysis. scheduled in exibly so that if assumptions about the total resource
load are violated, critical real-time constraints may be violated. Un-
fortunately this leads to provisioning of resources at levels that are
I. I NTRODUCTION
both (1) excessive compared to what is needed to assure the mini-
A. Emerging System Demands
mum critical system requirements and (2) unrecoverable to improve
Distributed, real-time, and embedded (DRE) systems are becom-
average case performance.
ing increasingly widespread and important. Examples of DRE sys-
Insuf cient responsiveness to varying operating environments:
tems include telecommunication networks, e.g., wireless phone ser-
Custom DRE systems make rigid assumptions about system load
vices, tele-medicine, e.g., remote surgery, manufacturing process
and load jitter that can in unexpectedly varying environments lead
automation, e.g., hot rolling mills, and defense systems, e.g., avion-
to (1) a violation of critical QoS requirements, and/or (2) reduced
ics mission computing systems. Although there are many types of
performance in meeting non-critical QoS requirements. While static
DRE systems, they have one thing in common: the right answer de-
scheduling might be replaced with dynamic scheduling in some sys-
livered too late becomes the wrong answer. More speci cally, DRE
tems, any single-paradigm approach will naturally suffer these same
systems have the following types of requirements:
limitations.
As distributed systems, DRE systems require capabilities to
some aspects of the total ownership cost challenges Outlined
manage connections and data transfer between separate com-
above are being addressed for business applications by COTS soft-
puters.
ware, such as SOAP/.NET and J2EE. Until recently, however, little
As real-time systems, DRE systems require predictable and ef-
has been done to simultaneously meet all of these challenges for
cient control over end-to-end system resources.
mission-critical DRE systems.
As embedded systems, DRE systems have weight, cost, and
power constraints that limit their computing and memory re-
C. A Promising Approach: Real-time CORBA Middleware
sources.
Designing DRE systems that implement their required capabili- Over the past several years, a promising solution to many of the
ties, are dependable, and are parsimonious in their use of limited challenges outlined above has emerged in the form of distributed
object computing (DOC) middleware. DOC middleware is sys-
This work was supported in part by Boeing, DARPA ITO, DARPA contract
tems software that resides between the applications and the under-
F33615-00-C-1697 (PCES) and AFRL contracts F3615-97-D-1155/DO (WSOA) and
lying operating systems, network protocol stacks, and hardware [5].
F33645-97-D-1155 (ASTD/ASFD).
Its primary role is to allow clients to invoke operations on tar- have enhanced our prior work [1], [8] to focus on DRE systems with
get object implementations without concern for where the object the following characteristics:
resides, what language the object implementations are written in, Bounded execution time, where the use of resources during
the OS/hardware platform, or the types of communication proto- each execution of a resource request stays within the limit of
cols, networks, and buses used to interconnect distributed applica- its speci ed duration.
tions [6]. Bounded rates, where resource requests arrive within a speci-
Real-time CORBA [7] is a DOC middleware standard that adds ed period.
QoS control capabilities to the original CORBA speci cation by (1) Known operations, where all operations are visible to the
improving system predictability and bounding priority inversions scheduler prior to scheduling, or are re ected entirely within
and (2) managing system resources end-to-end. At the heart of the execution times of other speci ed operations.
Real-time CORBA is an Object Request Broker (ORB) that provides Critical and non-critical operations, where deadlines of all
run-time support to automate many DRE computing tasks, such as critical operations must be assured, and non-critical deadlines
connection management, marshaling/demarshaling, demultiplexing, should be met to the extent possible.
language and OS independence, resource scheduling and load bal- Real-time QoS requirements of DRE systems with these character-
ancing, error handling and fault-tolerance, and security. istics have been addressed historically by scheduling tasks within a
First-generation ORBs did not provide features or optimizations single paradigm, such as:
to support DRE systems with stringent QoS requirements. To bet- Static scheduling, that assigns priorities to all tasks statically
ter meet these requirements, researchers at Washington Univer- and ensuring the task with the highest xed priority always
sity St. Louis and the University of California, Irvine have de- runs [19], [23], or
veloped a second-generation ORB called TAO [8], which is an Dynamic scheduling, that orders all tasks dynamically and en-
open-source implementation of Real-time CORBA that supports suring the task with the highest dynamic priority is dispatched
ef cient, predictable, and exible DRE computing. Prior work preferentially [19], [4].
on TAO has explored many dimensions of high-performance and Static scheduling can minimize overhead stemming from, e.g.,
real-time ORB design and performance, including scalable event dispatching and admission control mechanisms, while dynamic
processing [9], request demultiplexing [10], I/O subsystem [11] scheduling requires less a priori knowledge of operation charac-
and protocol [12] integration, connection architectures [13], asyn- teristics, e.g., rates of execution. However, using either of these
chronous [14] and synchronous [15] concurrent request processing, scheduling paradigms alone imposes the following limitations:
adaptive load balancing [16], meta-programming mechanisms [17], 1) It does not isolate critical and non-critical load,
and IDL stub/skeleton optimizations [18]. 2) It is brittle in the face of total load in excess of the feasible
TAO isolates DRE systems from platform-speci c QoS enforce- limit, even though critical load is below that limit, and
ment mechanisms by encapsulating a robust QoS framework for 3) It is thus insuf ciently responsive to variations in demands by
managing end-to-end resources within a standard set of CORBA the application or operating environment.
interfaces. TAO also reduces DRE system dependence on rigid as- A hybrid static/dynamic scheduling paradigm used by the
sumptions by enabling alternative policies and mechanisms to be MUF [4] and RMS+MLF [20] strategies has been proposed to (1)
plugged into its QoS framework. In fact, the Real-time CORBA 1.0 partition critical and non-critical resource utilization using static
speci cation and its implementation in TAO address all the DRE mechanisms such as thread priorities, and then (2) dynamically
system challenges outlined in Section I-B except for insuf cient re- schedule operations within one [20] or more [4] partitions. The hy-
sponsiveness to varying operational environments. The reason for brid static/dynamic scheduling paradigm can therefore assure fea-
this omission is because no single scheduling paradigm performs sible critical deadlines will be met, even when when total load is
best in all environments, which motivates our research in this paper infeasible. When the total load is feasible, however, the additional
on the design and performance of exible scheduling frameworks overhead imposed by hybrid static/dynamic scheduling means that
for DRE middleware and applications. fewer non-critical deadlines can be met than in static scheduling.
To alleviate the drawbacks of single-paradigm scheduling while
still preserving its key bene ts our work with the Kokyu frame-
D. An Inclusive Solution: Multi-paradigm Scheduling work described in this paper allows DRE systems to specify multi-
paradigm scheduling strategies that trade a small additional amount
This paper extends our previous work on static [8] and dy-
of overhead for increased exibility in (1) assuring critical QoS re-
namic [1] scheduling for Real-time CORBA by incorporating a
strategized scheduling framework called Kokyu1 as a service atop quirements and (2) enhancing the availability of resources to im-
prove non-critical performance. In particular, we present foun-
TAO. Kokyu enables the con guration and empirical evaluation of
dational work towards strategies that can enforce each preferred
multiple scheduling paradigms, including:
single-paradigm strategy along the entire range of resource utiliza-
Static scheduling strategies, e.g., rate monotonic scheduling
tion.
(RMS) [19],
Figure 1 illustrates the bene ts of the Kokyu multi-paradigm ap-
Dynamic scheduling strategies, e.g., earliest deadline rst
proach. The upper solid curved line shows a hypothetical ideal uti-
(EDF) [19] and minimum laxity rst (MLF) [4], and
lization of resources as system load increases. The solid square line
Hybrid static/dynamic scheduling strategies, e.g., maximum
illustrates static single-paradigm strategies, such as RMS, that can
urgency rst (MUF) [4] and RMS+MLF [20].
approach the ideal under certain conditions, but may miss critical as-
Kokyu is applicable to an important class of demanding real-
surances beyond a certain limit, which is illustrated by the utilization
world DRE systems, which includes avionics mission comput-
value dropping to zero. Similarly, purely dynamic approaches may
ing [21], [22], mission-critical distributed audio/video process-
offer feasibility improvements under special cases, e.g., when rates
ing [2], [3], and real-time robotic systems [4]. To maintain schedul-
are non-harmonic, yet the additional overhead they impose may re-
ing assurances and simplify testing for these types of systems, we
sult in missed critical assurances at an even lower level of load. Hy-
brid static-dynamic approaches, in contrast, offer feasibility along
1 Kokyu is a Japanese word meaning literally breath, but also implying timing
the length of the load axis (as long as the critical load is feasible),
and coordination.
2 : REGISTER
IDEAL Air 1 : REGISTER
HYBRID FOR EVENTS
AD Frame
E
S T A T I C / DYNAMIC Nav OPERATION
ERH
RESOURCE UTILIZATION
OV CHARACTERISTICS
9: PRIO
DISPATCH
KOKYU 6: ( RE)-
KOKYU KOKYU
DISPATCH
SERVICES SCHEDULER
CONFIGURE
MODULE
MULTI-
8: FILTER, EVENT
PARADIGM
CHANNEL
CORRELATE
4 : REGISTER
STATIC
RT- ARM
7: PERIODIC
DYNAMIC DEPENDENCIES
PUSH
5: ( RE) ASSIGN
N O N - CRITICAL LOAD
RATE, PRIO
Sensor Sensor
Sensor
proxy proxy 3: REGISTER TO GET
proxy
Fig. 1. Ideal, Static, Dynamic, and Hybrid Paradigms
P E R I O D I C T I M E O U T S,
OBJECT REQUEST BROKER SEND EVENTS
and exhibit overhead that is intermediate between purely static and
purely dynamic approaches.
The dashed curve in Figure 1 shows how multi-paradigm schedul-
ing can approximate the best single-paradigm approach at each point
along the horizontal load axis. Due to mode switches or other adap-
Fig. 2. Application and Middleware Layers
tation mechanisms, multi-paradigm approaches may incur more
overhead than static and hybrid static/dynamic single-paradigm ap-
proaches. They are better suited than single-paradigm approaches, The ACE ORB (TAO) [8], the TAO real-time event channel [9],
however, to approximate the ideal performance curve over its length. and the Kokyu strategized scheduler [1] middleware, which is
This paper shows how the Kokyu framework supports alterna- described in Section II-B.
tive scheduling strategies implemented using COTS OS and mid- The Bold Stroke avionics domain-speci c middleware [21],
dleware mechanisms. By doing so, Kokyu increases adaptability [22], which is described in Section II-C.
across product families, operating systems, and most importantly The OFP application components used for the studies, which
environmental conditions, while preserving the rigorous scheduling are described in Section II-D.
guarantees and testability offered by prior work on statically sched-
The remainder of this section describes these layers of the open ex-
uled CORBA operations [8], [21], [22].
perimentation platform.2 Sidebar 1 de nes key terminology used
throughout the paper.
E. Paper Organization
A. Overview of OS/Hardware Con gurations
The remainder of this paper is organized as follows: Section II
describes the application, middleware, OS, and hardware con gu- Figure 3 shows the COTS hardware and operating system used in
rations that comprise the open experimentation platform used for the experiments described in Section III, consisting of a commercial
our empirical studies; Section III describes how our experiments VME-64 chassis with four commercial processor cards, a desktop
quantitatively evaluate the suitability of COTS-based hardware and computer running Windows NT 4.0, and a portable UNIX worksta-
software for mission-critical DRE systems; Section IV presents the tion. The desktop computer gathered metrics data and presented vi-
empirical results obtained on our open experimentation platform; sualizations of processor utilization and deadline successes, failures,
Section V summarizes the observations and recommendations based and cancellations. The UNIX workstation loaded the executable
on our results; Section VI compares our research on Kokyu with re- programs onto the boards in the VME chassis and provided a le
lated work; and Section VII presents concluding remarks. server for the digital map display.
Two COTS processor cards, a Dy4-783 and a Dy4-177, per-
formed the map display function. The Dy4-783 card had a memory-
II. O PEN E XPERIMENTATION P LATFORM
mapped display processor and the Dy4-177 card hosted an appli-
The work in this paper focuses on a mission-critical system that cation component that ran the map display algorithms. The OFP
is representative of an important class of DRE systems: the op- system was distributed across the remaining two processor cards.
erational ight program (OFP) in an avionics mission computing The rst system card was a 200 MHz, PowerPC 604, Motorola card,
system. An OFP manages sensors and operator displays, navigates which ran the experimental system described in Section II-D on the
the aircraft s course, and controls on-board equipment. The avion- VxWorks [24] 5.3.1 real-time operating system. The second system
ics system used for this paper consists of OFP components hosted card was a 100 MHz, PowerPC 603, Dy4-177 card. This card con-
on a domain-speci c middleware infrastructure called Bold Stroke, tained a MIL-STD-1553 MUX bus interface card and the Ethernet
which in turn is built using the distribution middleware capabilities interface for the VME chassis. All external communication, e.g.,
and common middleware services provided by the TAO Real-time over the 1553 bus to avionics remote terminals, or over the VME
CORBA ORB [8]. backplane to diagnostic and debug systems, went through this card.
Figure 2 illustrates the interactions between the Kokyu framework This card also controlled timing for frame sequencing and display
and OFP application and middleware components. Along with Fig- updates, upon which operation rates on the Motorola card depended.
ure 3 in Section II-A, this gure shows how the OFP application
2 This platform, and the studies conducted on it, were supported under the Adap-
components were hosted on an open experimentation platform con-
tive Software Flight Demonstration (ASFD) program hosted by the Boeing Phantom
sisting of the following layers: Works Open Systems Architecture organization. This work was administered by the
Embedded Systems Branch of the Information Directorate, Air Force Research Labs
An OS/hardware platform consisting of the VxWorks real-time
(AFRL), Wright-Patterson Air Force Base, Dayton, Ohio. Portions of the TAO ORB
operating system on embedded hardware, which is described in and the Bold Stroke open experimentation platform were developed under support
Section II-A. from DARPA ITO.
connections and
Sidebar 1: Terminology Memory resources via buffering requests in queues and
bounding the size of thread pools.
For clarity, we de ne the following terms used in the discus-
sion of the Bold Stroke open experimentation platform: As shown in Figure 2, the TAO Real-time Event Channel [9] is
Operation A single short-lived computation run each a publish/subscribe service that mediates communication between
time an event is pushed to its component. components acting as proxies for (1) remote terminals that interact
Cancellation Interdiction of the event push to an oper-
with the physical environment and (2) the operations that process
ation so that it will not be invoked. We denote schedul-
the data. Sensor proxies ush relevant data to the replication service
ing strategies using cancellation by a c annotation in
and then push events through the Real-time Event Channel to the
Section IV.
Load chain A sequence of operations, where each op- processing operations.
eration itself (except the last one) pushes an event to in-
Figure 2 also shows the Kokyu scheduling framework, which is a
voke the next operation in the chain. Subsequent events
CORBA service that provides scheduling and dispatching services
have precedence dependencies on prior events in the
to TAO s Real-time Event Channel. Kokyu is responsible for (1)
chain, and cancelling an operation in the chain amounts
to shedding the rest of the chain from that operation on- isolating critical processing from non-critical processing and (2)
ward. making the remaining CPU time available to non-critical process-
Route leg A segment of a navigation route computed
ing. Kokyu provides these services via a scheduling strategy with
in one operation invocation. Computing route legs was
which it is con gured to (1) assign priorities to operations and (2) to
implemented as a load chain in our experiments, with
specify the queueing discipline used at each priority level. By con-
each route segment successfully completed requesting
the next segment, up to the length of the chain. In par- guring the TAO Real-time Event Channel according to the speci-
ticular, a realistic system might declare the computa- ed set of priorities and queue disciplines, the middleware services
tion of the rst one or two legs to be critical operations,
described above enforce the mission computing system s real-time
that must be completed on time and cannot be can-
QoS assurances and performance.
celled, while subsequent route legs might be declared
non-critical.
Replication service A middleware service provided by
C. Overview of the Bold Stroke Platform
the Boeing Bold Stroke infrastructure for replicating
data across mission-computing processors. Operation The open experimentation platform for our work is based on the
deadlines in the experimental system correspond to the Bold Stroke domain-speci c middleware [21], [22]. Bold Stroke
points in time when their respective output values must
uses COTS hardware and middleware to produce a standards-based
be delivered and ushed to the replication service.
component architecture for military avionics mission computing ca-
Remote terminals Connected sensors and actuators in
pabilities, such as navigation, data link management, and weapons
the aircraft. In the open experimentation platform, emu-
lation software for these was connected to the mission control. A driving objective of Bold Stroke is to support reusable
computer by a MIL-STD-1553 hardware bus, to simu- product-line applications, leading to a highly con gurable appli-
late the inputs of actual sensors. The experimental sys-
cation component model and supporting reusable middleware ser-
tem, middleware, and hardware were demonstrated in
vices, such as a replication service.
an AV-8B ight simulator at Boeing, which included an
AV-8B cockpit and hardware remote terminals. Bold Stroke has been developed and deployed using DOC mid-
dleware components and services based on the TAO Real-time ORB
Ethernet and Real-time Event Channel, and the Kokyu framework described
Unix
NT Desktop
Workstation
in Section II-B. Figure 2 illustrates the middleware components in
S OFP OFP OFP OFP
Bold Stroke. As shown in this gure, Bold Stroke uses TAO Real-
C COMPONENT COMPONENT COMPONENT COMPONENT
H
E BOLD STROKE
time Event Channel atop the TAO ORB to communicate between
BOLD STROKE
D
INFRASTRUCTURE
INFRASTRUCTURE
U
L
components (1) on the same endsystem and (2) distributed across
E
EVENT CHANNEL
EVENT CHANNEL
R
different endsystems. The Kokyu scheduler maintains information
TAO ORB CORE
required for priority-preserving dispatching, which in the experi-
VXWORKS RTOS
VXWORKS RTOS VXWORKS RTOS
mental framework described in Section III was performed in dis-
patching queues within the TAO Real-time Event Channel.
1553
Bus
Dy4-177
Dy4-783 Motorola
Dy4-177
D. Overview of the OFP Application
VME
The OFP application used as the basis of our multi-paradigm
Backplane
Map Display Processing OFP Processing
scheduling experiments provides avionics mission computing capa-
EVENT CHANNEL
bilities for an AV-8B (Harrier) aircraft. The baseline version evolved
Fig. 3. Hardware and Software Con guration
from
1) An AV-8B OFP written in assembly language, to
B. Overview of DOC Middleware Con gurations 2) A single-board C/C++ OFP, and subsequently to
3) A distributed OFP using the Boeing AV-8 Open Systems Core
The COTS distributed object computing middleware used for the
Avionics Requirements airframe and the Boeing Bold Stroke
ASFD demonstration were based on the TAO 1.2 implementation
domain-speci c middleware described in Section II-C.
of Real-time CORBA [8], [7]. Real-time CORBA allows DRE de-
All major OFP components were implemented as periodically in-
velopers to con gure and control the following system resources:
voked operations, executed by event consumers. Operations were
divided into two equivalence classes:
Processor resources via thread pools, priority mechanisms,
intra-process mutexes, and a global scheduling service for real- Hard real-time (HRT) for critical operations Critical op-
time systems with xed priorities erations in the HRT class are those whose failure to meet any
Communication resources via protocol properties and ex- given deadline has potentially signi cant consequences for the
plicit bindings to server objects using priority bands and private correctness of the application.
Soft real-time (SRT) for non-critical operations Deadline whose resource demands (1) vary total load at longer time-scales
success for the non-critical SRT operations is desirable but not across a series of stable epochs of operation, according to inputs
strictly mandatory. from the environment and/or human users and (2) produce differ-
ent degrees of load jitter in invocation-to-invocation demands across
There were ve pre-de ned rates of execution in the system: 40
shorter time-scales within each epoch according to relevant factors,
Hz, 20 Hz, 10 Hz, 5 Hz, and 1 Hz. Each operation runs at one
such as progress of a navigation computation in a rapidly evolving
of these rates. For the ASFD open experimentation platform, new
threat environment.
20 Hz SRT functions were added to the OFP, including routes and
e) Safe addition of non-critical processing: To more fully oc-
steering components, as well as a digital map display.
cupy under-utilized resources in non-worst-case scenarios, it is de-
sirable to perform additional non-critical processing. While missing
III. E XPERIMENTAL F RAMEWORK TO E VALUATE
a non-critical operation s deadline does not compromise system cor-
M ULTI -PARADIGM S CHEDULING
rectness, reduced or even zero value accrues to the application for
Section II outlined the Bold Stroke architecture and the OFP ap-
that operation s use of the resources. It is crucial, however, to assure
plication components for avionics mission computing. This sec-
that non-critical processing does not interfere with critical process-
tion describes the design of experiments that empirically evalu-
ing and cause critical deadlines to be missed.
ate the suitability of COTS-based hardware and software for these
types of mission-critical DRE systems. We focus on three canoni- These design and implementation challenges addressed by Bold
cal scheduling strategies Rate Monotonic Scheduling (RMS) [19], Stroke and Kokyu are also fundamental to many other DRE systems
Maximum Urgency First (MUF) [4], and RMS+Minimum Laxity with similar requirements and constraints. Our previous work [1]
First (MLF) [20] to determine which performs better under repre- described the design and implementation challenges we addressed
sentative environmental conditions with varying load and load jitter. to apply Kokyu to Real-time CORBA and thus integrate Kokyu
within the Bold Stroke architecture. This paper extends our earlier
A. OFP Application Design and Implementation Challenges work by presenting empirical studies that show how Kokyu can then
meet the above open challenges not historically addressed by Bold
Challenges addressed by Bold Stroke: The Bold Stroke archi-
Stroke. The results in this paper can be generalized to a broader class
tecture addresses the following key design and implementation chal-
of DRE systems that perform both critical and non-critical process-
lenges confronted by OFP applications:
ing and that operate in dynamically varying environments.
a) Scheduling assurance of critical operations is required
prior to run-time: In OFP applications, as in many other DRE
B. Experimental Design
systems, the consequences of missing a deadline at run-time can
be catastrophic. For example, failure to process an input from the We have applied the open experimental platform described in
pilot by a speci ed deadline can be disastrous in an avionics appli- Section II to determine the degree to which the challenges described
cation, e.g., during navigation through a dense threat environment. in Section III-A can be met (1) using Commercial off-the-shelf
It is therefore essential to assure prior to run-time that even in the (COTS) hardware, operating systems, and middleware (i.e., using
worst-case scenario(s), all critical processing deadlines will be met. Dy4 and Motorola cards, the VxWorks OS, and the TAO, TAO
Bold Stroke has historically addressed this challenge through static Real-time Event Channel, and Kokyu middleware) and (2) across
scheduling and extensive testing and validation. a range of environmental conditions. The remainder of this section
b) Severe resource limitations: Like many other DRE sys- describes the hypotheses tested, the variables that were controlled,
tems, OFP applications must perform ef cient processing due to and the variables that were measured in our studies.
strict resource constraints, such as cost, weight, and power con- 1) Hypotheses: The hypotheses explored in these studies are
sumption restrictions. In particular, it is desirable to provision only shown in Table I. This table also notes which challenges described
the resources needed to meet worst-case critical processing require- in Section III-A are addressed by each hypothesis. To test these hy-
ments. Bold Stroke has historically addressed this challenge by
Hypothesis Challenges
clustering operations within an OFP application into a set of coarse-
Multi-paradigm scheduling is needed to both (1) A, B, and D
grain mutually exclusive modes, and provisioning resources for the
maintain QoS assurances for DRE systems while
worst-case mode.
(2) increasing performance beyond levels achiev-
c) Adaptability across product families: Some DRE real-time able by single-paradigm approaches.
systems are custom-built for speci c product families. Development Infrastructure factors, such as dynamic queue over- C and E
and testing costs can be reduced if critical and non-critical resource head, may in uence both the ability to enforce crit-
ical processing assurances, and the ability to im-
requirements can be shown to be isolated. In addition, validation
prove non-critical processing performance.
and certi cation of components can be shared across product fami-
TABLE I
lies, which amortizes development time and effort. Bold Stroke ad-
dresses this challenge by using CORBA to separate interfaces from H YPOTHESES S TUDIED AND C HALLENGES A DDRESSED
implementations and support component reuse [8].
Challenges addressed by Kokyu: We apply the Kokyu schedul-
potheses, and to study the potential bene ts and consequences of (1)
ing framework to the Bold Stroke architecture to address the above
supporting alternative scheduling strategies and (2) working toward
challenges in a broader range of contexts, as described in Section IV.
the ability to perform bene cial adaptation among them at run-time,
Furthermore, Kokyu addresses the following design and implemen-
we ran identical trials using each of the following canonical schedul-
tation challenges confronted by OFP applications, but not addressed
ing strategies:
historically by the Bold Stroke platform itself:
d) Robust performance under widely varying environmental RMS [19], which is a purely static strategy that assigns priori-
conditions: As noted in Section I, next-generation DRE systems ties in rate order and manages requests at each priority level in
must repond exibly to variations in load and load jitter imposed rst-in- rst-out (FIFO) order.
by the external environment. For example, next-generation avion- MUF [4], which is a hybrid static/dynamic strategy that assigns
ics mission computing applications implement features, such as on- static priorities by operation criticality, and schedules within
demand imagery download [2] and decision aiding systems [25], each static priority by minimum laxity.
Region Variable HRT Execution SRT Load Chain Length
RMS+MLF [20], which rst schedules critical operations ac-
0 0 msec 1 route leg
cording to rate and then non-critical operations at lower prior-
1 0 to 5 msec 1 route leg
ity according to laxity. 2 5 to 10 msec 2 route legs
We selected these strategies since they are most applicable to OFP 3 0 to 10 msec 3 route legs
application requirements to support both hard real-time (HRT) and 4 0 msec 4 route legs
5 0 to 5 msec 5 route legs
soft real-time (SRT) operations under a range of load and load jitter
6 5 to 10 msec 6 route legs
conditions.
7 0 to 10 msec 7 route legs
2) Controlled Variables: To examine effects of varying load and 8 0 msec 8 route legs
load jitter in the production-quality avionics mission computing en- 9 0 to 5 msec 9 route legs
vironment described in Section III-A, many next-generation DRE 10 5 to 10 msec 10 route legs
11 0 to 10 msec 11 route legs
systems must satisfy resource demands that
Vary overall at longer time-scales across a series of stable TABLE II
epochs of operation and L OADS F OR E ACH O PERATING R EGION
Produce different degrees of jitter in invocation-to-invocation
demands across shorter time-scales within each epoch.
To model variation in both load and load jitter imposed by these Regions 0, 4 and 8 have xed HRT event consumer loads, with
types of demands, we added operations to a sequence of twelve no additional variability.
epochs of operation, each representing a distinct operating re- Regions 1, 5, and 9 have variability of between 0 msec and 5
gion [2] numbered 0 11, as shown in Figure 4. msec for each of the 10 Hz, 5 Hz, and 1 Hz rates, for a total
variability of between 0 and 80 msec of each 1 Hz frame, i.e.,
2 between 0 and 8 percent variability.
6 10
Regions 2, 6, and 10 have variability of between 5 msec and
3 7 11 10 msec for each of the 10 Hz, 5 Hz, and 1 Hz rates, for a total
MEAN JITTER
variability of between 80 and 160 msec of each 1 Hz frame,
1 5 9
i.e., between 8 and 16 percent variability.
0 4 8 Regions 3, 7, and 11 have variability of between 0 msec and
10 msec for each of the 10 Hz, 5 Hz, and 1 Hz rates, for a total
variability of between 0 and 160 msec of each 1 Hz frame, i.e.,
NON-CRITICAL LOAD
between 0 and 16 percent variability.
Fig. 4. Operating Regions Total variability was thus lowest in regions 0, 4, and 8, higher in
re
ched.dvi