Rtos Image Building for Different Target Platforms Lecture Five
Rtos Image Building for Different Target Platforms Lecture Five
Porting of RTOS, Configuring RTOS for minimizing RAM consumption and increasing
Throughput- Building RTOS Image for Target platforms
Porting of RTOS:
Product development cycles are market driven, and market demands often require
vendors to compress development schedules. One approach to this is to simultaneously
develop similar products, yet with varying levels of product complexity. However, scheduling
pressures coupled with increased product complexity can be a recipe for disaster, resulting in
slipped schedules and missed opportunities. Consequently, vendors are always on the alert for
silver bullets, yet as developers, we know that they don't exist. That said, it is still in our best
interest to seek better ways of compressing development cycles, and one way to do this is to
port existing products to new hardware platforms, adding new features along the way. This
is the approach we used to demonstrate a proof-in-concept when porting a legacy security
application to a new hardware platform.
Our firm was hired to make enhancements to the client's existing 6502-based product,
and we quickly realized that this platform was running out of steam. Specifically, the proposed
features would significantly impact performance. Consequently, we proposed three options
for fixing this problem:
➢
Completely rewriting the application on the current hardware.
➢
Rewriting the application on a new, higher performance hardware.
➢
Migrating portable portions of the application to the new hardware. After
considering the options, we decided to port to new hardware.
RTXC Overview
The Real-Time executive kernel (RTXC) supports three kinds of priority- based task
scheduling: preemptive (the default), round-robin, and time-slice. RTXC is robust, supports
hard deadlines, changeable task priorities, time and resource management, and inter task
communication. It also has a small RAM/ROM code footprint, standard API interface, and is
implemented in many processors. RTXC is divided into nine basic components: tasks,
mailboxes, messages, queues, semaphores, resources, memory partitions, timers, and
Interrupt Service Routines (ISRs). These components are further subdivided into three
groups that are used for inter task communication, synchronization, and resource
management. Moreover, component functionality is accessed via the standard API interface.
The first activity is design related, while the others are implementation related.
Moreover, the last three activities require an understanding of the new hardware— knowing
the specifics of what needs to happen to make the RTOS interact with the board.
System Architecture:
The best way to identify hardware components is to study the board's schematics.
Examining the NPE-167 board revealed that the I/O ports would be key for this project.
Why? Because this board used the processor's general-purpose ports to handle switches to
control CAN bus operation, the board's operating mode, control LED outputs, and memory
selection. I/O cards were controlled via the SPI bus, rather than I/O ports.
Ports can as either inputs or outputs. Examination of the NPE- 167 board showed
that 17 be configured ports are used. Eleven ports are used as switch inputs. From the
schematic we saw that switches 1-7 were used to set the MAC address for the CAN device.
CAN bus speed is controlled by switches 8-9, while the board operating mode is controlled
by switches 11-12. Switch 10 is not used. Four ports control the LEDs. There are three in
total. One LED is green, one red, and the third bicolor. Thus, four outputs are required to
control the three LEDs. Finally, two output ports are used as page selection for extended
memory.
NPE board addresses up to 512K of memory before having to make use of the page-
selection ports. Although we would configure the page-selection ports for the porting process,
we didn't have to use them because the total code footprint of the kernel, plus test code, is
107K. RTXC's kernel is about 76K, and the porting test code fits within another 31K. In
short, we would only use about 1/5 of the default memory to validate the porting process.
The last necessary component for the port was to determine which timer to use as the
master time base. Timers are internal on the C167 processor, so they don't show up on the
schematic. So we had two options—choose a timer and write the code for that timer, or use
the BSP default timer. RTXC's C167 BSP uses a timer in its configuration. A trick to simplify
the initial porting process is to use the default timer that the BSP uses. Reviewing the BSP
documentation, we discovered that it uses timer 6 for the master timer. Once we determined
the components associated with the porting process, we could turn our attention to figuring
out which files needed to be changed.
Changing Files
We knew from the previous step that 11 ports were used for input and six ports for
output. Because these were general-purpose I/O ports, they needed to be initialized to work
as either inputs or outputs. This gave us an idea of where NPE- specific initialization code
needed to go—specifically, initialization code to set up these ports goes in the startup code.
For this project, initialization code was located in the cstart.a66 file that is located in the
Porting directory. Listing One is the code that configures the NPE-167 board I/O. Once
configured, I/O can be used by higher level RTOS and API functions. Once we figured out
where the I/O changes go, we needed to turn our attention to discovering and setting up the
master timer.
BSP set up the master timer for us because we were using default timer 6. Setup code
for this timer is located in cstart.a66 and rtxc main.c. Listing Two is a snippet of the RTXC-
specific code. After analyzing the architecture requirements, we discovered that the only file
to change for porting the NPE-167 board was cstart.a66. Granted, we knew we would have
to change other files as well, but those files are application specific.
This brought us to the third step, which was straightforward because we knew what
needed to be changed and where. Recall that all changes for basic porting functionality
occurred in cstart.a66. We also needed to write the code for initialization. We wrote code to
initialize the switches to handle CAN—but no other code— to deal with it because it is not
used in the basic port. For specifics, look at cstart.a66 and search for npe and rtxc labels to
find code changes specific to this port. Keep in mind, when porting to new hardware you
may want to adopt a similar strategy for partitioning the code for hardware- and RTOS-
specific changes. That is because partitioning code through the use of labels helps with code
maintainability.
Test Code
4
Finally, we needed to create some test code to test our port. Building the test code
application was a two-step process:
• We compiled the RTXC kernel into a library object (rtxc.lib).
• We compiled the test code and link in rtxc.lib to create the executable.
There are two directories for generating the test code, and they are stored at the same
level in the hierarchy. Moreover, all files for creating rtxc.lib are located in the kernel
directory. Alternatively, test code-specific files are located in the Porting directory.
The RTXCgen utility creates a set of files corresponding to each RTOS component.
For instance, application queues are defined in three files: cqueue.c, cqueue.h, and
cqueue.def. The same holds true for tasks, timers, semaphores, mailboxes, and the rest.
Changes to the number of RTOS components are handled by this utility. For example, if we
wanted to change the number of tasks used by the test code, we use RTXCgen to do it. Figure
2 shows the contents of the task definition file for the test code application. Test code files
created by RTXCgen are placed in the Porting directory. Once RTXCgen has defined the
system resources, we are ready to build the project.
Creating the executable test code requires the build of two subprojects—the kernel
and test code. We performed builds using the Keil Microvision IDE (http://www.keil.com/).
Keil uses project files (*.prj files) to store its build information. RTXC kernel creation
consists of building the code using the librtxc.prj file located in the kernel directory. Evoking
the librtxc project compiles, links, and creates a librtxc object in the kernel directory.
Building the test code is accomplished using the NpeEg.prj file stored in the Porting
directory. Invoking the NpeEg project compiles and links files in the Porting directory, and
links the librtxc object in the kernel directory. The resulting executable is then placed in the
Porting directory as well. Once the test code was fully built, we were ready to test the board
port.
The test code is a simple application used to validate the porting process. Most of the
test code is located in main.c located in the Porting directory. The application works by
starting five tasks—two user and three system. User tasks execute alternatively, while system
tasks execute in the background. One user task begins running. It then outputs data via one
of the system tasks to the console. Next, it signals the other to wake up, and it puts itself to
sleep, thus waiting for the other task to signal it to wake up again.
Broadly speaking, there is read only memory (ROM – nowadays that is usually flash
5
memory) and read/write memory (RAM). ROM is where the code and constant data is
stored; RAM is used for variables. However, to improve performance, it is not uncommon
to copy code/data from ROM to RAM on boot up and then use the RAM copy. This is
effective because RAM is normally faster to access than ROM. So, when thinking about of
RTOS footprint, you need to consider ROM and RAM size, including the RAM copy
possibility.
The issue can become more complex. There may be on-chip RAM and external
memory available. The on-chip storage is likely to be faster, so it may be advantageous to
ensure that RTOS code/data is stored there, as its performance will affect the whole
application. In a similar fashion, code/data may be locked into cache memory, which tends
to offer even higher performance.
Compiler optimization
When building code, like an RTOS, the optimization setting applied to the compiler
affect both size and execution speed. Most of the time, code built for highest performance
(i.e. fastest) will be bigger; code optimized to be smaller will run slower. It is most likely that
an RTOS would normally be built for performance, not size.
Although an RTOS vendor, wanting to emphasize the small size of their product, might make
a different choice.
RTOS configuration
Real time operating systems tend to be very configurable and that configuration can
vary the RTOS size drastically. Most RTOS products are scalable, so the memory footprint
is determined by the actual services used by the application. The granularity of such
scalability varies from one product to another. In some cases, each individual service is
optional; in others, whole service groups are included or excluded – i.e. if support for a
particular type of RTOS object (e.g. semaphore) is required, all the relevant services are
included. On a larger scale, other options, like graphics, networking and other connectivity,
will affect the code size, as these options may or may not be needed/included.
Fig:5.1:RTOS Configuration
Runtime library
6
Typically, a runtime library will be used alongside an RTOS; this code needs to be
accommodated. Again, the code, being a library, may scale well according to the needs of a
particular application. Data size issues apart from a baseline amount of storage for
variables, the RAM requirements of an RTOS can similarly be affected by a number of
factors:
Compiler optimization
The number of RTOS objects (tasks, mailboxes, semaphores etc.) used by the
application will affect the RTOSRAM usage, as each object needs some RAM space.
Stack
Normally, the operating system has a stack and every task has its own stack; these
must all be stored in RAM. Allocation of this space may be done differently in each RTOS,
but it can never be ignored.
Dynamic memory
7
RTOS for Image Processing:
Typical registration process stages include identifying movement vectors between two
relative images, performing alignment, and applying further correction/enhancement filters
to improve image and stream quality. In defense applications, sensor-based components use
registration from ground systems to a variety of aerial systems. Adding to its complexity,
defense applications require very high-performance computations (high resolutions and
frame rates) and have limited space for hardware, dictating a small system size. This requires
a solution with good heat dissipation and the ability to consistently operate at low power.
The key points in RTOS for image processing applications are :
• The need for speed
• Power and size constraints
• Real-Time threat detection
• Faster-than-real-time processing
The quality and the size of image data (especially 3D medical Segmentation of a human
brain. data) is constantly increasing. Fast and optimally interactive post processing of these
images is a major concern. E. g., segmentation on them, morphing of different images,
sequence analysis and measurement are difficult tasks to be performed. Especially for
segmentation and for morphing purposes level set methods play an important role.
In the case of image segmentation f is a force which pushes the interface towards the
boundary of a segment region in an image. Usually f equals one in homogeneity regions of
the image, whereas f tends to zero close to the segment boundary. The discretization of the
level set model is performed with finite differences on an uniform quadrilateral or
octahedral grid. A characteristic of image processing methods is the above described
multiple iterative processing of data sets. Due to the possible restriction on the number
precision it is possible to work on integer data sets with a restricted range of values, i. e., an
application specific word length. Furthermore, it is possible to incorporate parallel
execution of the update formulas
8
A hardware accelerated system for data Processing: Software and
Hardware Modules
Image processing algorithms as described consists of a complex sequence of primitive
operations, which have to be performed on each nodal value. By combining the complete
sequence of primitive operations into a compound operation it is possible to reduce the loss
of performance caused by the synchronous design approach.
This design approach, which is common to CPU designs as well as for FPGA designs,
is based on the assumption that all arithmetic operations will converge well in advance to the
clock tick, which will cause the results to be post processed. Therefore, the maximum clock
speed of such systems is defined by the slowest combinatorial path. In a CPU this leads to
’waiting time’ for many operations. Furthermore, there is no need for command fetching in
FPGA designs, which solves another problem of CPU-based algorithms. Additionally, it is
possible to do arbitrary parallel data processing in a FPGA, so that several nodal values can
be updated simultaneously.
The input data rate for CPU-based and FPGA-based applications is determined by
the bandwidth of the available memory interface. A 2562 image results in 64k words,
resulting in 768 kBit data at 12 bit resolution. A CPU with a 16 bit wide data access would
need 128 kByte to store the original image data, without taking the memory for intermediate
results into account. The discussion is not restricted to the information described in this
section, there are many real time examples those can be considered as case
studies for image processing in RTOS. The process for generating a target image for the
QNX CAR platform is described below
9
Fig:5.3: Procedure to generate a QNX CAR platform target image
As part of the installation process for the QNX CAR platform, a workspace was created for
you that contains the scripts and configuration files you'll be using. These files are located in
the following
locations:
Scripts:
For Linux: $QNX_CAR_DEPLOYMENT /deployment/scripts/
For Windows: %QNX_CAR_DEPLOYMENT% \deployment\scripts
where QNX_CAR_DEPLOYMENT is install_location /qnx660/deployment/qnx-car/.
Configuration files:
For Linux: $QNX_CAR_DEPLOYMENT /boards/<platform >/etc/
For Windows: %QNX_CAR_DEPLOYMENT% \boards\<platform >\etc
10
2. Extract a BSP. For detailed instructions, see “Building a BSP ”.
3. Create an output directory where you want to have the image generated.
You must specify a valid directory name; the directory must exist prior to running
the mksysimage.py script, otherwise the image won't be generated.
The mksysimage.py utility generates images for various configurations. For example, for
SABRE
Lite, image files are created for SD and SD/SATA:
imx61sabre-dos-sd-sata.tar
imx61sabre-dos-sd.tar
imx61sabre-os.tar
imx61sabre-sd-sata.img
imx61sabre-sd.img
11