Real time and determinism. In many RTOS
documentation and in literature
we find definition
of term "real-time" as property of reporting certain events (usually
interrupts and task switching) no longer than some guaranteed period of
time. While this is a requirement in many applications, it is very hard
to determine precisely this maximum latency because most modern CPU
themselves are non-deterministic. An RTOS may
attempt to measure report times and document them, but they will be
effective only to strict circumstances under which they were measured,
and will not necessarily be effective under other circumstances
employed by the customer code.
CPU features that impact OS determinism are:
Caches. Most modern CPUs have
at least separated L1 cache for code and data, I-cache and D-cache
respectively. Many have unified or separated L2 cache, and some have L3
cache. Caching performance is non-deterministic, as cached may
introduce delays for cache misses (data must be brought in from
external memory) and even longer delays for cache replacement
(a new datum or instruction has bad luck of arriving when all quitable
cache lines filled; the previously cached data then must be written
back to external memory and after then new data may be brought from
external memory and fill a cache line.
Write buffers. If a code
writes to external memory it may depend on state of write buffers.
Cache write-back may be affected by state of write buffers too.
Internal pipeline state and out of order
execution. Modern CPUs have long pipelines and some
high-performance CPUs employ internal out-of-order execution of
instructions. This is done in order to optimize instruction throughput,
but the same features prevent determinism of instructions execution.
The same instructions will execute at different count of CPU cycles
depending on state of internal pipeline and execution units at the time
when they enter CPU. (Pipelines more affect software task swicthing
procedures than interrupt reporting, as CPUs with long pipelines would
complete all outstanding instructions and write their results before
starting execution at interrupt handler).
MMU context swicth. When an OS
employs MMU and maps different address spaces for different tasks, it
faces a need of partially or completely flushing MMU state during task
switch. Complete flushing hurts performance too much and on-demand
flushing introduce another delayed non-determinism: MMU translation
cache will be flushed as necessary; amount of flushes depends on
previously cached translations and new translations hit patterns.
Many dedicated embedded RTOS try to address CPU non-determinism by
not using at least features not vital for an embedded application.
First and obvious candidate is MMU - many embedded RTOS don't employ
MMU translations (SeptemberOS is among them). However, caching improve
overall performance too much that they can be sacrificed for
determinism and disabled. Other CPU internal mechanisms are not even
possible to disable.
We saw that it is very hard to determine precisely the "guaranteed
response time" of an RTOS. From a theoretical point of view the
"guaranteed" times may be deduced by adding maximum amount of maximum
possible delay on a given CPU. However such figures are "guaranteed" to
look bad on marketing sheets. Usually not
only "response times" but also "response CPU clocks count" being
documented are taken from measuring a particular application; that mean
that they mean nearly nothing for the customer's
application's numbers taken for the same parameters.
the other hand, all the discussion is relevant to RTOS with very fast
response times, comparable to dozens of CPU cycles. Some OS have
extremely long (relative to CPU cycles) latencies; if figures say that
guaranteed response times count in thousands of CPU cycles (typically
more than 3 microseconds on today's mid-range CPU), they may be
considered (the discussed non-determinism will be negligible).