OpenRADIUS is built around some design ideas that determine most of how
it works; if you build any server using these principles, then a lot of
its architecture will already be defined.
These are no novel ideas at all; those who are comfortable with the Unix
programming environment will probably be already familiar with them:
modularity is based on pipes and processes, not on libraries or a
hybrid form of a library and a process such as some DLL designs;
concurrency is achieved by offloading work to multiple other
processes, not by multithreading within the same process.
This means that any operation that takes any significant amount of time to
complete, is to be done in a subprocess (module). The server itself has only a
single thread, which processes a request until the response to the client is
ready or until a module is invoked.
All this is done on the assumption that it makes no sense for the main server
to preempt any of its own data processing, which is CPU bound; it's better to
get that over with as soon as possible. But all I/O operations are fully
non-blocking; as soon as a job is queued for transmission to a subprocess, the
server can handle new requests, (partial) answers from subprocesses, anything.
This also means, indeed, that if you run only a single main server,
without running any external modules, you won't take any advantage of a
multi-CPU machine. But who would want to waste that CPU power on such a
simple job? The bottleneck is the network in that case. If it's not,
because you run multiple high-performance ethernet cards in the machine,
you can run multiple instances of the main server, each bound to a
different network interface - thus taking advantage of the extra CPUs
and other hardware.
The module subprocesses themselves can be very simple too - they don't
need to keep track of multiple outstanding requests; that is done by the
server. It only needs to be handling one request at a time and can block
as long as needed.
That this doesn't block other requests that cause a call to the same
interface is because multiple subprocesses may be started per interface.
That way, you can write a simple module that eg. introduces a latency of
a full second, but by having the server run 10 of them, you can still
achieve a throughput of 10 requests per second.
Alternatively, a module interface can be configured to operate asynchronously,
meaning that any subprocess associated with it is allowed to receive a number
of requests before having to answer any of them. The number of outstanding
requests per subprocess and the attribute that is used to match responses to
requests can be configured per module interface.
The difference with the 'classic' Unix server design is that the server doesn't
spawn a child for each request as it comes in, but that it pre-spawns a pool of
children that keep running, communicating with them through pipes. This removes
the overhead of creating and destroying processes; see Apache's FastCGI module
for another example of this type of interface.
2. The interface
When the server is started, it creates two pipes for each subprocess,
one for sending requests to it and one for receiving answers, and spawns
the module as a child of the main process.
Each subprocess is associated with a certain interface. You could have
an interface to a certain database, which uses three subprocesses; two of
them may connect to the same database, and one may connect to another
database which contains the same information, but that just happens to
run on another machine.
It doesn't matter; they're all part of the same interface because they
all give the same answer when asked the same question. The reason you'd
want to run multiple subprocesses is for load sharing among database
connections and as said above, to be able to have multiple outstanding
requests on the same interface, even though each simple subprocess can
handle only one request at a time.
After a child is started, it waits for a request to come in on file
descriptor 0, and answers it on file descriptor 1. It must keep running
indefinitely, or until it is terminated by a signal the server. Childs
that terminate unexpectedly are automatically restarted after a short
interval, to avoid consuming unnecessary resources if something causes
the subprocess to immediately exit after starting.
2.1. Interface calls
When an interface is called by the behaviour expression, the server
builds a message from the of A/V pairs currently on the job's request
list. (This list initially contains all information from and about the
original request, but there may be attributes added, changed or removed
by the behaviour expression).
If no send ACL is specified in the configuration file for the interface,
the request message will include all attributes on the request list
present; otherwise it will only have the ones listed in the ACL.
If the interface has a channel to a subprocess that's currently idle (it
tries to find one in round-robin fashion), it makes that subprocess take
the job immediately; it puts the message on the sending ring buffer and
puts the channel in the 'sending' state, so that each time the operating
system indicates data can be sent through the pipe to that process,
we'll send from the ring that holds the full message as much as it can
take, until the whole message has been transmitted to the subprocess.
The request, with its associated state (together called a job here), is
put on a receive queue associated with that subprocess, so that when
a (partial) answer arrives from it, the server knows which question the
answer is for. Certain modules, such as the RADIUS client, may want to
answer module queries out of order, and in those cases a configurable
attribute is used to match queries and answers.
In case no channel to a subprocess was found idle, the job is put on a shared
queue, which is associated with the interface, not with any particular
subprocess. Then, as soon as the server finishes sending a request message to
any of that interface's subprocesses, the server checks the shared queue of the
interface the subprocess belongs to for pending jobs. If there are any, it
immediately takes one off and hands it to the subprocess that has become idle.
This is more efficient than immediately selecting the subprocess that
gets the message as soon as the interface is called, because it allows a
subprocess that happens to finish its work first, to immediately continue
to do useful work. Thus, no process ever sits idle if there are any jobs
waiting for the interface it belongs to.
As said, the server never waits for a single specific external event before
doing anything else. All external I/O is multiplexed. After preparing things
for a message to be sent to a subprocess and attempting the first write, it
waits for any new request to come in, or another (part of an) answer to come
from any subprocess, or another subprocess that we're sending data to, to
become available to receive more data, and so on.
When an answer happens to have fully arrived from a subprocess, the
server gets the job that was waiting for this answer from that
subprocess' pending job list and adds the A/V pairs from the answer to
the reply list of the job. If no read ACL is defined for the interface,
all pairs in the answer are added, otherwise only the ones listed in the
ACL.
Then, the server continues processing the behaviour expression on the
job until it is done or until it makes another interface call.
2.2. Requests from modules
Other than as RADIUS requests from the network or, when running as a module
itself, as binary interface messages from stdin, the server may also receive
A/V pairs from modules.
This way, changed requests can be processed from the top as if a real request
was received, or RADIUS or EAP requests decrypted by an EAP-TLS module can be
processed by the same databases and logic used by other requests.
Modules that need this feature may only send RADIUS requests to the module
interface as after a request from the main server, never untriggered. The
server distinguishes a module's requests from its responses using the message
header, which is only available when the binary interface is used.
3. Message layout
Before looking at this, it's a good idea if you first have an idea of
how the server handles information about a request. See the file
doc/packet_handling for more details about attribute spaces, fixed
fields, etc.; see doc/language for more information about manipulating
the request and reply lists.
The important things to keep in mind are: a. that *all* information that
is available to the behaviour expression and the subprocesses, is represented
as attribute/value pairs; b. that each job has two lists of those A/V pairs
associated with it: the request list and the reply list; c. that a full
attribute specification consists of a attribute space number, a vendor number
and an attribute number, and d. that values can have various types: integer,
string, IP address or date.
As explained earlier, when an interface call is made, the server transmits
the job's current request list in a particularly formatted message to
one of the interface's subprocesses. The subprocess prepares its answer
in the same format, and upon receipt, the server adds the A/V pairs
that are present in the answer to the job's reply list.
These messages can either be in binary format, with length specifiers
for all variable-length fields to facilitate modules written in C, Java
or C++; or in ASCII, with field and record separators to facilitate
modules written in text-oriented languages such as Perl, awk, or even
an ordinary unix shell.
3.1. Binary messages
Binary-type interface messages are used if you don't specify the 'Ascii'
flag for an interface. They are formatted as follows:
a value in network order (MSB first), used to detect framing errors
and to distinguish module replies from module requests. As of 0.9.11,
the value is 0xbeefdead for requests (server to module or module to
server) and 0xdeadbeef for responses (module to server or server to
module); previously the value was always 0xdeadbeef.
Length:
the total length of the message, incl. the Magic and Length
fields, in network order.
A/V pairs:
a list of attribute/value pairs. Each pair starts at an even
4-octet boundary, and is formatted as follows:
number that identifies the attribute's space, in network
order. See doc/packet_handling and the dictionary for some
more information about spaces.
Vendor:
the vendor's Private Enterprise Code as assigned by the IANA
if the attribute is vendor-specific, or 0 if it's a standard
attribute. In network order.
Attribute:
the attribute number, in network order.
Length:
the value's real length, i.e. the number of 8-bit characters
in a string or the number of significant bytes in an ordinal
value, in network order.
If you round this length up to a multiple of 4 octets and add
16 octets for the value's own offset, you get the total
length for this A/V pair entry - i.e. the relative offset of
the next pair.
Value:
is formatted depending on the value's type:
string: any number of octets as specified by the Length
field, padded on the right up to a multiple of 4 octets.
ordinal: a value sized between 1 and 4 octets inclusive
(between 1 and 8 octets inclusive on 64 bit platforms),
as specified by the length field, zero-padded on the left
to a multiple of 4 octets, in network order.
If the server detects any anomaly in a message from a subprocess, such
as a wrong magic number, an A/V pair that would extend past the number
of bytes specified by the message length field, or anything else that
would indicate a framing error, it will terminate and restart the
subprocess, as that's the only way to re-synchronise the stream.
In most cases, this indicates a bug in the subprocess, so if the server
logs that it restarted the subprocess because of an invalid received
message, it is recommended to take a good look at the subprocess' code.
The alignment specifications above may seem perhaps a bit complex, but
they make traversing the list and accessing the ordinal values much
faster. In the end, it only takes an 'add' and an 'and' operation to
round up the lengths when traversing the list. This is much cheaper than
accessing non-aligned values, especially on non-intel platforms.
3.2. ASCII messages
ASCII messages are selected when the 'Ascii' interface flag is
specified. Their format can be influenced to a large extent by other
flags, so that the subprocesses need to do as little text mangling as
possible before being able to do useful work.
The messages look like this (optional components are enclosed by square
brackets, angle brackets denote ASCII control charactes):
Thus, A/V pairs are separated by <LF> characters, and an empty pair
indicates the end of the message.
The appearance of the leading <TAB> is controlled by the Add-Tab
interface flag; similarly, the appearance of the spaces around the
equals sign is controlled by the Add-Spaces flag.
The space- and vendor names to qualify the attribute are present by
default; to turn them off, use the Short-Attr flag. The data type
specification is only present if you specify the Add-Type flag.
The individual fields have the following meaning and format:
Space:
the name of the attribute space as associated with its
number by the dictionary. See doc/packet_handling.
Vendor:
the name of the vendor as specified by the dictionary for
this vendor's Private Enterprise Code.
Attribute:
the name of this attribute as specified by the dictionary.
Type:
one of the type names 'Integer', 'String', 'IPAddr' or
'Date' that indicate the value's data type.
Value:
formatted according to the interface flags and the value's
data type.
When the 'Hex-Value' flag is set for this interface, all value fields
are written out in hexadecimal, regardless of their actual data type.
This makes parsing by modules extremely easy, but if the module has to
output any attributes in a human-oriented form, it may not be the best
option.
If the Hex-Value flag is not set, the format for the Value field is as
follows, depending on the data type:
Integers:
If the flag 'Named-Const' is set for this interface, and
the value corresponds to one of the constants defined in the
dictionary for this attribute, the constant's name is used.
In all other cases, the value is written as a decimal number.
Strings:
Sent by the server between double quotes; non-printable
characters (<32 or >126), single and double quotes that are part
of the string itself and backslashes are all written using hex
escapes, as in \x3f. If the flag 'Double-Backslash' is set, the
backslash is output twice, to better accomodate shell script modules.
The server is a bit more more liberal in what it accepts when receiving
pairs: the quotes are optional for strings that do not contain
whitespace; quotes may be single (') in as well as double ("); single
quotes don't have to be escaped in double quoted strings and vice
versa, and in addition to hex escapes (\xff) the following C-style
escapes are supported:
\n - linefeed
\r - carriage return
\NNN - character with octal value NNN
\C - the character C, without the backslash.
These rules are the same as those used in the behaviour language
for string-type terms, and are documented in some more detail
here.
IP addrs:
Written in standard dotted decimal notation, as in 1.2.3.4.
Dates:
Written in decimal as the number of seconds since January 1, 1970.