Conventions
This sections describes the coding conventions adopted by the Code Craftsmen project. These standards should be respected when making contributions to our Code Crafting tools.
Python Code
Style
A formal style guide for Code Craftsmen Python code has not been defined yet, but the Google Python Style Guide is worth a look as a starting point. Until a style guide is defined, please look at existing code and try to match the style as closely as possible.
Embedded Documentation
The API documentation for our Python packages is extracted from docstrings embedded in the code using Sphinx. The docstrings are formatted according to the Google docstring style convention with PEP 484 type annotations. The autodoc, napoleon, and sphinx_autodoc_typehints Sphinx extensions are used to extract and format the API documentation. See this Google docstring example to get a feel for this docstring format.
C++ Code
Style
A formal style guide for Code Craftsmen C++ code has not been defined yet, but the Google C++ Style Guide is worth a look as a starting point. Until a style guide is defined, please look at existing code and try to match the style as closely as possible.
Embedded Documentation
The API documentation for our C++ packages is extracted from special comment blocks embedded in the source code using Doxygen and Sphinx.
C Code
Style
In general, C code should follow the Linux kernel coding style.
Embedded Documentation
The API documentation for our C packages is extracted from special comment blocks embedded in the source code using the kernel-doc Sphinx extension.
Text Files
Configuration, input, and source information should be specified in a human-readable text format, if it is practical to do so. A Wumps-based syntax should be employed for these purposes unless there is a very good reason to use something else. This commonality reduces the number of formats that software developers need to deal with and allows parsing code to be shared. More rationale for these recommendations can be found in the file format considerations section.
Binary Data Files
In some cases, such as those where data processing or storage efficiency is a significant concern, it may be more appropriate to store data in binary files instead of text files. In these cases, it is also desirable to use some sort of standardized format, if possible. The best format to use may depend on the particular application.
As discussed in the data recording considerations section, there is often a need to record or play back the high-density run-time message stream traffic generated by an application. Since messaging is such an integral part of our code crafting paradigm, we have developed our own standardized file format for storing binary message stream data.
Binary Message Stream Recording Format
A binary message stream file is simply a sequence of file entries.
Entry 1 |
Entry 2 |
[…] |
Entry N |
Each entry in the file consists of three sequential fields: an entry code identifying the type of entry, the entry size (in bytes), and the entry content.
Entry Code |
Entry Size |
Entry Content |
The sizes of these three fields are not fixed, but may vary with each entry. This minimizes the storage space “overhead” of the file format without introducing additional restrictions.
Standardizing the format of file entries allows data processing applications to skip over entries that are not of interest without having to understand the details of those specific entries.
Entry Code
The first field of a file entry is the entry code, which is used to
indicate what type of data the following entry content field contains.
The minimum size of the entry code field is one byte, and its size is
determined by examining the content of this field as it is read in.
If the most significant bit (bit 7) of the first byte of the entry
code is 0
, then the entry code field is only one byte long.
MSb |
0XXXXXXX |
LSb |
If the most significant bit of the first byte is 1
, then the entry
code field is more than one byte long, and the next byte must be
examined. The same process is applied to each consecutive byte until
a 0
is found in the most significant bit. In theory, an entry
code could be up to 4 bytes long, but more than one or two bytes
should not be required.
MSb |
1XXXXXXX |
LSb |
MSb |
0XXXXXXX |
LSb |
MSb |
1XXXXXXX |
LSb |
MSb |
1XXXXXXX |
LSb |
MSb |
0XXXXXXX |
LSb |
MSb |
1XXXXXXX |
LSb |
MSb |
1XXXXXXX |
LSb |
MSb |
1XXXXXXX |
LSb |
MSb |
0XXXXXXX |
LSb |
Once the number of bytes in the entry code has been determined, the individual bytes are combined in little-endian byte order to construct the final entry code field value.
Byte 3 (MSB) |
Byte 2 |
Byte 1 |
Byte 0 (LSB) |
32-Bit Result (Hex) |
---|---|---|---|---|
N/A |
N/A |
N/A |
0XXXXXXX |
000000XX |
N/A |
N/A |
0XXXXXXX |
1XXXXXXX |
0000XXXX |
N/A |
0XXXXXXX |
1XXXXXXX |
1XXXXXXX |
00XXXXXX |
0XXXXXXX |
1XXXXXXX |
1XXXXXXX |
1XXXXXXX |
XXXXXXXX |
Entry Size
The second field of a file entry is the entry size, which indicates the number of bytes present in the following entry content field. The entry size field is either one, two, four, or eight bytes in length, depending on the values of bits 5 and 6 in the preceding entry code field.
Entry Code Byte 0 (LSB) |
Length of Entry Size Field |
||
---|---|---|---|
MSb |
X00XXXXX |
LSb |
1 Byte |
MSb |
X01XXXXX |
LSb |
2 Bytes |
MSb |
X10XXXXX |
LSb |
4 Bytes |
MSb |
X11XXXXX |
LSb |
8 Bytes |
Entry Content
The third (and final) field of a file entry is the entry content, which is the actual data associated with the file entry. The number of bytes present in the entry content field is equal to the value of the preceding entry size field, and the type of data contained in the content field is indicated by the value of the preceding entry code field. The following table lists the types of file entries that are currently defined.
Entry Code |
Entry Content |
||
---|---|---|---|
MSb |
0XX00000 |
LSb |
File Format ID |
MSb |
0XX00001 |
LSb |
Source Application ID |
MSb |
0XX10000 |
LSb |
Message Content |
MSb |
0XX10001 |
LSb |
Message Content with 1-Byte Stream ID |
MSb |
0XX10010 |
LSb |
Message Content with 2-Byte Stream ID |
MSb |
0XX10011 |
LSb |
Message Content with 4-Byte Stream ID |
All other entry codes are reserved for future use.
File Format ID
The content field for a File Format ID
entry contains a
null-terminated string that specifies the file format used in the data
file being read. Files that conform to the Binary Message Stream
Recording Format described in this document shall specify a File
Format ID
string that begins with the characters CCBMSRF
. Other
characters may follow these leading characters. In the future, a
suffix (e.g. r1.2
) may be added to indicate a specific revision of
the file format.
The first entry in a CCBMSRF
data file must always be a
File Format ID
entry. This allows data processing applications
to quickly determine if a specified data file is encoded in a
supported format.
Source Application ID
The content field for a Source Application ID
entry contains a
null-terminated string that specifies the application that was used to
generate the data file being read. This string is typically used by
data processing applications to look up the message stream identifiers
and message formats associated with the data in the file.
A source application string is typically of the form
/company/division/group/application[revision]
in order to reduce
name collisions.
Message Content
The content field for a Message Content
entry contains an “opaque”
block of raw data. This is usually the content of a single message,
but could, in theory, be any type of data. Since there is no stream
identifier specified for the message, this type of file entry is
typically used in data files where only one message stream is
recorded.
Message Content with Stream ID
The content field for a Message Content with Stream ID
entry
contains two subfields: a stream identifier field followed by a
message content field.
Stream ID |
Message Content |
The size of the stream identifier subfield is determined by bits 0 and 1 of the entry code field:
Entry Code |
Size of Stream ID Subfield |
||
---|---|---|---|
MSb |
0XX10001 |
LSb |
1-Byte |
MSb |
0XX10010 |
LSb |
2-Bytes |
MSb |
0XX10011 |
LSb |
4-Bytes |
The message content subfield is an “opaque” raw data block. It is
usually the content of a single message, but could, in theory, be any
type of data. The preceding stream identifier subfield indicates
which data stream the message is associated with. A data processing
application typically uses the stream identifier in conjunction with
the Source Application ID
string to uniquely identify the source
and format for the associated message data. The size of the message
content subfield is the value of the entry size field minus the size
of the stream identifier subfield.