Conventions

This sections describes the coding conventions adopted by the Code Craftsmen project. These standards should be respected when making contributions to our Code Crafting tools.

Python Code

Style

A formal style guide for Code Craftsmen Python code has not been defined yet, but the Google Python Style Guide is worth a look as a starting point. Until a style guide is defined, please look at existing code and try to match the style as closely as possible.

Embedded Documentation

The API documentation for our Python packages is extracted from docstrings embedded in the code using Sphinx. The docstrings are formatted according to the Google docstring style convention with PEP 484 type annotations. The autodoc, napoleon, and sphinx_autodoc_typehints Sphinx extensions are used to extract and format the API documentation. See this Google docstring example to get a feel for this docstring format.

C++ Code

Style

A formal style guide for Code Craftsmen C++ code has not been defined yet, but the Google C++ Style Guide is worth a look as a starting point. Until a style guide is defined, please look at existing code and try to match the style as closely as possible.

Embedded Documentation

The API documentation for our C++ packages is extracted from special comment blocks embedded in the source code using Doxygen and Sphinx.

C Code

Style

In general, C code should follow the Linux kernel coding style.

Embedded Documentation

The API documentation for our C packages is extracted from special comment blocks embedded in the source code using the kernel-doc Sphinx extension.

Text Files

Configuration, input, and source information should be specified in a human-readable text format, if it is practical to do so. A Wumps-based syntax should be employed for these purposes unless there is a very good reason to use something else. This commonality reduces the number of formats that software developers need to deal with and allows parsing code to be shared. More rationale for these recommendations can be found in the file format considerations section.

Binary Data Files

In some cases, such as those where data processing or storage efficiency is a significant concern, it may be more appropriate to store data in binary files instead of text files. In these cases, it is also desirable to use some sort of standardized format, if possible. The best format to use may depend on the particular application.

As discussed in the data recording considerations section, there is often a need to record or play back the high-density run-time message stream traffic generated by an application. Since messaging is such an integral part of our code crafting paradigm, we have developed our own standardized file format for storing binary message stream data.

Binary Message Stream Recording Format

A binary message stream file is simply a sequence of file entries.

File Content

Entry 1

Entry 2

[…]

Entry N

Each entry in the file consists of three sequential fields: an entry code identifying the type of entry, the entry size (in bytes), and the entry content.

Entry

Entry Code

Entry Size

Entry Content

The sizes of these three fields are not fixed, but may vary with each entry. This minimizes the storage space “overhead” of the file format without introducing additional restrictions.

Standardizing the format of file entries allows data processing applications to skip over entries that are not of interest without having to understand the details of those specific entries.

Entry Code

The first field of a file entry is the entry code, which is used to indicate what type of data the following entry content field contains. The minimum size of the entry code field is one byte, and its size is determined by examining the content of this field as it is read in. If the most significant bit (bit 7) of the first byte of the entry code is 0, then the entry code field is only one byte long.

Single-Byte Entry Code

MSb

0XXXXXXX

LSb

If the most significant bit of the first byte is 1, then the entry code field is more than one byte long, and the next byte must be examined. The same process is applied to each consecutive byte until a 0 is found in the most significant bit. In theory, an entry code could be up to 4 bytes long, but more than one or two bytes should not be required.

Two-Byte Entry Code

MSb

1XXXXXXX

LSb

MSb

0XXXXXXX

LSb

Three-Byte Entry Code

MSb

1XXXXXXX

LSb

MSb

1XXXXXXX

LSb

MSb

0XXXXXXX

LSb

Four-Byte Entry Code

MSb

1XXXXXXX

LSb

MSb

1XXXXXXX

LSb

MSb

1XXXXXXX

LSb

MSb

0XXXXXXX

LSb

Once the number of bytes in the entry code has been determined, the individual bytes are combined in little-endian byte order to construct the final entry code field value.

Resulting Entry Code Field Value

Byte 3 (MSB)

Byte 2

Byte 1

Byte 0 (LSB)

32-Bit Result (Hex)

N/A

N/A

N/A

0XXXXXXX

000000XX

N/A

N/A

0XXXXXXX

1XXXXXXX

0000XXXX

N/A

0XXXXXXX

1XXXXXXX

1XXXXXXX

00XXXXXX

0XXXXXXX

1XXXXXXX

1XXXXXXX

1XXXXXXX

XXXXXXXX

Entry Size

The second field of a file entry is the entry size, which indicates the number of bytes present in the following entry content field. The entry size field is either one, two, four, or eight bytes in length, depending on the values of bits 5 and 6 in the preceding entry code field.

Entry Code Byte 0 (LSB)

Length of Entry Size Field

MSb

X00XXXXX

LSb

1 Byte

MSb

X01XXXXX

LSb

2 Bytes

MSb

X10XXXXX

LSb

4 Bytes

MSb

X11XXXXX

LSb

8 Bytes

Entry Content

The third (and final) field of a file entry is the entry content, which is the actual data associated with the file entry. The number of bytes present in the entry content field is equal to the value of the preceding entry size field, and the type of data contained in the content field is indicated by the value of the preceding entry code field. The following table lists the types of file entries that are currently defined.

Entry Code

Entry Content

MSb

0XX00000

LSb

File Format ID

MSb

0XX00001

LSb

Source Application ID

MSb

0XX10000

LSb

Message Content

MSb

0XX10001

LSb

Message Content with 1-Byte Stream ID

MSb

0XX10010

LSb

Message Content with 2-Byte Stream ID

MSb

0XX10011

LSb

Message Content with 4-Byte Stream ID

All other entry codes are reserved for future use.

File Format ID

The content field for a File Format ID entry contains a null-terminated string that specifies the file format used in the data file being read. Files that conform to the Binary Message Stream Recording Format described in this document shall specify a File Format ID string that begins with the characters CCBMSRF. Other characters may follow these leading characters. In the future, a suffix (e.g. r1.2) may be added to indicate a specific revision of the file format.

The first entry in a CCBMSRF data file must always be a File Format ID entry. This allows data processing applications to quickly determine if a specified data file is encoded in a supported format.

Source Application ID

The content field for a Source Application ID entry contains a null-terminated string that specifies the application that was used to generate the data file being read. This string is typically used by data processing applications to look up the message stream identifiers and message formats associated with the data in the file.

A source application string is typically of the form /company/division/group/application[revision] in order to reduce name collisions.

Message Content

The content field for a Message Content entry contains an “opaque” block of raw data. This is usually the content of a single message, but could, in theory, be any type of data. Since there is no stream identifier specified for the message, this type of file entry is typically used in data files where only one message stream is recorded.

Message Content with Stream ID

The content field for a Message Content with Stream ID entry contains two subfields: a stream identifier field followed by a message content field.

Message Content w/ Stream ID Subfield

Stream ID

Message Content

The size of the stream identifier subfield is determined by bits 0 and 1 of the entry code field:

Entry Code

Size of Stream ID Subfield

MSb

0XX10001

LSb

1-Byte

MSb

0XX10010

LSb

2-Bytes

MSb

0XX10011

LSb

4-Bytes

The message content subfield is an “opaque” raw data block. It is usually the content of a single message, but could, in theory, be any type of data. The preceding stream identifier subfield indicates which data stream the message is associated with. A data processing application typically uses the stream identifier in conjunction with the Source Application ID string to uniquely identify the source and format for the associated message data. The size of the message content subfield is the value of the entry size field minus the size of the stream identifier subfield.