=========== Conventions =========== This sections describes the coding conventions adopted by the Code Craftsmen project. These standards should be respected when making contributions to our Code Crafting tools. Python Code =========== Style ----- A formal style guide for Code Craftsmen `Python ` code has not been defined yet, but the `Google Python Style Guide`_ is worth a look as a starting point. Until a style guide is defined, please look at existing code and try to match the style as closely as possible. .. _documenting-python-code: Embedded Documentation ---------------------- The API documentation for our `Python ` packages is extracted from `docstrings`_ embedded in the code using `Sphinx `. The docstrings are formatted according to the `Google docstring`_ style convention with `PEP 484`_ type annotations. The `autodoc`_, `napoleon`_, and `sphinx_autodoc_typehints`_ Sphinx extensions are used to extract and format the `API documentation`_. See this `Google docstring example`_ to get a feel for this docstring format. C++ Code ======== Style ----- A formal style guide for Code Craftsmen C++ code has not been defined yet, but the `Google C++ Style Guide`_ is worth a look as a starting point. Until a style guide is defined, please look at existing code and try to match the style as closely as possible. Embedded Documentation ---------------------- The API documentation for our C++ packages is extracted from `special comment blocks`_ embedded in the source code using `Doxygen ` and `Sphinx `. C Code ====== Style ----- In general, C code should follow the `Linux kernel coding style`_. Embedded Documentation ---------------------- The API documentation for our C packages is extracted from special comment blocks embedded in the source code using the `kernel-doc`_ `Sphinx ` extension. Text Files ========== Configuration, input, and source information should be specified in a human-readable text format, if it is practical to do so. A `Wumps-based ` syntax should be employed for these purposes unless there is a very good reason to use something else. This commonality reduces the number of formats that software developers need to deal with and allows parsing code to be shared. More rationale for these recommendations can be found in the `file format considerations ` section. Binary Data Files ================= In some cases, such as those where data processing or storage efficiency is a significant concern, it may be more appropriate to store data in binary files instead of text files. In these cases, it is also desirable to use some sort of standardized format, if possible. The best format to use may depend on the particular application. As discussed in the `data recording considerations ` section, there is often a need to record or play back the high-density run-time message stream traffic generated by an application. Since messaging is such an integral part of our code crafting paradigm, we have developed our own standardized file format for storing binary message stream data. Binary Message Stream Recording Format -------------------------------------- A binary message stream file is simply a sequence of file **entries**. .. table:: File Content +---------+ | Entry 1 | +---------+ | Entry 2 | +---------+ | [...] | +---------+ | Entry N | +---------+ Each entry in the file consists of three sequential fields: an **entry code** identifying the type of entry, the **entry size** (in bytes), and the **entry content**. .. table:: Entry +---------------+ | Entry Code | +---------------+ | Entry Size | +---------------+ | Entry Content | +---------------+ The sizes of these three fields are not fixed, but may vary with each entry. This minimizes the storage space "overhead" of the file format without introducing additional restrictions. Standardizing the format of file entries allows data processing applications to skip over entries that are not of interest without having to understand the details of those specific entries. Entry Code ~~~~~~~~~~ The first field of a file entry is the entry code, which is used to indicate what type of data the following entry content field contains. The minimum size of the entry code field is one byte, and its size is determined by examining the content of this field as it is read in. If the most significant bit (bit 7) of the first byte of the entry code is ``0``, then the entry code field is only one byte long. .. table:: Single-Byte Entry Code +----+----------+-----+ |MSb | 0XXXXXXX | LSb | +----+----------+-----+ If the most significant bit of the first byte is ``1``, then the entry code field is more than one byte long, and the next byte must be examined. The same process is applied to each consecutive byte until a ``0`` is found in the most significant bit. In theory, an entry code could be up to 4 bytes long, but more than one or two bytes should not be required. .. table:: Two-Byte Entry Code +----+----------+-----+ |MSb | 1XXXXXXX | LSb | +----+----------+-----+ |MSb | 0XXXXXXX | LSb | +----+----------+-----+ .. table:: Three-Byte Entry Code +----+----------+-----+ |MSb | 1XXXXXXX | LSb | +----+----------+-----+ |MSb | 1XXXXXXX | LSb | +----+----------+-----+ |MSb | 0XXXXXXX | LSb | +----+----------+-----+ .. table:: Four-Byte Entry Code +----+----------+-----+ |MSb | 1XXXXXXX | LSb | +----+----------+-----+ |MSb | 1XXXXXXX | LSb | +----+----------+-----+ |MSb | 1XXXXXXX | LSb | +----+----------+-----+ |MSb | 0XXXXXXX | LSb | +----+----------+-----+ Once the number of bytes in the entry code has been determined, the individual bytes are combined in little-endian byte order to construct the final entry code field value. .. table:: Resulting Entry Code Field Value +----------+----------+----------+----------+--------------+ | Byte 3 | Byte 2 | Byte 1 | Byte 0 | 32-Bit | | (MSB) | | | (LSB) | Result (Hex) | +==========+==========+==========+==========+==============+ | N/A | N/A | N/A | 0XXXXXXX | 000000XX | +----------+----------+----------+----------+--------------+ | N/A | N/A | 0XXXXXXX | 1XXXXXXX | 0000XXXX | +----------+----------+----------+----------+--------------+ | N/A | 0XXXXXXX | 1XXXXXXX | 1XXXXXXX | 00XXXXXX | +----------+----------+----------+----------+--------------+ | 0XXXXXXX | 1XXXXXXX | 1XXXXXXX | 1XXXXXXX | XXXXXXXX | +----------+----------+----------+----------+--------------+ Entry Size ~~~~~~~~~~ The second field of a file entry is the entry size, which indicates the number of bytes present in the following entry content field. The entry size field is either one, two, four, or eight bytes in length, depending on the values of bits 5 and 6 in the preceding entry code field. +-----+-------------------------+-----+----------------------------+ | | Entry Code Byte 0 (LSB) | | Length of Entry Size Field | +=====+=========================+=====+============================+ | MSb | X00XXXXX | LSb | 1 Byte | +-----+-------------------------+-----+----------------------------+ | MSb | X01XXXXX | LSb | 2 Bytes | +-----+-------------------------+-----+----------------------------+ | MSb | X10XXXXX | LSb | 4 Bytes | +-----+-------------------------+-----+----------------------------+ | MSb | X11XXXXX | LSb | 8 Bytes | +-----+-------------------------+-----+----------------------------+ Entry Content ~~~~~~~~~~~~~ The third (and final) field of a file entry is the entry content, which is the actual data associated with the file entry. The number of bytes present in the entry content field is equal to the value of the preceding entry size field, and the type of data contained in the content field is indicated by the value of the preceding entry code field. The following table lists the types of file entries that are currently defined. +-----+------------+-----+-----------------------+ | | Entry Code | | Entry Content | +=====+============+=====+=======================+ | MSb | 0XX00000 | LSb | File Format ID | +-----+------------+-----+-----------------------+ | MSb | 0XX00001 | LSb | Source Application ID | +-----+------------+-----+-----------------------+ | MSb | 0XX10000 | LSb | Message Content | +-----+------------+-----+-----------------------+ | MSb | 0XX10001 | LSb | Message Content with | | | | | 1-Byte Stream ID | +-----+------------+-----+-----------------------+ | MSb | 0XX10010 | LSb | Message Content with | | | | | 2-Byte Stream ID | +-----+------------+-----+-----------------------+ | MSb | 0XX10011 | LSb | Message Content with | | | | | 4-Byte Stream ID | +-----+------------+-----+-----------------------+ All other entry codes are reserved for future use. File Format ID ~~~~~~~~~~~~~~ The content field for a ``File Format ID`` entry contains a null-terminated string that specifies the file format used in the data file being read. Files that conform to the *Binary Message Stream Recording Format* described in this document shall specify a ``File Format ID`` string that begins with the characters ``CCBMSRF``. Other characters may follow these leading characters. In the future, a suffix (e.g. ``r1.2``) may be added to indicate a specific revision of the file format. **The first entry in a** ``CCBMSRF`` **data file must always be a** ``File Format ID`` **entry.** This allows data processing applications to quickly determine if a specified data file is encoded in a supported format. Source Application ID ~~~~~~~~~~~~~~~~~~~~~ The content field for a ``Source Application ID`` entry contains a null-terminated string that specifies the application that was used to generate the data file being read. This string is typically used by data processing applications to look up the message stream identifiers and message formats associated with the data in the file. A source application string is typically of the form ``/company/division/group/application[revision]`` in order to reduce name collisions. Message Content ~~~~~~~~~~~~~~~ The content field for a ``Message Content`` entry contains an "opaque" block of raw data. This is usually the content of a single message, but could, in theory, be any type of data. Since there is no stream identifier specified for the message, this type of file entry is typically used in data files where only one message stream is recorded. Message Content with Stream ID ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ The content field for a ``Message Content with Stream ID`` entry contains two subfields: a stream identifier field followed by a message content field. .. table:: Message Content w/ Stream ID Subfield +-----------------+ | Stream ID | +-----------------+ | Message Content | +-----------------+ The size of the stream identifier subfield is determined by bits 0 and 1 of the entry code field: +-----+------------+-----+----------------------------+ | | Entry Code | | Size of Stream ID Subfield | +=====+============+=====+============================+ | MSb | 0XX10001 | LSb | 1-Byte | +-----+------------+-----+----------------------------+ | MSb | 0XX10010 | LSb | 2-Bytes | +-----+------------+-----+----------------------------+ | MSb | 0XX10011 | LSb | 4-Bytes | +-----+------------+-----+----------------------------+ The message content subfield is an "opaque" raw data block. It is usually the content of a single message, but could, in theory, be any type of data. The preceding stream identifier subfield indicates which data stream the message is associated with. A data processing application typically uses the stream identifier in conjunction with the ``Source Application ID`` string to uniquely identify the source and format for the associated message data. The size of the message content subfield is the value of the entry size field minus the size of the stream identifier subfield. .. _Google Python Style Guide: https://google.github.io/styleguide/pyguide.html .. _docstrings: https://www.python.org/dev/peps/pep-0287/ .. _Google docstring: https://google.github.io/styleguide/pyguide.html#38-comments-and-docstrings .. _PEP 484: https://www.python.org/dev/peps/pep-0484/ .. _autodoc: https://www.sphinx-doc.org/en/master/usage/extensions/autodoc.html .. _napoleon: https://www.sphinx-doc.org/en/master/usage/extensions/napoleon.html .. _sphinx_autodoc_typehints: https://pypi.org/project/sphinx-autodoc-typehints/ .. _API documentation: https://www.sphinx-doc.org/en/master/usage/quickstart.html#autodoc .. _Google docstring example: https://www.sphinx-doc.org/en/master/usage/extensions/example_google.html#example-google .. _Google C++ Style Guide: https://google.github.io/styleguide/cppguide.html .. _special comment blocks: https://www.doxygen.org/manual/docblocks.html .. _Linux kernel coding style: https://www.kernel.org/doc/html/latest/process/coding-style.html .. _kernel-doc: https://www.kernel.org/doc/html/latest/doc-guide/kernel-doc.html