At Code Construct, we mainly work on software and hardware development, which often warrants some accompanying documentation. The articles below give some technical details on some of our open source development areas.
These documents are available under a Creative Commons BY-SA 4.0 license - you're free to share with others, modify as you see fit, as long as your provide attribution and distribute under the same terms. See the CC BY-SA license text for full details.
Maintaining a long-lived series of changes against a fast-moving open source project can be a headache. Ideally we wouldn't put ourselves in this situation - instead we'd work with upstream to merge the changes and find better things to do with life. However, that's not always feasible.
Some of the bigger headaches are caused by merge conflicts. Sometimes conflicts are invisible as they are solved by clever merge strategies. Other times there's been significant rework of the code on both sides and you are forced to demonstrate you know what you're doing.
We have recently been adding support to the libnvme and MCTP components of Linux that allow out-of-band communication with NVMe storage devices. One of the neat applications of this is to manage device firmware - allowing a BMC to update firmware over an out-of-band (OOB) channel, without any intervention from the host system.
The protocol for the firmware updates is specified by the NVM Express standard, so device support is generally pretty good. As long as the device supports NVMe-MI, out-of-band firmware update should work.
We recently built a fully open source Baseboard Management Controller (BMC) implementation, right down to the hardware, and used it to boot and manage an IBM POWER9 AC922 ("witherspoon") server. The AC922 platform is the compute component of the Summit and Sierra supercomputers, the two fastest worldwide until mid-2020.
This work was conducted in conjunction with the OpenPOWER foundation, as part of their libreBMC project, and builds on a lot of prior work from IBM and Antmicro.
The Non-Volatile Memory Express (NVMe) standard describes an interface between storage controllers and host hardware. Alongside this, there's also a "management interface" (MI) standard for interacting with NVMe devices over a lighter-weight channel - typically to allow a management controller to discover and manage storage devices, over an out-of-band link, typically I²C/SMBus.
Following on from the MCTP introduction, this document describes one of the newer features of the MCTP stack: extended addressing. This allows utilities to directly address specific physical endpoints; for example, when no endpoint IDs have been assigned.
At Code Construct, we have been working on support for the Management Component Control Protocol (MCTP) on Linux systems, to the point where it's becoming generally useful for production server environments. To help with that, we have put together a few details in this introductory document.