This is now the fifth CI system I have worked with / on, and I am growing tired of not being able to re-use components from the previous systems due to how deeply-integrated its components are, and how implementation details permeate from one component to another. Additionally, such designs limit the ability of the system to grow, as updating a component would impact a lot of components, making it difficult or even impossible to do without a rewrite of the system, or taking the system down for multiple hours.
With this new system, I am putting emphasis on designing good interfaces between components in order to create an open source toolbox that CI systems can re-use freely and tailor to their needs, while not painting themselves in a corner.
I aim to blog about all the different components/interfaces we will be making for this test system, but in this article, I would like to start with the basics: proposing design goals, and setting up a machine to be controllable remotely by a test system.
Overall design principles
When designing a test system, it is important to keep in mind that test results need to be:
- Stable: Re-executing the same test should yield the same result;
- Reproducible: The test should be runnable on other machines with the same hardware, and yield the same result;
What this means is that we should use the default configuration as much as possible (no weird setup in CI). Additionally, we need to reduce the amount of state in the system to the absolute minimum. This can be achieved in the following way:
- Power cycle the machine between each test cycle: this helps reset the hardware;
- Go diskless if at all possible, or treat the disk as a cache that can be flushed when testing fails;
- Pre-compute as much as possible outside of the test machine, to reduce the impact of the environment of the machine running the test.
Finally, the machine should not restrict which kernel / Operating System can be loaded for testing. An easy way to achieve this is to use netboot (PXE), which is a common BIOS feature allowing diskless machines to boot from the network.
Converting a machine for testing
Now that we have a pretty good idea about the design principles behind preparing a machine for CI, let’s try to apply them to an actual machine.
Step 1: Powering up the machine remotely
In order to power up, a machine needs both power and a signal to start. The latter is usually provided by a power button, but additional ways exist (non-exhaustive):
- Wake on LAN: An Ethernet frame sent to the network adapter triggers the boot;
- Power on by Mouse/Keyboard: Any activity on the mouse or the keyboard will boot the computer;
- Power on AC: Providing power to the machine will automatically turn it on;
- Timer: Boot at a specified time.
Unfortunately, none of these triggers can be used to also turn off the machine. The only way to guarantee that a machine will power down and reset its internal state completely is to cut its power supply for a significant amount of time. A safe way to provide/cut power is to use a remotely-switchable Power Distribution Unit (example), or simply using some smart plug such as Ikea’s TRÅDFRI. In any case, make sure you rely on as few services as possible (no cloud!), that you won’t exceed the ratings (voltage, power, and cycles), and can read back the state to make sure the command was well received. If you opt out for the industrial PDUs, make sure to check out PDU Gateway, our REST service to control the machines.
Now that we can reliably cut/provide power, we still need to control the boot signal. The difficulty here is that the signal needs to be received after the machine received power and initialized enough to receive this event. To make things as easy as possible, the easiest is to configure the BIOS to boot as soon as the power is brought to the computer. This is usually called “Boot on AC”. If your computer does not support this feature, you may want to try the other ones, or use a microcontroller to press the power button for you when powering up (see the HELP! My machine can’t … Boot on AC section at the end of this article).
Step 2: Net booting
Net booting is quite commonly supported on x86 and ARM bootloaders. On x86
platforms, you can generally find this option in the boot option priorities
under the name
PXE boot or
network boot. You may also need to enable the
LAN option ROM,
LAN controller, or the UEFI
network stack. Reboot, and
check that your machine is trying to get an IP!
The next step will be to set up a machine, called Testing Gateway, that will provide a PXE service. This machine should have two network interfaces, one connected to a public network, and one connected to the test machines (through a switch). Setting up this machine will be the subject of an upcoming blog post, but if your are impatient, you may use our valve-infra container.
Step 3: Emulating your screen and keyboard using a serial console
Thanks to the previous steps, we can now boot in any Operating System we want, but we cannot interact with it…
One solution could be to run an SSH server on the Operating System, but until we could connect to it, there would be no way to know what is going on. Instead, we could use an ancient technology, a serial port, to drive a console. This solution is often called “Serial console” and is supported by most Operating Systems. Serial ports come in two types:
- UART: voltage changing between 0 and VCC (TTL signalling), more common in the System-on-Chip (SoC) and microcontrollers world;
- RS-232: voltage changing between a positive and negative voltage, more common in the desktop and datacenter world.
In any case, I suggest you find a serial-to-USB adapter adapted to the computer you are trying to connect:
On Linux, using a serial console is relatively simple, just add the following
in the command line to get a console on your screen AND over the
serial port running at 9600 bauds:
If your machine does not have a serial port but has USB ports, which is more the norm than the exception in the desktop/laptop world, you may want to connect two RS-232-to-USB adapters together, using a Null modem cable:
Test Machine <-> USB <-> RS-232 <-> NULL modem cable <-> RS-232 <-> USB Hub <-> Gateway
And the kernel command line should use
ttyUSB0 instead of
Putting it all together
Start by removing the internal battery if it has one (laptops), and any built-in wireless antenna. Then set the BIOS to boot on AC, and use netboot.
Steps for an AMD motherboard:
Steps for an Intel motherboard:
Finally, connect the test machine to the wider infrastructure in this way:
Internet / ------------------------------+ Public network | +---------+--------+ USB | +-----------------------------------+ | Testing | Private network | Main power (240 V) ---------+ | Gateway +-----------------+ | | +---------+--------+ | | | | Serial / | | | | Ethernet | | | | | | +-----------+--------------------+--------------+ +-------+--------+ +----+----+ | Switchable PDU | | RJ45 switch | | USB Hub | | Port 0 Port 1 ... Port N | | | | | +----+------------------------------------------+ +---+------------+ +-+-------+ | | | Main | | | Power| | | +--------|--------+ Ethernet | | | +-----------------------------------------+ +----+----+ | | Test Machine 1 | Serial (RS-232 / TTL) | Serial | | | +---------------------------------------------+ 2 USB +----+ USB +-----------------+ +---------+
If you managed to do all this, then congratulations, you are set! If you got some issues finding the BIOS parameters, brace yourself, and check out the following section!
HELP! My machine can’t …
It’s annoying, but it is super simple to work around that. What you need is to install a bootloader on a drive or USB stick which supports PXE. I would recommend you look into SYSLINUX, and Arch Linux’s wiki page about it.
Boot on AC
Well, that’s a bummer, but that’s not the end of the line either if you have some experience dealing with microcontrollers, such as Arduino. Provided you can find the following 4 wires, you should be fine:
- Ground: The easiest to find;
- Power rail: 3.3 or 5V depending on what your controller expects;
- Power LED: A signal that will change when the computer turns on/off;
- Power Switch: A signal to pull-up/down to start the computer.
On desktop PCs, all these wires can be easily found in the motherboard’s manual. For laptops, you’ll need to scour the motherboard for these signals using a multimeter. Pay extra attention when looking for the power rail, as it needs to be able to source enough current for your microcontroller. If you are struggling to find one, look for the VCC pins of some of the chips and you’ll be set.
Next, you’ll just need to figure out what voltage the power LED is at when the machine is ON or OFF. Make sure to check that this voltage is compatible with your microcontroller’s input rating and plug it directly into a GPIO of your microcontroller.
Let’s then do the same work for the power switch, except this time we also need to check how much current will flow through it when it is activated. To do that, just use a multimeter to check how much current is flowing when you connect the two wires of the power switch. Check that this amount of current can be sourced/sinked by the microcontroller, and then connect it to a GPIO.
Finally, we need to find power for the microcontroller that will be present as soon as we plug the machine to the power. For desktop PCs, you would find this in Pin 9 of the ATX connector. For laptops, you will need to probe the motherboard until you find a pin that has one with a voltage suitable for your microcontroller (5 or 3.3V). However, make sure it is able to source enough current without the voltage dropping bellow the minimum acceptable VCC of your microcontroller. The best way to make sure of that is to connect this rail to the ground through a ~100 Ohm and check that the voltage at the leads of the resistor, and keep on trying until you find a suitable place (took me 3 attempts). Connect your microcontroller’s VCC and ground to the these pads.
The last step will be to edit this Arduino code for your needs, flash it to your microcontroller, and iterate until it works!
Here is a photo summary of all the above steps:
Thanks to Arkadiusz Hiler for giving me a couple of these BluePills, as I did not have any microcontroller that would be small-enough to fit in place of a laptop speaker. If you are a novice, I would suggest you pick an Arduino nano instead.
Oh, and if you want to create a board that would be generic-enough for most motherboards, check out the schematics from my 8 year-old blog post about doing just that!
Boot without a battery
So far, I have never heard of any laptop that would completely refuse to boot when disconnecting the battery. The worst I have heard of was that the laptop would take 30s before starting to boot.
Let’s be real though, your time is valuable, and I would suggest you buy/get another laptop. However, if this is the only model you can get, and you really want to test it, then it will be …. juuuuuust fine!
There are multiple options here, depending how far down the stack you want to go to:
- Rework the Embedded Controller (EC) to drop this delay: Applicable when you have access to the EC’s source code, like for chromebooks;
- Impersonate the battery: Replacing the battery with a microcontroller that will respond to the EC’s commands just like the real battery;
- Reuse the battery controller, but replace the battery cells with … capacitors: The fastest way forward, but can be real-tricky without some knowledge about dealing with Li-ion cells.
I will not explain what needs to be done in option 1, as this is highly-dependent on your platform of choice, but it is by far the safest and the least hacky.
Option 2 is the next best option if the EC is not open or easy to flash. What you will want is to figure out what are the I2C lines in the battery’s connector, and then attach a protocol analyser to it. Boot the machine, then inspect the logs and try to figure out the pattern. Re-implement as much as of it as needed in a microcontroller, until the system boots reliably. Should be a good weekend project!
Option 3 is by far the hackiest and requiring the most skills even if it is the fastest to implement IF you have an oscilloscope, and some super capacitors with a low discharge rate lying around (who doesn’t?). What you’ll need to do is open the battery, rip off the battery cells, and replace them with the super capacitors. They will simulate the battery cell well-enough for most controllers, but beware that the controller might not like starting with the capacitors being discharged, so you may need to force-charge them to [2, 3.6]V (the range between a fully discharged and a fully-charged battery), so consider using a 3.3V power rail. Beware that the battery cells might be wired in series, so you should not connect their negative pole to the ground, as it would short one or more cells! In my case, the controller was happy with seeing an empty battery, and it was fun to see the battery go from 50% to 100% in a second when booting :D
That’s all, folks!