28 December 2019

BTRFS filesystem full, now what?

The mileage may vary, but I try to update this post.

Is your filesystem really full? Mis-balanced metadata and/or data chunks

Below, you'll see how to rebalance data blocks and metadata, and you are unlucky enough to get a filesystem full error before you balance, try running this first:
# btrfs balance start -musage=0 /path
# btrfs balance start -dusage=0 /path
A null rebalance will help in some cases, if not read on.
Also, if you are really unlucky, you might get in a no more space error that requires adding a temporary block device to your filesystem to allow balance to run. See below for details.

Pre-emptively rebalancing your filesystem

In an ideal world, btrfs would do this for you, but it does not. I personally recommend you do a rebalance weekly or nightly as part of of a btrfs scrub cron job. See the btrfs-scrub script.

Is your filesystem really full? Misbalanced metadata

Unfortunately btrfs has another failure case where the metadata space can fill up. When this happens, even though you have data space left, no new files will be writeable.
In the example below, you can see Metadata DUP 9.5GB out of 10GB. Btrfs keeps 0.5GB for itself, so in the case above, metadata is full and prevents new writes.
One suggested way is to force a full rebalance, and in the example below you can see metadata goes back down to 7.39GB after it's done. Yes, there again, it would be nice if btrfs did this on its own. It will one day (some if it is now in 3.18).
Sometimes, just using -dusage=0 is enough to rebalance metadata (this is now done automatically in 3.18 and above), but if it's not enough, you'll have to increase the number.
# btrfs fi df .
Data, single: total=800.42GiB, used=636.91GiB
System, DUP: total=8.00MiB, used=92.00KiB
System, single: total=4.00MiB, used=0.00
Metadata, DUP: total=10.00GiB, used=9.50GiB
Metadata, single: total=8.00MiB, used=0.00

legolas:/mnt/btrfs_pool2# btrfs balance start -v -dusage=0 /mnt/btrfs_pool2
Dumping filters: flags 0x1, state 0x0, force is off
  DATA (flags 0x2): balancing, usage=0
  Done, had to relocate 91 out of 823 chunks

legolas:/mnt/btrfs_pool2# btrfs fi df .
Data, single: total=709.01GiB, used=603.85GiB
System, DUP: total=8.00MiB, used=88.00KiB
System, single: total=4.00MiB, used=0.00
Metadata, DUP: total=10.00GiB, used=7.39GiB
Metadata, single: total=8.00MiB, used=0.00

Are you using space_cache?

Probably you've maded a massive copy like a Tb copy with features like space_cache enabled. While space_cache is nice and accelerate things up, you'll probably need to empty this cache. It's easier than you think:

# mount -o remount,clear_cache
# sync
# reboot 

26 October 2019

Make the Macbook touchpad works like MacOS (a modern and definitive guide)

There's a lot of tutorials about that on the internet, I'll try to be as much distribution-agnostic as possible. If your distribution doesn't the package that I'm talking about, you should search elsewhere or compile yourself.


A working Xorg (not sure if works with wayland) and a working DE/WM
mtrack driver


Uninstall whatever the touchpad driver you're using, like synaptics, and install mtrack. And use the following configuration for that (in /etc/X11/xorg.conf or /etc/X11/xorg.conf.d/10-mtrack.conf):

Section "InputClass"
        MatchIsTouchpad "on"
        Identifier      "Touchpads"
        MatchDevicePath "/dev/input/event*"
        Driver          "mtrack"
        # The faster you move, the more distance pointer will travel, using "polynomial" profile
        Option          "AccelerationProfile" "2"
        # Tweak cursor movement speed with this
        Option          "Sensitivity" "0.08"
        # Pressure at which a finger is detected as a touch
        Option          "FingerHigh" "5"
        # Pressure at which a finger is detected as a release
        Option          "FingerLow" "5"
        # I often use thumb to press down the physical button, so let's not ignore it
        Option          "IgnoreThumb" "true"
        Option          "ThumbRatio" "70"
        Option          "ThumbSize" "25"
        # Ignore palm, with palm takes up to 30% of your touch pad
        Option          "IgnorePalm" "true"
        Option          "PalmSize" "30"
        # Trigger mouse button when tap: 1 finger - left click, 2 finger - right click, 3 - middle click
        Option          "TapButton1" "1"
        Option          "TapButton2" "3"
        Option          "TapButton3" "2"
        Option          "TapButton4" "0"
        Option          "ClickTime" "25"
        # Disable tap-to-drag, we're using three finger drag instead
        Option          "TapDragEnable" "false"
        # While touching the touch pad with # fingers, press the touchpad physical click button
        Option          "ClickFinger1" "1"
        Option          "ClickFinger2" "3"
        Option          "ClickFinger3" "2"
        Option          "ButtonMoveEmulate" "false"
        Option          "ButtonIntegrated" "true"
        # The momentum after scroll fingers released
        Option          "ScrollCoastDuration" "300"
        Option          "ScrollCoastEnableSpeed" ".1"
        # Natural scrolling with two fingers
        Option          "ScrollSmooth" "true"
        Option          "ScrollUpButton" "5"
        Option          "ScrollDownButton" "4"
        Option          "ScrollLeftButton" "6"
        Option          "ScrollRightButton" "7"
        # Tweak scroll sensitivity with ScrollDistance, don't touch ScrollSensitivity
        Option          "ScrollDistance" "250"
        Option          "ScrollClickTime" "10"
        # Three finger drag
        Option          "SwipeDistance" "1"
        Option          "SwipeLeftButton" "1"
        Option          "SwipeRightButton" "1"
        Option          "SwipeUpButton" "1"
        Option          "SwipeDownButton" "1"
        Option          "SwipeClickTime" "0"
        Option          "SwipeSensitivity" "1500"
        # Four finger swipe, 8 & 9 are for browsers navigating back and forth respectively
        Option          "Swipe4LeftButton" "9"
        Option          "Swipe4RightButton" "8"
        # Mouse button >= 10 are not used by Xorg, so we'll map them with xbindkeys and xdotool later
        Option          "Swipe4UpButton" "11"
        Option          "Swipe4DownButton" "10"
        # Mouse buttons triggered by 2-finger pinching gesture
        Option          "ScaleDistance" "300"
        Option          "ScaleUpButton" "12"
        Option          "ScaleDownButton" "13"
        # Mouse buttons trigger by 2-finger rotating gesture, disabled to enhance the pinch gesture
        Option          "RotateLeftButton" "0"
        Option          "RotateRightButton" "0"

By now, your trackpad will be working with a lot better. Pay attention to the comments if you need to customize. Now we'll create a map for what we still need. You can adjust the modifiers to fit your needs, or just remap then in your DE/WM. Create a $HOME/.xbindkeysrc with this:

# Next Workspace (Swipe4Left)
"xdotool key --clearmodifiers Control_L+Alt_L+Left"
# Previous Workspace (Swipe4Right)
"xdotool key --clearmodifiers Control_L+Alt_L+Right"
# Desktop Grid (Swipe4Down)(F4 key)
"xdotool key --clearmodifiers XF86LaunchA"
# Toggle Present Windows (Current Desktop) (Swipe4Up)(F5 key)
"xdotool key --clearmodifiers XF86LaunchB"
# Zoom in
"xdotool key ctrl+21"
# Zoom out
"xdotool key ctrl+20"

Dispad have a better touchpad-ignore than mtrack, so we're using it. Create this configuration at $HOME/.dispad (I know, it could be .dispadrc but the default is .dispad so I'll keep it):

# name of the property used to enable/disable the trackpad
property = "Trackpad Disable Input"
# the value used to enable the trackpad
enable = 0
# the value used to disable the trackpad
disable = 1
# whether or not modifier keys disable the trackpad
modifiers = false
# how long (in ms) to sleep between keyboard polls
poll = 48
# how long (in ms) to disable the trackpad after a keystroke
delay = 500

With the files in place, now we need to create a $HOME/.xprofile:

xbindkeys -f $HOME/.xbindkeysrc
dispad -c $HOME/.dispad


Sources used:


26 September 2019

The fragmentation

Since  a lot of stupid people loves to say that all the time, let me explain using your falacies arguments.

Too many package managers

No, there's not "too many package managers", this is called choice. You can CHOOSE what use it. Like portage? Use gentoo. Like apt? Use a debian-oriented one. And the list goes on. Don't like rpm? Then use something that doesn't use it and be happy. No one is obligated to use something just because you like it. Having choices is good for everyone, maybe not for you. 

Too many desktop managers/window managers

Again, you can choose one that suits you better. You can even use one that the configuration is done inside the code. It's good to have choices, and there's a plenty out there.

Too many init systems

Any decent distribution let's you choose what init you want, and you use what is easy/better for you. Even so, most distributions that uses, let say, systemd, there's an alternative without it.

Too many tools

For what job? Some tools have alternatives, some with a lot of features but not everyone wants lots of features. For example, I like syslog-ng better for a syslog, but there's simple ones. 

Final thoughts

I understand, people that born limited are unable to understand how is "having choices". Maybe you don't want to choose, maybe you want that the world turns around you, maybe you want that everyone make the exact same choices as you. Maybe you should stick to windows and stop talking this bullshit everywhere. Or even better, move to some place that everyone is forced to do the same.

11 September 2019

Recovery a freebsd after whatever problem

If you had any trouble with your install (ndis module someone), there's an easy way to fix it.
  1. Boot your pendrive/cdrom/dvd/whatever with the freebsd install
  2. Enter the shell or start the LiveCD option (livecd is root without any password)
  3. Create a directory where you can import your zfs pool: mkdir /tmp/zroot
  4. Import your zpool there: zpool import -fR /tmp/zroot zroot 
  5. Need access to /? No problem: mkdir /tmp/root && mount -t zfs zroot/ROOT/default /tmp/root
  6. When you're done, unmount everything: zpool export zroot 
  7. Reboot
Messed up your bootloader? No problem (this is for EFI boot):
  1. Check your partition table to see what number is your EFI partition: gpart show
  2. Re-setup your EFI partition: gpart bootcode -p /boot/boot1.efifat -i 1 ada0
  3. Check if ada0p1 is FAT just to be sure: file -s /dev/ada0p1
  4. Setup bootcode (-i3 if your zfs is ada0p3): gpart bootcode -b /boot/pmbr -p /boot/gptzfsboot -i 2 ada0

05 September 2019

Organizing your clusterfuck collection of wallpapers

I have a directory filled with wallpapers that syncs with my devices, so I can use a random wallpaper (usually, change at boot or every 24h, whatever comes first). For the sake of organization, let's make this straight:

First, convert everything to png, because why not?

find . -name "*.jpg" -exec mogrify -format png {} \;

Double check your files and then delete the remaining jpgs (or whatever format you're converting):

rm *.jpg

Now, let's organize by number:

num=0; for i in *; do mv "$i" "$(printf '%04d' $num).${i#*.}"; ((num++)); done
If you need to add more wallpapers to this directory, remember to change the num= to the last wallpaper of this directory +1.

21 August 2019

Can't unlock KDE Session

If for some reason you're unable to unlock your desktop, then probably is the permissions of kcheckpass. You have two choices:

1) Reinstall kde-plasma/kscreenlocker
2) Check the permissions of /usr/lib64/libexec/kcheckpass, it should be 4755 and owned by root:root
3) A more radical solution:

# Screen locker broken in KDE with ConsoleKit
# See https://forums.gentoo.org/viewtopic-t-1046566.html
# and https://forums.gentoo.org/viewtopic-t-1054134.html

# Find which session is locked
session=Session$(ck-list-sessions | grep -B10 "x11-display = ':0" | grep -o -P '(?<=Session).*(?=:)')

# Create Bash script to unlock session
echo "#!/bin/bash" > $HOME/unlock.sh
echo "su -c 'dbus-send --system --print-reply --dest=\"org.freedesktop.ConsoleKit\" /org/freedesktop/ConsoleKit/$session org.freedesktop.ConsoleKit.Session.Unlock'" >> $HOME/unlock.sh
chmod +x $HOME/unlock.sh

# Run Bash script in another TTY
openvt -s -w $HOME/unlock.sh

30 July 2019

Filesystem benchmarks

Ok, let's get this straight.
When I choose to use JFS, it's because some years ago I see with my own eyes how JFS is reliable in different scenarios and it still is. Yes, EXT4 is reliable too, but the performance isn't on par with JFS, but both offers a reliable and secure solution. XFS in other hand isn't that secure and reliable (it is to point), but offers a quite good performance. When kernel 5.0 came out, they talked a lot about how BTRFS is good now and, different from most people that rely on "everyone uses, so I'll use too" I want to test myself because I'm not the guy that relies on this  "everyone guy" opinion.
The host used for this test is my main desktop using gentoo linux with the last available kernel (5.2.4-gento0) and last available tools to date (i5-3470 on a gigabyte motherboard, 16gb of ram, sata3 1Tb hdd). The disk is entirely formated with the filesystem being tested. All filesystems are mounted using noatime and using bfq scheduler (bfq offers better performance for rotational disks than mq-deadline)). The IOZone tests was executed with a reboot before creating the new filesystem to test to exclude any possible bias.

Test 1: Creating a 1Gb file with dd=/dev/urandom of=test bs=1024 count=1M (in seconds):

Test 2: Cold reboot during the creation of the 1Gb file from one filesystem to another (5 times)

Auto Fixed?5/55/53/54/54/5
Mounted rw without fixing01241
Wasn't able to fixing00121
Continue working with problems without reporting00121

Test 3: Copy a 1Gb file from one filesystem to another

Test 4: Cold reboot during mariadb heavy worload (5 times)

Auto Fixed?5/55/54/54/54/5
Corrupted databases?NYYYN
Mounted rw with problemsNNYYN
Continue working with problems without reportingNNYYN

Test 5: Boot from EFI to lightdm, sata3 hdd, mounting / and /home (5 times)

After a cold reboot0:421:221:402:112:10

Test 6: Shutdown from lightdm to total shutdown (openrc) 

Test 7: IOZone

My overall opinion:
  • JFS and EXT4 are good, performance wise and secure enough for daily usage.
  • XFS was polished through the years and the most awkward problems was taken off (data corruption with power loss and others), it's a good option but have some issues with performance depending of size and mass of data
  • ZFS is good but you need ram to have a good overall performance, I would suggest start with 16Gb for a desktop (this isn't a problem for a server, of course). Also, keep in mind that the mainline kernel doesn't support ZFS, it takes sometime for ZFS reach the last line (right now, ZOL supports up to 5.1.x). Also, despite the comparison of performance between ZFS and BTRFS seems to be similar, ZFS is far more stable and trustful.
  • By cpu/mem footprint in a copy, JFS is the lighter and ZFS the heavier. Of course, will also depend of how much data are you copying from where to where and kind of storage, but in overall, let's put this way: Copy lots of tiny files JFS > EXT4 > BTRFS > ZFS > XFS. Copy lots of big files: JFS/EXT4 > XFS > ZFS > BTRFS. Of course, the mileage will vary depending on the use case (it can be irrelevant with a full non-virtualized server with lots of ram and a good storage system).
  • Only use BTRFS if you like to restore backups often. To date, I can't trust BTRFS, not after seeing a filesystem having corruption with simple tests. -> http://lkml.iu.edu/hypermail/linux/kernel/1907.1/05873.html

Notes by 20190810:
  • XFS resolved the issues caused by power loss and metadata corruption. 
  • For XFS, the performance can benefit more with mq-deadline than bfs, afaik. Otherwise, use this recommendation in udev and don't forget to use noatime if you don't need it.
  • BTRFS still not ready, it can offer some benefits and performance, but the stability is still far from being acceptable, the auto-healing doesn't work as expected (like zfs, for instance) and scrub doesn't fix as expected in some scenarios. They're getting straight, a lot of this stuff seems to being fixed from 5.2 and beyond.
  • The fact of a filesystem receiving more kernel or userland updates doesn't mean it's more stable, or better, or whatever.
  • You SHOULD do your tests instead of talking bullshit like "everyone uses" or "it's an uncommon use". If the kernel supports it, it's supported. Period.

29 July 2019

How to choose an OS

Here's how to choose an OS that fill your needs.

1) Try on a VM first
Try on a VM. You can test everything on this list inside a virtual machine. Virtualbox is free, but it's your choice.

2) Device drivers
The first thing you need to analyze and pay attention is how the system is supported by hardware and how you can fix it if possible. If you don't have enough knowledge to manually install a required driver, you should rely on the in-house driver installer (like ubuntu, linux mint, manjaro, etc). Even so, pay attention if the drivers was installed correctly and there's no issues, specially during an update.

2) Community Support
If you want to rely on community support (I don't suggest since some communities causes more trouble than help) you have to pay attention closely and don't use it as an unique source. Search on the internet and compare the results, usually the distribution have his own documentation about it and most of time you'll see that some irc/matrix/whatever support USUALLY doesn't rely on their own wiki and in his own standards, specially elitist communities. Let me get and example:

1) You've tested yourself a couple of filesystems and decided to choose, let's say, F2FS or JFS.
2) You had a problem with systemd and ask for help, and paste your dmesg
3) If they say that the problem is F2F2 or JFS without dmesg having something explicit about this, you're dealing with stupid people.

3) Custom configurations
This one is really easy. If the distribution have drivers loaded at boot for an specific filesystem (well supported by the kernel developers) and you can't use it as root (as stated by kernel developers), you're dealing with naive packagers and problems will occur in other places often. For example, some linux distributions doesn't support LVM properly, like.. they will boot correctly if you have /boot and / outside of LVM (?????). This is the result of naive packagers, stay away from that.

4) Elitist community
Over-complicating things to state how they're right based on own defaults without proof; stating age of a processor even if the system have a nice performance compared to a more recent processor (saying that a Intel I7 Haswell is slow because it's old); using "because everyone use it X" as answer; "no one uses this kind of hardware anymore" talking about something like 3 years old hardware. And the list goes on, this kind of crap should be redirected to a black hole.

5) OS implementation
Lack of multilib without any reason; Answering questions of poor implementation saying "because it's professional" instead of the real reason; Trying to hide problems with "because it's better in this way" it's a matter of naive developers.

6) Systemd

7) Documentation
Usually, any wiki should work with any distribution at some point, it'll depend on the implementation. So is good to choose something that is KISS as possible, so you'll have more documentation available if a problem arises. You don't need to use the specific wiki of the OS of your choice if the OS follows the standards of the application maded by the developer, so you can even rely in the application docs to fix something. Unless you're using a trollercoaster OS.

8) Learning curve
You don't need to understand what's happening in the background in a matter of days, but it's probably best for you if you learn how it works at your own pace. The less you rely on other people, the less is the chance of someone giving you a bad advise, and trust me, this will occurs more often than you expect.

9) FHS and POSIX
It seems no one cares about it, but FHS and POSIX standards are important. No, isn't a matter of "get used", it's a matter of having standards. I accept changes when it benefits, changes that came form stupid nature should be discarded ASAP (for example, linking /bin to /usr/bin). I suggest avoid OS maded by brainless people that take decisions without a sane reason.

28 June 2019

The stupid chit-chat around and how to fix it (for people that prefer reality than some herp-derp) - Volume 2

Yeah, it never stops, here the first part.

1. IBM will fuck up Redhat because IBM destroys everything it touches, IBM is not good to open source

Instead of farting with your mouth (or fingers) try researching about this topic a little more. IBM is one of the major contributors to the open source world, not only to kernel but there's a lot, I mean, A LOT of things IBM contributed, with open source. Research linux kernel just for start, even so, IBM also sponsors some open source things, iSCSI is just one of the examples

2. Why you use the "feature X" if no one uses?

Did you tried yourself or you just "use because everyone uses"? Instead of being stupid and answering a question with another question, did you tried to understand the problem first? It's a well reported problem? It's something like "I want to disable the spectre mitigation to have performance"? Oh.. no? Then why are you doing that? Because you only uses what everyone uses without testing yourself to see if fits to your scenario?
Let's get an example: A decent linux distribution will not care what filesystem you use if it's well supported by the linux kernel and will permit you to boot on that. If reiserfs is supported (to date, I'm not sure about that), a DECENT linux distribution should boot this without any problem. If I want to use a mix of LVM+JFS+REISERFS+WHATEVERFS, it should boot if the kernel have support, unless you use a fucking messed up distro that the packagers doesn't know what they're doing and/or "you should use what everyone uses" (I still don't know who is this everyone dude". Oh the distro doesn't support install with this filesystem? Cool, what a piece of shit distro are you using? Sooo the developers of your big ass distro have more knowledge than the kernel developers itself to know what can be stable and what cannot.

3. Rolling releases are unstable, point releases are way better.

Let's get one point here, yes, there's a chance of having a problem. But who the fuck knows what software is stable or not? The developer of this software, right? And with what authority or knowledge the packager of your distro know exactly when this package is stable enough to backport patches to an ancient package? Oh he just decided and tested as fuck? Are we talking about some sort of LTS package, why? It's there a major bump on everything like... QT4 to QT5? No? Then your fucking point release isn't better, is just a decision that the packager maded for you. And to be honest, 90% of the problems I had with loss of data and stuff crashing beyond repair was, guess what, with the "so stable point release distributions".
Now in fact, there IS a reason for some distributions like redhat/centOS and SLES being in that way: Some 3rd party software wants specific version of software to be well supported, and these are the ones to blame, they could even made their binaries static or whatever other thousand of ways to make everything works better, but since this doesn't happens, redhat/centos/sles have to stick into that.

4. This feature isn't implemented in the distribution because it's unstable.

Isn't implemented because the packagers didn't made it into the distribution, stop being this jackass. Tell the truth.

5. Systemd

I know, it's bad, do bad things, create own issues, it's like a virus that do all the shit, have a shitty QA, a god-level shitty Issue tracker, binary logs that can get corrupt with a blink of an eye, and an infinite list of awful bugs that will never be fixed because some developers have a crown in his ass. But if you want to argue why you doesn't like systemd, at least get your shit together. Don't go to reddit say "I don't like binary logs, because it's binary" or people will laugh at you and it's your fault.

08 May 2019

Why open source firmware is important for security

I gave a talk recently at GoTo Chicago on Why open source firmware is important and I thought it would be nice to also write a blog post with my findings. This post will focus on why open source firmware is important for security.

Privilege Levels

In your typical “stack” today you have the various levels of privileges.

  • Ring 3 - Userspace: has the least amount of privileges, short of there being a sandbox in userspace that is restricted further.
  • Rings 1 & 2 - Device Drivers: drivers for devices, the name pretty much describes itself.
  • Ring 0 - Kernel: The operating system kernel, for open source operating systems you get visibility into the code behind this.
  • Ring -1 - Hypervisor: The virtual machine monitor (VMM) that creates and runs virtual machines. For open source hypervisors like Xen, KVM, bhyve, etc you have visibility into the code behind this.
  • Ring -2 - System Management Mode (SMM), UEFI kernel: Proprietary code, more on this below.
  • Ring -3 - Management Engine: Proprietary code, more on this below.
From the above, it’s pretty clear that for Rings -1 to 3, we have the option to use open source software and have a large amount of visibility and control over the software we run. For the privilege levels under Ring -1, we have less control but it is getting better with the open source firmware community and projects.
It’s counter-intuitive that the code that we have the least visibility into has the most privileges. This is what open source firmware is aiming to fix.

Ring -2: SMM, UEFI kernel

This ring controls all CPU resources.
System management mode (SMM) is invisible to the rest of the stack on top of it. It has half a kernel. It was originally used for power management and system hardware control. It holds a lot of the proprietary designed code and is a place for vendors to add new proprietary features. It handles system events like memory or chipset errors as well as a bunch of other logic.
The UEFI Kernel is extremely complex. It has millions of lines of code. UEFI applications are active after boot. It was built with security from obscurity. The specification is absolutely insane if you want to dig in.

Ring -3: Management Engine

This is the most privileged ring. In the case of Intel (x86) this is the Intel Management Engine. It can turn on nodes and re-image disks invisibly. It has a kernel that runs Minix 3 as well as a web server and entire networking stack. It turns out Minix is the most widely used operating system because of this. There is a lot of functionality in the Management Engine, it would probably take me all day to list it off but there are many resources for digging into more detail, should you want to.
Between Ring -2 and Ring -3 we have at least 2 and a half other kernels in our stack as well as a bunch of proprietary and unnecessary complexity. Each of these kernels have their own networking stacks and web servers. The code can also modify itself and persist across power cycles and re-installs. We have very little visibility into what the code in these rings is actually doing, which is horrifying considering these rings have the most privileges.

They all have exploits

It should be of no surprise to anyone that Rings -2 and -3 have their fair share of vulnerabilities. They are horrifying when they happen though. Just to use one as an example although I will let you find others on your own, there was a bug in the web server of the Intel Management Engine that was there for seven years without them realizing.

How can we make it better?

NERF: Non-Extensible Reduced Firmware

NERF is what the open source firmware community is working towards. The goals are to make firmware less capable of doing harm and make its actions more visible. They aim to remove all runtime components but currently with the Intel Management Engine, they cannot remove all but they can take away the web server and IP stack. They also remove UEFI IP stack and other drivers, as well as the Intel Management/UEFI self-reflash capability.


This is the project used to clean the Intel Management Engine to the smallest necessary capabilities. You can check it out on GitHub: github.com/corna/me_cleaner.

u-boot and coreboot

u-boot and coreboot are open source firmware. They handle silicon and DRAM initialization. Chromebooks use both, coreboot on x86, and u-boot for the rest. This is one part of how they verify boot.
Coreboot’s design philosophy is to “do the bare minimum necessary to ensure that hardware is usable and then pass control to a different program called the payload.” The payload in this case is linuxboot.


Linuxboot handles device drivers, network stack, and gives the user a multi-user, multi-tasking environment. It is built with Linux so that a single kernel can work for several boards. Linux is already quite vetted and has a lot of eyes on it since it is used quite extensively. Better to use a open kernel with a lot of eyes on it, than the 2½ other kernels that were all different and closed off. This means that we are lessening the attack surface by using less variations of code and we are making an effort to rely on code that is open source. Linux improves boot reliability by replacing lightly-tested firmware drivers with hardened Linux drivers.
By using a kernel we already have tooling around firmware devs can build in tools they already know. When they need to write logic for signature verification, disk decryption, etc it’s in a language that is modern, easily auditable, maintainable, and readable.


u-root is a set of golang userspace tools and bootloader. It is then used as the initramfs for the Linux kernel from linuxboot.
Through using the NERF stack they saw boot times were 20x faster. But this blog post is on security so let’s get back to that….
The NERF stack helps improve the visibility into a lot of the components that were previously very proprietary. There is still a lot of other firmware on devices.

What about all the other firmware?

We need open source firmware for the network interface controller (NIC), solid state drives (SSD), and base management controller (BMC).
For the NIC, there is some work being done in the open compute project on NIC 3.0. It should be interesting to see where that goes.
For the BMC, there is both OpenBMC and u-bmc. I had written a little about them in a previous blog post.
We need to have all open source firmware to have all the visibility into the stack but also to actually verify the state of software on a machine.

Roots of Trust

The goal of the root of trust should be to verify that the software installed in every component of the hardware is the software that was intended. This way you can know without a doubt and verify if hardware has been hacked. Since we have very little to no visibility into the code running in a lot of places in our hardware it is hard to do this. How do we really know that the firmware in a component is not vulnerable or that is doesn’t have any backdoors? Well we can’t. Not unless it was all open source.
Every cloud and vendor seems to have their own way of doing a root of trust. Microsoft has Cerberus, Google has Titan, and Amazon has Nitro. These seem to assume an explicit amount of trust in the proprietary code (the code we cannot see). This leaves me with not a great feeling. Wouldn’t it be better to be able to use all open source code? Then we could verify without a doubt that the code you can read and build yourself is the same code running on hardware for all the various places we have firmware. We could then verify that a machine was in a correct state without a doubt of it being vulnerable or with a backdoor.
It makes me wonder what the smaller cloud providers like DigitalOcean or Packet have for a root of trust. Often times we only hear of these projects from the big three or five. I asked this on twitter and didn’t get any good answers…

There is a great talk by Paul McMillan and Matt King on Securing Hardware at Scale. It covers in great detail how to secure bare metal while also giving customers access to the bare metal. When they get back the hardware from customers they need to ensure with consistency and reliability that there is nothing from the customer hiding in any component of the hardware.
All clouds need to ensure that the hardware they are running has not been compromised after a customer has run compute on it.

Platform Firmware Resiliency

As far as chip vendors go, they seem to have a different offering. Intel has Platform Firmware Resilience and Lattice has Platform Firmware Resiliency. These seem to be more focused on the NIST guidelines for Platform Firmware Resiliency.
I tried to ask the internet who was using this and heard very little back, so if you are using Platform Firmware Resiliency can you let me know!

From the OCP talk on Intel’s firmware innovations, it seems Intel’s Platform Firmware Resilience (PFR) and Cerberus go hand in hand. Intel is using PFR to deliver Cerberus’ attestation priniciples. Thanks @msw for the clarification.
It would be nice if there were not so many tools to do this job. I also wish the code was open source so we could verify for ourselves.

How to help

I hope this gave you some insight into what’s being built with open source firmware and how making firmware open source is important! If you would like to help with this effort, please help spread the word. Please try and use platforms that value open source firmware components. Chromebooks are a great example of this, as well as Purism computers. You can ask your providers what they are doing for open source firmware or ensuring hardware security with roots of trust. Happy nerding! :)
Huge thanks to the open source firmware community for helping me along this journey! Shout out to Ron Minnich, Trammel Hudson, Chris Koch, Rick Altherr, and Zaolin. And shout out to Bridget Kromhout for always finding time to review my posts!

Source: Jess Frazelle