Don't forget to add the niche
When I'm writing Rust code, I use enums, especially Option and Result, all the time, so I've gotten somewhat keen to the mechanics of the optimization. In my opinion, it could stand to be easier to trigger; if you don't know the rules of niche optimization, you're almost certainly leaving cheap or free performance on the table. In some cases, the unused space is already there in your enum, going unused, but the compiler will not perform niche optimization unless you explicitly add a niche to your struct.
Playground link with all examples from this post.
Niche optimization primer
Here's a brief explanation of the basic concept for the uninitiated. In Rust, when an enum variant contains fields that have unused bits or illegal bit patterns, the compiler will try to exploit this "free" space to optimize the layout of your enum. In the general case, an enum in Rust is represented as a tagged union. For example, this enum:enum MyEnum {
First(u16),
Second(u32),
}
... has (roughly) the same layout as this C struct:
struct MyEnum {
uint8_t tag;
union {
uint16_t first;
uint16_t second;
};
};
When one variant contains a field whose type has illegal values, there may be a better representation. For example, this enum:
enum MyEnum2 {
First(u16),
Second(u16, NonZeroU16),
}
... actually has the same layout as the tuple (u16, u16). Instead of adding a "tag" to the layout, the compiler uses a special check to detect which variant the tuple represents: if tuple.0 != 0, the tuple cannot be an instance of the second variant, so it must be an instance of MyEnum::First. The pseudocode for this check looks like this:
let (x, y): MyEnum2 = ...;
if y == 0 {
// Cast to First(x)
} else {
// Cast to Second(x, y)
}
All cases of niche optimization are various generalizations of this simple trick.
Triggering niche optimization
A bit pattern that is invalid for a particular type, such as zero forNonZeroU16 is called a niche. The following (non-exhaustive) list gives the primitive types that have niches:
boolchar&T&mut TNonNull<T>NonZero<T>- function pointetplrs
- structs whose fields have niches
NonZeroU16, but bool can only take the values zero and one, leaving seven bits worth of niche representations in which the compiler can store a tag. So this enum only takes up two bytes of memory:
enum MyEnum3 {
First(u8, bool),
Second(u8),
Third(u8),
Fourth(u8),
Fifth(u8),
Sixth(u8),
}
The niche found by the compiler also cannot overlap with other variants. Compare these two enums:
// Triggers niche optimization
enum MyEnum4 {
First(u16),
Second(u16, bool),
}
enum MyEnum5 {
First(u16, bool),
Second(u16, bool),
}
In the case of Option, the None variant contains no fields, so Option<T> always triggers niche optimization as long as T contains a niche.
Caveat: padding
It seems obvious that a struct with padding bytes (due to alignment) would trigger niche optimization. But, surprisingly, this is not the case. Rust treats padding bytes as fair game for unsafe code to overwrite, so padding does not count as a niche. In order to trigger niche optimization for a type with padding bytes, you have to explicitly fill those padding bytes with a dummy field that has a niche, such asbool. Here's an example:
// Does not trigger niche optimization
struct Padded {
x: u16,
y: u8,
}
// Triggers niche optimization
struct PaddedWithNiche {
x: u16,
y: u8,
_pad: bool,
}
This is a pretty silly piece of code to have to write to trigger an important optimization. I am not the first to point out this flaw.
This is only one case where niche optimization could be made easier to trigger. Some improvements that would be useful in future Rust versions:
- padding is a niche by default (requires a new Rust edition)
NonMax<T>(integers less than INT_MAX)NonNegative<T>(a.k.a.u7,u15, etc.)- fully-featured bitfields
Comments
Post a Comment