Choosing a more optimal `String` type
— 7 minThis week, I have been profiling and measuring the overhead of the Sentry Rust SDK, as another team has reported
a large overhead in their testing. So much so that the team shied away from using it more extensively in combination
with #[tracing::instrument]
.
After some profiling, I identified a potential culprit, which was using very high quality randomness in the form of the
getrandom
crate, which depending on the operating system was doing syscalls to get true randomness from the operating
system. This was clearly visible in profiles as contributing to SDK overhead. We definitely don’t need high quality
randomness to identify tracing spans, so I switched that to a faster randomness source which is still documented to be
cryptographically secure, though I might decide to further downgrade the quality of the randomness in favor of speed.
But I digress, I really wanted to talk about Strings here.
When profiling, one thing that often sticks out and is a good opportunity for optimization is avoiding allocations. And there were a couple of allocation-related things visible in the profile. Primarily allocating, copying and freeing Strings. Optimizing or avoiding these copies should give us some wins in terms of performance and SDK overhead.
Lets take a look at what our use-case is first.
- Our Strings are immutable. You set them when initializing the SDK, configuring the Scope, or instrumenting a Span. They never change.
- Our Strings are copied often. Whenever an event or trace is captured, we copy over some Scope data, like the release identifier configured during SDK init, or all the tags set on the Scope.
- Strings are presumably small. I don’t have concrete evidence for this, but I would suspect most strings to be short.
- The Strings are serialized often. The strings that are being copied into events are then obviously serialized and
sent to Sentry. Except when they are being discarded inside the SDK because of a configured sampling rate, rate limits
or for other reasons. I’m unsure if we have any other frequent accesses like
PartialEq
orHash
usage however. - Most of the Strings are
Option
al. Most of the properties of Events areOption
s. - Protocol types are in need of Optimization. Not strictly related to our usage of Strings, but all other protocol types have way too detailed typing, and are not extensible on the other hand. In a ton of situations we might be better served with just having the option to manually add arbitrary JSON properties.
To summarize this again in more technical terms:
- We want
Clone
to be cheap, without allocating and copying the actual string contents, akaO(1)
. - The type should optimize for
Option
usage, in particularsize_of::<T>() == size_of::<Option<T>>()
. - The type should at most as large as
String
, in particularsize_of::<T>() <= size_of::<String>()
. - Having Small String Optimization (SSO) is preferable, which means storing
N
inline without a heap allocation. - Ideally, creating a string should not do a roundtrip allocation.
The last point in particular is a pain point with Arc<str>
for example, as creating it out of a String
will almost
always incur a re-allocation. However, that allocation will amortize itself the first time you do a clone()
, so might
as well not matter that much in practice.
There is a ton of options to chose from, and in this comparison I am focusing on these contenders:
std::string::String
, obviouslystd::sync::Arc<str>
arcstr
kstring
smol_str
, used inrust-analyzer
compact_str
flexstr
smartstring
Here is a quick comparison table looking at the various size_of
values, and looking at other properties according to
the docs:
name | size_of::<T> | size_of::<Option<T>> | Clone | SSO | mutable |
---|---|---|---|---|---|
String | 24 | 24 | O(n) | - | yes |
Arc<str> | 16 | 16 | O(1) | - | no |
arcstr | 8 | 8 | O(1) | - | no |
smol_str | 24 | 24 | O(1) | 23 | no |
kstring (arc ) | 24 | 32 | O(1) | 15 / 22 | no |
flexstr | 24 | 32 | O(1) | 22 | no |
compact_str | 24 | 24 | O(n) | 24 | yes |
smartstring | 24 | 32 | O(n) | 23 | yes |
I have not looked at any runtime performance of these crates, and haven’t checked if conversion from String
really
incurs a re-allocation. I assume it does however.
As we can see from that quick table, there doesn’t seem to be any free lunch here. Some of the listed crates do have
small string optimization, but are not optimized for usage with Option
.
Depending on which characteristics are most important to us, this leaves us with only smol_str
which has SSO, cheap
clones and supports Option
. However, it is still the same size as String
and not smaller. Given that it is part
of rust-analyzer
also gives us confidence that it is of high quality and well maintained.
If we want to aim for small size, arcstr
is the way to go, which advertises itself as a better Arc<str>
. It does
not have SSO, but to be honest, I doubt SSO would do much at size 8
, though I’m not sure what the sweet spot for our
particular use-case would be.
And one should definitely not dismiss Arc<str>
, which is both small, has cheap clones, and most of all is part of std
and thus the obvious choice if the goal is to minimize external dependencies.
# Building Strings
So far, we have looked at various String types that are good for storing and cloning. But what about creating Strings?
We have already established that Arc<str>
and most of the other contenders need to re-allocate when creating a new String,
either out of a &str
, or from a String
itself. Not surprisingly, all the contenders that have O(n)
clones allow mutation.
So they are a good option for parsing, and when formatting small strings.
On that note, format!
itself is using String
, so is to_string
. If you want to take advantage of any other string
type that can avoid allocations, you would have to use write!(&mut s, "oh hi: {}", display_type)?
, which is a bit unergonomic.
Alternatives might include having an impl From<fmt::Arguments> for MyStringType
, which allows using
format_args!("oh hi: {}", display_type).into()
. Or having something like impl<D: Display> From<D> for MyStringType
,
although I haven’t tried if that actually compiles, or if the impl bounds might be too broad.
Ideally, I would love to have a more flexible type that allows mutable String building, maybe something with a const
generic parameter giving the most flexibility on construction. And then for long term storage, one can do a single copy
/ allocation using arcstr
for example. Or any of the other types that have SSO.
# Conclusion
It is really hard to make a concrete choice here. I really want to have cheap clones, and I absolutely want the type
to be optimized for usage with Option
, and ideally be smaller than Option<String>
in the first place.
On the other hand though, the Sentry Rust SDK already has way too many external dependencies as it is, so adding even
more might not be the best thing.
In the end, I believe its a choice between smol_str
which seems to be the best choice considering SSO, or arcstr
which seems to be the best choice when optimizing for pure size_of
. Or good old Arc<str>
if we do not want to take
on any new external dependencies.
Either way, to retain maximum flexibility, I might start by defining an opaque newtype which derefs to &str
and can
thus impl all the standard traits, especially Display
and Serialize
, and is constructible out of a &str
, String
,
and possibly impl Display
if I can make that work. With that in place, we can change the internal implementation at
any time without breaking the API.
A big question in the end that still remains is how this can be combined with serde_json::Value
, as we use that type
already in a couple of places, and I would like to use it even more, replacing way too detailed type definitions by
having all the types being extendable with a generic Map<String, Value>
. Especially the keys would probably benefit a
lot from small string optimization. This remains to be seen.