Lifetimes
In Safety Features of Rust, we looked at one borrow checker
rule that Rust enforces, namely aliasing XOR mutability. There is in fact another borrow checker
rule that Rust enforces, which is that a referent should outlive its references. We looked at this
at play with move
before.
fn main() { let a = String::from("a"); print_str(a); // `a` moves into the function `print_str()`. println!("a is {}", a); // This throws an error, // because `a` can no longer access the string. } fn print_str(x: String) { println!("String {}", x); }
The above example is the same example we saw before in Safety Features of
Rust. a
is a reference of a referent String
that contains "a"
.
This string gets deallocated after executing print_str
. If move
did not occur, the original
reference a
would outlive the referent, which would cause a use-after-free problem.
In general, in order to make sure that a referent does outlive its reference, Rust needs to know how long a referent would live for (i.e., not be deallocated) and how long a reference would live (i.e., be pointing to a valid object in memory). The problem is that this is sometimes difficult to infer and Rust asks programmers to tell the compiler using a concept called lifetimes. Below, we will first look at functions and how lifetimes are important for functions. Later, we will look at other uses of lifetimes.
The Problem with References in Functions
Lifetimes are frequently used in functions, and most of the times Rust is able to automatically take care of lifetimes so the programmers do not need to worry about them. However, that is not always the case. Let's look at the following example to understand this further.
The example below uses
&str
which is called a string slice. You can read more about it in the Rust book.
fn ref_return(x: &str) -> &str { // Assume that this function does some complicated things // and returns a string slice. } fn main() { let s = String::from("string"); let res = ref_return(&s); // Assume that the rest of the code does things with `res`. }
In the above code, res
is a reference to the string slice returned from ref_return()
(the
referent). Thus, Rust needs to check whether or not the referent outlives the reference. Determining
how long the reference (res
) would live is easy---it's until the end of the main()
function.
However, it is not easy to determine how long the referent that res
points to would live---it's
coming from ref_return()
and generally speaking, unless you execute the function, you won't know
what it's going to return and which memory location res
will point to. Thus, Rust is unable to
check if the referent would outlive the reference.
Lifetimes in Functions
Due to the above problem, if a function returns a reference, Rust asks programmers to tell the Rust compiler how long the reference would live. This is called the reference's lifetime. However, there is an important thing to keep in mind. Rust does not ask programmers to specify the lifetime of a reference. Instead, Rust only asks programmers to represent the relationship of the lifetimes of input parameters and returned references. It would be very difficult, if not impossible, for a programmer to determine how long a reference would be valid and Rust does not ask for it.
Let's look at a more concrete example to understand what this means. The example below is from the Rust book.
#![allow(unused)] fn main() { fn longest(x: &str, y: &str) -> &str { if x.len() > y.len() { x } else { y } } }
If you run the code, the compiler will complain that the return type misses a lifetime parameter.
The error message will tell you how lifetime parameters look like and what to do. A lifetime parameter
looks like 'a
with a '
and a name of the parameter. It is used to represent the duration for
which a variable would be valid and it comes after &
.
Using a lifetime parameter, Rust expects programmers to tell its compiler how long a reference would be valid for. As mentioned earlier, Rust does not expect programmers to manually figure out how long a reference would be valid for. All Rust expects is how long a reference would be valid for in relation to input parameters. Let's take a look at the following example.
#![allow(unused)] fn main() { fn longest<'a>(x: &'a str, y: &'a str) -> &'a str { if x.len() > y.len() { x } else { y } } }
There are a few things to note in the above code. First, <'a>
is using a syntax similar to the one
we saw for generics in More on Struct and Traits. It means that
'a
is a generic lifetime parameter, i.e., it represents any lifetime, not a particular lifetime.
We then use 'a
for the input parameters and the return type to represent that they all have the
same lifetime. This is basically showing the relationship between the input parameters' lifetimes
and the return value's lifetime. As mentioned earlier, Rust doesn't ask programmers to specify a
lifetime. It only asks programmers to show what the relationships are. For functions, you always
need to show what the relationship is between input parameters' lifetimes and the return value's
lifetime. Since the return value is either x
or y
, telling the Rust compiler that the input
parameters and the return value have the same lifetime is exactly what we want to represent.
If we called this function from another function, Rust would be able to see that the return value would be valid as long as input parameters are valid. Using this information, Rust would be able to check if a referent would outlive its references.
'static
There is a special lifetime called the static lifetime, meaning that the reference can live
"during the entire duration of the
program". The best
example is a string literal that is embedded in a program's binary. Since it is always accessible,
the lifetime of a string literal is always 'static
.
#![allow(unused)] fn main() { fn string_literal() -> &'static str { let literal = "a string literal"; // This string is embedded in the program binary. literal } }
Oftentimes, you will see that the Rust compiler's error messages suggest to use 'static
and it is
actually an easy way to satisfy the compiler to get your code compiled. However, a lot of times
'static
is not what you should use. Thus, it is important to think hard about why 'static
is
appropriate for you before using it.
The Implication of Lifetimes in Functions
As the above example shows, when you return a reference, you have to tell the Rust compiler how the
lifetime of the reference is related to the lifetimes of the input parameters. This has an
interesting implication---you can only return a reference if it manipulates the input arguments. For
example, suppose you have a custom struct
and you create an instance of it within a function. You
cannot return a reference to the newly-created instance of the struct
.
There are actually two reasons why you cannot return a reference to a newly-created instance of a
struct
. One is that you cannot represent a lifetime of the reference, and the other is that the instance gets dropped at the end of the function.
When you feel the need to create a new object and return a reference for it, instead of returning a
reference directly, you need to use a data structure that transfers ownership, e.g., Box
, Vec
,
or String
.
#![allow(unused)] fn main() { fn box_creation_and_return() -> Box<i32> { let a: i32 = 1; return Box::new(a); } }
When Do We Need Lifetime Parameters?
If you read the Rust book or the above description of lifetimes, you may get an impression that
lifetimes are optional. This is actually not true, and lifetimes are mandatory for all
references. The reason why you don't see lifetimes all the time is that the Rust compiler is smart
enough to do the work for you. For example, if you use a reference in a struct
or enum
, you have
to have a lifetime.
#![allow(unused)] fn main() { struct ProblematicStruct { str_slice: &str, } }
The above does not work because it has a reference as a field and there is no lifetime. The following fixes it.
#![allow(unused)] fn main() { struct ProblematicStruct<'a> { str_slice: &'a str, } }
As mentioned earlier, <'a>
means that you're using a generic lifetime parameter ('a
) in the
definition of the struct
. You can then use it for the reference field. However, within a
function, Rust is quite often able to do the work for you. This is called lifetime
elision.
There is a simple algorithm that the Rust compiler uses to determine what lifetimes should be, and if the algorithm cannot determine lifetimes, the compiler throws an error. Then you need to manually annotate lifetimes. The algorithm works as follows.
- First, the algorithm assigns a lifetime parameter for each input parameter. For example, if
fn ex_func(x: &str, y: &str) -> &str {...}
is the function, then the algorithm assigns'a
tox
and'b
toy
like this:fn ex_func(x: &'a str, y: &'b str) -> &str {...}
. - Second, if there is exactly one input lifetime (i.e., one parameter), then that lifetime is
assigned to all output references. For example, if the function is
fn ex_func(x: &str) -> &str {...}
, then the algorithm assigns'a
to both the input parameter and the return reference like this:fn ex_func(x: &'a str) -> &'a str {...}
. - Third, for methods with
&self
or&mut self
, all output references get the same lifetime asself
.
In the above longest()
example, since there are two input parameters, the algorithm tries to
assign 'a
for the first parameter, then 'b
for the second parameter. The problem is that there
is nothing else the algorithm can do. The second case doesn't apply because there are more than one
parameter. The third case doesn't apply either because there is no self
reference. Thus, the
algorithm fails to determine the lifetimes and asks for manual annotation.