Rust's LLVM InstrProf-based source code coverage implementation instruments Rust code via the MIR pass InstrumentCoverage. Most criteria for identifying coverage regions and counter locations are very general, based on Control Flow Graph (CFG) analysis of the MIR, and a fairly straightforward mapping of MIR Statements and Terminators to their source code regions (Spans).
TerminatorKind::Gotos are an exception, requiring special handling.
This issue is created to highlight some of the unique requirements and issues addressed in the current coverage implementation, in case someone has ideas for improving things, to reduce the reliance on the Goto-specific logic, either by improving InstrumentCoverage if something was overlooked, or improving the Goto representation (such as refining its Span representation, or providing additional context that InstrumentCoverage might leverage).
Current State
One of the first steps in the InstrumentCoverage process is to extract relevant code Spans from the MIR Statements and Terminators. (These Spans are later combined into sets of sequential statements and with contiguous source code regions that can be counted via a single counter; i.e., if any statement in the set was executed, all statements in the same set would also have been executed.)
bcb_to_initial_coverage_spans() iterates through the BasicBlocks of the CoverageGraph (a subset of the MIR, essentially skipping panic/unwind paths), and their Statements and Terminators. Some Statements and Terminators are relevant to Coverage, and others are not. The Statement and Terminator filtering is handled by filtered_statement_span() and filtered_terminator_span(), respectively.
In almost all cases, if not filtered out, the initial coverage Span contributed by either a Statement or a Terminator is the source_info.span (within the function body) of the Statement or Terminator; because, in most cases, the source code span carried forward from the parsed source to its MIR representation is a fairly accurate mapping from intent to execution.
For example, filtered_terminator_span() uses the entire source_info.span for the following TerminatorKinds:
fn filtered_terminator_span(terminator: &'a Terminator<'tcx>, body_span: Span) -> Option<Span> {
match terminator.kind {
...
// Retain spans from all other terminators
TerminatorKind::Resume
| TerminatorKind::Abort
| TerminatorKind::Return
| TerminatorKind::Call { .. }
| TerminatorKind::Yield { .. }
| TerminatorKind::GeneratorDrop
| TerminatorKind::FalseUnwind { .. }
| TerminatorKind::InlineAsm { .. } => {
Some(function_source_span(terminator.source_info.span, body_span))
}
All other TerminatorKinds are filtered out, except for Goto.
Goto terminators play an important role in the control flow, so they cannot be filtered out, but their source_info.span typically includes the Spans of the statements that precede it, making the Span redundant, in most cases.
One example: `Goto`s are often the targets of `SwitchInt` branches, and certain important optimizations to replace some `Counter`s with `Expression`s require a separate `BasicCoverageBlock` for each branch, to support the `Counter`, when needed.
Since a Goto-based CoverageSpan still needs a span to indicate if a region of actual source code was executed or not, the span returned from filtered_terminator_span(), for Gotos, is an empty span, positioned at the Goto span's last byte position:
TerminatorKind::Goto { .. } => {
Some(function_source_span(terminator.source_info.span.shrink_to_hi(), body_span))
}
This byte position can--most often--be leveraged to contribute to a CoverageSpan for certain execution branches.
For example, an if block without an else shows the block was executed if the condition was true, but there would be no way to indicate coverage (or lack thereof) of the false branch without using the associated Gotos hi() byte position (which is expanded by one character to the left, for a non-empty CoverageSpan.
However, in other cases, a visible CoverageSpan is not wanted, but the Goto
block must still be counted (for example, to contribute its count to an Expression
that reports the execution count for some other block). In these cases, the code region
is set to None.
This decision (whether to include a one-character coverage span for a Goto or to count a Goto block without a code region) is handled in inject_coverage_span_counters(), beginning with the call to is_code_region_redundant(), which encapsulates the decision on how to handle these special cases.
At the time of this writing, the decision criteria is only looking for Goto terminators with spans that end at the last byte position in the file, because these Goto spans--if present--are redundant with the spans from every function's final Return terminator. When they are present, they can cause the function's last line to appear to have been executed twice, when it was only executed once.
Rust's LLVM InstrProf-based source code coverage implementation instruments Rust code via the MIR pass
InstrumentCoverage. Most criteria for identifying coverage regions and counter locations are very general, based on Control Flow Graph (CFG) analysis of the MIR, and a fairly straightforward mapping of MIRStatements andTerminators to their source code regions (Spans).TerminatorKind::Gotos are an exception, requiring special handling.This issue is created to highlight some of the unique requirements and issues addressed in the current coverage implementation, in case someone has ideas for improving things, to reduce the reliance on the
Goto-specific logic, either by improvingInstrumentCoverageif something was overlooked, or improving theGotorepresentation (such as refining itsSpanrepresentation, or providing additional context thatInstrumentCoveragemight leverage).Current State
One of the first steps in the
InstrumentCoverageprocess is to extract relevant codeSpans from the MIRStatements andTerminators. (TheseSpans are later combined into sets of sequential statements and with contiguous source code regions that can be counted via a single counter; i.e., if any statement in the set was executed, all statements in the same set would also have been executed.)bcb_to_initial_coverage_spans()iterates through theBasicBlocks of theCoverageGraph(a subset of the MIR, essentially skipping panic/unwind paths), and theirStatements andTerminators. SomeStatements andTerminators are relevant toCoverage, and others are not. TheStatementandTerminatorfiltering is handled byfiltered_statement_span()andfiltered_terminator_span(), respectively.In almost all cases, if not filtered out, the initial coverage
Spancontributed by either aStatementor aTerminatoris thesource_info.span(within the function body) of theStatementorTerminator; because, in most cases, the source code span carried forward from the parsed source to its MIR representation is a fairly accurate mapping from intent to execution.For example,
filtered_terminator_span()uses the entiresource_info.spanfor the followingTerminatorKinds:All other
TerminatorKinds are filtered out, except forGoto.Gototerminators play an important role in the control flow, so they cannot be filtered out, but theirsource_info.spantypically includes theSpans of the statements that precede it, making theSpanredundant, in most cases.Since a
Goto-basedCoverageSpanstill needs a span to indicate if a region of actual source code was executed or not, the span returned fromfiltered_terminator_span(), forGotos, is an empty span, positioned at theGotospan's last byte position:This byte position can--most often--be leveraged to contribute to a
CoverageSpanfor certain execution branches.For example, an
ifblock without anelseshows the block was executed if the condition wastrue, but there would be no way to indicate coverage (or lack thereof) of thefalsebranch without using the associatedGotoshi()byte position (which is expanded by one character to the left, for a non-emptyCoverageSpan.However, in other cases, a visible
CoverageSpanis not wanted, but theGotoblock must still be counted (for example, to contribute its count to an
Expressionthat reports the execution count for some other block). In these cases, the code region
is set to
None.This decision (whether to include a one-character coverage span for a
Gotoor to count aGotoblock without a code region) is handled ininject_coverage_span_counters(), beginning with the call tois_code_region_redundant(), which encapsulates the decision on how to handle these special cases.At the time of this writing, the decision criteria is only looking for
Gototerminators with spans that end at the last byte position in the file, because theseGotospans--if present--are redundant with the spans from every function's finalReturnterminator. When they are present, they can cause the function's last line to appear to have been executed twice, when it was only executed once.