Skip to content

Fix perf regression in Read::read_to_end on short reads due to not checking if the cursor has initialized bytes#158083

Open
asder8215 wants to merge 1 commit into
rust-lang:mainfrom
asder8215:default_read_to_end_mark_init
Open

Fix perf regression in Read::read_to_end on short reads due to not checking if the cursor has initialized bytes#158083
asder8215 wants to merge 1 commit into
rust-lang:mainfrom
asder8215:default_read_to_end_mark_init

Conversation

@asder8215

@asder8215 asder8215 commented Jun 18, 2026

Copy link
Copy Markdown
Contributor

This PR fixes #158008.

In particular, in #150129, it refactored some code within library/std/io/mod.rs to utilize BorrowedBuf::is_init instead of manually checking read_buf.init_len() == buf_len to see if the read buffer had initialized bytes. However, the BorrowedBuf is never marked or set as init within this function, and I think this portion of the code:

 // SAFETY: These bytes were initialized but not filled in the previous loop
unsafe {
     read_buf.set_init(initialized);
}

was removed by mistake. This PR reverts the changes made by #150129, so that we can mark the BorrowedBuf/read_buf as initialized using BorrowedBuf::set_init if in a previous iteration the cursor has initialized bytes. This would allow max_read_size to not be marked as usize::max if the read buffer contains initialized bytes.

@rustbot rustbot added S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. T-libs Relevant to the library team, which will review and decide on the PR/issue. labels Jun 18, 2026
@rustbot

rustbot commented Jun 18, 2026

Copy link
Copy Markdown
Collaborator

r? @Darksonn

rustbot has assigned @Darksonn.
They will have a look at your PR within the next two weeks and either review your PR or reassign to another reviewer.

Use r? to explicitly pick a reviewer

Why was this reviewer chosen?

The reviewer was selected based on:

  • Owners of files modified in this PR: @ChrisDenton, libs
  • @ChrisDenton, libs expanded to 12 candidates
  • Random selection from Darksonn, Mark-Simulacrum, clarfonthey, jhpratt

@asder8215 asder8215 changed the title Fix perf regression in Read::read_to_end on short reads due to not checking if the cursor has initialized bytes Fix perf regression in Read::read_to_end on short reads due to not checking if the cursor has initialized bytes Jun 18, 2026
Comment thread library/std/src/io/mod.rs
Comment on lines 484 to 485
// Note that we don't track already initialized bytes here, but this is fine
// because we explicitly limit the read size

@Darksonn Darksonn Jun 18, 2026

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This comment was added in #150129. Should it be removed?

View changes since the review

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unsure. Are these comments still relevant @a1phyr?

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well, probably not if you start tracking initialized bytes :)

Comment thread library/std/src/io/mod.rs Outdated
Comment thread library/std/src/io/mod.rs
Comment thread library/std/src/io/mod.rs
@Darksonn

Copy link
Copy Markdown
Member

@rustbot author

@rustbot rustbot added S-waiting-on-author Status: This is awaiting some action (such as code changes or more information) from the author. and removed S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. labels Jun 18, 2026
@rustbot

rustbot commented Jun 18, 2026

Copy link
Copy Markdown
Collaborator

Reminder, once the PR becomes ready for a review, use @rustbot ready.

@Darksonn Darksonn added the A-io Area: `std::io`, `std::fs`, `std::net` and `std::path` label Jun 18, 2026
@asder8215 asder8215 force-pushed the default_read_to_end_mark_init branch from 104baaa to aebb8e1 Compare June 18, 2026 16:03
@asder8215 asder8215 requested a review from Darksonn June 18, 2026 16:05
@rustbot rustbot added S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. and removed S-waiting-on-author Status: This is awaiting some action (such as code changes or more information) from the author. labels Jun 18, 2026
@asder8215 asder8215 force-pushed the default_read_to_end_mark_init branch from aebb8e1 to c6406c5 Compare June 18, 2026 16:06

@a1phyr a1phyr left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I didn't track uninitialized bytes in my previous MR because I thought it would be to complicated to do properly.

For example, if this MR improve some existing cases, it will not really solve the pathological case you sent in your issue for larger sizes (eg around a million): when you initialized N bytes, on the next round you will have N-1 spare initialized bytes left, so you won't be able to use set_init() (or you could initialize the rest manually).

All in all, it was a trade-off between code complexity, properly handling common cases but having suboptimal (but still acceptable) behavior in weird cases.

View changes since this review

Comment thread library/std/src/io/mod.rs
}
};

initialized_len = cursor.capacity();

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is true only if read_buf.is_init() (use the boolean below to avoid lifetime issues)

Comment thread library/std/src/io/mod.rs
let mut read_buf: BorrowedBuf<'_, u8> = spare.into();

let buf_unfilled_len = read_buf.capacity() - read_buf.len();
if initialized_len == buf_unfilled_len {

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This condition is wrong: you compare the old buffer capacity and the new buffer spare capacity, but the start of the buffer has changed since then. It would be less error prone to track initialized bytes counting from the beginning of the Vec.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, please store the full capacity of the vector, including anything already written.

@Darksonn

Copy link
Copy Markdown
Member

@rustbot author

@rustbot rustbot added S-waiting-on-author Status: This is awaiting some action (such as code changes or more information) from the author. and removed S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. labels Jun 20, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

A-io Area: `std::io`, `std::fs`, `std::net` and `std::path` S-waiting-on-author Status: This is awaiting some action (such as code changes or more information) from the author. T-libs Relevant to the library team, which will review and decide on the PR/issue.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Read::read_to_end performance regression

4 participants