Fix perf regression in Read::read_to_end on short reads due to not checking if the cursor has initialized bytes#158083
Conversation
|
r? @Darksonn rustbot has assigned @Darksonn. Use Why was this reviewer chosen?The reviewer was selected based on:
|
Read::read_to_end on short reads due to not checking if the cursor has initialized bytes
| // Note that we don't track already initialized bytes here, but this is fine | ||
| // because we explicitly limit the read size |
There was a problem hiding this comment.
This comment was added in #150129. Should it be removed?
There was a problem hiding this comment.
Unsure. Are these comments still relevant @a1phyr?
There was a problem hiding this comment.
Well, probably not if you start tracking initialized bytes :)
|
@rustbot author |
|
Reminder, once the PR becomes ready for a review, use |
104baaa to
aebb8e1
Compare
…ecking if the cursor has initialized bytes
aebb8e1 to
c6406c5
Compare
There was a problem hiding this comment.
I didn't track uninitialized bytes in my previous MR because I thought it would be to complicated to do properly.
For example, if this MR improve some existing cases, it will not really solve the pathological case you sent in your issue for larger sizes (eg around a million): when you initialized N bytes, on the next round you will have N-1 spare initialized bytes left, so you won't be able to use set_init() (or you could initialize the rest manually).
All in all, it was a trade-off between code complexity, properly handling common cases but having suboptimal (but still acceptable) behavior in weird cases.
| } | ||
| }; | ||
|
|
||
| initialized_len = cursor.capacity(); |
There was a problem hiding this comment.
This is true only if read_buf.is_init() (use the boolean below to avoid lifetime issues)
| let mut read_buf: BorrowedBuf<'_, u8> = spare.into(); | ||
|
|
||
| let buf_unfilled_len = read_buf.capacity() - read_buf.len(); | ||
| if initialized_len == buf_unfilled_len { |
There was a problem hiding this comment.
This condition is wrong: you compare the old buffer capacity and the new buffer spare capacity, but the start of the buffer has changed since then. It would be less error prone to track initialized bytes counting from the beginning of the Vec.
There was a problem hiding this comment.
Yeah, please store the full capacity of the vector, including anything already written.
|
@rustbot author |
This PR fixes #158008.
In particular, in #150129, it refactored some code within
library/std/io/mod.rsto utilizeBorrowedBuf::is_initinstead of manually checkingread_buf.init_len() == buf_lento see if the read buffer had initialized bytes. However, theBorrowedBufis never marked or set as init within this function, and I think this portion of the code:was removed by mistake. This PR reverts the changes made by #150129, so that we can mark the
BorrowedBuf/read_bufas initialized usingBorrowedBuf::set_initif in a previous iteration the cursor has initialized bytes. This would allowmax_read_sizeto not be marked asusize::maxif the read buffer contains initialized bytes.