Pinned
It turns out multi step backpropaganda is better.
paper has a beautiful way of improving backpropagation. One iteration cleanly gets us backprop, multiple iterations get us a preconditioned update.
Replying to @LinYorker @ryu0000000001 and @weijie444
arxiv.org/abs/2106.06199
Same update here


















