Currently only the "greedy" sampling is implemented (the token with the highest probability is selected). Implement other sampling methods, some options are: * top-p * top-k * temperature (here is an example how it could be done: https://github.com/jaymody/picoGPT/pull/19) * categorical sampling
Currently only the "greedy" sampling is implemented (the token with the highest probability is selected).
Implement other sampling methods, some options are: