Qwik News
new
best
Batched reward model inference and Best-of-N sampling
33 points by rawsh 4 days ago |
0 comments