Gemini 1.5 Flash-8B is now production ready

26 points by srameshc 2 days ago | 11 comments

bearjaws 2 days ago |
Damn I literally just published a article benchmarking flash-1.5 and showing it is very impressive for it's cost.
https://myswamp.substack.com/p/improving-accessibility-using...
Maybe I'll redo it and add in 1.5-8b, it's so cheap it doesn't hurt to add it lol.
YetAnotherNick 2 days ago |
Can you also include gpt-4o-mini.
bearjaws a day ago |
I made a note with the updated chart:
https://substack.com/profile/107132439-michael-barajas/note/...
Havoc 2 days ago |
Does anyone know if the rate limits on Flash and Flash8B are separate?
Alifatisk a day ago |
It's on the bottom
> To make this model as useful as we can, we are doubling the 1.5 Flash-8B rate limits, meaning developers can send up to 4,000 requests per minute (RPM).
You can even compare the late limit here https://ai.google.dev/pricing
faangguyindia 2 days ago |
It's such a shame, zed editor cannot use Gemini Flash for code completion, it's stuck on Supermaven or copilot.
Most editors can easily support LLMs via Fill in Middle operation mode
Alifatisk a day ago |
Why do some people turn to Gemini? I've tried it, and I remember it lacking or being heavily censored. Is it because it's cheap? Or is it better at some tasks that others aren't?
druskacik 10 hours ago |
Mainly for its long context abilities. The other major LLMs peak at 128-256K context windows, Gemini models promise 1 million (2 million in case of 1.5 Pro).
Alifatisk 7 hours ago |
Oh, you're right! I was about to ask how because gemini.google.com never allowed me, but I guess it's accessible through aistudio.google.com.
But I do wonder, how well does Gemini 1.5 Pro / Flash recall from the context window? For example, when both chatGPT and Claude allowed for 8k context window, Claude was still way far ahead with recalling what you've said compared to chatGPT which tend to forget tokens after a while, so you had to remind it.
druskacik 6 hours ago |
Yes, aistudio.google.com is the way to go! You can upload some long documents such as books and try the long context. And the AI Studio is actually free, not many people know about this.
As of the recall performance, I can't really speak from my experience, you should try yourself :)
Alifatisk 6 hours ago |
I actually stayed away from AI studio (until I completely forgot about it) because I taught it was for enterprise or paying customers only. I'll definitely try it out, thank you!