Pinned
π Jan
1,591 posts
Jan is one agent for (almost) everything. Community: discord.gg/Exe46xPMbK
- Introducing Jan-v1: 4B model for web search, an open-source alternative to Perplexity Pro. In our evals, Jan v1 delivers 91% SimpleQA accuracy, slightly outperforming Perplexity Pro while running fully locally. Use cases: - Web search - Deep Research Built on the new version
00:00 - Google has quietly open-sourced a full-stack research agent stack, powered by Gemini and LangGraph. It's capable of multi-step web search, reflection, and synthesis. While not confirmed to match Geminiβs production backend, it's strikingly close.
- NVIDIA just released Llama-Nemotron-Nano-VL-8B-V1, an 8B vision model that reads dense documents, charts, and video frames. It's #1 on OCRBench V2 (English), with layout and OCR fused end-to-end.
- Microsoft releases a new dataset that improves Qwen2.5-7B from 17.4% to 57.3% on LiveCodeBench. It's called rStar-Coder, 418K tasks designed to push competitive code reasoning. A 7B model trained on it outperforms QWQ-32B on the USA Computing Olympiad.
- Open-source voice cloning at 16x real-time? Chatterbox TTS (0.5B Llama) now runs on vLLM, 5β10x faster than the original implementation. On a 3090: - 40 min speech in ~2.5 min - Same quality, way faster
- Someone got DeepSeek-R1-0528-Qwen3-8B running on an iPhone 16 Pro using MLX. It runs but takes ages to respond, and the phone gets hot fast. 8B models on phones aren't sci-fi anymore. via u/adrgrondin on r/LocalLLaMA
00:00 - Get your free Perplexity-style search agent in 2 mins. Use Jan v1 and match the settings in this video.
00:00 - This is interesting. One dev is training an AI from scratch on books from 1800s London. It's called TimeCapsuleLLM, not a fine-tuned modern model, but one trained entirely on historical data. No modern language or context. Built on nanoGPT by @karpathy.
- DeepSeek R1.1 just matched Claude Opus on Aider's polyglot benchmark - 70.7% Pass@2. Old R1 scored 56.9%, so this is a +13.8pt jump. Same test, same setup, posted by a user on r/LocalLLaMA. Cost to run: ~$3 off-peak.
- Qwen3-30B-A3B local settings guide. - With thinking: Temp 0.6, TopP 0.95, TopK 20, 32,768 tokens max - Without thinking: Temp 0.7, TopP 0.8, TopK 20 Switch modes with /think or /no_think in prompts, or enable_thinking=False in code. Source: @Alibaba_Qwen
- GRPO-tuned Qwen 32B matches Claude 3.7 Sonnet on deductive reasoning tasks! Outperforms DeepSeek R1, o1, and o3-mini on "Temporal Clue" puzzles at 100x lower inference cost. Click Use this model on @huggingface and select π Jan to run it locally: huggingface.co/bartowski/Openβ¦
- How to run AI models locally? -> Go to π€ @huggingface -> Grab the GGUF model link -> Drop it into π Jan Hub That's all there is to it.
00:00 - π³ DeepSeek just dropped official recommendations on how to run their models effectively! Here's a quick breakdown of what you need to know: π§΅





