Skip to content

pcuenca/LlamaLanguageModels

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

10 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

LlamaLanguageModels

Foundation Models API for llama.cpp.

Leverages the LanguageModel and LanguageModelExecutor protocols introduced in WWDC 2026. Offers the same API for llama.cpp models downloaded from Hugging Face. Built on the experimental LlamaKit.

Usage

import FoundationModels
import LlamaLanguageModels

let model = LlamaLanguageModel(
    modelIdentifier: "Qwen/Qwen2.5-0.5B-Instruct-GGUF:Q4_K_M"
)
let session = LanguageModelSession(model: model)
let response = try await session.respond(to: "Who are you?")
print(response.content)

Example CLI: fm_llama

Sources/fm_llama/ is a minimal REPL with streaming.

swift run fm_llama ggml-org/gemma-4-26B-A4B-it-GGUF:Q4_K_M "Respond in verse"

Requirements

  • Swift 6.4+
  • macOS 27+ / iOS 27+ / Xcode 27.0 beta

To Do

  • Default model generation parameters
  • Faithful token counts
  • Tool calling
  • Reasoning
  • Constrained generation

About

Foundation Models API for llama.cpp

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages