Skip to content
octagono
Expertise Stack Contact Blog
EN | ES
← All tags
Tag

inference

5 posts

  • Microsoft BitNet 1.58: The Era of 1-Bit Large Language Models
    April 25, 2026
    Microsoft BitNet 1.58: The Era of 1-Bit Large Language Models
  • TurboQuant: Online Vector Quantization with Near-optimal Distortion Rate
    April 18, 2026
    TurboQuant: Online Vector Quantization with Near-optimal Distortion Rate
  • SGLang: Structured Generation Language for Efficient LLM Serving
    April 11, 2026
    SGLang: Structured Generation Language for Efficient LLM Serving
  • vLLM: High-Throughput LLM Inference at Scale
    April 9, 2026
    vLLM: High-Throughput LLM Inference at Scale
  • Ollama: Run Local LLMs on Your Own Hardware
    April 8, 2026
    Ollama: Run Local LLMs on Your Own Hardware
© 2026 octagono
RSS