Horizon Alert
Summary of the vulnerability and why it matters
This vulnerability in SGLang could allow an attacker to execute arbitrary code on your systems. It happens when a malicious model file is loaded, and the system's template rendering is not properly secured. This is a serious issue because it can lead to a complete compromise of the affected server.
- Critical severity and network access.
- Could impact AI services and applications.
- Requires loading a malicious model file.
Attack Path
How an attacker could exploit the issue
An attacker can achieve remote code execution by tricking the SGLang service into loading a malicious model file. This file contains a specially crafted tokenizer chat template that, when processed by the unsandboxed Jinja2 environment, allows the attacker to run arbitrary code on the server.
- No authentication required.
- Targets the `/v1/rerank` endpoint.
- Requires loading a malicious model.
Live Threat
Current exploitation, exposure, and threat context
This vulnerability allows for Remote Code Execution (RCE) by loading a model with a malicious tokenizer, leveraging an unsandboxed Jinja2 environment. While the impact is severe, threat actor interest may be tempered by the specific nature of the dependency on model files and the potential for detection in production LLM inference environments.
- RCE vulnerability in LLM framework.
- Requires loading a crafted model file.
- No public exploit code observed yet.
Priority actions
Operational Fix
Recommended remediation, mitigation, and detection steps
Teams should prioritize blocking or isolating services exposed to the network that utilize SGLang's reranking endpoint, especially if model files are loaded from untrusted sources, to prevent critical RCE. Since a patch is available, apply it immediately to all affected systems.
- Apply patch from [https://github.com/sgl-project/sglang/pull/23660](https://github.com/sgl-project/sglang/pull/23660).
- If patching is delayed, restrict network access to the rerank endpoint.
- Monitor for suspicious requests to the `/v1/rerank` endpoint.