high

HIGH: LMDeploy CVE-2026-33626 SSRF Exploited Within Hours of Disclosure

A server-side request forgery flaw in the LMDeploy vision-language pipeline was under active exploitation twelve hours and thirty-one minutes after public disclosure, and no upstream patch has shipped yet. Attackers are already probing exposed instances for AWS metadata credentials, Redis on localhost, and loopback services.

By Danny Mercer, CISSP — Lead Security Analyst • Apr 24, 2026 •148 views

Is your business exposed? Our McKinney-based security team can assess your risk for free.

Free Assessment 512-518-4408

If you needed proof that attackers are now watching AI infrastructure advisories the same way they watch Microsoft Patch Tuesday, this week delivered it. A newly disclosed server-side request forgery vulnerability in LMDeploy, the open-source toolkit for compressing, deploying, and serving large language models, was under active exploitation within twelve hours and thirty-one minutes of public disclosure. By the time most security teams had even read the advisory, someone was already probing exposed instances for cloud credentials.

The vulnerability tracked as CVE-2026-33626 carries a CVSS score of 7.5 and affects all LMDeploy releases at version 0.12.0 and prior that ship with vision-language support. The flaw lives in a single function, load_image() inside lmdeploy/vl/utils.py, which fetches arbitrary URLs to pull images into multimodal workflows without validating whether those URLs resolve to internal or private IP addresses. That missing check turns a routine image loader into a generic HTTP client operating from inside your production environment, which is exactly the kind of primitive attackers dream about.

Igor Stepansky of Orca Security gets credit for finding and reporting the issue. InternLM, the research group behind LMDeploy, acknowledged the vulnerability through GitHub Security Advisory GHSA-6w67-hwm5-92mq. As of this writing the advisory lists no patched version. That is not a typo. Almost two days after disclosure there is still no fixed release, which means every team running a vision-language model through LMDeploy is currently exposed and dependent on workaround mitigations rather than a version bump.

The real story here is not just the vulnerability but the speed of the response from the other side. Sysdig's threat research team captured exploitation attempts on April 22, 2026 at 03:35 UTC, roughly twelve and a half hours after the advisory went live. The source was a single IP, 103.116.72.119, and the entire active phase lasted about eight minutes and consisted of ten distinct requests. What the attacker did inside those eight minutes reads like a textbook enumeration sweep. They probed AWS Instance Metadata Service endpoints, the classic first stop for anyone who lands SSRF on cloud-hosted workloads and wants to grab temporary IAM credentials. They looked for Redis instances exposed on localhost. They tested external egress by pointing the vulnerable server at a DNS canary hosted on requestrepo.com, which is a tell that the attacker wanted to confirm outbound connectivity before committing to a heavier payload. They port scanned the loopback interface. And to keep from painting too obvious a signature, they rotated between model identifiers, swapping among internlm-xcomposer2 and OpenGVLab/InternVL2-8B so their requests looked like legitimate multimodal workloads rather than identical scripted calls.

None of that tradecraft is sophisticated. It is, however, exactly what you would expect from a threat actor who has pre-built tooling ready for SSRF opportunities in any newly announced framework, and who is running monitored feeds of vulnerability disclosures. The thirteen-hour window between advisory and attack is not a coincidence. It is the new normal for anything with an exposed HTTP surface.

Understanding why SSRF is such a reliable path to cloud account compromise matters here because this is not an LMDeploy-specific weakness in principle, even though the exact bug is. On AWS EC2 instances the legacy IMDSv1 metadata service responds to any HTTP request originating from inside the instance at 169.254.169.254 and returns temporary session credentials tied to the instance profile role. Any process running on that instance can ask, and any SSRF bug that lets an external attacker coerce the server into making an HTTP request can effectively ask as well. Google Cloud and Azure expose similar metadata endpoints with similar risks when token protections are not enabled. Once the attacker has an access key and a session token, the rest of the attack is just API calls, and at that point your detection strategy needs to pivot into cloud audit logs rather than host-based telemetry.

Who is actually exposed to this? Anyone running LMDeploy in production with its vision-language features enabled, which is a larger population than you might expect. LMDeploy is one of the more popular toolkits in the Chinese and international open-source LLM community and shows up in self-hosted inference stacks, on-prem deployments built by companies that did not want OpenAI or Anthropic in their data flow, research lab clusters, and a growing number of model-as-a-service startups that needed a quick way to stand up InternLM, InternVL, or other multimodal models behind a REST API. Many of those deployments sit on EC2, GKE, or AKS nodes with the default instance metadata service still reachable, and a non-trivial number are exposed to the public internet either directly or through thin authentication layers.

Because no upstream patch exists yet, the mitigation burden sits entirely on the operator. The advisory's guidance is the right starting point. Block outbound fetches to the RFC 1918 ranges of 10.0.0.0/8, 172.16.0.0/12, and 192.168.0.0/16, along with 127.0.0.0/8 loopback, 169.254.0.0/16 link-local which covers the AWS and GCP metadata services, and any other internal subnets you care about. Restrict the allowed URL schemes to HTTP and HTTPS so an attacker cannot pivot into file, gopher, or dict protocol tricks. Resolve hostnames before fetching and recheck the resolved IP against your blocklist, because otherwise an attacker will register a domain that resolves to 169.254.169.254 and waltz past naive string matching. If you cannot modify the code path quickly, put a forward proxy between the LMDeploy workers and the internet, and make that proxy enforce the allowlist. On AWS specifically, enforce IMDSv2 on every instance running inference workloads. IMDSv2 requires a session token obtained through a PUT request that an SSRF payload cannot easily forge. Enforcing IMDSv2 alone blunts most of the damage this specific attacker was trying to do.

Detection is worth thinking about too. If your inference hosts have egress logging, look for outbound connections from the LMDeploy process to 169.254.169.254, RFC 1918 ranges, or 127.0.0.1 on ports you do not normally serve on. The Sysdig-reported attacker used DNS lookups to requestrepo.com as a canary, so any DNS resolution of that domain from a model-serving host is effectively a confirmed probe. Rapid rotation of model identifiers inside a short burst of requests, especially where the inputs are suspicious image URLs pointing at internal addresses, is another behavioral indicator worth alerting on. If you run cloud audit logging, correlate any unexpected use of instance profile credentials from IPs other than the instance's own with recent activity from your inference fleet, which is how you will catch the late-stage exploitation even when you missed the initial SSRF.

For MSPs, this incident is a reminder that the AI infrastructure your customers are standing up, often without telling you, is now firmly inside your threat model. A lot of mid-market shops spun up self-hosted LLM stacks in the last eighteen months on the theory that it was safer than sending data to a third party, and nobody ever folded those stacks into the vulnerability management program. This is a very sellable conversation. Offer an AI infrastructure assessment service that inventories self-hosted inference platforms, maps them to the cloud identities they can reach, verifies IMDSv2 enforcement, and produces an SSRF-hardening plan. Attach continuous vulnerability scanning and egress monitoring to it and you have a recurring revenue line that did not exist two years ago. Customers who balked at paying for traditional pentesting will open the checkbook for AI-specific security work, and CVE-2026-33626 is the talking point that gets you in the door.

In the meantime, if you run LMDeploy with vision support, treat this as drop-everything work. The patch has not shipped, the exploit is live, and the people writing the exploit already know exactly what they want when they land it.

If your team is shipping LLM endpoints, the lesson here is that AI inference servers are now perimeter assets. Adding application-layer penetration testing to your release cycle catches SSRF and prompt-handling flaws before they reach production, and routing inference traffic through a managed SOC gives you the egress monitoring needed to detect cloud-metadata theft in flight.

References

The Hacker News coverage of CVE-2026-33626
https://thehackernews.com/2026/04/lmdeploy-cve-2026-33626-flaw-exploited.html
GitHub Security Advisory GHSA-6w67-hwm5-92mq
https://github.com/InternLM/lmdeploy/security/advisories/GHSA-6w67-hwm5-92mq
NVD CVE-2026-33626
https://nvd.nist.gov/vuln/detail/CVE-2026-33626

Vulnerability

Concerned about this threat?

Our security team can assess your exposure and recommend immediate actions.

Get a Free Assessment →

Protect Your Organization

Penetration Testing

Find vulnerabilities like this in your systems before attackers do.

Managed SOC

24/7 monitoring to detect and respond to threats like these in real time.

Email Security

Block phishing and malware delivery targeting your organization.

Compliance & GRC

Map security controls to 26 frameworks including NIST, SOC 2, and HIPAA.

Back to All Articles

Need help with this threat? We are local in McKinney, TX.

Get Protected Call Now