0% found this document useful (0 votes)
37 views8 pages

2018 - A Novel Approach For Detecting Browser-Based Silent Miner IEEE Conference

Uploaded by

liujiannan1987
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
37 views8 pages

2018 - A Novel Approach For Detecting Browser-Based Silent Miner IEEE Conference

Uploaded by

liujiannan1987
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 8

2018 IEEE Third International Conference on Data Science in Cyberspace

A Novel Approach for Detecting Browser-based


Silent Miner
Jingqiang Liu Zihao Zhao Xiang Cui*
1Institute
Institute of Information Engineering of Information Engineering, Cyberspace Institute of Advanced
2School of Software
Chinese Academy of Sciences Technology
1Chinese Academy of Sciences,
Beijing, China Guangzhou University
2Nanchang University
[email protected] Guangzhou, China
{1Beijing, 2Nanchang}, China [email protected]
[email protected]

Zhi Wang Qixu Liu


1Institute 1Institute
of Information Engineering, of Information Engineering,
2School of Cyber Security 2School of Cyber Security
1Chinese Academy of Sciences, 2University 1Chinese Academy of Sciences, 2University

of Chinese Academy of Sciences of Chinese Academy of Sciences


Beijing, China Beijing, China
[email protected] [email protected]

Abstract— Transactions in the cryptocurrency market has block of the main blockchain.. Valid blocks are verified to be
been extremely hot in recent years, with the price of legal after broadcast on the entire network and are appended to
cryptocurrency climbing all the way. Hackers have turned their the main blockchain.
attentions to cryptocurrencies, and have used various means to
acquire cryptocurrencies illegally, which caused huge losses to the
victims. Some browsers block malicious mining activities from the TABLE I. REPRESENTATIVE MINER MALWARE EVENTS IN RECENT
YEARS
network protocol level, but they do not have the ability to detect
mining samples themselves, and it is difficult to make effective Time Malware Description
detection of homogenous mining samples of the network layer. To The first bitcoin mining Trojan. It was
solve these problems, based on the attack pattern of browser 2013.06 BT Seed Trojan disguised as a popular movie "China
mining, the browser-based silent mining features are analyzed, Partner" for mining.
and a method to detect browser silent mining behavior is proposed. It makes use of the botnet consisting of
2014.03 CoinKrypt Android smart machines to perform
This method drives known malicious mining samples, extracts
mining for cryptocurrencies.
heap snapshots and stack code features of a dynamically running A mining worm using Linux
browser, and performs automated detection based on recurrent 2014.09 ShellShock
ShellShock Vulnerability
neural network. By modifying the kernel code of Chrome, a Coinhive embedded JavaScript code
browser-based silent miner detection prototype system 2017.09 Coinhive that uses the CPU of a website visitor
BMDetector was designed and implemented. With 1159 samples to conduct Monroe mining
detected and analyzed, experimental results show that the 2017.09 mateMiner
Use only one PowerShell script as a bot
recognition rate of the original mining sample is 98%, and 92% to complete all functions
for the encrypted and confused, which is an effective and feasible Based on Python, it spreads via SSH
method. 2018.01 PyCryptoMiner and uses the notes site Pastebin to
publish alternate C&C servers.
Utilize the "Eternal Blue" to construct
Keywords—Cryptocurrency, Miner, Browser, Dynamic detection, 2018.02 NrsMiner a mining botnet and use the CPU and
RNN GPU of the bot to dig for Monroe Coin
Huge profits of cryptocurrencies have attracted more
I. INTRODUCTION hackers to target cryptocurrencies and use various means to
Thanks to the decentralized mechanism of blockchain[1], implant mining programs in the victims’ machine for profit.
cryptocurrency is favored by many industries. Many Mining Trojan was first appeared in 2013. In 2014, a mining
cryptocurrencies based on blockchain have emerged, such as virus spread using system vulnerabilities emerged.
Bitcoin[2], Ethereum[3], Monroe[4], Litecoin[5], etc. Prices of Representative miner malware events in recent years is shown
Bitcoin have increased millions of times since its birth[6]. These in TABLE I. Some of the mining Trojans disguised as other
cryptocurrencies are not issued by a specific currency issuer, but software to spread, and some use CVE and other well-known
are derived from a large number of operations based on specific vulnerabilities to form a botnet to spread.[7] A new mining
algorithms. The miners perform a hash collision to generate a method using the PowerShell method and browser-based mining
block by performing a large scale of operations on the previous has also emerged. It has infected multiple operating system
platforms including Windows, Linux, MacOS and Android to
* Corresponding Author
This work is supported by The National Key R&D Program of China (No.
2016QY08D1602), Foundation of Key Laboratory of Network Assessment
Technology, the Chinese Academy of Sciences (CXJJ-17S049), National
Natural Science Foundation of China (NO.61602470), Key Laboratory of
Network Assessment Technology at Chinese Academy of Sciences and Beijing
Key Laboratory of Network security and Protection Technology.

978-1-5386-4210-8/18/$31.00 ©2018 IEEE 490


DOI 10.1109/DSC.2018.00079
illegally acquire encrypted digital currencies such as Bitcoin and x The first to propose a method to detect the browser's
Monroe, which brought great loss to users [8][9][10]. malicious mining behavior starting from heap snapshots
and stack features of a dynamically running browser. It
Browser-based mining is a rising star compared to can be used to solve the homogenous mining behavior
traditional botnet-based mining. However, the late appearance of browsers that are not concerned by industry and
does not stop the rise of such mining Trojans. Browser is one of academia. It detects malicious behavior based on
the most frequently used applications. When the user visits a recurrent neural network (RNN), which drives the
website embedded with a mining Trojan, the browser will parse known malicious browser mining samples, collects the
and execute the mining script, causing a large amount of data generated, dynamically extracts key features of
computing resources occupation for mining. According to malicious mining behavior from the browser's memory
AdGuard, the number of computer hidden mining scripts is snapshot and stack after the browser executing the
about 500 million[11]. The browser mining incident disclosed scripts to train and predict.
in 2017 showed an explosive growth[12]. Security company
Sucuri found that nearly 5500 WordPress websites were x A browser-based prototype system for detecting
implanted with mining scripts. Webmasters of Pirate Bay hide malicious mining behavior is designed and
mining scripts in HTML pages, which will be ran in the implemented, which can automate the detection and
background when the user visits, and mine with even 100% analysis of homologous mining activities and has a
occupancy of some CPUs. better ability to identify the encrypted and confused
mining samples. The performance of the system meets
In order to increase the mining rate, miners using browsers
the needs of the user's normal browsing and has a good
as mining tools gather sites together to form a "mining pool",
user experience.
which takes only a small price to make a huge profit. Browser-
based mining started to become commercialized under huge This paper is organized as follows: Section II analyzes the
profits. The currently discovered commercialized miner projects attack pattern of browser malicious mining, summarizes the
include Coinhive[13], JSEcoin[14], reasedoper[15], existing defense methods and their problems; Section III
LMODR.BIZ[16], MineCrunch[17], Crypto-Loot[18], proposes browser-based detection method for silent mining, and
ProjectPoi[19], etc. Most of the projects are open source, designs a browser-based mining detection framework; Section
making it convenient for webmasters, attackers, operators, etc. IV implements a browser-based silent mining detection
to use in web pages. ProjectPoi[19] provides a wealth methods prototype system BMDetector and illustrates the function of
for Monroe mining, including simple UI mining components, each module; Section V analyzes the data features of the
verification code mining components, webpage embedded JS detection model, performs functional and performance tests of
mining components, suitable for integrated plug-ins of the system, and explains the test results; Section VI concludes
WordPress, Discuz!, etc. CoinHive[13] provides a JS mining the whole paper.
engine that is specifically designed for Monroe. The mining
engine can limit CPU occupancy during mining and thus is II. RELATED WORK
difficult for users to discover it. At the same time, browsers like
Microsoft Edge and IE do not have the ability to detect browser A. Data preprocessing and code feature extraction
mining behavior. Some popular browsers block websites using
black and white lists and do not have the ability to detect Before sample feature extraction, data preprocessing is
homogenous and malicious miner features. Interception support required to extract the code features of scripts to facilitate
for popular browsers is shown in TABLE II. computer operations. Data preprocessing includes data
cleansing, data integration, data transformation, data reduction,
etc. Data preprocessing before data analysis can improve the
TABLE II. MINING BEHAVIOR INTERCEPTION SUPPORT FOR POPULAR quality of data analysis and reduce the time required for actual
BROWSERS
analysis. Yamamoto et al. [20] gave a definition of software
Browser Vendor
Latest
Interception Method
similarity. They believe that the definition of the loudness of the
Versiona program code is similar to the similarity of the software code
Microsoft and can be used as a basis for judging the similarity of the
Microsoft 42.17120.1.0 No -
Edge
Internet
software code. ALKHALID et al. [21] proposed an adaptive K-
Explorer
Microsoft 11.1.17120.0 No - nearest neighbor clustering method. By calculating the
Chrome Google 65.0.3325.162 Yes Blacklist similarity between entities (program statements), the method
Mozilla puts similarly functional sentences into a clustered set for
Mozilla 59.0.1 Yes Blacklist
Firefox refactoring. By calculating the similarity, the most relevant
Safari Apple 11.0.3 No - statements are gathered together. The calculation is simple and
360 the complexity of the algorithm is low, which is suitable for
Static,
Security 360 9.1.0.410 Yes
Browser
Blacklist analyzing large-scale program code. Document [22] proposed
a.
the concept of program dependency graph, and program
Latest version on March 22, 2018
dependence graph (PDG) is a kind of intermediate graph
To deal with these problems, we propose a browser-based representation of program, which is a directed multiple graph
detection method for malicious mining behaviors. The main with mark. In PDGs, program statements are represented as
contributions are as follows: nodes, and the dependencies between statements are represented
by the edges of the graph. Statistics of Kapser et al. [23]

491
indicated that there are usually 5% to 10% of code clones in communication. When a user visits a website that hosts
software systems. The use of code cloning to extract the key malicious JavaScript, the script mines when the user visits the
features of the same code can help analyze the code. CCFinder site.
[24] is considered to be the currently popular code clone
detection tool that converts source code into token streams and stratum+tcp://cn.stratum.slushpool.com:3333
then analyzes these token streams instead of source code. stratum+tcp://coins.arstechnica.com:3333
CCFinder uses lexical analysis to preprocess the source code, stratum+tcp://equihash.eu.nicehash.com:3357
remove spaces and comments, preserving only keywords and stratum+tcp://equihash.L0CATI0N.icehash.com:3357
grammatical identifiers, and then format the token stream stratum+tcp://equihash.usa.nicehash.com:3357
according to a given language specification such as replacing
identifiers with uniform symbols, removing the prefix, template Fig. 1. Mining Communication Based on Stratum
parameters, keywords of access control, etc., then detect the
characteristics of the token stream, and output the detection C. Block of malicious script execution
results.
The script for browser mining is JavaScript. JavaScript plays
B. Detection of malicious mining behavior an important role in user interaction. In stateful interaction, JS
can be used to set HTTP cookies and HTML Local Storage and
The researches related to browser security mainly focus on interact with Flash cookies. In stateless interaction, JS can be
detections on web security. Some researchers detect malwares used to obtain user client information and submit and retrieve
from network traffic. P. Prasse et al. [25] detected malware on server data. JavaScript can steal user privacy information and
client computers based on HTTPS traffic analysis, which is implant third-party advertisements. Some tools are used to
derived based on a neural language model and a long short-term prevent the execution of scripts such as NoScript[30]. The
memory (LSTM) network, and has a good detection result for protection method that prevents script execution can effectively
unknown malware. Shar et al. [26] used supervised and semi- prevent third-party tracking and advertisement implantation, but
supervised learning, combined with the static and dynamic it will make the page unable to load and work normally and
analysis of the code, to identify and predict sites with SQL affect user experience. The use of blacklists to prevent third-
injection, cross site scripting, remote code execution, and file party tracking of privacy threats is currently the most effective
inclusion vulnerabilities problems. Soska et al. [27] used defense method. The working principle of the method is as
unsupervised learning and data mining techniques to follows: When the user browses the first party website, the
automatically detect vulnerable websites before they turn HTTP Request is intercepted and checked. If the URL in the
malicious. Bartos et al. [28] improved to the previous method of Request is in the black list, the Request packet is discarded to
learning from HTTP traffic, and proposed an optimized prevent it from sending user's private information to a server of
classification system that can detect both known as well a third-party website in a black list to prevent third-party trackers
previously unseen security threats. S. Eskandari et al. [29] from stateful and stateless tracking. This defense method is
introduced the development of browser mining in recent years usually implemented by some browser plug-ins, such as
and the evolution of browser mining scripts, described common DoNotTrackMe[31], Ghostery[32], Adblock Plus[33], etc.
mining scenarios, analyzed threat models, measured key metrics
and performance, and investigated impact of browser mining on Some secure browser vendors collect and maintain the IP
the client. and domain list of the public mining pool. When the browser
detects that the communication includes the IP and domain in
At present, the detection methods for mining behaviors the list, it actively shields them. However, the number of
include the implementation of mining behavior monitoring blacklists is limited. There are still some unknown pools that
through the host's operating status, mining behavior monitoring have not been discovered. And public pools can change the
based on network traffic analysis, and the blocking of public domain name and IP address for mining. When the website uses
mining pool lists through the browser to block mining activities. its own server as the proxy and implants homologous mining
File system, CPU utilization, CPU temperature, fan speed, etc. script, shielding the public mining pools cannot block the
is monitored to ensure that the monitored machine is operating malicious miners of the browser. In the face of high CPU usage
in a reasonable state. For example, if a server's CPU utilization and slow running of the computer when executing mining scripts,
reaches 100% at midnight and remains the same, it can be some mining scripts can control the CPU utilization below a
considered as a suspicious state. The mature file system certain range to avoid the increased CPU utilization and
monitoring tool records the new programs installed and running temperature, faster fan speed, etc. to not attract user's attention.
on the host. Mining Trojans generally use mining pool, such as The method based on in-depth inspection of traffic packets can
mining based on the Stratum protocol, which is currently the protect the host-level mining behavior to a certain extent, but it
most commonly used communication protocol over TCP can not play an effective role in the mining of browsers that use
between miners and mining pools, and commonly used ports of the HTTP and port 80 for communication.
which are 3333, 3357, etc. as shown in Fig. 1. At present, the
main method of blocking encrypted miner at the network layer This paper analyzes features of browser-based silent miners,
is to destroy mining client 's access to join mining pools. The drives known malicious miner samples, dynamically extracts
Deep Packet Inspection (DPI) in the firewall can be used to browser heap snapshots and stack information of JavaScript
detect and block Stratum over TCP, block the addresses and execution from the browser kernel as sample features,
domains of the open mining pool, and disrupt blockchain automatically detects based on RNN (Recurrent Neural
Networks) algorithms and intercepts the dynamic execution of

492
the scripts for the discovered malicious miners to prohibit the A. Framework of BMDetector
implementation of mining activities. Based on the general ideas above, we designed a browser-
based silent miner detection prototype system BMDetector
III. OVERALL FRAMEWORK (Browser Mining Detector), as shown in Fig. 2. In the process
In view of the browser-based silent miner problems, a of website loading, the JS script and context of the website are
detection method for browser silent mining is proposed. This loaded asynchronously and run in the browser sandbox for
method hooks JavaScript in kernel source of Chrome Webkit, preprocessing. Through hooking the mining script parsing
drives known browser malicious mining samples, analyzes the engine, the key code that encrypted and obfuscated is restored at
data structure features from the browser heap snapshot and the the parsing layer. Then browser stack code after JavaScript is
stack data after the script execution using the browser’s parse restored and heap data for browser-running scripts can be
layer Hook key functions, and extracts the dynamic behavior acquired, key mining behavior information can be characterized
features of malicious miners, and perform detection and RNN is used for detection analysis to train detection model
automatically based on RNN. Memory snapshot and stack automatically. When the browser sandbox detection engine
information after the script parsed and executed are used to detects a malicious mining script, it responds in a timely manner
dynamically restore the obfuscated and encrypted code and get and reports to the cloud analysis module for confirmation.
the code features after the browser runs. No matter how Though detecting and identifying malicious samples through the
local-passive and cloud-active dual-engine browser silent miner
confusing the mining script is, it will eventually be restored by
detector, the malicious mining script execution is blocked and
the browser at the parsing layer, and the dynamic detection and
reported at the key function.
analysis of the restored code will generally touch the nature of
the browser's silent mining.

Input
text image Asynchronous Web
traffic resource extraction
script ...
Automated
Hook script Sites list inspection
Site resources loading
engine
Stack data
extraction Report
Internet Script blocking Malicious
Memory snapshot Mining
extraction behavior mining
Mining
Web Server Behavior feature behavior
behavior
detection library confirmation
Output feature
Sandbox Detection Engine
traffic library
Cloud Analysis Module
Feature Library Synchronization
Browser

Man in the Middle

Fig. 2. Framework of BMDetector

Fig. 3. Example of a Coinhive script


B. Feature extraction of heap snapshot
As a common way of JavaScript variants, code obfuscation In order to better restore the behavior of the script, we
can evade the static code analysis and trace detection that are extracted some of CoinHive's mining features from the heap
common in software security. In various mining Trojans, snapshot of Chrome and built an abstract syntax tree as shown
Coinhive's ease of use makes it the first choice for attackers. in Fig. 4. Among them, object Window is created by controller,
Webmasters or hackers who invade the site do not need to write and it can iterate over all the contained DOM nodes and
the mining script code into web pages. Instead, they just need to reference its corresponding object. We screened through the
call the coinhive.min.js file offered by Coinhive and specify a objects contained in node CoinHive, obtained and identified key
unique identifier. Fig. 3 shows an example of Coinhive. function features such as "Call_Site" for creating mining
connections, "GetHashes" for hiding and calculation, and
<script src="https://coinhive.com/lib/coinhive.min.js"></script> "_.isRunning" for representing the running of user’s mining.
<script>
var miner = new CoinHive.User('SITE_KEY', 'john-doe');
miner.start();
</script>

493
Window
preprocessed data from the browser memory snapshot and the
@3099 stack after the script parsed and executed. Since the mining
script is often generated in a similar structure and does not hide
CoinHive
the JavaScript built-in classes, functions and attributes after
@32529 obfuscation, we build a bag-of-word model for the key code
blocks(total in n) built in all the mining scripts (total in W) , and
collect statistics on term frequency Tf, number of scripts
User JobThread Anonymous
@40945
referring to code blocks M and number of code blocks Ci (0 < i
@36397 @52371
≤ n), and let code blocks with a certain frequency be the textual
features.
isRunning isAuthed onReceiveMsg SetJob Miner
@36327 @36333 @36429 @36431 @52363 ( )
 =  
Miner.isRunning Miner.isAuthed regexp_prototype_ Call_Site GetHashesPerSecond _adjustThreads
@36045 @36063 @28433 @30339 @36303 @40883

textual features, ≥
Fig. 4. Heap snapshot feature of CoinHive script extracted from Chrome  =  
non − text features, <
C. Feature extraction of stack information
Through the identification of the JavaScript API
In addition to the code obfuscation, in the collected mining corresponding to the stack information in the mining samples of
samples, there are encryption, encoding and other methods to the RNN trained in the detection model, the mining script that is
avoid detection. In order to effectively detect the browser's silent hooked by the local sandbox detection engine is mapped into the
mining behavior, improve the detection accuracy and achieve a feature library to form a dynamic behavior feature library after
better detection, the mining features are achieved by modifying the script parses and runs.
the Chrome Webkit kernel code and adding JavaScript API
interception, thus to extract the mining features, and record the D. RNN Detection
stack information of CoinHive function when JavaScript is
executed. The key stack information is shown in Fig. 5. The dynamic analysis function is based on RNN, including
Combined with the function features extracted from the heap preprocessing, sample generation and detection. The processing
snapshot, vectorization and malicious sample identification are flow is shown in Fig. 6.
performed on the suspicious functions and call sequences
obtained by the Hook dynamic script parsing engine. JS
Preprocessing

Lexical analyzer String


module

Processing

Filter, Replace

Down-sampling 1 Down-sampling 2
Sample generation
module

Vector dictionary

LSTM LSTM

Fig. 5. Stack data of script parsing by Hooked browser


Detection
module

Splice layer
The local RNN sandbox detection engine is located at the
browser client. When a user visits a website, the website's Fully connected layer
resources are loaded into the browser, which include text,
pictures, scripts, etc. To not reduce the user's experience when
browsing web pages, the local sandbox detection engine uses
Result
asynchronous resource loading. When the JS engine in Webkit
parses the mining script, the script loading function is
responsible for parsing the script asynchronously to the local Fig. 6. RNN Detection Based on Dynamic Analysis Feature Extraction
according to the script URL, and dynamically acquiring the

494
In training process, the data format input to the model is hooks.addAction("loaded", function), they will
N×100, with 100 columns per row. The input is then entered into be uniformly executed at hooks.doAction("loaded").
the embedding layer and the data is embedded into the 128- If the characteristics of the malicious mining script are detected,
dimensional vector space. Then the stream passes through the the key function is blocked, and the blocking method is to
LSTM layer with 128 layers. Through the full connection layer, prohibit the execution of the function.
it is divided into two layers for output. Finally, the softmax layer
is used to calculate the output probability corresponding to each function hooks() { this.queue = new Array(); }
category. The total loss of the output layer are calculated by the hooks.prototype.addAction = function(hook, func) {…}
cross-entropy function, and the values of all the parameters are hooks.prototype.doAction = function(hook) {…}
updated according to the goal of minimizing the loss. hooks.prototype.call_user_func_array = function(cb, param
1) Preprocessing module eters) {…}
The preprocessing module includes identification and function func1() {....}
extraction of function features, suspicious function data, and function func2() {....}
function calling sequence. On the basis of feature extraction, hooks.addAction("loaded", func1); // Add function
features of function data and calling sequence are replaced by hooks.addAction("loaded", "func2");
symbols according to a unified rule, and part of the useless unit window.onload = function() { hooks.doAction("loaded"); }
object is filtered.
2) Sample compression and vectorization Fig. 7. Implementation of mining script blocking
We use the function data and function calling sequence that
are processed by the preprocessing module as features, and each
of the parsed-to-local JavaScript is identified with a binary IV. EXPERIMENTS AND RESULTS
vector according to these features, and the initial setting is a To verify the actual effects and performance of the prototype
zero-vector. By comparing the locally generated scripts and system of BMDetector, we selected data from different types of
browser memory snapshots asynchronously, the corresponding mining scripts and non-homologous mining script data for
vector position from hook to the key function features is set to 1. testing, including functional testing and performance testing.
The functional test mainly tests the system's recognition effect
3) Recurrent neural network detection on different types of mining scripts and non-homologous mining
The detection section of the mining script in the RNN model scripts, and the performance testing mainly tests the influence of
includes two sub-modules: the long short-term memory (LSTM)
user experience when user browsing after deploying the system.
and the connection layer. LSTM solved the problem of rapid loss
of information in the loop unit. The memory function in the time
dimension is implemented by the switch of the door, thereby A. Experimental environment
improving the use of the calling sequence in the features of the The client environment used in the experiment is Windows
mining script function and improving the detection accuracy. 7 with 4GB of memory, Intel GMA HD 4000 graphics, and a
The connection layer maps the “feature representation” learned custom-built Chrome browser; the cloud server is Ubuntu 16.04
from the network to the sample markup space, and classifies the with 8 GB of memory, and NVIDIA GeForce GTX 970 graphics.
vectorized sample features through the fully connected layer. It deploys an active detection system and is responsible for the
synchronous mining feature database.
In the evaluation of the performance of the system model,
Precision P, Recall R, and harmonic mean F1 [34][35] are
B. Experimental process analysis
effective evaluation indicators for the machine learning model.
We calculated the evaluation indicator Precision (P) and Recall We crawled 115 homepages that included infected mining
(R) of the model, and derive harmonic mean F1 from P and R sites and normal websites, and obtained a total of 923 JavaScript.
using formulate (3) to evaluate the effectiveness of the model. The statistical results are shown in Fig. 8.
80
Mining Scripts
 1=   60 Normal Scripts
Samples

40
E. Miner blockage
The key functions of the mining script can be intercepted by 20
JS Hook, and the JS API Hook can be used to more accurately
detect the native actions of JS. A hook registers a function or 0
other sequence of actions that need to be executed in a unified
entry. The program executes these registered functions by
calling this hook. The key implementation for the mining script Features
is shown in Fig. 7. The custom function hooks() adds the
prototype attribute addAction and doAction, and is registered to Fig. 8. Most commonly features in the miner and normal scripts with F2
Distribution
an entry function when the JS is loaded. Therefore, no matter
how many functions are intercepted by

495
With original 59 malicious mining samples encrypted, result are extracted. These 60 instances are added to the training
encrypted and obfuscated, 236 mining samples are obtained in pool and the current 460 instances are used to continue training
totaling 1159 samples. We drove all the samples, extracted the the classifier. This process is repeated until no instance in the
browser heap snapshot and stack info, calculated actual original malicious sample set is judged as a mining script by the
observations and theoretical inferences, and evaluated the current classifier. The final recognition accuracy of the
features using hypothesis test method 2 in classification BMDetector prototype system is 93.04%. The results of the
statistics, whose value reflects the degree of deviation from the evaluation criteria are shown in TABLE IV. from the
actual observed and the theoretical inference. calculation formulas of P, R, and F1 [34][35].
Through hooking mining script parsing engine, RNN
algorithm was used, which has good applicability to dynamic TABLE IV. RECOGNITION EFFECT OF BROWSER SILENT MINING SCRIPT
detection, to detect and analyze after restoring the key code in Recall F1
the parsing layer and performing stream processing, and Category Precision˄
˄%˅
˄%˅ ˅ ˄%˅ ˅
obtained a higher accuracy in the training set. According to the original 97.92 98.72 98.32
features frequency of the script, the top 10 most frequent encrypted 94.56 95.87 95.21
features of mining scripts and normal scripts are listed, as shown confused 90.47 88.35 89.40
in TABLE III. encrypted and
87.73 89.85 88.78
confused
homology 94.54 93.21 93.87
TABLE III. TOP 10 API FEATURES OF NORMAL SCRIPTS AND MINING The experiments in the table show the defensive framework
SCRIPTS is more feasible based on the method of dynamically acquiring
ID Normal Scripts Frequency Mining Scripts Frequency web scripts from the JS parsing layer. The sandbox detection
Features (%) Features (%) module in the system has better recognition capabilities for
1 data 65.37 onReceiveMsg 58.51 original, variants, and homologous mining scripts.
2 value 62.48 protocal 56.23
3 test 61.74 connect 52.39
4 text 58.83 anonymous 49.85 D. Performance test
5 prototype 55.31 verify 47.16 Because the prototype system based on the BMDetector
6 display 54.68 setJob 44.72 needs to load the JS and context to the browser sandbox while
7 show 53.52 callSite 43.67
8 getElementById 52.95 encodeURI 41.34
loading the website, the page-loading time, CPU and memory
9 name 51.79 random 39.46 utilization are tested, and the test results are shown in Fig. 9.
10 referrer 50.87 addEventListener 37.42
Since the purpose of the mining script is to use computing 80
power to establish mining connections, it tends to do some
connection and encoding operations; while the purpose of 60
Usage/%

normal script is to improve the browsing effect, so the script


40
here is more inclined to some operations with HTML elements.
C. Functional test 20
Functional test mainly focused on the ability of BMDetector 0
to identify different types of mining scripts. Three types of 0 5 10 15 20 25
detection objects includes here: original scripts, variant scripts,
and homology scripts. Time/s
CPU with BMDetector CPU without BMDetector
Original scripts are unaltered JavaScript obtained directly Memory with BMDetector Memory without BMDetector
from commercial websites such as CoinHive. Variant scripts are
new JavaScripts that is encrypted, obfuscated, and both Fig. 9. Performance comparison before and after the deployment of
encrypted and obfuscated based on the original scripts, in order BMDetector
to conceal the functional features of the code. The browser will
parse the source code of the variant malicious mining script at Deploying the BMDetector system consumes CPU, memory,
the script parsing layer and execute it. The homologous mining and other resources during the site loading process. Compared
script uses the website's own protocol, domain name, IP and port to accessing websites that do not deploy the framework, the
to communicate, and is compatible with the browser's same- performance is reduced, but is within an acceptable range, which
origin policy. Existing black-and-white-list defense strategies has a good user experience overall.
cannot detect it.
V. CONCLUSION AND FUTURE WORK
The training classifier randomly selects 200 instances from
the initial malicious mining samples, extracts 200 instances from Browser-based silent miner sacrifices victims’ computing
the website homepage script and manually confirms them as resources without their knowledge, resulting losses for their
normal samples, and uses these 400 instances to create an initial interests. Some of the currently popular browsers as IE (Edge),
classifier with an accuracy rate of 93.1%. After the incremental Chrome, Firefox, Safari, etc. detect mining behavior based on
learning process, the initial classifier is used to classify the malicious URL blocking, but cannot cope with the silent mining
remaining samples, and the top 30 instances in the classification behavior of communication with the same original. This paper

496
proposes a novel browser-based silent miner detection method. [15] Greatis. How To Delete REASEDOPER.PW Virus In 3 Simple Steps?
It asynchronously extracts heap snapshots and stack data, REASEDOPER.PW Virus Removal Guide (UPDATE).
https://info.greatis.com/howto/remove-reasedoper-pw.htm, 2017.
dynamically restores encrypted and confused scripts during
[16] Greatis. HOW to REMOVE "LMODR.BIZ" virus COMPLETELY from
browser operating and checks the malicious mining behavior Chrome, Firefox, Edge, IE browser: Simple "LMODR.BIZ" Removal
based on deep learning. The experiment has a high recognition Guide. http://greatis.com/blog/howto/remove-lmodr-biz-virus.htm, 2017.
rate for malicious mining scripts. When mining behaviors are [17] Minecrunch Network. https://www.minecrunch.net/web/.
discovered, mining behaviors can be blocked and the original [18] Crypto-Loot. https://crypto-loot.com/.
functions of the website are not affected, which has a good user [19] ProjectPoi. https://www.ppoi.org/.
experience. [20] Yamamoto, Tetsuo, et al. Measuring Similarity of Large Software System
This paper is based on Chrome in the source level to s Based on Source Code Correspondence. Product Focused Software Pr
ocess Improvement. Springer Berlin Heidelberg, 2002:530-544.
intercept and deal with malicious mining activities, and supports
[21] Alkhalid, Abdulaziz, M. Alshayeb, and S. Mahmoud. "Software
will be available for more browsers in the future. Browser-based refactoring at the function level using new Adaptive K-Nearest Neighbor
mining is a new attempt compared to website advertisements algorithm." Advances in Engineering Software 41. 10(2010): 1160-1178.
placement. If the form is reasonable, it may replace some of the [22] Ferrante, Jeanne, K. J. Ottenstein, and J. D. Warren. "The program
existing forms of advertising. And illegal web mining will also dependence graph and its use in optimization." Acm Trasactions on
be a new web security issue for webmasters following the Programming Languages & Systems 9.3(1987):125-132.
hijacking of web pages and traffic. [23] Kapser, Cory, and M. W. Godfrey. "Toward a Taxonomy of Clones in
Source Code: A Case Study." Elisa Workshop (2003).
[24] CCFinder. the archive of CCFinder Official Site. http://www.ccfinder.net/.
REFERENCES
[25] Prasse, Paul, et al. Malware Detection by Analysing Encrypted Network
[1] Nakamoto, Satoshi. "Bitcoin: A peer-to-peer electronic cash system." Traffic with Neural Networks. Machine Learning and Knowledge
Consulted (2008). Discovery in Databases(2017).
[2] Bitcoin - Open source P2P money. https://bitcoin.org/. [26] Shar, Lwin Khin, L. C. Briand, and H. B. K. Tan. "Web Application
[3] Ethereum Project. https://www.ethereum.org/. Vulnerability Prediction Using Hybrid Program Analysis and Machine
[4] Monero. https://getmonero.org/. Learning." IEEE Transactions on Dependable & Secure Computing
12.6(2015):688-707.
[5] Litecoin - Open source P2P digital currency. https://litecoin.org/.
[27] Soska, Kyle, and N. Christin. "Automatically detecting vulnerable
[6] CoinDesk. Bitcoin (USD) Price. https://www.coindesk.com/price/. websites before they turn malicious." Usenix Conference on Security
[7] Trend Micro. Struts and DotNetNuke Server Exploits Used For Cryptocu Symposium USENIX Association, 2014:625-640.
rrency Mining. https://blog.trendmicro.com/trendlabs-security-intelligen [28] Bartos, Karel, M. Sofka, and V. Franc. "Optimized Invariant
ce/struts-dotnetnuke-server-exploits-used-cryptocurrency-mining/, 2018. Representation of Network Traffic for Detecting Unseen Malware
[8] BleepingComputer. "Zealot" Campaign Uses NSA Exploits to Mine Mo Variants." Usenix Security Symposium (2016).
nero on Windows and Linux Servers. https://www.bleepingcomputer.co [29] S. Eskandari, A. Leoutsarakos, T. Mursch, J. Clark. “A first look at
m/news/security/-zealot-campaign-uses-nsa-exploits-to-mine-monero-on browser-based Cryptojacking”, IEEE SECURITY & PRIVACY ON THE
-windows-and-linux-servers/, 2017. BLOCKCHAIN (IEEE S&B) 2018, University College London (UCL),
[9] BleepingComputer. The Second Most Popular Mac Malware Is a Crypto London, UK
currency Miner. https://www.bleepingcomputer.com/news/security/the-s [30] NoScript. https://noscript.net/.
econd-most-popular-mac-malware-is-a-cryptocurrency-miner/, 2017.
[31] DoNotTrack.Us. Do Not Track - Universal Web Tracking Opt Out.
[10] Bleeping Computer. Android Malware Will Destroy Your Phone. No Ifs http://donottrack.us/.
and Buts About It. https://www.bleepingcomputer.com/news/security/an
droid-malware-will-destroy-your-phone-no-ifs-and-buts-about-it/, 2017. [32] Ghostery. Ghostery Makes the Web Cleaner, Faster and Safer!.
https://www.ghostery.com/.
[11] AdGuard. Cryptocurrency mining affects over 500 million people. And
they have no idea it is happening. https://blog.adguard.com/en/crypto- [33] Adblock Plus. Adblock Plus - Surf the web without annoying ads!.
mining-fever/, 2017. https://adblockplus.org/.
[12] 360. 360 Security Report – Discussion on Blockchain Security. http://zt. [34] Kent, Allen, et al. "Machine literature searching VIII. Operational criteria
360.cn/1101061855.php?dtid=1101062370&did=210560729, 2018. for designing information retrieval systems." Journal of the American
Society for Information Science & Technology 6.2(1955):93-101.
[13] Coinhive. Coinhive – Monero JavaScript Mining. https://coinhive.com/.
[35] van Rijsbergen, C. J. Information retrieval, 2nd edition. Butterworths.
[14] JSECoin . JSECoin: Digital Currency - Designed for the web. http:// 1979
www.jsecoin.com/.

497

You might also like