Evaluationof Web Vulnerability Scanners Basedon OWASPBenchmark
Evaluationof Web Vulnerability Scanners Basedon OWASPBenchmark
net/publication/331038733
CITATIONS READS
13 2,600
2 authors, including:
Balume Mburano
Western Sydney University
2 PUBLICATIONS 13 CITATIONS
SEE PROFILE
Some of the authors of this publication are also working on these related projects:
All content following this page was uploaded by Balume Mburano on 17 February 2019.
Abstract— The widespread adoption of web vulnerability WAVSEP benchmark, showing the different capabilities of
scanners and their differences in effectiveness make it necessary to these two benchmarks.
benchmark these scanners. Moreover, the literature lacks the
comparison of the results of scanners effectiveness from different The remainder of this paper is structured as follows. In
benchmarks. In this paper, we first compare the performances of Section II, we review benchmarking and vulnerability scanners
some open source web vulnerability scanners of our careful choice literature. In Section III, we describe the experimental
by running them against the OWASP benchmark, which is environment, the selection of benchmark and scanners. In
developed by the Open Web Application Security Project Section IV, we detail our experimental results. Then in Section
(OWASP), a well-known non-profit web security organization. V, we give conclusions drawn from the experiments and make
Furthermore, we compare our results from the OWASP
benchmark with the existing results from the Web Application
recommendations.
Vulnerability Security Evaluation Project (WAVSEP)
benchmark, another popular benchmark used to evaluate scanner II. RELATED WORK
effectiveness. We are the first to make a comparison between these
two benchmarks in literature. Our evaluation results allow us to To form the basis of this study, we reviewed benchmarks
make some valuable recommendations for the practice of and vulnerability scanners literature to make an informed
benchmarking web scanners. selection of appropriate benchmark and web vulnerability
scanners for evaluation. We focused on scanners’ vulnerability
Keywords — security measures, penetration testing, web
vulnerability scanner, benchmarking. detection performance and benchmarks capability to reflect
this.
Table 1:Scanners Detection score results for CMDI, LDAPI, SQLI, XSS and Path Traversal
2. Quantitative Measures. Based on OWASP benchmark test The categories including SQLI, XSS, CMDI and Path Tra-
cases, six metrics were calculated to evaluate the effectiveness versal were considered for benchmarks results’ comparison
of the two scanners in detecting Command Injection, LDAP purposes. Although LDAP category was examined in our ex-
Injection, XSS, SQL Injection, and Path Traversal attacks. periments, it was not included in the comparison because it was
Table 1 shows the detailed summary of the scanners results in not examined in Chen’s study.
the calculated metrics.
The obtained experimental results (in table 1) has allowed
us to get an overview of the performances of Arachni and ZAP
related to Command Injection, LDAP Injection, SQL injection,
Cross-Site Scripting and Path Traversal. We then gave a close
comparison of the two scanners performance in the five chosen
categories (See figure 2)
Arachni outperformed ZAP by 4% in WAVSEP benchmark The strictness of benchmarks test cases was apparent in the
results whereas OWASP benchmark results indicated that ZAP comparison results of this category. As it can be seen, although
outperformed Arachni by 8% in this category. Although it was both scanners scored 100% detection rate in WAVSEP
clear that Arachni outperformed ZAP in the existing WAVSEP benchmark results, the opposite occurred in the OWASP
benchmark results with a score of 100% and 96% respectively, benchmark results with both scanners scoring 0% detection rate
we considered OWASP benchmark results. This was because in the same category as shown in figure 6 above.
OWASP benchmark examined the latest version of ZAP
V. CONCLUSIONS AND RECOMMENDATIONS Journal of Applied Engineering Research, vol. 12, pp. 11068-
11076, 2017.
The results of our comparative evaluation of the scanners
confirmed again that scanners perform differently in different [10] Y.Smeets, "Improving the adoption of dynamic web security
categories. Therefore, no scanner can be considered an all- vulnerability scanners," Computer Science, In Dei Nomine
rounder in scanning web vulnerabilities. However, combining Feliciter, 2015.
the performances of these two scanners in both benchmarks, we [11] M. Alsaleh, N. Alomar, M. Alshreef, A. Alarifi, and A. Al-
concluded that ZAP performed better than Arachni in SQLI, Salman, "Performance-Based Comparative Assessment of Open
XSS and CMDI categories. Arachni, on the other hand, Source Web Vulnerability Scanners," Security and
performed much better in LDAP category. Communication Networks, vol. 2017, pp. 1-14, 2017.
There were considerable variations in the performances of [12] Y. Makino and V. Klyuev, "Evaluation of web vulnerability
these two scanners between the OWASP benchmark and the scanners," in 2015 IEEE 8th International Conference on
WAVSEP benchmark. Specifically, our results of benchmarks Intelligent Data Acquisition and Advanced Computing Systems:
comparison revealed that for both scanners and all the four Technology and Applications (IDAACS), 2015, pp. 399-402.
vulnerability categories compared, the scores under the [13] Sarosys.LLC. Arachni Web Application Security Scanner
WAVSEP benchmark were much higher than those under the Framework. Available: http://www.arachni-
OWASP benchmark. This reflects that the OWASP benchmark scanner.com/Sarosys LLC, 2017.
is more challenging than the WAVSEP benchmark in these four [14] OWASP. OWASP Zed Attack Proxy Project. Available:
vulnerability categories. Therefore, we recommend that, if a https://www.owasp.org/index.php/OWASP_Zed_Attack_Proxy
scanner is to be evaluated on these four vulnerability categories, _Project, 2018.
the OWASP benchmark should be chosen as the main target, [15] P. J. Fleming and J. J. Wallace, "How not to lie with statistics:
while the WAVSEP benchmark can be used as a secondary the correct way to summarize benchmark results,"
target to complement the evaluation results. Communications of the ACM, vol. 29, pp. 218-221, 1986.
[16] A. Baratloo, M. Hosseini, A. Negida, and G. El Ashal, "Part 1:
REFERENCES Simple Definition and Calculation of Accuracy, Sensitivity and
Specificity," Emergency, vol. 3, pp. 48-49, 2015.
[1] CoreSecurity. What is Penetration Testing? Available:
[17] N. Antunes and M. Vieira, "On the Metrics for Benchmarking
https://www.coresecurity.com/content/penetration-testing,
Vulnerability Detection Tools," in 2015 45th Annual IEEE/IFIP
2018.
International Conference on Dependable Systems and Networks,
[2] K.Reintjes, "A benchmark approach to analyse the security of
2015, pp. 505-516.
web frameworks," Master, Computer Science, Radboud
[18] J. S. Akosa, "Predictive Accuracy: A Misleading Performance
University Nijmegen, Nijmegen, Netherlands, 2014.
Measure for Highly Imbalanced Data," 2017.
[3] E. Tatlı and B. Urgun, WIVET—Benchmarking Coverage
[19] Exsilio.Solutions. Accuracy, Precision, Recall & F1 Score:
Qualities of Web Crawlers vol. 60, 2016.
Interpretation of Performance Measures. Available:
[4] IBM. The Most Comprehensive Web Application Security
https://blog.exsilio.com/all/accuracy-precision-recall-f1-score-
Scanner Comparison Available Marks AppScan Standard as the
interpretation-of-performance-measures/, 2018.
Leader (Again). Available:
[20] S. Chen. Price and Feature Comparison of Web Application
http://blog.watchfire.com/wfblog/2012/08/the-most-
Scanners. Available: http://sectoolmarket.com/price-and-
comprehensive-web-application-security-scanner-comparison-
feature-comparison-of-web-application-scanners-unified-
available-marks-appscan-standard-as-the-leader.html, 2012.
list.html, 2017.
[5] Darknet. wavsep-web-application-vulnerability-scanner-
[21] ToolsWatch. 2016 Top Security Tools as Voted by
evaluation-project. Available:
ToolsWatch.org Readers. Available:
https://www.darknet.org.uk/2011/09/wavsep-web-application-
http://www.toolswatch.org/2018/01/black-hat-arsenal-top-10-
vulnerability-scanner-evaluation-project/, 2017.
security-tools/.
[6] OWASP. OWASP Benchmark. Available:
[22] OWASP. Cross Site Scripting. Available:
https://www.owasp.org/index.php/Benchmark, 2017.
https://www.owasp.org/index.php/Cross-site_Scripting_(XSS),
[7] M. El, E. McMahon, S. Samtani, M. Patton, and H. Chen,
2016.
"Benchmarking vulnerability scanners: An experiment on
[23] Microsoft. Establishing an LDAP Session. Available:
SCADA devices and scientific instruments," in 2017 IEEE
https://msdn.microsoft.com/en-
International Conference on Intelligence and Security
us/library/aa366102(v=vs.85).aspx, 2018.
Informatics (ISI), 2017, pp. 83-88.
[24] OWASP. SQL Injection. Available:
[8] S. Chen. Evaluation of Web Application Vulnerability Scanners
https://www.owasp.org/index.php/SQL_Injection, 2016.
in Modern Pentest/SSDLC Usage Scenarios. Available:
[25] PortSwigger_Ltd. SQL injection. Available:
http://sectooladdict.blogspot.com/2017/11/wavsep-2017-
https://portswigger.net/kb/issues/00100200_sql-injection, 2018,
evaluating-dast-against.html, 2017.
2018.
[9] S. E. Idrissi, N. Berbiche, F. G. and, and M. Sbihi, "Performance
[26] GitHub.Inc. OWASP Benchmark. Available:
Evaluation of Web Application Security Scanners for Prevention
https://github.com/OWASP/Benchmark/compare/1.2beta...mast
and Protection against Vulnerabilities.pdf," International
er, 2018, February.