Deep Learning-Based Prediction of Test Input Validity For RESTful APIs
Deep Learning-Based Prediction of Test Input Validity For RESTful APIs
2021 IEEE/ACM Third International Workshop on Deep Learning for Testing and Testing for Deep Learning (DeepTest) | 978-1-6654-4565-8/21/$31.00 ©2021 IEEE | DOI: 10.1109/DEEPTEST52559.2021.00008
Abstract—Automated test case generation for RESTful web same API request, otherwise an error will be returned. A
APIs is a thriving research topic due to their key role in software recent study [3] revealed that these dependencies are extremely
integration. Most approaches in this domain follow a black- common and pervasive—they appear in 4 out of every 5
box approach, where test cases are randomly derived from the
API specification. These techniques show promising results, but APIs across all application domains and types of operations.
they neglect constraints among input parameters (so-called inter- Unfortunately, current API specification languages like the
parameter dependencies), as these cannot be formally described OpenAPI Specification (OAS) [4] provide no support for the
in current API specification languages. As a result, when testing formal description of this type of dependencies, despite being
real-world services, most random test cases tend to be invalid a highly demanded feature by practitioners1 . Instead, users are
since they violate some of the inter-parameter dependencies of the
service, making human intervention indispensable. In this paper, encouraged to describe dependencies among input parameters
we propose a deep learning-based approach for automatically informally, using natural language, which leads to ambiguities
predicting the validity of an API request (i.e., test input) before and makes it hardly possible to interact with services without
calling the actual API. The model is trained with the API requests human intervention2 .
and responses collected during the generation and execution Automated testing of RESTful web APIs is an active
of previous test cases. Preliminary results with five real-world
RESTful APIs and 16K automatically generated test cases show research topic [5]–[10]. Most techniques in the domain follow
that test inputs validity can be predicted with an accuracy a black-box approach, where the specification of the API under
ranging from 86% to 100% in APIs like Yelp, GitHub, and test (e.g., an OAS document) is used to drive the generation of
YouTube. These are encouraging results that show the potential test cases [6]–[8], [10]. Essentially, these approaches exercise
of artificial intelligence to improve current test case generation the API using (pseudo) random test data. To test a RESTful
techniques.
Index Terms—RESTful web API, web services testing, artificial API thoroughly, it is crucial to generate valid test inputs (i.e.,
neural network successful API calls) that go beyond the input validation code
and exercise the actual functionality of the API. Valid test
I. I NTRODUCTION inputs are those satisfying all the input constraints of the API
under test, including inter-parameter dependencies.
RESTful web APIs [1] are heavily used nowadays for Problem: Current black-box testing approaches for REST-
integrating software systems over the network. One common ful web APIs do not support inter-parameter dependencies
phenomenon occurring to RESTful APIs is that they contain since, as previously mentioned, these are not formally de-
inter-parameter dependencies (or simply dependencies for scribed in the API specification used as input. As a result,
short), i.e., dependency constraints that restrict the way in existing approaches simply ignore dependencies and resort to
which two or more input parameters can be combined to brute force to generate valid test cases, i.e., those satisfying all
form valid calls to the service. For example, in the Google input constraints. This is hardly feasible for most real-world
Maps API, when searching for places, if the location services, where inter-parameter dependencies are complex and
parameter is set, then the radius parameter must be set pervasive. For example, the search operation in the YouTube
too, otherwise a 400 status code (“bad request”) is returned. API has 31 input parameters, out of which 25 are involved
Likewise, when querying the GitHub API [2] to retrieve
the authenticated user’s repositories, the optional parameters 1 https://github.com/OAI/OpenAPI-Specification/issues/256
2 https://swagger.io/docs/specification/describing-parameters/
type and visibility must not be used together in the
Authorized licensed use limited to: ULAKBIM UASL - GAZI UNIV. Downloaded on May 15,2023 at 15:41:08 UTC from IEEE Xplore. Restrictions apply.
in at least one dependency: trying to generate valid test
cases randomly is like hitting a wall. This was confirmed
in a previous study [11], where it was found that 98 out of
every 100 random test cases for the YouTube search operation
violated one or more inter-parameter dependencies.
To address this problem, Martin-Lopez et al. [11] devised a
constraint-based testing technique, where valid combinations
of parameters were automatically generated by analyzing the
dependencies of the API expressed in Inter-parameter Depen-
dency Language (IDL), a domain-specific language proposed
by the own authors. Although effective, this approach requires
specifying the dependencies of the API in IDL, which is a
manual and error-prone task. Furthermore, sometimes depen-
dencies are simply not mentioned in the API specification, not
even in natural language, and thus they can only be discovered
when debugging unexpected API failures.
Approach: In this paper, we propose a deep learning-based
approach for automatically inferring whether a RESTful API
request is valid or not, i.e., whether the request satisfies all the
inter-parameter dependencies. In contrast to the state-of-the-art
methods, no input specification is needed. The model is trained
with the API requests and their corresponding API responses
observed in previous calls to the API. In effect, this allows to
rule out potentially invalid test cases—unable to test the actual
functionality of the API under test—without calling the actual
API. This makes the testing process significantly more efficient
and cost-effective, especially in resource-constrained scenarios
where API calls may be limited. Preliminary evaluation results
show that test inputs validity can be predicted with an accuracy
ranging from 86% to 100% in APIs like Yelp, GitHub, and
YouTube. These results are promising and show the potential
of artificial intelligence techniques in the context of system-
level test case generation.
The rest of the paper is organized as follows: Section II
introduces the basics on automated testing of RESTful web
APIs. Section III presents our approach for predicting the Fig. 1. Excerpt of the OAS specification of the GitHub API.
validity of test inputs for RESTful APIs. Section IV explains
the evaluation process performed and the results obtained.
The possible threats to validity are discussed in Section V. update, and delete (CRUD) operations, using specific HTTP
Related literature is discussed in Section VI. Lastly, Section methods such as GET and POST.
VII mentions future lines of research and concludes the paper. RESTful APIs are commonly described with languages such
as the OpenAPI Specification (OAS) [4], arguably considered
II. REST FUL WEB API S the industry standard. An OAS document describes an API in
terms of the allowed inputs and the expected outputs. Figure 1
Web APIs allow systems to interact over the network by depicts an excerpt of the OAS specification of the GitHub API.
means of simple HTTP interactions. A web API exposes one As illustrated, the operation GET /user/repos accepts
or more endpoints through which it is possible to retrieve data five input parameters (lines 67, 77, 85, 94 and 103), three
(e.g., an HTML page) or to invoke operations (e.g., create a of which are involved in two inter-parameter dependencies,
Spotify playlist [12]). Most modern web APIs typically adhere according to the documentation of the GitHub API [2]: 1)
to the REpresentational State Transfer (REST) architectural type and visibility cannot be used together; and 2)
style [1], being referred to as RESTful web APIs. Such APIs type and affiliation cannot be used together either.
are usually decomposed into multiple RESTful web services These dependencies must be satisfied in order to generate
[13], each one allowing to manage one or more resources. A valid API inputs (i.e., HTTP requests). Upon valid API inputs,
resource can be any piece of data exposed to the Web, such successful HTTP responses with a 200 status code will be
as a YouTube video [14] or a GitHub repository [2]. These returned (line 113); otherwise, an API error will be obtained,
resources can be accessed and manipulated via create, read, identifiable by a 4XX status code (lines 114 and 116).
10
Authorized licensed use limited to: ULAKBIM UASL - GAZI UNIV. Downloaded on May 15,2023 at 15:41:08 UTC from IEEE Xplore. Restrictions apply.
Automated test case generation for RESTful APIs is an TABLE I
DATASET EXAMPLE .
active research topic [5]–[10]. Most approaches in this domain
exploit the OAS specification of the service to generate test sort direction visibility type affiliation faulty
cases, typically in the form of one or more HTTP requests.
full name - public public - True
An HTTP request is identified by a method (e.g., GET), a - - all private - True
path (e.g., /user/repos) and a set of parameters (e.g., - - private - collaborator,owner False
type and sort). In some cases, parameter values can also be - desc - all - False
extracted from the API specification, as in the example shown
in Figure 1: all parameters are defined as enum strings, i.e.,
with a finite set of values. Crafting an HTTP request would cell [i, j] represents the value of the parameter in column
just be a matter of assigning values to random parameters j for the API call in row i. The last cell of each row
within their domain, resulting in a test input such as GET (column “faulty”) states whether the API call is valid or
/user/repos?type=all&sort=pushed. In order to not—the value to be predicted in our work. For example,
generate a valid request, all inter-parameter dependencies of row 2 in the dataset represents a request to the GitHub
the API operation should be satisfied. Valid requests are API (operation GET /user/repos) with the following key-
essential for thoroughly testing APIs, since they exercise their value pairs: <visibility=all, type=private>. As
inner functionality, going beyond their input validation logic. stated in the faulty column, such call is invalid because, as
According to the REST best practices [13], valid requests explained in the API documentation [2], the parameters type
should return successful responses (2XX status codes), while and visibility cannot be used together. By contrast, row
invalid requests should be gracefully handled and return “client 4 is valid, because <direction=desc, type=all> does
error” responses (4XX status codes). not violate any dependency.
III. A PPROACH B. Data preprocessing
In this paper, we propose an artificial neural network for The raw dataset is not ready to be fed into the network,
the automated inference of test input validity in the context as it contains many empty values, not supported by standard
of RESTful APIs (or simply APIs henceforth). The goal is to deep learning frameworks. In order to fill those empty values,
automatically infer whether an API call is valid or not without different techniques could be used. For instance, empty values
calling the actual API. This would be extremely helpful for the of numeric variables could be filled with the mean or the
automated generation of cost-effective test suites, especially in median value of the column, or zero values. Unfortunately,
resource-constrained scenarios where stressing the API with none of these strategies can be applied in this domain because
thousands or even millions of random API calls is not an the sole presence of an input parameter in an API call can
option. The approach is divided in three steps, described be crucial for the violation or fulfillment of a dependency.
below. Additionally, artificial neural networks do not support string
inputs, only numeric values. To address these problems, we
A. Data collection
propose enriching and processing the dataset as described
The first step consists in the collection of a dataset of below.
API calls properly labeled as valid or invalid. We consider 1) Data enrichment: we propose extending the original data
an API call as valid if it returns a successful response (2XX as follows.
status codes), and invalid if it returns a “client error” response
– For each number and string column, an auxiliary
(4XX status codes). As a precondition, the API calls must
boolean column will be created informing whether the
meet all the individual parameter constraints indicated in
value is empty or not. Columns representing free string
the API specification regarding data types, domains, regular
parameters will be deleted since dependencies typically
expressions, and so on. For example, if a parameter is defined
constrain their presence or absence exclusively [3].
as an integer between 0 and 20, the value 42 would be invalid.
– For each enum column, a string fake_value will
Therefore, it can be assumed that all the invalid API calls in the
be assigned to empty values (e.g., “None0”), making sure
dataset are invalid due to the violation of one or more inter-
that such values do not exist in the dataset already.
parameter dependencies, and not due to errors in individual
parameters. A dataset of these characteristics (a set of valid and 2) Data processing: next, we propose processing the
invalid API calls) can be automatically generated using state- dataset as follows.
of-the-art testing tools like RESTest [11] or RESTler [15], or – Numerical variables will be normalized in order to even
collected directly from the users activity. the influence of variables of different orders of magni-
Table I depicts a sample dataset for the operation to browse tude. For example, the operation for creating a coupon
user’s repositories from the GitHub API (described in Fig- in the Stripe API includes, among others, the numerical
ure 1). The dataset, in tabular format, contains five rows, parameters redeem_by and duration_in_months.
one for each API call. It contains n + 1 columns, n being While the latter is ∼ 102 , the former is typically ∼ 109 ,
the number of input parameters of the API operation. Each which makes its influence to the network output much
11
Authorized licensed use limited to: ULAKBIM UASL - GAZI UNIV. Downloaded on May 15,2023 at 15:41:08 UTC from IEEE Xplore. Restrictions apply.
stronger. The normalization step guarantees that the val-
ues of both parameters are reduced to the same order of
magnitude, so that there is equity among inputs.
– enum values will be processed using one-hot encoding
[16], which creates an alphabet of the k possible values
found in dataset, and then transforms the string pa-
rameter into a binary vector of k elements. For example,
possible values for the parameter direction of GitHub API
Fig. 2. Network architecture.
are “asc”, “desc”, or fake_value (if it is missing).
Those three possibilities are encoded as vectors (1, 0, 0),
(0, 1, 0), (0, 0, 1) respectively, where each position of this case by three dropout layers, with the dropout coefficient
the vector corresponds to a specific value and only equal to 0.3.
one element is equal to 1. One-hot encoding technique
provides better performance than simple integer encoding D. Training and automated inference
when there is no logical order among enum values [16]. For each target API, the network must be trained with
a dataset of previous calls to the API before using it as
C. Network design
an effective predictor. Training the network excessively is
The discipline of deep learning is a common choice when counterproductive as it may results on the network memorizing
facing a classification problem. Specifically, the proposed instances rather than learning; a phenomena called overfitting.
model is the multilayer perceptron [17], [18]: it can easily To avoid overfitting, we propose the following strategies:
handle numerical data and learn knowledge representation; – L1 and L2-regularization factors in each layer [20], which
moreover, its implementation is relatively straightforward, and prevents the entropy among weights from increasing
it offers many possible configurations and architectures to excessively. A lower entropy guarantees the uniform dis-
experiment with. tribution of the information, in this case the knowledge,
The system is a classifier of Rn into 2 classes: valid or
among weights.
faulty; it accepts the parameters of an API call as input, and
– Several Dropout layers along the architecture [21], which
returns the predicted validity as output. While the number of
randomly silence neurons and prevent the network from
classes is already known, the number of inputs n is variable,
activating them, and therefore from improving their
and it depends on the number of API parameters. For this
weights.
reason, the network automatically adapts to the number of
– Early stopping of the training process [20]. Along the
inputs of the corresponding API operation. More specifically,
training process several metrics are controlled and when
it has the following structure:
they stop improving the training is forced to stop.
– Input layer: an input layer will be automatically built
Once the system has been properly trained, the validity
fitting to the dataframe number of columns, so as for the
of new API calls can be predicted by simply applying the
accepted shape to be flexible.
network over new instances of the same API (i.e., introducing
– Inner layers: we have experimented with several depths
the instance as an input and applying the forward step only,
and different neuronal units per layer.
thus getting the output of the network). In the GitHub example,
– Output layer: one neuron, emitting the probability of the
for instance, when running the network with the API call
instance of belonging to the valid class. The class will
GET /user/repos?sort=created, not included in the
be predicted as faulty or valid depending on the value of
original dataset depicted in Table I, the system should ideally
the output being greater than 0.5 or not.
predict it as valid, since it satisfies all the input constraints,
In order to achieve better results, we experimented with including the inter-parameter dependencies described in Sec-
different combinations of values for the key hyperparameters tion II.
driving the learning process. The optimizer is the computation
method used to update network weights [19]. The batch size IV. E VALUATION
refers to the number of instances the system is fed with For the evaluation of the approach, we implemented a neural
before each weight update happens. Finally, the learning rate network in Python and we assessed its prediction capability
determines how much the gradient of the error influences on an automatically generated dataset. Next, we describe
the weights update. The best configuration found is opti- the research questions, experimental setup and the results
mizer=“Adam”, batch size=8, and learning rate=0.002. obtained.
Figure 2 depicts a visual representation of the proposed
architecture, consisting in: n-units input layer, five inner layers A. Research questions
with 32, 16, 8, 4 and 2 units, respectively, and the final one- We address the following research questions (RQs):
unit output layer. As usual in these feed-forward networks, all – RQ1: How effective is the approach in predicting the
layers are densely connected to the preceding and following validity of test inputs for RESTful APIs? This is the
ones. Besides, the first, second, and third layers are followed in main goal of our work. To answer it, we will study the
12
Authorized licensed use limited to: ULAKBIM UASL - GAZI UNIV. Downloaded on May 15,2023 at 15:41:08 UTC from IEEE Xplore. Restrictions apply.
TABLE II TABLE III
REST FUL SERVICES USED IN THE EVALUATION . P REDICTION ACCURACY FOR EACH SERVICE .
13
Authorized licensed use limited to: ULAKBIM UASL - GAZI UNIV. Downloaded on May 15,2023 at 15:41:08 UTC from IEEE Xplore. Restrictions apply.
V. T HREATS TO VALIDITY
Next, we discuss the possible internal and external validity
threats that may have influenced our work, and how these were
mitigated.
A. Internal validity
Threats to the internal validity relate to those factors that
might introduce bias and affect the results of our investigation.
A possible threat in this regard is the existence of bugs in
the implementation of the approach or the tools used. For
the generation of the API requests, we relied on RESTest
[11], a state-of-the-art testing framework for RESTful APIs.
It could happen that, due to bugs on RESTest or errors in the
Fig. 3. Accuracy evolution by dataset size. API specifications, some valid requests were actually invalid,
or vice versa. To neutralize this threat, we executed all the
requests generated against each of the APIs, to make sure that
there is a strong similarity between CV and test scores for
all of them were correctly labeled as valid or invalid.
each API operation.
For the implementation of our approach, we used cross
It would be sensible to think that a correlation should
validation to select the best combination of hyperparameters
exist between goodness of results and complexity of the
that would yield the best possible results. Regarding the dataset
API operation, i.e., validity of services with less number of
used, it could be argued that it is not diverse enough, especially
parameters and dependencies should be easily predictable.
for the invalid requests: since these were generated with
This theoretical intuition is confirmed by GitHub or Stripe-
RESTest (which we used as a black box), it could happen that
CC, for instance, where accuracy is 100%. However, Yelp
all of them were invalid due to violating the same dependency
scores are counter-intuitively lower, which could be due to
over and over again. To mitigate this threat, we manually
its dependencies being arithmetic; similarly, LanguageTool
confirmed that every single dependency was violated at least
validity is by far the hardest to predict despite having only
once in the dataset.
4 dependencies. By contrast, the search operation validity in
YouTube is predicted very accurately (99.3%), despite being B. External validity
its parameters and dependencies the most numerous ones. This External validity concerns the extent to which we can
may due to the fact that most of its parameters are enum, generalize from the results obtained in the experiments. Our
with a fixed set of possible values, and therefore there are less approach has been validated on five subject APIs, and therefore
combinations of parameters and values for which to predict they might not completely generalize further. To minimize this
input validity. All these facts suggest that the complexity threat, we resorted to industrial APIs with millions of users
of validity prediction does not lie only on the number of world-wide. Specifically, we selected API operations with
dependencies, but also on their type and the shape of the different characteristics in terms of numbers of parameters
parameters involved. (from 5 to 31), parameter types (e.g., strings, numeric, enums),
In view of these results, RQ1 can be answered as follows: number of inter-parameter dependencies (from 2 to 16), and
type of dependencies. In this regards, it is worth mentioning
The accuracy of the network ranges from 86% to 100% that the selected API operations include instances of all the
with a mean accuracy in the eight API operations eight dependency patterns identified in web APIs [3].
under study of 97.3%.
VI. R ELATED WORK
Figure 3 shows the evolution of the mean accuracy across The automated generation of valid test inputs for RESTful
all the API operations under study with respect to the size APIs is a somehow overlooked topic in the literature. Most
of the dataset. As illustrated, accuracy increases drastically as testing approaches focus on black-box fuzzing or related tech-
the size approaches to 400 instances, properly balanced. Then, niques [7], [8], [10], [15], where test inputs are derived from
the prediction accuracy increases slowly until the 800-1000 the API specification (when available) or randomly generated,
instances, after which the growth is asymptotic. As a result, with the hope of causing service crashes (i.e., 5XX status
RQ2 can be answered as follows: codes). Such inputs are unlikely to be valid, especially in
the presence of inter-parameter dependencies like the ones
discussed in this paper.
The approach achieves an accuracy of 90% with
Applications of artificial intelligence and deep learning
balanced datasets of about 400 instances, reaching its
techniques for enhancing RESTful API testing are still in
top performance with 800 or more instances.
their infancy. Arcuri [5] advocates for a white-box approach
where genetic algorithms are used for generating test inputs
14
Authorized licensed use limited to: ULAKBIM UASL - GAZI UNIV. Downloaded on May 15,2023 at 15:41:08 UTC from IEEE Xplore. Restrictions apply.
that cover more code and find faults in the system under [3] A. Martin-Lopez, S. Segura, and A. Ruiz-Cortés, “A Catalogue of
Inter-Parameter Dependencies in RESTful Web APIs,” in International
test. Atlidakis et al. [22] proposed a learning-based mutation Conference on Service-Oriented Computing, 2019, pp. 399–414.
approach, where two recurrent neural networks are employed [4] “OpenAPI Specification,” accessed April 2020. [Online]. Available:
for evolving inputs that are likely to find bugs. In both cases, https://www.openapis.org
[5] A. Arcuri, “RESTful API Automated Test Case Generation with Evo-
the source code of the system is required, which is seldom Master,” ACM Transactions on Software Engineering and Methodology,
available, especially for commercial APIs such as the ones vol. 28, no. 1, pp. 1–37, 2019.
studied in our evaluation (e.g., YouTube and Yelp). [6] V. Atlidakis, P. Godefroid, and M. Polishchuk, “Checking Security Prop-
erties of Cloud Services REST APIs,” in IEEE International Conference
The handling of inter-parameter dependencies is key for on Software Testing, Validation and Verification, 2020, pp. 387–397.
ensuring the validity of RESTful API inputs, as shown in a [7] H. Ed-douibi, J. L. C. Izquierdo, and J. Cabot, “Automatic Generation
recent study on 40 industrial APIs [3]. Two previous papers of Test Cases for REST APIs: A Specification-Based Approach,” in
IEEE International Enterprise Distributed Object Computing Confer-
have addressed this issue up to a certain degree: Wu et al. ence, 2018, pp. 181–190.
[23] proposed a method for inferring these dependencies by [8] S. Karlsson, A. Causevic, and D. Sundmark, “QuickREST: Property-
leveraging several resources of the API. Oostvogels et al. [24] based Test Generation of OpenAPI Described RESTful APIs,” in IEEE
International Conference on Software Testing, Validation and Verifica-
proposed a DSL for formally specifying these dependencies, tion, 2020, pp. 131–141.
but no approach was provided for automatically analyzing [9] S. Segura, J. A. Parejo, J. Troya, and A. Ruiz-Cortés, “Metamorphic
them. The most related work is probably that of Martin- Testing of RESTful Web APIs,” IEEE Transactions on Software Engi-
neering, vol. 44, no. 11, pp. 1083–1099, 2018.
Lopez et al. [11], [25], where an approach for specifying [10] E. Viglianisi, M. Dallago, and M. Ceccato, “RestTestGen: Automated
and analyzing inter-parameter dependencies was proposed, as Black-Box Testing of RESTful APIs,” in IEEE International Conference
well as a testing framework supporting these dependencies. on Software Testing, Validation and Verification, 2020, pp. 142–152.
[11] A. Martin-Lopez, S. Segura, and A. Ruiz-Cortés, “RESTest: Black-
However, the API dependencies must be manually written in Box Constraint-Based Testing of RESTful Web APIs,” in International
IDL language, which is time-consuming and error-prone. The Conference on Service-Oriented Computing, 2020, pp. 459–475.
work presented in this paper is a step forward in automating [12] “Spotify Web API,” accessed November 2016. [Online]. Available:
https://developer.spotify.com/web-api/
this process, since we provide a way for predicting the validity [13] L. Richardson, M. Amundsen, and S. Ruby, RESTful Web APIs.
of API inputs without the need of formally specifying the O’Reilly Media, Inc., 2013.
dependencies among their parameters, just by analyzing the [14] “YouTube Data API,” accessed April 2020. [Online]. Available:
https://developers.google.com/youtube/v3/
inputs and outputs of the API. [15] V. Atlidakis, P. Godefroid, and M. Polishchuk, “RESTler: Stateful REST
API Fuzzing,” in International Conference on Software Engineering,
VII. C ONCLUSION AND F UTURE W ORK 2019, pp. 748–758.
In this paper, we proposed a deep learning-based approach [16] F. N. Kerlinger and E. J. Pedhazur, Multiple regression in behavioral
research. Holt, Rinehart and Winston of Canada Ltd, New York (N.Y.),
for predicting the validity of test inputs for RESTful APIs. 1973.
Starting from a dataset of previous calls labeled as valid or [17] F. Rosenblatt, Principles of neurodynamics: perceptions and the theory
faulty, the proposed network is able to predict the validity of brain mechanisms. Washington, DC: Spartan, 1962.
[18] D. E. Rumelhart, G. E. Hinton, and R. J. Williams, Learning Internal
of new API calls with an accuracy of ∼ 97%. In contrast to Representations by Error Propagation. Cambridge, MA: MIT Press,
existing methods, relying on manual means or brute force, our 1986, p. 318–362.
approach is fully automated, leveraging the power of current [19] Keras. Optimizers. [Online]. Available: https://keras.io/api/optimizers/
[20] I. Goodfellow, Y. Bengio, and A. Courville, Deep Learning. MIT Press,
deep learning frameworks. This is a small step, but promising, 2016.
showing the potential of AI to achieve an unprecedented [21] N. Srivastava, G. Hinton, A. Krizhevsky, I. Sutskever, and
degree of automation in software testing. R. Salakhutdinov, “Dropout: A simple way to prevent neural
networks from overfitting,” Journal of Machine Learning Research,
Many plans remain for our future work: first and foremost, vol. 15, no. 56, pp. 1929–1958, 2014. [Online]. Available:
we plan to perform a more thorough evaluation including http://jmlr.org/papers/v15/srivastava14a.html
different datasets and more APIs. In addition, we are currently [22] V. Atlidakis, R. Geambasu, P. Godefroid, M. Polishchuk, and B. Ray,
“Pythia: Grammar-Based Fuzzing of REST APIs with Coverage-guided
exploring the use of different machine learning techniques for Feedback and Learning-based Mutations,” Tech. Rep., 2020.
the automated inference of inter-parameter dependencies in [23] Q. Wu, L. Wu, G. Liang, Q. Wang, T. Xie, and H. Mei, “Inferring
RESTful APIs. Dependency Constraints on Parameters for Web Services,” in World Wide
Wed, 2013, pp. 1421–1432.
V ERIFIABILITY [24] Oostvogels, N., De Koster, J., De Meuter, W., “Inter-parameter Con-
straints in Contemporary Web APIs,” in International Conference on
For the sake of reproducibility, the code and the dataset of Web Engineering, 2017, pp. 323–335.
our work are publicly accessible in the following anonymous [25] A. Martin-Lopez, S. Segura, C. Müller, and A. Ruiz-Cortés, “Specifica-
tion and Automated Analysis of Inter-Parameter Dependencies in Web
repository: APIs,” IEEE Transactions on Services Computing, 2020, in press.
https://anonymous.4open.science/r/ A PPENDIX A
8954c607-8d6c-4348-a23c-d57c920cdc22/ IDL DEPENDENCIES FROM THE API S UNDER STUDY
R EFERENCES Table IV shows the inter-parameter dependencies present in
[1] R. T. Fielding, “Architectural Styles and the Design of Network-based the eight API operations considered in our study, expressed in
Software Architectures,” Ph.D. dissertation, 2000. the IDL language.
[2] “GitHub API,” accessed January 2020. [Online]. Available: https:
//developer.github.com/v3/
15
Authorized licensed use limited to: ULAKBIM UASL - GAZI UNIV. Downloaded on May 15,2023 at 15:41:08 UTC from IEEE Xplore. Restrictions apply.
TABLE IV
I NTER - PARAMETER DEPENDENCIES INCLUDED API OPERATIONS UNDER STUDY DESCRIBED IN IDL.
IN THE
16
Authorized licensed use limited to: ULAKBIM UASL - GAZI UNIV. Downloaded on May 15,2023 at 15:41:08 UTC from IEEE Xplore. Restrictions apply.