UJIIndoorLoc: WLAN Indoor Localization Database
UJIIndoorLoc: WLAN Indoor Localization Database
net/publication/283894296
CITATIONS READS
276 3,397
7 authors, including:
Some of the authors of this publication are also working on these related projects:
All content following this page was uploaded by Joaquín Torres-Sospedra on 16 March 2020.
Available at:
[Link]
[Link]
UJIIndoorLoc: A New Multi-building and
Multi-floor Database for WLAN Fingerprint-based
Indoor Localization Problems
I. I NTRODUCTION
Many real world applications need to know the localization Fig. 1. Timeline and number of research works records on indoor localization
of a user in the world to provide their services. Therefore, from 2004 to 2013 (records collected in September 2013 [4]).
automatic user localization has been a hot research topic in the
last years. Automatic user localization consists of estimating
the position of the user (latitude, longitude and altitude) by his/her position and creates a test sample. This sample is sent
using an electronic device, usually a mobile phone. Outdoor to the server to be compared with the training samples of the
localization problem can be solved very accurately thanks to radio map. Basically, the user’s location corresponds to the
the inclusion of GPS sensors into the mobile devices. However, position associated with the most similar sample in the radio
indoor localization is still an open problem mainly due to the map.
loss of GPS signal in indoor environments.
One of the major advantages of the WLAN fingerprint-
A spectacular growth of indoor localization studies has based methods is that they do not require the installation of
been witnessed during the last decade (see Figure 1), and the any additional hardware since they use the existing WLAN
WLAN-based ones is the basis for many indoor localization infrastructure. Therefore, the location of the user can be
approaches. This is mainly due to the proliferation of both obtained without additional infrastructures and costs. However,
wireless local area networks (WLANs) and mobile devices. WLANs were not natively designed to support a positioning
Nowadays WLANs can be found anywhere, and mobile phones function. Taking into account the existing obstacles introduced
have increasingly become an indispensable part of our daily by the indoor environment (including reflections and multi path
lives and, therefore, we can safely expect that the user is at interference) the spread of radio signal in indoor environments
the same location than the mobile device. The last generation is very hard to predict [5]. In addition, in WLAN-based po-
of these devices (also known as smartphones) not only pro- sitioning systems, the user typically carries the mobile device
vides programmable abilities but they carry embedded sensors with him/her, being his/her motion or how the device is carried
[1] like GPS, accelerometer, gyroscope, microphone, camera, an important factor that affects the measured RSSI values [6].
bluetooth, etc. which have even been used to study social
interactions [2] or predict human behavior [3] among many Although there are many papers in the literature trying
other studies. to solve the indoor localization problem using a WLAN
fingerprint-based method, there still exists one important draw-
WLAN Fingerprint-based positioning systems are based on back in this field which is the lack of a common database
the Received Signal Strength Indicator (RSSI) value. Com- for comparison purposes. Each approach presents its estimated
monly, two phases are needed: calibration and operation [5]. results using its own database and describes how the experi-
In the calibration phase, a radio map of the area where the ment was carried out. Under these conditions, it is not possible
users should be detected is constructed. Later, during the to compare different methods since the particularities of each
operational phase, a user obtains the signal strength of all experiment are hardly reproducible. In the Pattern Recognition
visible access points of the WLAN that can be detected from and Machine Learning research fields, the common practice is
to test the results of each proposal either using a well-known II. R ELATED WORK
dataset or providing the dataset used. In this way, researchers
are able to fairly compare different methodologies in the Indoor positioning and localization literature is vast. In
literature. For instance, the UCI Machine Learning Repository1 [8], authors categorised approaches according to the tech-
is a well-known example [7] in this sense. However, in the nique used for localization into several paradigms, includ-
WLAN fingerprint-based indoor localization field does not ing calibration-free localization [9], WLAN based techniques
exist such kind of database. [10], Dead-reckoning [11], simultaneous localization and map-
ping (SLAM) [12] and multi-modal sensing [13]. Another
In this paper, the UJIIndoorLoc database is presented to classification can be found in [14], where fingerprint-based
overcome this gap. We expect that the proposed database will indoor localization has been particularly classified into two
become the reference database to compare different indoor categories: infrastructure-based and infrastructure-less ap-
localization methodologies. As far as we know, the proposed proaches. Infrastructure-based approaches rely on the deploy-
database is the first public accessible database in this field and ment of customized Radio-Frequency beacons (RFID, infrared,
researchers can access to the database following this url2 . ultrasound, bluetooth, led lights, etc.) that can be carefully
optimized for a particular purpose. The main drawback of these
The main contribution of this work is the creation and the approaches is that they need their own customized hardware.
presentation of the UJIIndoorLoc database which is the biggest However, infrastructure-less approaches use the already avail-
database in the literature as it was previously mentioned. It able wireless signals to profile a location, taking advantage
would also be the first publicly available database that could of the powerful mobile phones sensors. Our work is an
be used to make comparisons among different methods in this infrastructure-less approach since we use the already available
field. The main characteristics of the database3 are: WLAN access points (WAPs) to construct the database by
using mobile phones.
• It covers a surface of 108703m2 including 3 buildings
with 4 or 5 floors depending on the building. There are also many works dealing with the indoor lo-
calization problem by using WLAN-based techniques. Table I
• The number of different places (reference points) shows the main characteristics of the data used in some of the
appearing in the database is 933. most important papers in this field. In [5] a new fingerprint-
based method was proposed which uses a previously stored
• 21049 sampled points have been captured: 19938 for map of the signal strength at several positions and determines
training/learning and 1111 for validation/testing. the position using similarity functions and majority rules.
• Dataset independence has been assured by taking According to the authors, their proposed method is able to
Validation (or testing) samples 4 months after Training obtain high rates determining the building, the floor and the
ones. place, with an average error around 3 meters. The database
used in [5] has (see Table I) 9358 sample points taken from
• The number of different wireless access points (WAPs) 2 buildings of 3 floors each one, with 101 different WAPs
appearing in the database is 520. in the database. However authors do not provide information
about the number of users or the number of different devices
• Data were collected by more than 20 users using 25 used to capture the samples. Information about the covered
different models of mobile devices (some users used surface was also not provided. The database used in [5] was
more than one model). the biggest one in the previous literature but it is not public
and it has 55% less samples than the one we propose here,
Two Android applications have been used to create the 80% less number of WAPs and 60% less places, among other
database CaptureLoc and ValidationLoc. Both applications use data.
as a reference map services that are published in ArcGIS
server. These services contain the geographic information of TABLE I. M AIN CHARACTERISTICS OF THE DATA USED IN SOME OF
the building interiors as well as the training reference points THE MOST IMPORTANT PAPERS IN THE INDOOR LOCALIZATION FIELD :
Number of buildings (NB ), Surface (S URF.), Number of floors (NF ),
localization. Using these services, the applications created Number of places (NP ), Number of Samples (NS ), Number of WAPs
show the maps to improve the user localization for training and (NW ), AND Number of Devices (ND ). N/A STANDS FOR INFORMATION
validation. Data were collected at three multi-floor buildings NOT AVAILABLE OR NOT PROVIDED BY AUTHORS .
of the Jaume I University 4 (UJI).
Work NB Surf. NF NP NS NW ND
The rest of the paper has been organized as follows: [5] 2 N/A 3 392 9358 101 N/A
Section II presents the related work. Section III explains in [15] 1 N/A 1 96 2880 206 2
detail the proposed database. How this database has been [16] 1 N/A N/A N/A N/A 9 N/A
created is commented in Section IV. Section VI describes the [17] 1 980m2 1 50 N/A N/A N/A
most important challenges we contemplate using the proposed [14]a 3 2730m2 3 120 N/A 434 1
database. Finally, in Section VII some conclusions are given. [14]b 3 69000m2 1 13 N/A 379 1
Our 3 108703m2 4/5 933 21049 520 25
1 [Link]
2 [Link]
3 The first version of the database only covers 3 buildings, our idea is to In [15] a different approach that uses only the rankings of
cover all the university facilities (30 buildings), so the final version of the the RSSI values is used. Authors argue that their method is
database will be approximately 10 times higher. better to avoid the well known problem of having hardware
4 [Link] and software differences between user devices, that produces
that the RSSI reported by the current mobile device may 001-520 RSSI levels
differ from the RSSI in the database, and therefore this can
degrade the positioning accuracy. According to the information 521-523 Real world coordinates of the sample points
provided by authors, the database used is also significantly 524 BuildingID
smaller than the proposed in [5]. For instance, the number of
WAPs, which is the only criterion in which work is superior 525 SpaceID
to [5], is still 60% less than ours. In [16] two improvements 526 Relative position with respect to SpaceID
are proposed to the common way used to solve the WLAN-
based fingerprint problem. The first issue is for differing 527 UserID
antenna attenuation among different devices. The second is for 528 PhoneID
dealing with environments where not every beacon is visible
everywhere. Although the methods proposed in [15] and in 529 Timestamp
[16] are both promising, since they try to solve some of the
common problems in this field, the databases used (see Table As an example, Table II shows an extract of one record.
I) are very small and are different. Due to the length of this record, only two RSSI levels are
shown and being the values of the spatial coordinates truncated
Other WLAN-based works as [18], [19], [20], [21], [22], to one decimal. The example corresponds to the 7754th record
[23], [24], [25], [26], [27], [28], [29], [30], [31], [32], [33] from the training set.
have been not included in Table I since they do not provide
1) RSSI Levels: The most important information for
information about the characteristics of the database or the
WLAN fingerprinting comparison purposes are the WAPs
information provided shows that their databases are very small
detected and their RSSI level values. In the proposed database,
with respect to the one proposed in this paper or even the one
this information represents the 98% of the data given in each
used in [5]. The last three rows of the Table I show information
record (520 vector positions out of 529) as a 520-element
about the databases used in two papers [17], [14] which are
vector of integer values. These values represent the RSSI levels
not WLAN-based. [17] uses a RF-based method, and not all
whereas the WAP identifiers (MAC addresses) are linked to the
information about the database is available. [14] proposed a
vector positions. Table III expands the example record given
FM signal-based method. They performed experiments in two
in Table II showing its RSSI levels. This representation has
scenarios named as [14]a and [14]b in Table I. In both cases,
been adopted due to the number of different WAPs detected
the authors tested their proposed FM-based method together
on the three buildings (that is, 520) and the fact that Android
with a WLAN-based methodology. The database of the second
provides integer RSSI levels.
scenario has similar size to the one proposed in this paper, but
it is still smaller than ours. For instance, it covers 37% less The method getScanResults() included in WifiManager5
surface, and the number of WAPs is 28% smaller. In addition, Android class has been used to obtain the list of detected
just one device were used in their experiments. WAPs from each localization in each capture. This list, shown
in Table IV, contains the MAC address and the corresponding
In general, each indoor localization work used their own
intensity level for any detected WAP. The MAC addresses
data to publish the results, and therefore it is quite difficult to
are coded as strings, and the RSSI levels correspond to
make comparison among the proposed methods. A common
negative integer values6 measured in dBm, where 100dBm
database is needed to perform this task. As it has been
is equivalent to a very weak signal, whereas 0dBM means that
commented before, in the Pattern Recognition and Machine
the detected WAP has an extremely good signal. Although the
Learning research fields, it is a common practice to share
list shown in the example is sorted according to the intensity
databases to allow other researchers to provide comparable
value, this ordering depends mostly on the device (model and
results when using their proposals. Up to date, WLAN-based
Android version).
indoor localization methods have relatively small databases
which in some cases, come just from one building being often The database includes 520 WAPs identified by the MAC
captured using a small number of devices or by a small number address. These addresses have been alphabetically sorted and
of users. The database presented in this paper overcomes all sequentially renamed to WAPnnnn . We use these new identi-
these lacks. fiers instead of the MAC addresses due to privacy reasons.
A total number of 520 WAPs appear in the database and
III. UJIIndoorLoc DATABASE DESCRIPTION the 520-element vector from each record contains the raw
In this section, the proposed database is completely de- intensity levels of the detected WAPs from a single WiFi
scribed. First, Section III-A provides details about the informa- scan. Obviously, not all the WAPs are detected in each scan.
tion published for each capture (i.e. for each sampled point or For instance, only 14 WAP identifiers were detected in the
record). Second, Section III-B shows how the whole database scan example shown in Table IV. The RSSI levels for these
has been split into sets for training and validation purposes. WAPs remain unaltered, using the artificial value +100dBm
by default in those WAPs that have not been detected by the
device.
A. Description of elements stored
It is important to mention how the RSSI values are dis-
As it was previously introduced, the whole proposed tributed in the proposed database. Figure 2 introduces the
database contains 21049 records. Each record is directly related
to a single capture and it contains the following 529 numeric 5 [Link]
elements: 6 [Link]
TABLE II. E XAMPLE OF ONE DATABASE ENTRY (7754- TH RECORD ). I T WAS CAPTURED ON J UNE , 4 TH 2013 ([Link] PM GMT+02) BY U SER 11
WITH A HTC Wildfire S A510e (A NDROID VERSION 2.3.5). T HE DEVICE DETECTED 14 WAP ( NEGATIVE RSSI VALUES ) ON THE REFERENCE POINT
LOCATED OUTSIDE OFFICE 111 ON THE THIRD FLOOR OF THE TI BUILDING .
[1] ... [520] [521] [522] [523] [524] [525] [526] [527] [528] [529]
WAP001 ... WAP520 Longitude Latitude Floor BuildingID SpaceID [Link]. UserID PhoneID Time
-97 ... +100 -7594.7... 4864983.9... 3 0 111 2 11 13 1370340142
TABLE III. E XTRACT OF THE VECTOR THAT REPRESENTS THE RSSI VALUES . MAC ADDRESSES HAVE BEEN ANONYMIZED DUE TO PRIVACY REASONS .
WAP001 ... WAP031 WAP032 WAP033 WAP034 WAP035 WAP036 ... WAP520
-97 ... +100 -97 +100 +100 -65 -65 ... +100
Number'of'database'records'
···
1200"
11th WAP036 65dBm
1000"
12th WAP035 65dBm
13th WAP142 48dBm 800"
400"
200"
recorded in our database (374234 intensity values). Although 0" 2" 4" 6" 8" 10" 12" 14" 16" 18" 20" 22" 24" 26" 28" 30" 32" 34" 36" 38" 40" 42" 44" 46" 48" 50"
Number'of'WAPs'detected'
70000"
60000"
1000" TABLE V. D ETAILED COORDINATES AND FLOOR OF THE PLACE
50000"
WHERE THE STORED CAPTURE ON THE 7754 TH RECORD WAS TAKEN .
40000"
100"
30000"
Longitude Latitude Floor
20000"
-7594.736999999732 4864983.902400002 3
10"
10000"
[-5 ["
["
0… ["
5… ["
0… ["
5… ["
0… ["
5… ["
0… ["
5… ["
0… ["
5… ["
0… ["
5… ["
0… ["
5… ["
0… ["
5… ["
0… "
00 00"
0"
[-1 10[
[-9 -95
-5
…0
[-9 -90
[-8 -85
[-8 -80
[-7 -75
[-7 -70
[-6 -65
[-6 -60
[-5 -55
[-5 -50
[-4 -45
[-4 -40
[-3 -35
[-3 -30
[-2 -25
[-2 -20
[-1 -15
[-1 <-1
-
…
RSSI'level' in which the capture was taken. Figure VI shows: the UJI
University campus (left); the three buildings of the School
Fig. 2. Frequency distribution of the number of times that a RSSI value of Technology and Experimental Sciences (center image),
appears in the proposed database. Red bars stand for the values in linear scale hereafter ESTCE; and a zoom inside the third floor of the
(right scale) and blue bars stand for the values in logarithmic scale (left scale).
TI building (right). This figure has been introduced to show
where the example point (Table V) is exactly located. Table
Figure 3 shows the number of WAPs detected in a single VI shows the BuildingID for each building of the ESTCE.
capture. This number ranges from 0 (where there is not any
WiFi coverage) to 51. So, localizations with no coverage have TABLE VI. R ELATION BETWEEN BuildingID AND THE REAL
not been removed from the database. Finally, the average BUILDING .
number of WAPs scanned in each capture is 17.92, therefore,
approximately 500 elements of the previously described vector Building ID Real Building
contain out of range values (represented as +100dBm). It 0 ESTCE - TI
is worth saying that, according to our experiments, the main 1 ESTCE - TD
factors that affect to the number of WAPs reported by a WiFi 2 ESTCE - TC
111
Interior TI Building
UJI Campus Tx Buildings Third Floor
Fig. 4. Map of the UJI Riu Sec Campus and zoom on the Tx Buildings. Pink refers to the ESTCE - Tx building on the UJI Campus map (left). On the Tx
building zoom (right): red refers to TI building, green corresponds to TD building and blue stands for TC building. On the interior of TI building, the blue point
is the reference point.
B. Database division Fig. 5. Screenshot of CaptureLoc. On the left, example where the capture
is done (red circle). Button Send Fingerprint starts the collect-and-send
The whole database is split into two different sets: the procedure. On the right, the result of a capturing process that reports four
Training set and the Validation set. On the one hand, the errors.
training set provides fully-detailed measures whose location
corresponds to predefined reference points. On the other hand, To generate the training set, all the closed spaces of the
the validation set provides the same information on arbitrary three buildings (offices, laboratories, classrooms, WCs, among
points. Table X shows the information about the two sets. other spaces) have been initially considered as important places
where the captures should be done. Then, one reference point
TABLE X. BASIC FEATURES OF BOTH DATABASE SUBSETS inside each space and, at least, another reference point outside
Training Validation each space (i.e. at corridors) have been selected as reference
Captures 19674 1111 points for all the considered closed spaces. The point inside the
WAPs 465 367 space is located at the centroid of the closed space, whereas
RSSI Range [ 104 . . . 0]dBm [ 102 . . . 34]dBm the outside point is located in front of the door. If the space
Ref. points 933 N one⇤ has multiple accesses, we have selected one reference point
Users 18 U nknown⇤⇤ per entrance (door). Figure 6 shows a graphical example of
Devices 16 11 how and where the reference points are located.
⇤
There was not any established reference point for valida- Then, 18 users performed the captures to generate the
tion. training set. The reference points were uniformly distributed to
⇤⇤
The validation stage does not store the user id in order the users with the restriction that any reference point should
to be more realistic. be covered by, at least, two users. Any further suggestion,
advice and/or direct order were not provided to the users,
Although both the training subset and the validation subset and they were free to capture the assigned reference points
contain the same information, the latter includes the value 0 in on their own way. Figure 5 (right) shows a capture process in
some fields. These fields are: SpaceID, Relative Position with which there were some errors (captures 5, 8, 9 and 10 were
respect to SpaceID and UserID. As it has been commented not recorded), so the user decided to repeat the process in the
before, this information was not recorded because the valida- same reference point. The user was in charge of deciding if the
tion captures were taken at arbitrary points and the users were capture procedure must be repeated or not. Errors on capturing
not tracked in this phase. This fact tries to simulate a real are often related to low internet coverage (either 3G nor WiFi)
localization system. and they have only been reported on a few places.
V. UJIIndoorLoc BASELINE
In this section, the proposed database has been used with
a basic indoor positioning system to provide a baseline for
further comparisons. Note that the objective of this work is not
to provide an accurate indoor positioning system, the objective
is to provide an objective database which can be used for
comparing positioning systems and other algorithms based on
WLAN-fingerprinting.
We considered that the distance-based technique k-Nearest
Neighbor (kNN) [35] can be used as baseline for comparison
purposes. In particular, we have developed the 1NN technique
(k = 1) in conjunction to the Euclidean Distance as a basic
Fig. 6. Example of reference points located at the first floor of TI building
(left) and third floor of the TC building (right). Red points corresponds to
indoor localization system. The necessary steps to localize a
the reference points inside closed spaces, where blue points stands for the current fingerprint are:
reference points taken in front of the door/doors (outside the spaces).
• The Euclidean Distance of the current fingerprint with
respect to all the fingerprints included in the training
Finally, the 18 users have covered 924 reference points. set is calculated.
In general, each reference point has been registered by, at • The current fingerprint location corresponds to the Eu-
least, two users, so more than 18400 training samples were clidean’s closest training fingerprint location if there
expected to be recorded (19937 records were finally obtained). is only one candidate with the shortest distance.
There have been a few cases in which the user has repeated
the capture procedure due to connection errors on the first • When some candidates provide the shortest distance,
trial (Figure 5) and, moreover, there are some areas that have we apply a voting procedure to extract the “winning”
been covered by 3 users. Although all the suggested reference building and floor. Then, the position corresponds to
points located outside the spaces were captured, the users were the average of the location provided by the Euclidean’s
not always allowed to capture inside some restricted spaces closest training fingerprints that are on the winning
(chemical laboratories with biohazard labels, private offices, building and floor. In case of tie, a localization error
among other facilities). is raised.
Fig. 8. Screenshots of the ValidateLoc application. The first image shows again the localization and asks the user if the position is correct. In this case, the
user presses the “no” button. The second image shows the screen in which the real position is introduced. The last image warns the user that the validation
fingerprint has successfully arrived to the server. Blue point stands for the predicted position and the red one to the position assigned to the fingerprint.
VI. D ISCUSSION AND CHALLENGES majority of them are hidden to human eye. Commonly,
the WLAN antennas are located inside restricted areas
This database has been initially generated for indoor lo- or in the ceiling. The localization of the WAPs can
calization in our University Campus. Therefore, testing lo- strongly support the localization algorithm.
calization algorithms is the first use that external researchers
can do with the published dataset. New indoor localization • Detection of low-coverage places such as the ones
algorithms are often proposed using private datasets that are recorded in our dataset. On the one hand, adding
not publicly available, as it has been previously mentioned new antennas can improve the localization algorithm
in sections I and II. Two different algorithms can be hardly because those places can not be localized by RSSI-
compared only with the information and the results provided based algorithms. On the other hand, it would improve
by the authors. This public dataset can be used to test the the internet connection to those users who are on those
accuracy of any localization algorithm based on RSSI levels or places.
for performing a comparison of localization algorithms under • Detection of WLAN collision places where some
the same experimental framework. WAPs are emitting in the same channel, and WLAN
However, the database is susceptible of being used on connectivity may be degraded.
alternative problems that are discussed here. For instance: • The automatically detection of removed and new
• It could be of interest an analysis about how the WAPs may be interesting since re-mapping could be
internal structure of the building is related with the avoided. The procedure to fully map a building re-
WLAN access points and, therefore, how the number quires planning, elaboration of a mapping strategy and
and position of these points can be optimised without working hours. The automatic detection may reduce
being out of WiFi range. This could obviously result the maintenance costs of fingerprint databases. In the
in important savings in terms of hardware acquisition introduced database, the validation fingerprints were
and installation efforts. In summary, a very desirable taken 3 months later than the training ones, and some
preprocessing step in any WiFi structure is to reduce WAPs disappeared and new ones were introduced.
the redundant access points keeping a complete cov- From the 520 detected WAPs in the UJIIndoorLoc
erage. database, 312 of them were detected in training and
validation phases. 153 WAPs were only detected in
• WAPs location can be inferred with the databased the training phase, and 55 new WAPs appeared in the
provided. Although some WAPs are visible, the wide validation phase.
Another practical applications about how devices detect the VII. C ONCLUSIONS
WAPs are:
This paper introduces a new database for indoor localiza-
• How two devices differ on obtaining the individual tion, UJIIndoorLoc, on the basis of a WLAN fingerprinting
RSSI values on the same place. Table XII shows (RSSI levels) environment. First, database description has been
a summary of the RSSI values of the same WAP fully detailed, including the features used in the database,
provided by two different devices. Although the range their meaning, and the value ranges. Second, the procedure
of possible values is similar for both devices, one of and the applications used to generate the database have also
them tends to provide lower values according to the been described. To address the problem of samples diversity
mean and median values. and realistic approach, more than 20 users participated in
generating the database. Each training reference point was
• The study of anomalies in data such as the one detailed initially assigned to, at least, two different users. No suggestion
in Table XIII, where the RSSI values of the same WAP or advice about capturing was given to the users. In addition,
are shown for 4 different records located in the same the device was always held by a human user in contrast to
place. One of the devices detected the highest possible other datasets in which the device was left on a place to
value 0, but the same device detected a very low signal take several samples. All the samples were collected by a
strength in the same place for the same WAP. This human user because the human body partially blocks the radio
seems to be an anomaly in the data. wave communication [36]. Therefore, the samples taken for
• According to our records, the number of RSSI scanned UJIIndoorLoc can be considered very realistic. Due to privacy
by a device depends on the environment and on the issues, some information has been anonymized.
device itself. Table XIV shows an example of that, While the WLAN-based localization databases used in the
where the number of WAPs detected by two devices literature tend to cover small areas [17] or one-floor buildings
is shown. It is well known that not all devices detect [14], UJIIndoorLoc covers three buildings with 4 or more
the same number of WAPs. However, this database not floors and almost 110.000m2 . Moreover, the shape and internal
only confirms this fact but a detailed analysis on this structure are quite different among the three buildings where
sense could be performed. the samples were collected. In addition, more than 20 people
TABLE XII. S UMMARY OF THE RSSI VALUES OF WAP0034 PROVIDED
and 25 different devices have been used to generate the
BY TWO DIFFERENT DEVICES . M EASURES WERE TAKEN IN FRONT OF database, in contrast to other databases that were generated us-
OFFICE 120 LOCATED ON THE FIRST FLOOR OF THE TI BUILDING . ing a single device or few devices [25]. Our proposed database
can also be very useful for validation and comparison pur-
PhoneID min max mean median
poses, since validation samples have also been provided. All
13 52dBm 41dBm 48.5dBm 50.5dBm these features of the proposed database make UJIIndoorLoc
14 49dBm 37dBm 41.5dBm 39dBm suitable for testing and benchmarking localization algorithms.
Alternatively, UJIIndoorLoc can be used for other purposes
TABLE XIII. E XTRACT OF WAP0517 RSSI VALUES . M EASURES such as analysis of device accuracy, improvement of WiFi
TAKEN IN FRONT OF OFFICE 215 (3 RD FLOOR OF TC BUILDING ).
coverage, WAPs optimization (localization and distribution),
Record PhoneID UserID WAP0517 RSSI level among others. The here proposed UJIIndoorLoc database not
628 23 2 87dBm only is the biggest database in the literature shown but it is
846 23 2 not detected also the first publicly available database that could be used
2643 19 6 81dBm to make comparisons among different methods in the field. In
2677 19 6 0dBm addition, s basic positioning system has been developed using
the k-Nearest Neighbor rule in order to provide a baseline for
TABLE XIV. N UMBER OF RSSI VALUES SCANNED BY TWO DIFFERENT
comparison purposes.
DEVICES IN TWO SCENARIOS : 1) IN FRONT OF O FFICE 121 (1 ST FLOOR OF
Summarizing, the scarcity of publicly available localization
TI BUILDING ) AND 2) CONSIDERING THE WHOLE DATABASE .
databases, none as far as we know, reflects the need of a
Outside office 121 common public database for research purposes such as the
PhoneID min max mean median one here presented.
13 12 17 14.9 15
14 9 22 19.5 21 ACKNOWLEDGMENT