Academia.eduAcademia.edu

Outline

Software and applications of spatial data mining

https://doi.org/10.1002/WIDM.1180

Abstract

Most big data are spatially referenced, and spatial data mining (SDM) is the key to the value of big data. In this paper, SDM are overviewed in the aspects of software and application. First, spatial data are summarized on their rapid growth, distinct characteristics, and implicit values. Second, the principles of SDM are briefed with the descriptive definition, fundamental attributes, discovery mechanism , and usable methods. Third, SDM software is presented in the context of software components, developing methodology, typical software for geographical information system (GIS) data and remote sensing (RS) images, and software trend. Fourth, SDM applications are outlined on GIS data, RS image, and spatio-temporal video data. The final is the concluding remarks and perspectives.

Overview Software and applications of spatial data mining Deren Li,1 Shuliang Wang,1,2* Hanning Yuan2* and Deyi Li3 Most big data are spatially referenced, and spatial data mining (SDM) is the key to the value of big data. In this paper, SDM are overviewed in the aspects of soft- ware and application. First, spatial data are summarized on their rapid growth, distinct characteristics, and implicit values. Second, the principles of SDM are briefed with the descriptive definition, fundamental attributes, discovery mech- anism, and usable methods. Third, SDM software is presented in the context of software components, developing methodology, typical software for geographical information system (GIS) data and remote sensing (RS) images, and software trend. Fourth, SDM applications are outlined on GIS data, RS image, and spatio- temporal video data. The final is the concluding remarks and perspectives. © 2016 John Wiley & Sons, Ltd How to cite this article: WIREs Data Mining Knowl Discov 2016, 6:84–114. doi: 10.1002/widm.1180 INTRODUCTION and veracity of spatial data.2,3 Internet-based geospa- tial community of volunteered geographical informa- S patial data play an import role in the sustainable development of natural resources and human society. By using data, human civilization has gone tion4 and location-based sensed service, e.g., call logs, mobile-banking transactions, online user- generated content such as blog posts and Tweets, through preliminary sensing the world, memorizing online searches, and satellite images, make spatial the instances, recording the writings, inheriting the relationships more and more complex. The research history, exchanging the information, communicating from International Data Corporation5 has shown with each other, and uncovering the rules. Spatial that as of 2003, humans have created a total of 5EB data are closely related to human daily life and per- data, while in the year 2011, the amount of data that meated all walks of businesses. Most data are spa- had been copied and produced has exceeded 1.8ZB. tially referenced,1 which help people understand the It is expected that by 2020, global data usage will real world through the information world on the reach 35.2ZB, which needs 37.6 billion hard drives basis of locations. The rapid developments of instru- of 1TB capacity to store. ‘WorldView’ satellite2 has a ments, techniques ,and infrastructures to capture spa- spatial resolution of 0.5 m, and its direct georeferen- tial data, e.g., global positioning systems (GPS), cing accuracy can reach 2–3 m. The Earth remote sensing (RS), geographical information sys- Observing-1 (EO-1) satellite has 220 imaging spec- tem (GIS), smart planet, and the network of things, trums, and the spectral range is 400–2500 nm, and it have greatly prompted the volume, variety, velocity, can gain the spectral resolution of 10 nm. These spatial data are voluminous and grow rapidly, which have far exceeded the human ability *Correspondence to: [email protected], [email protected] 1 to timely interpret and use with common State Key Laboratory of Information Engineering in Surveying, Mapping and Remote Sensing, Wuhan University, Wuhan, China techniques.1,2,4,6–30 For example, a newspaper pub- 2 lished the same article of the same author on two dif- School of Software, Beijing Institute of Technology, Beijing, China ferent pages of ‘legal community’ and ‘youth topics.’ 3 Department of Computer Science and Technology, Tsinghua Uni- Another newspaper published three articles in the versity, Beijing, China edition of ‘home appliances,’ ‘lifestyle,’ and ‘science Conflict of interest: The authors have declared no conflicts of inter- and technology,’ all to compare video compact disc est for this article. (VCD), China video disc (CVD), and digital versatile 84 © 2016 John Wiley & Sons, Ltd Volume 6, May/June 2016 WIREs Data Mining and Knowledge Discovery Software and applications of SDM F I G U R E 1 4 | Chinese HFT map of 2008 with 1 km × 1 km resolution. FI GU RE 15 | The nighttime light monthly composites: (a) March 2011; and (b) February 2014. 1. Analysis of abnormal motion trajectory. analysis, feature blocks, or feature point analy- Extract the linear trajectory and characteristics; sis (Figure 16). identify and classify the characteristics. Solve 3. Classification combines with pattern recogni- the problem of trajectory cross or separation tion. For example, using SVM, Ann, Boost and so on. Solve the problem of analyzing the rapid classification technology to classify and multiobjective overlapping anomaly. identify moving target. 2. Target analysis based on the hybrid model, 4. Spatiotemporal analysis for the video from including color model analysis, shape model multicamera (Figure 17). For the target at the Volume 6, May/June 2016 © 2016 John Wiley & Sons, Ltd 105 Overview wires.wiley.com/dmkd same time, due to the fixed location of camera, popular techniques and products of spatiotemporal the times that targets arise are interconnected; video data mining. with this association information, we can com- bine image target-matching technology to real- ize the temporal and spatial correlation Disaster Forecast and Management analysis. In recent years, people have successfully utilized 5. The camera video analysis in different periods. advanced information technologies to forecast disas- Extraction is not dependent on the time-related ter events timely with high accuracy by assessing information of image characteristics such as environmental conditions and predicting events (such environment light, contrast ratio, and other as forest fires, population growth, and other natural information. Retrieve and recognize pedestrians or human development issues). In particular, the use or vehicles appear in different periods of satellite navigation and positioning technology can automatically. pinpoint the location of the disaster. The discovered knowledge applied in a specific industry, specific In this way, we can also delete large amounts scenes, and specific solution can better support deci- of videos recording people’s normal activities and sion making and action1,3. For example, after the private activities that need to be protected and only Hurricane Katrina disaster in Louisiana and Missis- keep the data about suspicious cars and people as sippi, when local electric power, telecommunications, well as data about people whom need to be cared for roads, and other local infrastructure were destroyed, (dementia in the elderly and mentally retarded chil- the US Coast Guard helicopter used GPS to locate dren). Because of the technical difficulty, there are no and rescue the victims. In the 2010 earthquake in Yushu, the unmanned aerial vehicle (UAV) images with a resolution of 0.2 m were used for distribution planning, traffic jams monitoring, and disaster asses- sing. Comparing the high-resolution images of Galle city railway station captured before and after the tsu- nami may help interpret the damage of buildings. Spatial mitigation satellites, RS satellites, com- munication, and navigation satellites have been widely used in disaster management, such as earth- quakes, tsunamis, typhoons (hurricanes), floods, droughts, geological disasters, and fire. Currently, the Disaster Monitoring Constellation is a system that consists of satellites from various countries through international cooperation. This system has a high time resolution, large monitor range, and fast response. For disaster mitigation and relief, IBM Sahana Rescuers-Centered Disaster Reduction sys- FI G UR E 1 6 | Pedestrian classification based on hog operator. tem, with the collaboration of personnel, FI GU RE 17 | Space-time video sequence-based event detection. 106 © 2016 John Wiley & Sons, Ltd Volume 6, May/June 2016