6 TH
SEM. SEMINAR
C-14
Topic :Spatial Data models and data
Editing
Instructed by : Dr. Dipankar Buragohain
Prepared by: 1. Anushmrita Dutta- 252
2. Tumonjyoti Gogoi- 176
3. Drishti Charingia -73
4. Gayatree Sarmah-145
5. Puja Das – 237
6. Chayan Gogoi - 177
7. Nipu Bhuyan - 261
CONTENTS
1. What is GIS?
2. Spatial definition .
3. Spatial data model.
4. Comparison of raster and vector data models.
5. Modelling the third dimension and fourth dimension .
6. Data editing
7. Detection and correction of errors.
8. Reprojection, transformation and generalization.
9. Edge matching and rubber sheeting
10. Conclusion.
11. Bibliography
1.What is GIS?
Geographic Information System is a computer system for capturing,
storing, quiering, analyzing and displaying the geographical data.
The Geographically referenced data distinguishes GIS from other
information system.
Geographically referenced data describe both the location and
characteristics of spatial features on the earth surface.
GIS therefore involves two geographic data component :
1. A Model of spatial form : The model of spatial form represents the
structure and distribution of features in geographical space.
2. A model of Spatial Process : These processes involved into
A. Identifying the spatial features from the real world that are
of interest in the context of an application and choosing how to
represent them in the conceptual model.
B. Representing the conceptual model by
an appropriate spatial model. C. Selecting an appropriate spatial
data model within the computer.
2.Spatial Data definition
Spatial data describes the location and shape of geographic features, and
their spatial relationship to the features.
The information contained in the spatial database is held in the form of
digital coordinates which describe the spatial features.
It mainly depends on the latitude and longitude of the feature.
Spatial Entity
Spatial data describe the spatial situation of objects concerning their form
and their relative situation in space.
Usually, the spatial relation of individual points, lines or areas is made via
the integration in a coordinate system resulting in the relation to the real
world and the metrics.
The Spatial data can be represented through using
1. The Point 2. The Line 3. The Area 4. Network 5. The Surface
3.Spatial Data Model
There are two method of spatial data model. They are:
1. Raster Data Model 2. Vector Data Model
The Raster Data Model: The Raster data model is one of the important
spatial data model described as tessellation.
In the raster data model the basic building block is the individual grid cell, and
the shape and character of an entity is created by the grouping of cells.
The size of the grid cell is very important as it is a method for the storage,
Processing and display of spatial data.
Each area is divided into rows and columns, which form a regular grid
structure.
Each cell must be rectangular in shape, but not necessarily square.
Each cell within this matrix contains location co-ordinates as well as an
attribute value. The origin of rows and column is at the upper left corner of the
grid.
Rows function as the “y”coordinate and column as”x”coordinate in a two
dimensional system. A cell is defined by its location in terms of rows and
columns.
Each row contains a group of cells with values representing a
geographic phenomenon.
Cell values are numbers, which represent nominal data such as land-
use classes, measures of light intensity or relative measures.
The cells in each line of the image are mirrored by an equivalent row
of numbers in the file structure.
In a simple raster data structure, such as different spatial features
must be stored as a separate data layers.
However, if the entities do not occupy the same geographical location
then it is possible to store them all in a single layer, with an entity
code given to each cell.
This figure shows the different land use can be coded in a raster
layer. The values 1, 2 and 3 have been used to classify the raster
cells according to the land use present at a given location. The values
1 represents residential area; 2 forest; and 3; farm land.
Vector Data Model
A vector spatial data model uses two dimensional Cartesian [x, y] co-
ordinates to store the shape of spatial entity.
In the vector spatial data can be represented by using point. It is the
basic building blocks from which all spatial entities are constructed.
The simplest spatial entity, the point is represented by a single co-
ordinate pair.
Line and area entities are constructed by connecting a series of
points into chains and polygons.
The more complex the shape of a line or area feature the greater the
number of points required representing it.
In the vector data all points in the data structure must be numbered
sequentially and contain an explicit reference which record which
points are associated with which polygon.
This is known as Point Dictionary- Borrough 1986.
The vector data structure mainly ensure the following points. They
are :
• No node or line segment is duplicated;
•Line segment and nodes can be referenced to more than one
polygon; •All polygon have the unique identifiers.
•Polygon can be adequately represented.
The following diagram shows the vector data structure
In vector structure topology is concerned with connectivity between
entities and not their physical shape.
Boundaries are identified through network of arcs, checking polygons for
closure, and linking arcs into polygons.
The area of polygon can calculated and unique identification numbers are
attached.
This identifier would allow non spatial information to be linked to a specific
polygon.
4. Comparison of raster and vector data models
The vector model represents location as x,y coordinates in a Cartesian
coordinate system. The raster model represents location as cells, also in a
Cartesian coordinate system. Raster data store rows and columns of cell
values.
The vector model represents features with well-defined boundaries; the
raster model represents a more generalized view.
The primary focus of the vector data model is the geographic feature; the
primary focus of the raster data model is location.
The vector model represents feature shape accurately; the raster model
represents rectangular areas and thus is more generalized and less
accurate.
5. MODELLING THE THIRD DIMENSION AND
FOURTH DIMESION IN GIS
MODELLING THE THIRD DIMENSION
In the real world all the features are seen exactly on 3D.
In terms of its display capabilities the computer screen is a
two-dimensional display device even though the use of clever
graphics it is possible to stimulate the appearance of the third
dimension.
In GIS the only medium for 3D is the display of computer
screen.
To produce system capable of representing the complexities of
the real world, we need to portray the third dimension in more
than a visual way.
The representation of the third dimension of an entity can
also help us model the form of entity and associated spatial
process.
The 3D is an integrated part of GIS tool box.
THREE DIMENSIONAL MAP WITH Z COORDINATES
MODELLING THE FOURTH DIMENSION
GIS is often a long –term process it is more than likely that this collage
of data will include entities at different period in time.
Modelling time is made more complex since there are several different
sorts of time that GIS developers might need to consider such as work
practice time, data base time, and future time.
Work practice time is the temporal state of a GIS data base used by
many people.
The 4D is having a temporal character .
The problem with developing a data structure that is flexible enough to
handle the time elements of entities include- what to include and how
frequently to update the information.
Most users represent the different temporal states of a spatial entity
either by creating a separate data layer of each time period or y
recording the state of the entity at a given time.
6. DATA EDITING
Geographic Information System simply represents real world conditions
with the aid of computer. It is a tool for analyzing the problems.
For that we need some data. It may be spatial or non-spatial.
These data may include errors. We could expect errors from the
original source as well as derived during encoding.
Before the processing of data it is essential to identify and eliminate
the error, otherwise it will contaminate the GIS data base.
The pre-processing of GIS data i.e. Data editing can be grouped in to
the following:
•Detecting and correcting errors
• •Reprojection, transformation and generalization.
• •Edge matching and rubber sheeting.
7. DETECTION AND CORRECTION OF ERRORS
Errors in input data may derive from three main sources.
They are:
•Errors in the sources of data: It may the errors in maps used by surveyors or
printing errors. •Errors
while encoding: It may be scanning errors, digitizing errors, typing errors, etc.
•Errors at
the time of transfer and conversion: While transferring and converting data
different formats makes errors and data loss.
ATTRIBUTE DATA EDITING
Attribute data may also have some errors and it could be identified easily by
manually and could compare with original data.
There are many methods for checking and correcting attribute data. Some of
them are:
Impossible values: We could check the error value, when we know the range
of data.
Extreme values: We could identify the errors in the data by extreme values.
Internal consistency: By tallying we could check the error in total and
averages.
•Scatter diagram: The error in correlation of two attribute data
could be identify using scatter diagrams.
•Trend surface: It will highlight the points which have a long
range from other figures.
SPATIAL DATA EDITING
•Spatial data error creates more problems and it is difficult to identify.
1. RASTER DATA EDITING
•Raster data editing refers to correcting specific contents of raster
images than their general characteristics.
•Commonly used raster data editing functions are:
Filling holes and gaps: it used to fill holes and gaps appear in raster
images.
Edge smoothing and boundary simplification: remove or fill single
pixel irregularities along line edges.
Deskewing: it is used to rotate the image.
Speckle removal or filtering: to remove speckles or random high or
low valued pixels in the image.Erase and delete: remove unwanted
pixels.
Thinning: to reduce the representation of linear features to single cell
width. It is done to preserve sharp corners and round corners.
Clipping: to cut and remove specific portions of raster image.
Drawing and rasterisation: to add vector graphics or text to raster
form in a new image.
2.VECTOR DATA EDITING
Errors may occur in vector data also. These errors are mainly because
of digitizing process.
Most GIS packages are providing editing tools for identification and
removal of errors in vector data.
Some of the errors and correcting measures are:
PSEUDO NODES
These are false nodes occuring where a line connects itself, or where
two lines intersect along a parallel path rather than crossing.
These incorrect nodes can be corrected by either selecting or deleting
when necessary or by adding nodes where needed to convert a
polygon.
DANGLING NODES
It can be defined as a single node connected to a single line entity
and it can be result from three possible mistakes.
Unclosed polygon : failure to close the polygon.
Undershoot: failure to connect the node to the object it was
supposed to be connected.
Overshoot: if a node going beyond the entity where it is supposed
to be connected is called overshoot.
For undershoot, the node is moved or snapped to the object to which
it should be connected.
Overshoot errors can be corrected by identifying intended line
intersection points and clipping the line so that it connects where it is
supposed to.
Open polygon can merely be moved one of the nodes to connect with
other.
LABEL ERRORS
There are two types of errors can be occur related with polygons.
One is missing labels and the other is too many labels. We can rectify it by
adding or deleting labels wherever necessary.
SLIVER POLYGONS
Vector data creates each polygon as a separate entity.
A sliver polygon is a small polygon which is an artifact of error rather than
representing a real-world feature.[
In such cases, if required to digitize the adjacent lines between polygons more
than once, failure to digitizing exactly will open result of overlay operation.
The easiest way to avoid this does not require digitizing the same line twice and
requirement is become very common.
WEIRD POLYGON : These are inadmissible loops occurs while digitizing.
8. REPROJECTION, TRANSFORMATION AND
GENERALIZATION
Once spatial and attribute data have been encoded and edited, it is
necessary to process data geometrically in order to provide a
common reference.
The data derived from various sources should be converted in to a
common projection before they are combined and analyzed.
If it not reprojected, data derived from a source map using one
projection will not plot the same location data derived from another
source using another projection.
Data derived from different sources may also have different co-
ordinate systems.
They may have different origins, units of measurements and
orientations. So it is necessary to transform it in to a common grid
system.
It involves some mathematical calculations.
Data may be derived from different maps with different
scales.
The generalization should be done while comparing data of
large and small scales.
This will also helps to save time and reduce the space of
storage.
The simplest method for generalization is to delete points
between two points with in a specified interval. But it will
not preserve the space of the object.
When we generalize a map data loss is a main problem. But
it is necessary with comparison of different scale maps.
Instead of this, compaction technique could be used it will
help to reduce the space with out any data loss.
9. EDGE MATCHING AND RUBBER SHEETING
When our study area extends across two or more map sheets, small
difference and miss matches may occur.
For that normally each map sheets would be digitize separately and then,
adjacent sheets joined after editing, projection, transformation and
generalization.
This joining process is known as edge matching.
Mismatches at sheet boundaries must be resolved.
When the maps are joining, the adjacent lines and polygons may not join.
It should be corrected to complete features and ensure that the data are
correct topologically.
For use as a vector data layers topology must be rebuilt as new lines and
polygons have been created from the segments that lie across the sheet.
It could be automated, but the problem may arise due to tolerance.
If the tolerance is too large, some small lines and polygons may miss. If
the tolerance is too small some of the lines may remain unjoined.
The redundant map sheet boundary lines should be deleted
or dissolved.
Some data sources may have some internal distortions with
in individual map sheets.
E.g.: some aerial photographs may have internal
inaccuracies even after Reprojection due to movement of air
craft or distortion due to camera.
This could be rectified by ‘rubber sheeting.’
This process involves stretching of maps in various
directions as it drawn in a rubber sheet.
Some control points are fixing and map is stretched.
Lack of control points and the processing time makes
problems.
10. Conclusion
In conclusion, spatial data modeling provides a structured approach to
organizing and representing geographic information, essential for
geographic information systems (GIS).
There are two types of spatial data models, including vector and raster
models, each suited for different types of geographic data representation.
Data editing within a GIS involves modifying, updating, and refining
geographic information to ensure accuracy and relevance.
This process includes tasks such as digitizing new features, correcting
errors, updating attribute information, and enhancing existing data.
Effective data editing is crucial for maintaining the quality and usability of
spatial data, supporting informed decision-making and analysis in diverse
fields ranging from urban planning to environmental management.
11. Bibliography
Wikipedia
Research.library.gsu.edu
Researchgate
Quora
Gisgeography
THANK YOU