0% found this document useful (0 votes)

119 views

Research Methodology (Data Analysis)

Data processing involves editing, coding, classifying, tabulating and organizing data to extract relevant information and establish order. It consists of five main steps: 1) editing data for errors and omissions, 2) coding data into classes/categories, 3) classifying data into groups, 4) tabulating data into tables for analysis, and 5) analyzing the tabulated data to discover useful information for decision making. Key challenges include collecting complete and consistent data from various sources in different formats and integrating the diverse data.

Uploaded by

Masood Shaikh

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

119 views

Research Methodology (Data Analysis)

Uploaded by

Masood Shaikh

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 7

Data Processing

Data processing is concerned with editing, coding, classifying, tabulating and charting and diagramming
research data. The essence of data processing in research is data reduction. Data reduction involves winnowing
out the irrelevant from the relevant data and establishing order from chaos and giving shape to a mass of
data. Data processing in research consists of five important steps. They are:

1.Editing of Data

Editing is the first step in data processing. Editing is the process of examining the data collected in
questionnaires/schedules to detect errors and omissions and to see that they are corrected and the schedules are
ready for tabulation. There are different types of editing. They are:

1. Editing for quality asks the following questions: are the data forms complete, are the data free of bias,
are the recordings free of errors, are the inconsistencies in responses within limits, are there evidences to
show dishonesty of enumerators or interviewers and are there any wanton manipulation of data.
2. Editing for tabulation does certain accepted modification to data or even rejecting certain pieces of data
in order to facilitate tabulation. or instance, extremely high or low value data item may be ignored or
bracketed with suitable class interval.
3. Field Editing is done by the enumerator. The schedule filled up by the enumerator or the respondent
might have some abbreviated writings, illegible writings and the like. These are rectified by the
enumerator. This should be done soon after the enumeration or interview before the loss of memory. The
field editing should not extend to giving some guess data to fill up omissions.
4. Central Editing is done by the researcher after getting all schedules or questionnaires or forms from the
enumerators or respondents. Obvious errors can be corrected. For missed data or information, the editor
may substitute data or information by reviewing information provided by likely placed other respondents.
A definite inappropriate answer is removed and “no answer” is entered when reasonable attempts to get
the appropriate answer fail to produce results.

2.Coding of Data

Coding is necessary for efficient analysis and through it the several replies may be reduced to a small number of
classes which contain the critical information required for analysis. Coding decisions should usually be taken at
the designing stage of the questionnaire.

Coding is the process/operation by which data/responses are organized into classes/categories and numerals or
other symbols are given to each item according to the class in which it falls. In other words, coding involves two
important operations;

(a) Deciding the categories to be used and

(b) Allocating individual answers to them.

3.Classification of Data

Classification or categorization is the process of grouping the data under various understandable homogeneous
groups for the purpose of convenient interpretation. A uniformity of attributes is the basic criterion for
classification; and the grouping of data is made according to similarity. Classification becomes necessary when
there is diversity in the data collected for meaningless for meaningful presentation and analysis. A good
classification should have the characteristics of clarity, homogeneity, equality of scale, purposefulness and
accuracy.

4.Tabulation of Data

Tabulation is the process of summarizing raw data and displaying it in compact form for further analysis.
Therefore, preparing tables is a very important step. Tabulation may be by hand, mechanical, or electronic. The
choice is made largely on the basis of the size and type of study, alternative costs, time pressures, and the
availability of computers, and computer programmes. If the number of questionnaire is small, and their length
short, hand tabulation is quite satisfactory.

Generally a research table has the following parts:

• table number
• title of the table
• caption
• stub (row heading)
• body
• head note
• foot note

Stages of Data Processing

1. Data collection

Collecting data is the first step in data processing. Data is pulled from available sources, including data lakes and
data warehouses. It is important that the data sources available are trustworthy and well-built so the data collected
(and later used as information) is of the highest possible quality.

2. Data preparation

Once the data is collected, it then enters the data preparation stage. Data preparation, often referred to as “pre-
processing” is the stage at which raw data is cleaned up and organized for the following stage of data processing.
During preparation, raw data is diligently checked for any errors. The purpose of this step is to eliminate bad data
(redundant, incomplete, or incorrect data) and begin to create high-quality data for the best business intelligence.

3. Data input

The clean data is then entered into its destination (perhaps a CRM like Sales force or a data warehouse
like Redshift), and translated into a language that it can understand. Data input is the first stage in which raw data
begins to take the form of usable information.

4. Processing

During this stage, the data inputted to the computer in the previous stage is actually processed for interpretation.
Processing is done using machine learning algorithms, though the process itself may vary slightly depending on
the source of data being processed (data lakes, social networks, connected devices etc.) and its intended use
(examining advertising patterns, medical diagnosis from connected devices, determining customer needs, etc.).

5. Data output/interpretation

The output/interpretation stage is the stage at which data is finally usable to non-data scientists. It is translated,
readable, and often in the form of graphs, videos, images, plain text, etc.). Members of the company or institution
can now begin to self-serve the data for their own data analytics projects.

6. Data storage

The final stage of data processing is storage. After all of the data is processed, it is then stored for future use.
While some information may be put to use immediately, much of it will serve a purpose later on. When data is
properly stored, it can be quickly and easily accessed by members of the organization when needed.

Challenges in Data Processing

1.Collection of Data

The very first challenge in data processing comes in the collection or acquisition of the correct data for the
input. The challenge here is to collect the exact data to get the proper result. As the result directly depends
on the input data. Hence, it is vital to collect the correct and exact data to get the desired result.

2.Duplicacy of Data

Data is collected from different data sources, then many times it happens that there is duplicacy in data. The
same entries and entities may present a number of times during the data encoding stage. This duplicate data
is redundant and may produce an incorrect result.

3.Inconsistency of Data

When we collect a huge amount of data and there is no guarantee that the data would be complete or all the
fields that we need are filled correctly. Then, the data may be ambiguous. the input/raw data is heterogeneous
in nature and is collected from autonomous data sources, the data may conflict with each other in three
different levels:

• Schema Level: Different data sources have different data models and different schemas within the
same data model.
• Data representation level: Data in different sources are represented in different structures, languages,
and measurements.
• Data value level: Sometimes, the same data objects have factual discrepancies among various data
sources. This occurs when we obtain two data objects from different sources and they are identified
as versions of each other. But, the value corresponding to their attributes differ.
4.Variety of Data

The input data, as it is collected from different sources, can contain different forms. The rows and columns
of a relational database don’t limit the data. The data varies from application to application and source to
source. Much of these data is unstructured and cannot fit into a spreadsheet or a relational database.

There may be that the collected data is in text or tabular format. On the other hand, it may be a collection of
photographs and videos and sometimes maybe just audio. Sometimes to get the desired result, there is a need
to process different forms of data altogether.

5.Data Integration

Data integration means to combine the data from various sources and present it in a unified view. With the
increased variety of data and different formats of data, the challenge to integrate the data enlarges.

The data integration consists of various challenges that are as follows:

• Isolation: Majority of the applications are developed and deployed in isolation which makes it difficult to
integrate the data across various applications.
• Technological Advancements: With the advancement in the technology, the ways to store and retrieve
data changes. The problem here occurs with the integration of newer data to the legacy data.
• Data Problems: The challenge in data integration rises when the data is incorrect, incomplete or is of the
wrong format.

Data Analysis

Data Analysis is a process of collecting, transforming, cleaning, and modeling data with the goal of discovering
the required information. The purpose of Data Analysis is to extract useful information from data and taking the
decision based upon the data analysis.

Whenever we take any decision in our day-to-day life is by thinking about what happened last time or what will
happen by choosing that particular decision. This is nothing but analyzing our past or future and making decisions
based on it. For that, we gather memories of our past or dreams of our future. So that is nothing but data analysis.
Now same thing analyst does for business purposes, is called Data Analysis.

Types of Data Analysis

1.Text Analysis

Text Analysis is also referred to as Data Mining. It is a method to discover a pattern in large data sets using
databases or data mining tools. It used to transform raw data into business information. Business Intelligence
tools are present in the market which is used to take strategic business decisions. Overall it offers a way to extract
and examine data and deriving patterns and finally interpretation of the data.

2.Statistical Analysis

Statistical Analysis shows "What happen?" by using past data in the form of dashboards. Statistical Analysis
includes collection, Analysis, interpretation, presentation, and modeling of data. It analyses a set of data or a
sample of data. There are two categories of this type of Analysis - Descriptive Analysis and Inferential Analysis.

• Descriptive Analysis: Analyses complete data or a sample of summarized numerical data. It shows mean and
deviation for continuous data whereas percentage and frequency for categorical data.

• Inferential Analysis: Analyses sample from complete data. In this type of Analysis, you can find different
conclusions from the same data by selecting different samples.

3.Diagnostic Analysis

Diagnostic Analysis shows "Why did it happen?" by finding the cause from the insight found in Statistical
Analysis. This Analysis is useful to identify behavior patterns of data. If a new problem arrives in your business
process, then you can look into this Analysis to find similar patterns of that problem. And it may have chances to
use similar prescriptions for the new problems.

4.Predictive Analysis

Predictive Analysis shows "what is likely to happen" by using previous data. The simplest example is like if last
year I bought two dresses based on my savings and if this year my salary is increasing double then I can buy four
dresses. But of course it's not easy like this because you have to think about other circumstances like chances of
prices of clothes is increased this year or maybe instead of dresses you want to buy a new bike, or you need to
buy a house!

So here, this Analysis makes predictions about future outcomes based on current or past data. Forecasting is just
an estimate. Its accuracy is based on how much detailed information you have and how much you dig in it.

5.Prescriptive Analysis

Prescriptive Analysis combines the insight from all previous Analysis to determine which action to take in a
current problem or decision. Most data-driven companies are utilizing Prescriptive Analysis because predictive
and descriptive Analysis is not enough to improve data performance. Based on current situations and problems,
they analyze the data and make decisions.
Data Analysis Process

Data
Data
Visualisation
Interpretation
Data
Analysing
Data
Data Cleaning
Collection
Data
Requirement
Gathering

Data Requirement Gathering

First of all, we have to think about why do we want to do this data analysis? All we need to find out the purpose
or aim of doing the Analysis. We have to decide which type of data analysis we wanted to do! In this phase, we
have to decide what to analyze and how to measure it, we have to understand why we are investigating and what
measures we have to use to do this Analysis.

Data Collection

After requirement gathering, we will get a clear idea about what things we have to measure and what should be
our findings. Now it's time to collect data based on requirements. Once we collect data, remember that the
collected data must be processed or organized for Analysis. As we collected data from various sources, we must
have to keep a log with a collection date and source of the data.

Data Cleaning

Now whatever data is collected may not be useful or irrelevant to aim of Analysis, hence it should be cleaned.
The data which is collected may contain duplicate records, white spaces or errors. The data should be cleaned and
error free. This phase must be done before Analysis because based on data cleaning, output of Analysis will be
closer to our expected outcome.

Data Analysis

Once the data is collected, cleaned, and processed, it is ready for Analysis. As we manipulate data, we may find,
we have the exact information to our need, or we might need to collect more data. During this phase, we can use
data analysis tools and software which will help us to understand, interpret, and derive conclusions based on the
requirements.
Data Interpretation

After analyzing data, it's finally time to interpret results. We can choose the way to express or communicate data
analysis either we can use simply in words or maybe a table or chart. Then use the results of data analysis process
to decide best course of action.

Data Visualization

Data visualization is very common in our day to day life; they often appear in the form of charts and graphs. In
other words, data shown graphically so that it will be easier for the human brain to understand and process it.
Data visualization often used to discover unknown facts and trends. By observing relationships and comparing
datasets, we can find a way to find out meaningful information.

Training tcgp51 Live Animal Regulations PDF
75% (4)
Training tcgp51 Live Animal Regulations PDF
2 pages
Data Processing and Analysis
100% (3)
Data Processing and Analysis
38 pages
Basic Principles of Computed Tomography
100% (2)
Basic Principles of Computed Tomography
61 pages
Data Processing in Research
No ratings yet
Data Processing in Research
31 pages
Q 7
No ratings yet
Q 7
2 pages
Assignment Data
No ratings yet
Assignment Data
7 pages
Data Processing
No ratings yet
Data Processing
21 pages
Assign02 Ques03
No ratings yet
Assign02 Ques03
7 pages
Assign02 Ques03
No ratings yet
Assign02 Ques03
7 pages
Chapter Five Qualitative Research
No ratings yet
Chapter Five Qualitative Research
22 pages
Week 2
No ratings yet
Week 2
4 pages
CHAPTER-7(1)
No ratings yet
CHAPTER-7(1)
13 pages
Data Processing in Research Methodology
100% (4)
Data Processing in Research Methodology
4 pages
Processing, Presentation Classification and Coding of Data
No ratings yet
Processing, Presentation Classification and Coding of Data
6 pages
Data Processing
No ratings yet
Data Processing
4 pages
Unit Iv (Research Methods in Business)
No ratings yet
Unit Iv (Research Methods in Business)
18 pages
Dissertation - Structure & Format
No ratings yet
Dissertation - Structure & Format
20 pages
C4 MODULE 4 UNITE 1
No ratings yet
C4 MODULE 4 UNITE 1
15 pages
Module-3 RM Vipul2
No ratings yet
Module-3 RM Vipul2
11 pages
Data Processing Cycle
No ratings yet
Data Processing Cycle
2 pages
Res Meth Unit 8 - Data Processing
No ratings yet
Res Meth Unit 8 - Data Processing
18 pages
Data Processing
100% (1)
Data Processing
16 pages
Busi. Research-Sheet-4
No ratings yet
Busi. Research-Sheet-4
7 pages
BRM - Data Processing
No ratings yet
BRM - Data Processing
17 pages
Research Methods: PHD in Nursing
No ratings yet
Research Methods: PHD in Nursing
63 pages
Research Proposal Components-Methodology
No ratings yet
Research Proposal Components-Methodology
27 pages
Tabulation Coding and Editing
100% (2)
Tabulation Coding and Editing
16 pages
RM Unit-4 & 5
No ratings yet
RM Unit-4 & 5
23 pages
Module 4
No ratings yet
Module 4
35 pages
Session 01
No ratings yet
Session 01
23 pages
Block 4
No ratings yet
Block 4
50 pages
Unit - 7 - New of New -02
No ratings yet
Unit - 7 - New of New -02
79 pages
Cse2026 Module 1 & 2 Detailed Notes
No ratings yet
Cse2026 Module 1 & 2 Detailed Notes
185 pages
Classification and Tabulation of Data in Research BRM
100% (1)
Classification and Tabulation of Data in Research BRM
5 pages
Processing & Data Analysis Lecture PPTs Unit IV
100% (1)
Processing & Data Analysis Lecture PPTs Unit IV
60 pages
Data Preparation and Preliminary Data Analysis
No ratings yet
Data Preparation and Preliminary Data Analysis
31 pages
Data Processing
No ratings yet
Data Processing
35 pages
Topic Five (5)
No ratings yet
Topic Five (5)
55 pages
Data Processing Cycle
No ratings yet
Data Processing Cycle
1 page
Data Processing
0% (1)
Data Processing
4 pages
Unit V Proessing & Analysis
No ratings yet
Unit V Proessing & Analysis
35 pages
My Mind Reader's
No ratings yet
My Mind Reader's
19 pages
Data Processing: DR Parmar
No ratings yet
Data Processing: DR Parmar
16 pages
It Is The Process of Checking and Adjusting The Data For Omissions
No ratings yet
It Is The Process of Checking and Adjusting The Data For Omissions
5 pages
What Are Activities To Convert Raw Data To Make Information in Information Systems
33% (3)
What Are Activities To Convert Raw Data To Make Information in Information Systems
4 pages
Data analysis notes
No ratings yet
Data analysis notes
29 pages
21BCAD5C01 IDA Module 2 Notes
No ratings yet
21BCAD5C01 IDA Module 2 Notes
16 pages
Wolkite University: College of Computing and Informatics
No ratings yet
Wolkite University: College of Computing and Informatics
52 pages
Chatper Three 2018
No ratings yet
Chatper Three 2018
66 pages
Data Processing: ANUBHAV (73) MOHIT (75) Priyanka (77) Sangeeta (81) GUNJAN
No ratings yet
Data Processing: ANUBHAV (73) MOHIT (75) Priyanka (77) Sangeeta (81) GUNJAN
31 pages
UNIT-8 PROCESSING & ANALYSIS OF DATA -PPT
No ratings yet
UNIT-8 PROCESSING & ANALYSIS OF DATA -PPT
39 pages
Computer Work
No ratings yet
Computer Work
17 pages
Upload 3
No ratings yet
Upload 3
22 pages
Data-Analysis-Mba 5562f03237df7
No ratings yet
Data-Analysis-Mba 5562f03237df7
20 pages
IX Part B Unit 2 Data Literacy Notes
No ratings yet
IX Part B Unit 2 Data Literacy Notes
9 pages
Masters of Commerce Degree Semester-Iv ACADEMIC YEAR:2016-17
No ratings yet
Masters of Commerce Degree Semester-Iv ACADEMIC YEAR:2016-17
32 pages
Data Analysis and Interpretation
100% (1)
Data Analysis and Interpretation
26 pages
Data Preparation and Analysis Final
No ratings yet
Data Preparation and Analysis Final
14 pages
Data Processing and Analysis: Chapter Six
No ratings yet
Data Processing and Analysis: Chapter Six
39 pages
RM Module 4 Part 1
No ratings yet
RM Module 4 Part 1
12 pages
GROUP IV-Written Report
No ratings yet
GROUP IV-Written Report
21 pages
Data Mining: Fundamentals and Applications
From Everand
Data Mining: Fundamentals and Applications
Fouad Sabry
No ratings yet
Republic of The Philippines Nueva Vizcaya State University Bayombong, Nueva Vizcaya STRATEGIC PLAN 2019-2023 Extension and Training Program
No ratings yet
Republic of The Philippines Nueva Vizcaya State University Bayombong, Nueva Vizcaya STRATEGIC PLAN 2019-2023 Extension and Training Program
4 pages
Students' Feedback and Challenges Encountered On Modular Distance Learning: Its Relationship To Their Academic Performance in Science
No ratings yet
Students' Feedback and Challenges Encountered On Modular Distance Learning: Its Relationship To Their Academic Performance in Science
28 pages
A 741
No ratings yet
A 741
11 pages
Knox & Keating - Introduction To The UCC
100% (3)
Knox & Keating - Introduction To The UCC
67 pages
Financial Management PPT Final
No ratings yet
Financial Management PPT Final
16 pages
Air Conditioning System For Your Car
No ratings yet
Air Conditioning System For Your Car
9 pages
Kidde PDC Bells by Potter K-75-022
No ratings yet
Kidde PDC Bells by Potter K-75-022
2 pages
Flender SafetyCouplings FLE10 7 EN
No ratings yet
Flender SafetyCouplings FLE10 7 EN
40 pages
File 94catalogo Sumitomo
100% (1)
File 94catalogo Sumitomo
28 pages
Ecosystem Assamese
No ratings yet
Ecosystem Assamese
5 pages
DeltaGrid EVM - Brochure - WW - 20231027
No ratings yet
DeltaGrid EVM - Brochure - WW - 20231027
4 pages
Product Data Sheet: Tesys LRD Thermal Overload Relays - 80... 104 A - Class 10A
No ratings yet
Product Data Sheet: Tesys LRD Thermal Overload Relays - 80... 104 A - Class 10A
3 pages
Module 1
No ratings yet
Module 1
10 pages
Sample Cloud Security Project
No ratings yet
Sample Cloud Security Project
12 pages
King Fisher
No ratings yet
King Fisher
15 pages
Government of India Directorate General of Civil Aviation Form CA-80B Application For Design Organisation Approval (DOA) Under CAR 21, Subpart JB
No ratings yet
Government of India Directorate General of Civil Aviation Form CA-80B Application For Design Organisation Approval (DOA) Under CAR 21, Subpart JB
2 pages
Chapter 11
No ratings yet
Chapter 11
20 pages
P1 A
100% (1)
P1 A
10 pages
Employment Application: Confidential
No ratings yet
Employment Application: Confidential
3 pages
Cambridge IGCSE (9-1) : MATHEMATICS 0980/31
No ratings yet
Cambridge IGCSE (9-1) : MATHEMATICS 0980/31
20 pages
Meanings and Importance of Financial Statement Analysis
No ratings yet
Meanings and Importance of Financial Statement Analysis
5 pages
PSM Report
No ratings yet
PSM Report
22 pages
Aspirants 6 Months Current Affairs Feb July 2023 E Magazine Qmjsiz
No ratings yet
Aspirants 6 Months Current Affairs Feb July 2023 E Magazine Qmjsiz
106 pages
4 Semester Syllabus BTECH - CSE - 13.03
No ratings yet
4 Semester Syllabus BTECH - CSE - 13.03
61 pages
Pa6 GF30 - Macplast Marenyl 6natgf30 FC
No ratings yet
Pa6 GF30 - Macplast Marenyl 6natgf30 FC
2 pages
FinalPaperDESIGN AND FABRICATION OF SOLAR PADDY THRESHING MACHINE FOR AGRICULTURE PURPOSE191502 PDF
No ratings yet
FinalPaperDESIGN AND FABRICATION OF SOLAR PADDY THRESHING MACHINE FOR AGRICULTURE PURPOSE191502 PDF
5 pages
DIGITAL - SIGNATURE & Electronic Signatures
No ratings yet
DIGITAL - SIGNATURE & Electronic Signatures
6 pages