UNIVERSITY OF CAPE COAST
COLLEGE OF AGRIC AND NATURAL SCIENCES
SCHOOL OF PHYSICAL SCIENCES
DEPARTMENT OF COMPUTER SCIENCE AND INFORMATION
TECHNOLOGY
MANAGEMENT SUPPORT SYSTEM
REGISTRATION NUMBER: PS/ITC/21/0087
COURSE CODE: INF310
ASSIGNMENT 2: DATA MINING
1
Question 1
What is Data mining?
Data mining is the process of discovering patterns, correlations, anomalies, and significant
structures in large datasets to predict outcomes. By applying various techniques such as
statistical analysis, machine learning, and database systems, data mining transforms raw data into
valuable information.
Question 2
Explain the steps involved in data mining process.
1. Data Collection: Gathering relevant data from various sources such as databases,
surveys, web pages, etc.
2. Data Preprocessing: Cleaning and transforming raw data to prepare it for analysis.
Handling missing values, removing noise, and formatting the data.
3. Exploratory Data Analysis (EDA): Analyzing the data to understand its characteristics,
identify patterns, and formulate hypotheses. Using statistical tools and visualization
techniques.
4. Feature Selection: Selecting the most relevant variables or attributes from the data for
analysis. Reducing dimensionality to improve model performance.
5. Model Building: Applying appropriate data mining algorithms or techniques to the
prepared data. Building predictive or descriptive models.
6. Model Evaluation: Assessing the performance and accuracy of the models. Using
appropriate metrics and validation techniques like cross-validation.
7. Model Deployment: Integrating the developed models into real-world applications or
systems. Generating insights and supporting decision-making processes.
8. Model Monitoring and Maintenance: Regularly monitoring the performance of the
deployed models. Updating or refining models as needed to maintain their effectiveness
over time.
2
Question 3
List any three fields where the application of data mining is very
relevant.
1. Business and Marketing
2. Healthcare
3. Finance and Banking
Question 4
Explain any four different techniques of data mining
Classification: Assigning items to predefined categories or classes. Example: Spam email
detection.
Clustering: Grouping a set of objects in such a way that objects in the same group are more
similar to each other than to those in other groups. Example: Market segmentation.
Association Rule Learning: Discovering interesting relations between variables in large
databases. Example: Market basket analysis.
Regression: Predicting a continuous-valued attribute based on other variables. Example:
Predicting house prices.
Question 5
Explain 4 benefits of Data mining.
Improved Decision Making: Provides data-driven insights that support strategic decisions.
3
Enhanced Customer Service: Data mining helps organizations better understand customer
needs and preferences, leading to improved customer satisfaction and engagement.
Fraud Detection: Identifies unusual patterns that could indicate fraudulent activity.
Cost Reduction: By optimizing operations and identifying inefficiencies through data mining,
organizations can achieve cost savings and better resource utilization.
Question 6
Explain 2 limitations of Data mining
Privacy Concerns: Mining sensitive data can lead to privacy issues and potential misuse of
information.
Quality of Data: The effectiveness of data mining highly depends on the quality of data.
Inaccurate or incomplete data can lead to misleading results.