Information privacy and data mining
7/2/2019 Compiled by: Kamal Acharya 1
7/2/2019 Compiled by: Kamal Acharya 2
Introduction
• What is information privacy?
A number of definitions:
– Definition1: Information privacy is the individual’s ability to control the
circulation of information relating to him/her.
– Definition2: it is the claim of individuals, groups or institutions to
determine for themselves when how and to what extent information about
them is communicated to others.
7/2/2019 Compiled by: Kamal Acharya 3
Basic principles to protect information privacy
• The OECD principles are given below:
– Collection limitation
– Data quality
– Purpose specification
– Use limitation
– Security safeguards
– Openness
– Individual participation
– Accountability
– International application
7/2/2019 Compiled by: Kamal Acharya 4
Contd..
• Collection limitation:
– Principle: There should be limits to the collection of personal data and any
such data should be obtained by lawful and fair means and, where
appropriate, with the knowledge or consent of the data subject.
– Description: This principle deals with two issues. Firstly it limits the
collection of personal data that puts an end to indiscriminate collection of
personal data. Secondly, deals with practices used in data collection. The
knowledge or consent of the data subject is essential although the principle
accepts that consent of the data subject may not always be possible.
7/2/2019 Compiled by: Kamal Acharya 5
Contd..
• Data Quality:
– Principle: Personal data should be relevant to the purposes for which they
are to be used, and, to the extent necessary for those purposes, should be
accurate, complete and kept up-to-date.
– Description: This principle in essence states that data should be related to
the purpose for which it is to be used. For example, data including
opinions may not always be useful to collect since it may not be relevant
for the purposes for which the data is to be used.
7/2/2019 Compiled by: Kamal Acharya 6
Contd..
• Purpose specification:
– Principle: The purposes for which personal data are collected should be
specified not later than at the time of data collection and the subsequent
use limited to the fulfillment of those purposes or such others as are not
incompatible with those purposes and as are specified on each occasion of
change of purpose.
– Description: The purpose of data collection should be identified before
data collection and any changes afterwards must be specified. Also, when
data no longer serves a purpose, it may be necessary to destroy the data if
practicable.
7/2/2019 Compiled by: Kamal Acharya 7
Contd..
• Use limitation:
– Principle: Personal data should not be disclosed, made
available or otherwise used for purposes other than those
specified in accordance with principle 3 except with the
consent of the data subject or by the authority of law.
– Description: the earlier principles were more dealing with
data collection. This principle deals with data use and data
disclosure. data
7/2/2019 Compiled by: Kamal Acharya 8
Contd..
• Security safeguards:
– Principle: Personal data should be protected by reasonable
security safeguards against such risks as loss or unauthorized
access, destruction, use, modification or disclosure of data.
– Description: To enforce the principle of use limitations, it is
required that there be appropriate safeguards. The safeguards
should include physical safeguards as well as safeguards for
the computer systems.
7/2/2019 Compiled by: Kamal Acharya 9
Contd..
• Openness:
– Principle: There should be a general policy of openness about
developments, practices and policies with respect to personal
data. Means should be readily available of establishing the
existence and nature of personal data, and the main purposes
of their use, as well as the identity and usual residence of the
data controller.
7/2/2019 Compiled by: Kamal Acharya 10
Contd..
• Individual participation: An individual should have the right:
– To obtain data from a data controller, or otherwise,
confirmation or not, the data controller has data relating to
him;
– To have communicated to him, data relating to him
• Within a reasonable time;
• At a charge, if any, that is not excessive;
• In a reasonable manner; and
• In a form that is readily intelligible to him;
7/2/2019 Compiled by: Kamal Acharya 11
Contd..
• To be given reasons if a request made under first two scenarios is
denied, and to be able to challenge such denial; and
• To challenge data relating to him and, if the challenge is
successful, to have the data erased, rectified, completed or
amended.
This is perhaps the most important principle. It involves the rights
of the individuals to acquire information about the collection,
storage and use of personal data. The right of access should be
simple to exercise.
7/2/2019 Compiled by: Kamal Acharya 12
Contd..
• Accountability:
– The data controller should be accountable for complying with
measures which give effect to the principles stated above.
– This principle requires that the data collector comply with the
privacy protection principles. Accountability should be
supported by legal sanctions and perhaps a code of conduct.
7/2/2019 Compiled by: Kamal Acharya 13
Uses and misuses of data mining
• Data mining involves the extraction of implicit, previously
unknown and potentially useful knowledge form large databases.
• Data mining is a very challenging task since it involves building
and using software that will manage, explore, summarize, model,
analyze and interpret large datasets in order to identify patterns
and abnormalities.
7/2/2019 Compiled by: Kamal Acharya 14
Contd..
• Uses:
– Data mining techniques are being used increasingly in a wide
variety of applications.
– The applications include:
• Fraud prevention
• Catching tax avoidance
• Catching drug smugglers
• Reducing customer churn and learning more about customers behavior.
7/2/2019 Compiled by: Kamal Acharya 15
Contd..
• Misuses of data mining:
7/2/2019 Compiled by: Kamal Acharya 16
Primary aims of data mining
• Some aims of data mining are:
– The primary aim of data mining applications is to better
understand the customer and improve customer services.
– Some applications aim to discover anomalous patterns in
order to help identify, for example, fraud , abuse, waste,
terrorist suspects or drug smugglers.
7/2/2019 Compiled by: Kamal Acharya 17
Contd..
• In many applications in private enterprises, the primary aim is
to improve the profitability of an enterprise.
• In many applications the primary purpose is to improve
human judgment, for example, in marketing diagnoses, in
resolving crime, in sorting out manufacturing problems, in
predicting share prices or currency movements or commodity
prices.
7/2/2019 Compiled by: Kamal Acharya 18
Contd..
• In some government applications, one of the aims of data mining is to
identify criminal activities.
• In some situations, data mining is used to find patterns that are simply
not possible without help of data mining, given the huge amounts of
data that must be processed.
Many data mining systems designed to achieve one or more of these aims
do not lead to personal privacy violations, but some do. In particular,
applications that aim to discover anomalous behavior can lead to
significantly violations of individual privacy.
7/2/2019 Compiled by: Kamal Acharya 19
Pitfalls of data mining
• Dempsey and Rosenzweig identify five pitfalls of data mining
when personally identifiable information is used in data mining
applications like identifying terrorist suspects.
• These five categories are:
– Unintentional Mistake-Mistaken identity
– Unintentional Mistake- Faulty Inference
– Intentional abuse
– Security Breach
– Mission Creep
7/2/2019 Compiled by: Kamal Acharya 20
Contd..
• Unintentional Mistake- Mistaken Identity:
– This kind problem arises when an innocent person shares
some identifier information with one or more persons that
have been identified as having a profile of interest in a data
mining application.
7/2/2019 Compiled by: Kamal Acharya 21
Contd..
• Unintentional Mistake- Faulty inference:
– This kind of problem arises when profiles are matched and
the data mining user misinterpret the result of the match.
– Faulty inferences then may be derived resulting, for example,
in an innocent person becoming a suspect and leading to
further investigation about the person
7/2/2019 Compiled by: Kamal Acharya 22
Contd..
• Intentional abuse:
– People employed in running a data mining system and
security organizations have access to sensitive personal
information and not all of them are trustworthy.
– This sensitive information may be used for unauthorized
purposes for personal financial gain.
7/2/2019 Compiled by: Kamal Acharya 23
Contd..
• Security Breach:
– Given that data mining system using personal information
have sensitive information may be stolen or carelessly
disclosed without strict security measures.
7/2/2019 Compiled by: Kamal Acharya 24
Contd..
• Mission Creep:
– Once a data mining system, for example for a national
security application, has been established and personal
information from a variety of source collected, there is likely
to be a temptation to use the information for other
applications. Such mission creep has been reported in data
mining applications.
7/2/2019 Compiled by: Kamal Acharya 25
Homework
• What is information privacy?
• What are the basic OECD information privacy protection
principles?
• List two important uses of data mining?
• List two important misuses of data mining?
• What are the primary aims of data mining?
• List the five pitfalls of data mining?
• What is privacy preserving data mining?
Thank You !
Compiled by: Kamal Acharya 267/2/2019

Information Privacy and Data Mining

  • 1.
    Information privacy anddata mining 7/2/2019 Compiled by: Kamal Acharya 1
  • 2.
    7/2/2019 Compiled by:Kamal Acharya 2 Introduction • What is information privacy? A number of definitions: – Definition1: Information privacy is the individual’s ability to control the circulation of information relating to him/her. – Definition2: it is the claim of individuals, groups or institutions to determine for themselves when how and to what extent information about them is communicated to others.
  • 3.
    7/2/2019 Compiled by:Kamal Acharya 3 Basic principles to protect information privacy • The OECD principles are given below: – Collection limitation – Data quality – Purpose specification – Use limitation – Security safeguards – Openness – Individual participation – Accountability – International application
  • 4.
    7/2/2019 Compiled by:Kamal Acharya 4 Contd.. • Collection limitation: – Principle: There should be limits to the collection of personal data and any such data should be obtained by lawful and fair means and, where appropriate, with the knowledge or consent of the data subject. – Description: This principle deals with two issues. Firstly it limits the collection of personal data that puts an end to indiscriminate collection of personal data. Secondly, deals with practices used in data collection. The knowledge or consent of the data subject is essential although the principle accepts that consent of the data subject may not always be possible.
  • 5.
    7/2/2019 Compiled by:Kamal Acharya 5 Contd.. • Data Quality: – Principle: Personal data should be relevant to the purposes for which they are to be used, and, to the extent necessary for those purposes, should be accurate, complete and kept up-to-date. – Description: This principle in essence states that data should be related to the purpose for which it is to be used. For example, data including opinions may not always be useful to collect since it may not be relevant for the purposes for which the data is to be used.
  • 6.
    7/2/2019 Compiled by:Kamal Acharya 6 Contd.. • Purpose specification: – Principle: The purposes for which personal data are collected should be specified not later than at the time of data collection and the subsequent use limited to the fulfillment of those purposes or such others as are not incompatible with those purposes and as are specified on each occasion of change of purpose. – Description: The purpose of data collection should be identified before data collection and any changes afterwards must be specified. Also, when data no longer serves a purpose, it may be necessary to destroy the data if practicable.
  • 7.
    7/2/2019 Compiled by:Kamal Acharya 7 Contd.. • Use limitation: – Principle: Personal data should not be disclosed, made available or otherwise used for purposes other than those specified in accordance with principle 3 except with the consent of the data subject or by the authority of law. – Description: the earlier principles were more dealing with data collection. This principle deals with data use and data disclosure. data
  • 8.
    7/2/2019 Compiled by:Kamal Acharya 8 Contd.. • Security safeguards: – Principle: Personal data should be protected by reasonable security safeguards against such risks as loss or unauthorized access, destruction, use, modification or disclosure of data. – Description: To enforce the principle of use limitations, it is required that there be appropriate safeguards. The safeguards should include physical safeguards as well as safeguards for the computer systems.
  • 9.
    7/2/2019 Compiled by:Kamal Acharya 9 Contd.. • Openness: – Principle: There should be a general policy of openness about developments, practices and policies with respect to personal data. Means should be readily available of establishing the existence and nature of personal data, and the main purposes of their use, as well as the identity and usual residence of the data controller.
  • 10.
    7/2/2019 Compiled by:Kamal Acharya 10 Contd.. • Individual participation: An individual should have the right: – To obtain data from a data controller, or otherwise, confirmation or not, the data controller has data relating to him; – To have communicated to him, data relating to him • Within a reasonable time; • At a charge, if any, that is not excessive; • In a reasonable manner; and • In a form that is readily intelligible to him;
  • 11.
    7/2/2019 Compiled by:Kamal Acharya 11 Contd.. • To be given reasons if a request made under first two scenarios is denied, and to be able to challenge such denial; and • To challenge data relating to him and, if the challenge is successful, to have the data erased, rectified, completed or amended. This is perhaps the most important principle. It involves the rights of the individuals to acquire information about the collection, storage and use of personal data. The right of access should be simple to exercise.
  • 12.
    7/2/2019 Compiled by:Kamal Acharya 12 Contd.. • Accountability: – The data controller should be accountable for complying with measures which give effect to the principles stated above. – This principle requires that the data collector comply with the privacy protection principles. Accountability should be supported by legal sanctions and perhaps a code of conduct.
  • 13.
    7/2/2019 Compiled by:Kamal Acharya 13 Uses and misuses of data mining • Data mining involves the extraction of implicit, previously unknown and potentially useful knowledge form large databases. • Data mining is a very challenging task since it involves building and using software that will manage, explore, summarize, model, analyze and interpret large datasets in order to identify patterns and abnormalities.
  • 14.
    7/2/2019 Compiled by:Kamal Acharya 14 Contd.. • Uses: – Data mining techniques are being used increasingly in a wide variety of applications. – The applications include: • Fraud prevention • Catching tax avoidance • Catching drug smugglers • Reducing customer churn and learning more about customers behavior.
  • 15.
    7/2/2019 Compiled by:Kamal Acharya 15 Contd.. • Misuses of data mining:
  • 16.
    7/2/2019 Compiled by:Kamal Acharya 16 Primary aims of data mining • Some aims of data mining are: – The primary aim of data mining applications is to better understand the customer and improve customer services. – Some applications aim to discover anomalous patterns in order to help identify, for example, fraud , abuse, waste, terrorist suspects or drug smugglers.
  • 17.
    7/2/2019 Compiled by:Kamal Acharya 17 Contd.. • In many applications in private enterprises, the primary aim is to improve the profitability of an enterprise. • In many applications the primary purpose is to improve human judgment, for example, in marketing diagnoses, in resolving crime, in sorting out manufacturing problems, in predicting share prices or currency movements or commodity prices.
  • 18.
    7/2/2019 Compiled by:Kamal Acharya 18 Contd.. • In some government applications, one of the aims of data mining is to identify criminal activities. • In some situations, data mining is used to find patterns that are simply not possible without help of data mining, given the huge amounts of data that must be processed. Many data mining systems designed to achieve one or more of these aims do not lead to personal privacy violations, but some do. In particular, applications that aim to discover anomalous behavior can lead to significantly violations of individual privacy.
  • 19.
    7/2/2019 Compiled by:Kamal Acharya 19 Pitfalls of data mining • Dempsey and Rosenzweig identify five pitfalls of data mining when personally identifiable information is used in data mining applications like identifying terrorist suspects. • These five categories are: – Unintentional Mistake-Mistaken identity – Unintentional Mistake- Faulty Inference – Intentional abuse – Security Breach – Mission Creep
  • 20.
    7/2/2019 Compiled by:Kamal Acharya 20 Contd.. • Unintentional Mistake- Mistaken Identity: – This kind problem arises when an innocent person shares some identifier information with one or more persons that have been identified as having a profile of interest in a data mining application.
  • 21.
    7/2/2019 Compiled by:Kamal Acharya 21 Contd.. • Unintentional Mistake- Faulty inference: – This kind of problem arises when profiles are matched and the data mining user misinterpret the result of the match. – Faulty inferences then may be derived resulting, for example, in an innocent person becoming a suspect and leading to further investigation about the person
  • 22.
    7/2/2019 Compiled by:Kamal Acharya 22 Contd.. • Intentional abuse: – People employed in running a data mining system and security organizations have access to sensitive personal information and not all of them are trustworthy. – This sensitive information may be used for unauthorized purposes for personal financial gain.
  • 23.
    7/2/2019 Compiled by:Kamal Acharya 23 Contd.. • Security Breach: – Given that data mining system using personal information have sensitive information may be stolen or carelessly disclosed without strict security measures.
  • 24.
    7/2/2019 Compiled by:Kamal Acharya 24 Contd.. • Mission Creep: – Once a data mining system, for example for a national security application, has been established and personal information from a variety of source collected, there is likely to be a temptation to use the information for other applications. Such mission creep has been reported in data mining applications.
  • 25.
    7/2/2019 Compiled by:Kamal Acharya 25 Homework • What is information privacy? • What are the basic OECD information privacy protection principles? • List two important uses of data mining? • List two important misuses of data mining? • What are the primary aims of data mining? • List the five pitfalls of data mining? • What is privacy preserving data mining?
  • 26.
    Thank You ! Compiledby: Kamal Acharya 267/2/2019

Editor's Notes

  • #4 OECD => Organization for Economic Cooperation and Development