GDPR Compliance Automation Report

Uploaded by

akankshapandeyb2001

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

18 views3 pages

GDPR Compliance Automation Report

Uploaded by

akankshapandeyb2001

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 3

Internship Report: Automating Mechanisms for

GDPR Compliances
Organization: VisionTech Systems PVT LTD
Mentor Name: Vijay Shukla

Akanksha Arpan Gevariya

202111004 202111030
[email protected] [email protected]

Abstract—- This report presents the research and implemen- • One of the primary tasks involved the development, im-
tation of a General Data Regulation Policy, with a specific focus plementation, and scheduling of a robust script designed
on compliance with the General Data Protection Regulation to automatically drop tables that exceeded the 28-day
(GDPR). Throughout the project, The project began with an
extensive examination of the GDPR framework, highlighting the retention period within our clusters. This task required
task performed implementing GDPR compliances. meticulous attention to scheduling and automation to
ensure that outdated data was efficiently managed and
I. I NTRODUCTION removed at regular intervals, thereby optimizing storage
This report provides an insightful overview of my internship and maintaining data hygiene.
experience at VisionTech Systems PVT LTD.We conducted • In addition to managing data retention, we were also
extensive research to understand the General Data Protection focused on identifying and dropping tables that contained
Regulation (GDPR), its applications, and its implications in personal data elements to ensure alignment with GDPR
real-life scenarios. This foundational work was crucial as we requirements. This involved a comprehensive review of
aimed to implement GDPR compliance for the company’s the data schemas and the implementation of systematic
data warehousing solution. We had the chance to work on procedures to protect and handle personal data appropri-
implementing General Data Protection Regulation (GDPR) ately, thereby mitigating potential legal risks and ensuring
compliance for their data warehousing solution. This opportu- compliance with stringent data privacy regulations.
nity allowed us to dive into a dynamic environment focused • To further enhance GDPR compliance, we developed,
on data privacy and regulatory adherence, presenting a unique implemented, and scheduled a script to verify whether the
blend of technical challenges and professional growth. tables containing personal data elements included a time-
Aligning the data warehousing solution with GDPR re- series column, such as a snapshot day or the latest up-
quirements was crucial for the organization to ensure com- dated date. This verification was crucial for determining
pliance with international data protection laws and to safe- the row creation date, which is essential for implementing
guard personal data. Throughout our internship, we engaged a 2.5 year data retention policy. The script’s automation
in several critical tasks, including developing, implementing, ensured consistent checks and updates, facilitating long-
and scheduling scripts to manage data retention, identifying term data management aligned with regulatory require-
and removing tables containing personal data elements, and ments.
ensuring adherence to GDPR guidelines. • Moreover, we established a robust mechanism to identify
Working on these projects not only sharpened our technical and drop tables that were created without adhering to
skills in SQL, Python, and AWS but also provided us with a the established GDPR rules within our data warehouse.
deeper understanding of GDPR and its implications for data This task involved implementing checks for compliance
management. This experience was professionally enriching with table naming conventions, the presence of required
and offered valuable insights into best practices for data columns, and other GDPR-related criteria. By ensuring
protection and compliance. This report details our experiences, that all tables met these compliance standards, the in-
the methodologies employed, and the significant impacts of the tegrity and regulatory alignment of the data warehouse
projects undertaken during our internship. were maintained, thus supporting the organization’s over-
II. TASKS A SSIGNED all data governance framework.
During our internship, we undertook several critical tasks III. D EVELOPMENT A PPROACH
aimed at ensuring GDPR compliance within our data ware- The overall approach for the first task involved a systematic
housing solution. process to manage data retention efficiently. Initially, we pulled
data from the designated cluster based on table creation dates
and stored this information in an S3 bucket, utilizing AWS’s
cloud storage capabilities. We developed an SQL query to ac-
curately retrieve key details such as schema name, table name,
and creation date from the database. Subsequently,we created
an S3 bucket to temporarily store this data. To automate the
data management process, We developed a job to unload data
from the database tables and transfer it to the S3 bucket,
configuring the job to run at specified intervals for regular
data updates. Additionally, We created a Python function to
determine the age of each table by comparing its creation date
with the current date. If a table’s age exceeded 28 days, the
function automatically dropped the table from the database,
ensuring compliance with our data retention policies.
For the second task, we began by extracting data from the
USER and EMP schemas and then compared the columns
with a file containing details on impacted columns. This
comparison helped us identify tables containing personal data
elements that needed attention. To streamline this process,
We developed a configuration table with key parameters,
including schema name, table name, table owner, creation date,
personal identifiable information (PII) columns, and various
status indicators such as the presence of customer ID, OD3
status, communication sent, table age, and whether the table
had been dropped. This configuration table served as a critical
tool for tracking and managing GDPR compliance, ensuring
accurate and efficient handling of data impacted by privacy
regulations. Fig. 1. Flowchart
For the third task, we adopted a structured approach to enhance
GDPR compliance by focusing on time series columns in
impacted tables. We began by adding a time series column ensuring that all tables in the data warehouse complied with
to these tables, which was updated to ”Yes” if a time series the established rules.
column was present and ”No” if it was absent. To ensure IV. I NSIGHTS AND ACQUIRED S KILLS
accurate tracking, we developed a script to verify each table
in the GDPR-impacted list against svv-columns, setting the During the implementation process, we gained valuable
time series column accordingly. Additionally, we introduced knowledge and skills, including:
another column to record the names of suspected time series • Proficiency with SQL Client Tools: During our internship,

columns for tables marked ”Yes,” while this column remained we gained significant experience with SQL client tools,
null for tables marked ”No.” Finally, to streamline the process, particularly mySQL and DBeaver. We learned to effec-
we automated the task by creating jobs across our databases tively connect mySQL and DBeaver to various databases,
that routinely check for the presence of time series columns facilitating efficient data exploration and query execution.
and update the status as well as list suspected columns if This experience enhanced our ability to navigate and
applicable. manage database systems, execute complex queries, and
For the fourth task, we took a comprehensive approach to visualize data interactively.
ensure GDPR compliance by addressing table naming conven- • AWS S3 Exploration: We successfully created and man-
tions and the presence of creation-date columns. We began aged an Amazon S3 bucket, leveraging AWS’s object
by developing an SQL query to identify tables across five storage service. We acquired a solid understanding of
schemas that either did not adhere to the prescribed naming bucket configurations, storage management, and data re-
guidelines or lacked a creation-date column. This required a trieval within the AWS ecosystem. This hands-on expe-
join operation between svv-columns, which contained schema rience with S3 contributed to my ability to handle large
names, table names, and column names, and admin-schema.d- volumes of data and implement scalable storage solutions.
aim-tables, which included the creation-date but not column • Enhanced SQL Skills: We refined our SQL skills, fo-
names. To automate the process, we created a Python script cusing on constructing and executing complex queries.
that identifies and lists tables failing to meet GDPR stan- We improved our ability to retrieve specific data from
dards. The script was designed to correct tables with naming large datasets, work with system catalog tables, and op-
guideline issues and to drop those missing a creation-date, timize query performance. This deepened understanding
of SQL was crucial for managing and manipulating data the company minimized the risk of non-compliance with data
effectively. protection regulations. This proactive measure helped avoid
• Advanced Python Programming: Our proficiency in potential legal penalties and financial repercussions related to
Python was significantly strengthened through scripting data management failures. Additionally, it reinforced the com-
and automation tasks. We developed efficient scripts for pany’s commitment to data privacy and security, enhancing
data processing and task automation, gaining practical its reputation and trustworthiness among users and regulatory
experience in writing and debugging Python code. This bodies.
enhanced our ability to handle data manipulation and VI. ACKNOWLEDGMENT
workflow automation tasks effectively.
We extend our heartfelt gratitude to Indian Institute of
• Familiarity with Pandas DataFrame: We explored and
Information Technology Vadodara-International Campus Diu
utilized the Pandas library, particularly its DataFrame
for their invaluable support throughout our internship journey.
functionality, for data processing tasks. This experience
We are immensely thankful to Visiontech Systems Pvt Ltd
allowed us to perform complex data transformations,
for providing us with the exceptional opportunity to intern
analyses, and visualizations with ease. Our familiarity
with the company. We are deeply indebted to our mentor
with Pandas enhanced our ability to manage and analyze
Vijay Shukla for their warm hospitality and mentorship during
large datasets efficiently.
our internship. Their support and insights have significantly
• Understanding GDPR Compliance: We gained a com- contributed to our learning experience.This internship has been
prehensive understanding of the General Data Protec- an enriching and learning experience, and We are truly grateful
tion Regulation (GDPR), including its requirements and for all the support and guidance we received.
implications for data management and protection. This
knowledge was crucial for ensuring that our data handling
practices adhered to regulatory standards and protected
personal data.

V. S IGNIFICANT O UTCOMES
We achieved the following impacts:
Data Privacy Enhancement: By systematically identifying
and dropping tables that contained personal data elements,
we significantly enhanced the organization’s data privacy
practices. This process ensured that sensitive information was
managed according to GDPR regulations, which mandate the
protection of personal data. The proactive removal of these
tables mitigated the risk of unauthorized access and poten-
tial data breaches, thereby safeguarding user privacy. This
approach demonstrated a strong commitment to compliance
and responsible data management, reassuring stakeholders and
users about the security of their personal information.
Operational Efficiency: The development and implementa-
tion of scripts for automating data management tasks notably
improved operational efficiency. By creating automated pro-
cesses to drop tables that exceeded the predefined retention
period, we optimized database storage utilization. This not
only helped in maintaining a cleaner and more manageable
database but also enhanced overall database performance. The
reduction in redundant and outdated data led to faster query
responses and reduced costs associated with data storage and
management. The streamlined approach contributed to more
efficient operations and better resource allocation, aligning
with the company’s goals of cost-effectiveness and perfor-
mance optimization.
Risk Mitigation: The approach of dropping tables based on
their age played a crucial role in mitigating legal and financial
risks associated with data retention and GDPR compliance.
By ensuring that tables were removed when they were no
longer needed or when they exceeded the retention period,

Scoping The Compliance Task For GDPR - Areas of Activity
No ratings yet
Scoping The Compliance Task For GDPR - Areas of Activity
5 pages
Guidelines For GDPR Compliance in Big Data Systems
No ratings yet
Guidelines For GDPR Compliance in Big Data Systems
20 pages
Class Activity - 2024 - Working With GDPR Principles in Your Institution
No ratings yet
Class Activity - 2024 - Working With GDPR Principles in Your Institution
5 pages
Ducbui PHD Thesis
No ratings yet
Ducbui PHD Thesis
291 pages
307 - EU GDPR Implementation Project Plan
100% (1)
307 - EU GDPR Implementation Project Plan
9 pages
Facf
No ratings yet
Facf
8 pages
GDPR Capability Pack PDF
No ratings yet
GDPR Capability Pack PDF
8 pages
GDPR Implementation
No ratings yet
GDPR Implementation
10 pages
GDPR Roadmap
No ratings yet
GDPR Roadmap
10 pages
GDPR Checklist PDF PDF
100% (1)
GDPR Checklist PDF PDF
4 pages
GDPR Checklist PDF
50% (2)
GDPR Checklist PDF
4 pages
12-2017 GDPR What US Firms Need To Know v3
No ratings yet
12-2017 GDPR What US Firms Need To Know v3
7 pages
GDPR Compliance Checklist
No ratings yet
GDPR Compliance Checklist
1 page
Priyanshu Que9
No ratings yet
Priyanshu Que9
4 pages
Barati 2020 Automating
No ratings yet
Barati 2020 Automating
6 pages
Data Management Policy TEMPLATE V1
No ratings yet
Data Management Policy TEMPLATE V1
9 pages
OneTrust GDPR ComplianceChecklist DIGITAL
No ratings yet
OneTrust GDPR ComplianceChecklist DIGITAL
7 pages
GDPR Compliance in Dutch Housing
No ratings yet
GDPR Compliance in Dutch Housing
101 pages
Compliance Challenges in DBMS
No ratings yet
Compliance Challenges in DBMS
15 pages
Data Inventory and Mapping
No ratings yet
Data Inventory and Mapping
6 pages
GDPR Checklist 1
No ratings yet
GDPR Checklist 1
8 pages
GDPR
No ratings yet
GDPR
2 pages
GDPR Compliance in The Context of Continuous Integration: Ze Shi Li, Colin Werner, Neil Ernst, and Daniela Damian
No ratings yet
GDPR Compliance in The Context of Continuous Integration: Ze Shi Li, Colin Werner, Neil Ernst, and Daniela Damian
14 pages
GDPR Compliance Checklist 2022
100% (3)
GDPR Compliance Checklist 2022
49 pages
GDPR Compliance Project Plan
No ratings yet
GDPR Compliance Project Plan
1 page
Data Interview Study Material
No ratings yet
Data Interview Study Material
4 pages
GDPR Implementation Guide
No ratings yet
GDPR Implementation Guide
2 pages
Proposal Implementing IT Processes For GDPR Data Retention Management
No ratings yet
Proposal Implementing IT Processes For GDPR Data Retention Management
7 pages
GDPR Audit Guide for Employers
No ratings yet
GDPR Audit Guide for Employers
6 pages
GDPR Compliance Checklist Guide
No ratings yet
GDPR Compliance Checklist Guide
2 pages
Ravikiran (1) - 5
No ratings yet
Ravikiran (1) - 5
7 pages
Big Data Analytics in Accounting and Finance Assignment 3
No ratings yet
Big Data Analytics in Accounting and Finance Assignment 3
7 pages
GDPR Compliance Checklist - GDPR - Eu
No ratings yet
GDPR Compliance Checklist - GDPR - Eu
4 pages
GDPR Guide for Organizations
No ratings yet
GDPR Guide for Organizations
2 pages
GDPR - Context, Principles, Implementation, Operation, Data Governance, Data Ethics and Impact On Outsourcing
No ratings yet
GDPR - Context, Principles, Implementation, Operation, Data Governance, Data Ethics and Impact On Outsourcing
49 pages
IST GDPR Journal-1
No ratings yet
IST GDPR Journal-1
21 pages
GDPR in Ten Simple Steps
No ratings yet
GDPR in Ten Simple Steps
28 pages
GDPR Audit & Compliance: Consulting
No ratings yet
GDPR Audit & Compliance: Consulting
6 pages
Preparing For A GDPR Audit: A Step-by-Step Guide
No ratings yet
Preparing For A GDPR Audit: A Step-by-Step Guide
25 pages
GDPR Preparation Project Plan
No ratings yet
GDPR Preparation Project Plan
1 page
GDPR Gap Analysis Template
No ratings yet
GDPR Gap Analysis Template
3 pages
General Data Protection Regulation GDPR Compliance Checklist 1752666195
No ratings yet
General Data Protection Regulation GDPR Compliance Checklist 1752666195
1 page
CodTech Internship Presentation
No ratings yet
CodTech Internship Presentation
20 pages
GDPR Compliance Checklist ProjectManager FD CM
No ratings yet
GDPR Compliance Checklist ProjectManager FD CM
12 pages
GDPR Readiness Report (Sample)
No ratings yet
GDPR Readiness Report (Sample)
29 pages
All Questions
No ratings yet
All Questions
7 pages
GDPR Data Protection Policy
No ratings yet
GDPR Data Protection Policy
2 pages
Aintro and Projects
No ratings yet
Aintro and Projects
6 pages
GDPR Compliance Checklist - V1
No ratings yet
GDPR Compliance Checklist - V1
5 pages
GDPR in Ten Simple Steps
No ratings yet
GDPR in Ten Simple Steps
28 pages
Shamee K Sharma - IR
No ratings yet
Shamee K Sharma - IR
11 pages
GDPR Compliance Checklist 2025
No ratings yet
GDPR Compliance Checklist 2025
3 pages
The Ceos Guide To GDPR Compliance
No ratings yet
The Ceos Guide To GDPR Compliance
62 pages
VDart Gulf - Naga Narasimha - Data Engineer
No ratings yet
VDart Gulf - Naga Narasimha - Data Engineer
4 pages
Data Privacy and GDPR Compliance
No ratings yet
Data Privacy and GDPR Compliance
10 pages
Victor Mupanguri
No ratings yet
Victor Mupanguri
3 pages
Prez
No ratings yet
Prez
11 pages
Bachelors Thesis Project (BTP) Template NSUT
100% (1)
Bachelors Thesis Project (BTP) Template NSUT
14 pages
"The Grace Period Has Ended": An Approach To Operationalize GDPR Requirements
No ratings yet
"The Grace Period Has Ended": An Approach To Operationalize GDPR Requirements
11 pages
Assignment 1
0% (1)
Assignment 1
9 pages
Srujana Putta - Power BI - Resume
100% (1)
Srujana Putta - Power BI - Resume
5 pages
GL Summary Diagram
100% (2)
GL Summary Diagram
1 page
SQL Server Data Dictionary
No ratings yet
SQL Server Data Dictionary
7 pages
Unit 2 Refining The ER, Design For Company DataBase and Relational Model
No ratings yet
Unit 2 Refining The ER, Design For Company DataBase and Relational Model
59 pages
Database: 2019/20 Info. & Communication Technology Lesson Note For Grade 10
No ratings yet
Database: 2019/20 Info. & Communication Technology Lesson Note For Grade 10
3 pages
MYSQL Notes
No ratings yet
MYSQL Notes
10 pages
DML Command and Various Clauses
No ratings yet
DML Command and Various Clauses
43 pages
2.views of Data
No ratings yet
2.views of Data
2 pages
Database Lab: SQL Aggregate Functions
No ratings yet
Database Lab: SQL Aggregate Functions
3 pages
Unit 4 Learning Task
No ratings yet
Unit 4 Learning Task
2 pages
DBMS Assignment 1
No ratings yet
DBMS Assignment 1
7 pages
DP 203 Demo
No ratings yet
DP 203 Demo
9 pages
DBMS Concepts for IT Students
No ratings yet
DBMS Concepts for IT Students
10 pages
140+ SQL Interview Questions and Answers (2022) - Great Learning
No ratings yet
140+ SQL Interview Questions and Answers (2022) - Great Learning
60 pages
Day 25-Facilitation Guide (Hibernate Caching, Second Level Cache)
No ratings yet
Day 25-Facilitation Guide (Hibernate Caching, Second Level Cache)
19 pages
Entity/Relationship Modelling: Database Systems
No ratings yet
Entity/Relationship Modelling: Database Systems
41 pages
Dremio vs. SQL Engines: Benchmark Insights
No ratings yet
Dremio vs. SQL Engines: Benchmark Insights
57 pages
ST2 - Big Data - KCS061 (Updated)
No ratings yet
ST2 - Big Data - KCS061 (Updated)
2 pages
JavaScript Practical
No ratings yet
JavaScript Practical
53 pages
Introduction To JDBC
No ratings yet
Introduction To JDBC
3 pages
Automate Report Emails with IBM Cognos
No ratings yet
Automate Report Emails with IBM Cognos
5 pages
Oracle Background Processes Guide
No ratings yet
Oracle Background Processes Guide
9 pages
Getting Started
No ratings yet
Getting Started
1 page
Quest SQL Optimizer For Oracle 7.5.3
No ratings yet
Quest SQL Optimizer For Oracle 7.5.3
47 pages
PI SQL Tutorial for Beginners
No ratings yet
PI SQL Tutorial for Beginners
11 pages
Tutorial 3
No ratings yet
Tutorial 3
3 pages
DB2 Database Tutorial Guide
No ratings yet
DB2 Database Tutorial Guide
28 pages
ECS402 MS-Access-Short-Questions-Answers
100% (1)
ECS402 MS-Access-Short-Questions-Answers
10 pages
SQL Predicate
100% (2)
SQL Predicate
41 pages

GDPR Compliance Automation Report

Uploaded by

GDPR Compliance Automation Report

Uploaded by

Internship Report: Automating Mechanisms for

Akanksha Arpan Gevariya

You might also like