0% found this document useful (0 votes)
433 views

CS3215 - Software Engineering Project

This document provides an agenda for a presentation on a static program analyzer project. It discusses the project's parser, program knowledge base (PKB) design, PKB application programming interface, PKB optimizations, query processor, query evaluation strategy, and query optimizations. Key points include using hash maps and caches to optimize the PKB, handling exceptions in the parser, and taking a "narrow passage" approach to the PKB API to minimize its public methods and make the query processor independent of PKB subcomponents.
Copyright
© Attribution Non-Commercial (BY-NC)
Available Formats
Download as PPT, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
433 views

CS3215 - Software Engineering Project

This document provides an agenda for a presentation on a static program analyzer project. It discusses the project's parser, program knowledge base (PKB) design, PKB application programming interface, PKB optimizations, query processor, query evaluation strategy, and query optimizations. Key points include using hash maps and caches to optimize the PKB, handling exceptions in the parser, and taking a "narrow passage" approach to the PKB API to minimize its public methods and make the query processor independent of PKB subcomponents.
Copyright
© Attribution Non-Commercial (BY-NC)
Available Formats
Download as PPT, PDF, TXT or read online on Scribd
You are on page 1/ 40

CS3215 – Software

Engineering Project
Team 8 – Static Program Analyzer
Presentation – Purpose
 To highlight USPs of Team8’s SPA
 “To show how unique we are and why we chose to
be so”

 “The Missing Piece” to our Final Report


 Few ideas could have been overlooked.
 Also provides clearer explanation to topics
covered briefly
 An Addendum to the Report
 Clearer understanding of our ideas, project
Presentation – Overall Achievements
 All Project Requirements have been met.
 Modifies, Uses and Calls – Data Structures in the
PKB
 Parent, Follows, Next, Affects – On Demand
 Clear and Efficient QES.
 Optimizations both from PKB and PQL.
 Flexible and Extensible SPA
Presentation - Agenda
 Team PKB (First Half)
 Parser
 PKB Design
 PKB API - The ‘Narrow Passage Approach’ 
 PKB Optimization

 Team PQL (Second Half)


 Query Processor
 Query Evaluation Strategy
 Query Optimization Principles
The Parser – Basic
 Approach
 Recursive subroutines
 Token–based

 Robust
 Capable of handling parse exceptions
 Detailed Reporting using the global Exception
class
The Parser – Exceptions
 Exceptions reported
 Token Mismatch
 Reports found and expected tokens and the line number
 Invalid Variable Name
 Empty Program or Empty Statement List
 Procedure calling itself (Direct Recursion)
 Called procedure is not found
 Multiple procedures having the same identifier (name)

 Exceptions API is used consistently


The Parser – Exception Class
 Generic, global class
 Inherited classes used both in the Parser and the
Query Pre-processor
 Exception: One of the minor reusable components
of our SPA
PKB – Overview (1/3)

‘Crucial-Entity’ Relationship
CacheTable
Tables Tables

VarTabl ProcTa StmtTa ModTab UsesTa CallsTabl


e ble ble le ble e

VarTable ProcTabl StmtTab Modifies Uses API Calls API


API e API le API API
PKB – Overview (2/3)

Abstract Syntax Control Flow


Tree Graph

CFG API
AST API

Follows Parent Affects


Next API
API API API
PKB – Overview (3/3)
 Crucial Entities Ec is a subset of all entities Eall (from
the Entity Table)
 Conditions:
 One of the arguments for any of the relationships in the
Relationships Table
 Cannot be a ‘derivative’ of another Ec. Eg. StmtList is a Statement

 Design Focus
 Speed - Data structures must enable quick data retrieval
thereby aiding in the speed of the query in a small way
 Extensibility

 Design Choice – Hash Maps 


PKB – Variable Table (1/3)
 ‘Two dimensional Hash’
 Index points to variable and variable points to a structure that includes index
 A KeyMapper keeps track of the keys
 Speed is guaranteed 

 Data is handled by the Modifies, Uses (friends) and Variable Table APIs

 Extensible – Can be easily extended to handle new relationships


 Take a look at the diagram (next slide)

 Procedure Table is quite similar except for Calls.


PKB – Variable Table (2/3)
Variable Variable Vector
Name
Index Stmts_used Stmt_modfd Proc_used Proc_modfd

0 4, 5, 7, 8, 9 4, 5, 14, 15, 18 P, Q Example, P


x

y 1 17, 18, 19 - R -

z 2 - 4, 5, 7, 8, 9 - P

i 3 24, 25 4, 5, 7, 8, 9 P R, Q
PKB – Variable Table (3/3)
 Variable Table = Hash_Map(Variable Name, Variable Vector)
 Extensibility
 Assume new relationship Coexist(variable, variable)
 So, there is a relationship R such that entity ‘variable’ is one of
the arguments.
 Two new columns need to be added to the Variable Table.
 variable_coexisting, variable_coexisted
 Generic View: <entity_relationship>
 “Structure accommodates change” – Extensible design
PKB – ModifiesTable (1/3)
 ‘Hash Map with a Vector Key’
 Key is a vector of two values (variable being modified and identifier of the
entity modifying the variable (From the entity table)
 Speed is guaranteed 

 Key is mapped to a Boolean Vector.


 Conserves Space. Each element = 1 bit only.

 Extensible – Can be easily extended to handle new entities


 Take a look at the diagram (next slide)

 UsesTable, Calls Table are quite similar


PKB – Modifies Table (2/3)
Modifies Key Modifies Vector

VarName Modifier_ID Boolean Vector

x 1 Taken from Entity 110


Table,
1 = Procedure, 3 =
Stmt
x 3 111101010110101010101000000
ith bit here corresponds to whether the ith statement
modifies variable ‘x’ or not

y 1 001

Varname 3
Taken from Variable Table
y 0000000000000000000000100
PKB – ModifiesTable (3/3)
 Modifies Table = Hash_Map(<vector> Modifies Key, Modifies Vector)
 Extensibility
 Assume new entity Function such that Modifies is extended to
include Modifies(Function, variable)
 Modifies Table neither be changed nor a new Modifies Table
be created
 “Structure accommodates change” – Extensible design
PKB API – The ‘Narrow Passage’
Approach (1/3)
 PKB API methods are covered by a ‘wrapper interface’
 Query Processor Access
 Restricted to the Entity Table, Relationship Table and the ‘wrapper’
interface – ‘Narrow Passage’
 Interface
 relationshipHandler(relationship, <vector> arguments)
 withHandler(<vector> arguments)
 patternHandler(<vector> arguments, pattern)

 Design - “Can be detached and added to the Evaluator”


PKB API – The ‘Narrow Passage’
Approach (2/3)
 Query Processor independent of the PKB subcomponents.
 Useful Scenario: Data structures are added. API widens.
 Query Processor needs to change if not for this approach.
 With NPA, PKB adjusts or accommodates itself
 Query Processor just looks at the change in the Relationship Table

 Minimizing PKB’s public API methods


 ‘Taking burden off the Evaluator’

 Easier to cache relationship calls in the PKB.


 Covered under PKB’s Cache Table.
PKB API – The ‘Narrow Passage’
Approach (3/3)
 relationshipHandler(relationship, <vector>
arguments)
Relationship index An argument has an
from the Relationship entity from the Entity
table Table and a value

 Ex. relationshipHandler(2, { {1, 12}, {v, 11} })


 It’s a call to the Modifies relationship with
arguments, a constant with a value 1 and a non-
constant variable
 Modifies(1,v)
PKB Optimization – Cache Table (1/4)
 Traditional view of PKB – “Static Knowledge Base”
 Cache – Missing ‘Dynamic Knowledge’ Component
 Dynamic Knowledge
 Derived Knowledge or ‘Learn from Experience’ principle
 Alternative to storing the Follows, Parent, Affects and
Next relationship calls in the PKB in separate data
structures. (Why?)
 Caching is done in the relationshipHandler() interface
method.
 Controlled by Global Parameters (constants)
PKB Optimization – Cache Table (2/4)
 Eliminating ‘computation on demand’
 By pre-computing and storing all design abstractions in the PKB
 Not elegant
 Space – Doesn’t work out in the real world!
 Time – Quite high for Affects*
 Extensible – Difficult to extend when RelTable increases in size
 Is it worth it?
 No 

 Fails when Q is a small for a program size of large N


PKB Optimization – Cache Table (3/4)
 Pre-computation time and PKB space depend of the program size
 S, T is proportional to N

 Fails as Pre-computation time and PKB space depend of the program size
and number of queries, Q
 S, T is proportional to Q, N

 Our Cache Table introduces the Q factor for design abstractions computed on
demand – “Don’t pre-compute, store when calculated”
PKB Optimization – Cache Table (4/4)
 Global Parameters
 CACHE, 0 or 1
 CACHE_MAX_QUERIES, 0 to max
 MIN_CALLS_CACHE_OPEN, 0 to max
 Extremely useful when the User knows the number of queries that will be entered in a
particular ‘Querier’ session. (Which is the case mostly)

 Cache miss doesn’t cost much as the Cache table is also a hash map.
Cache Table = Hash_map(QueryObj, QueryResult);

We call it ‘Query’ but it


refers to the relationship
call object and relationship
call result.
PKB Optimization – Other
 Restriction performed on Sets using the Entity Table
 The Entity Table included the complete set of values for each entity
 This set was pre-computed during parsing.
 Used to perform restriction to save on computation time.
 Ex. For Affects/Affects*, only assignment statements were to be used.
So, a intersection between all values of h
Program lines in which all
instances of the entity are
Entity Table found
Index Name Attribute Type Values
0 program progName name
1 procedure procName name P, Q, Example

2 stmtList
3 stmt stmt# integer 1, 2, 3 … 25
4 assign stmt# integer 2, 4, 5, …
5 call stmt# integer 4, 3
6 while stmt# integer 6, 9
7 if stmt# integer 10
8, 9, 10 plus, minus,
times
11 variable varName name x, y, z, i
12 constant value value
13 program line stmt# integer 1, 2, 3 … 25
Number of fields under Type, Entity
varies with the number of arguments
Relationship Table
Index Name Arguments Type1 Type2 Entity1 Entity2

0 Calls 2 name name 1 1


1 Calls* 2 name name 1 1
2 Modifies 2 both name 1, 3, 4, 5, 6, 7, 11
13

3 Uses 2 both name 1, 3, 4, 5, 6, 7, 11


13

4 Parent 2 integer integer 3, 6, 7, 13 3, 4, 5, 6, 7, 13

5 Parent* 2 integer integer 3, 6, 7, 13 3, 4, 5, 6, 7, 13

6 Follows 2 integer integer 3, 4, 5, 6, 7, 13 3, 4, 5, 6, 7, 13

7 Follows* 2 integer integer 3, 4, 5, 6, 7, 13 3, 4, 5, 6, 7, 13

8 Next 2 integer integer 3, 4, 5, 6, 7, 13 3, 4, 5, 6, 7, 13


Query Evaluation Strategy - Interface
Methods
 getValues(Entity) from Entity Table
 relationshipHandler(Name, Arg1, Arg2)
 patternHandler(Arg1, Arg2)
 withHandler(Arg1, Arg2)
Query Evaluation Strategy
Assume ‘Select’ variables in the form ‘Select
<s1, s2, …, sn>’)
 Step 1: getValues(), all values of s1, s2, …, sn.
 Step 2: patternHandler() and withHandler(),
filter ‘Select’ values.
 Step 3: Form combinations of output results.
[e.g. s1={4,5}, s2={6,7}. <s1, s2> will form {4
6, 4 7, 5 6, 5 7}.]
Query Evaluation Strategy
 Store non-’Select’ values in memory to aid in
finding ‘Select’ values as query answer.
 E.g. “program line n; assign a; Select a such
that Next*(13, n) and Affects*(a, n)”
 relationshipHandler() to find and store values
of n into memory, then use
relationshipHandler() to find values of a, given
n.
Query Evaluation Strategy
 Constants: 2, 3, 5, “Example”, “x”
 Non-constants: declaration variables and ‘_’

 Step 4: For each ‘Relationship’ clause,


 Case 1: [Relationship](constant, constant)
 Case 2: [Relationship](constant, non-constant)
 Case 3: [Relationship](non-constant, constant)
 Case 4: [Relationship](non-constant, non-constant)
Query Evaluation Strategy
 Case 1

 Case 2 and 3,
a) Non-constant is placeholder.
b) Non-constant is in ‘Select’ values.
c) Non-constant is not in ‘Select’ values.
Query Evaluation Strategy
 Case 4,
a) Both non-constant are the same
i. Placeholder
ii. ‘Select’ values
iii. Non-’Select’ values
b) Both non-constant are different
i. Placeholder in either argument
ii. Both arguments in ‘Select’ values
iii. ‘Select’ value in either argument
iv. Both arguments not in ‘Select’ values
Query Evaluation Strategy
 Variables not in ‘Select’ and ‘Relationship’
clauses, but are in ‘With’ and ‘Pattern’
clauses.
 E.g. “Stmt s, s1; Select s such that
Follows(2,s) with s1.stmt#=5”
Query Pre-Processor
 Validates queries
 Transforms query string into query tree for
efficiency of query optimization and evaluation
Example Query
 assign a, a1; stmt s, s1; Select <a, s> such
that Follows(a, s) and Next*(a1, s1) with
s.stmt# = 1 pattern a(_, _”x+z”_)
Query Validation
 Rules for PQL:
 Grammar table using static regular expressions
 Rules for relationships and entities stored in static
“tables”, e.g. RelTable, EntTable
 Rules are not “hardcoded”
Query Validation
 Check full syntax of query
 All variable synonyms are checked against
declared variable map
 Relationships are checked for correct number
of arguments, correct type of arguments, and
non-ambiguity
 Attribute references must correspond to those
of variable synonyms
Query Parsing
 Regex expressions in grammar table are used
to extract parts of query
 Parsing is recursive, similar to parsing of
SIMPLE source code
 As parsing is done, query tree is built
PQL Optimization
 Basic strategy
 Reordering relationships for optimal linear
evaluation of relationships from left to right
 Left: Most restrictive
Right: Least restrictive
 Priority given to joins between relationships
compared to crosses between relationships
PQL Optimization
 Order by relationship types
 Follows vs. Affects
 Order by number of occurrences for
relationship argument types
 Follows*(a1, a2) vs. Follows*(a1, s1)
 Order by combinations of variables and
constants as relationship arguments
 Modifies(“First”, “x”) vs. Modifies(p, v)
 Order by number of output variables with
relationship arguments
 Select <a, s> …; Follows(a1, s1) vs. Follows(a, s)

You might also like