Be Social. Use CrowdRE.
An IDA Plugin for Collaborative Reversing



Tillmann Werner, Jason Geffner




RECON, Montreal, Canada
Friday, June 15, 2012
CrowdStrike

■ Stealth mode startup

■ Handpicked ‘A’ team of technical talent

■ 26 Million Series A funding

■ “You don’t have a malware problem, you have an adversary
  problem”™

■ We are hiring!
Special Thanks




   Georg Wicherski           Aaron Putnam        TJ Little and Harley   Jeff Stambolsky
Sr. Research Scientist   Sr. Research Engineer    Sr. UI Engineers       Resident Nerd
Why                          ?

■ Developers work in teams to build the software we are reversing
  ■   Stuxnet, Flame, Duqu
  ■   RATs like PoisonIvy
  ■   Bots like Zeus
  ■   calc.exe

■ Code reuse is prevalent in malware variants

■ Working together, we can reverse more quickly and efficiently

■ Take a page from developer world and model RE after source
  control methodologies
Collaborative Reversing

■ Approach 1: Just-in-time propagation of results
  ■ All changes are synchronized to all users instantly
  ■ Well-suited for teaching reverse-engineering, demonstrations, etc.

■ Approach 2: Working on different parts, sharing results on demand
  ■ Distributed tasks
  ■ Multiple people can work on different parts simultaneously
  ■ Analysis results can be combined at any time
Related Work – Tools of the Trade

■ IDA Sync, 2005
 ■ Real-time synchronization of names, stack variables, comments
 ■ Hooks into IDA hot keys

■ CollabREate, 2008
 ■ Successor of IDA Sync: IDA Pro “remote-control”
 ■ Snapshot report: replay all updates up until a certain point

■ BinCrowd, 2010
 ■ Commit-based model
 ■ Supports matching similar functions
The                           Platform

■ Community platform to support professional, distributed RE
  ■ Design similar to version control systems
  ■ Commits: annotations per function

■ Free Cloud service for the reverse engineering community
  ■ People can share their results
  ■ Reverse engineering projects can benefit from community input

■ IDA Pro plugin
  ■ Utilizes the power of the Hex-Rays Decompiler plugin
  ■ Integrates smoothly into IDA’s Qt GUI
Rewoltke  CrowdRE




       +               = rewoltke
                     ...
BinNavi Integration

■ Google is adding integration for CrowdRE to BinNavi

■ Analysts will be able to use BinNavi to share their analysis results
  with the CrowdRE community

■ Our best wishes go to Thomas Dullien for a speedy recovery
Annotations

■ Function prototype
    ■   Name
    ■   Calling convention
    ■   Return type
    ■   Parameter types and names

■   Stack variables
■   Register variables (Hex-Rays)
■   Structs, enums
■   Comments – IDA and Hex-Rays
Type Information

■ Types
  ■ Structs
  ■ Enums
  ■ User-defined types

■ Function annotations depend on types
  ■ Dependencies are recursively included
  ■ Checkouts contain dependencies, too
  ■ Name duplicates require conflict resolution
    ■ User is prompted for solution (update, retain, keep)


■ Future plan: resolving cyclic dependencies
Importing Annotations

■ Batch import
  ■ The first thing to do when starting
    to work on a new binary
  ■ Always the most recent commit

■ Individual imports
  ■ More control over what to import
  ■ User can choose between different versions
Finding Functions

■ Exact matching
  ■ Binary’s hash + function offset

■ Fuzzy matching
  ■ SHA1 hash over sequence of mnemonics

■ Position-independent representation
  ■ Want to cover immediates, too
  ■ Jump and call operands are zeroed out
  ■ Same for immediates that generate
    cross-references
Dealing with Multiple Matches                                     FNV Hashes

                                                             • Fast to compute
■ Multiple matches – which is the best?                      • Good Avalanche behavior
  ■ Quality of the annotation                                • For different word sizes
  ■ Code similarity
    ■ Compute similarity value for pairs of inputs             hash := FNV_BASIS
    ■ Rank by this value, let the user choose                  for byte in input:
                                                                  hash ^= byte
                                                                  hash *= FNV_PRIME
■ Similarity hashing
  ■ Assign consecutive basic blocks to chunks
    ■ Fixed number of chunks ensures constant sized output
  ■ For each chunk: compute FNV hash
  ■ Combine FNV hashes to final hash
  ■ s(a, b) = 100 – normalized_levenshtein(simhash(a), simhash(b))
Similarity Hashing – Details

■ Basic block reordering poses challenges
  ■ Define an order on the set of basic blocks
  ■ Come up with a reordering resilient scheme

                                                         BB1          BB2   BB3      BB4
■ Fuzzy hash serves as pre-filter
  ■ Matches are usually 100% equal
  ■ Make fuzzy hash more fuzzy                                 fnv1           fnv2
    ■ Position-independent representation quite strict
    ■ Need to take instruction reordering into account


■ Improved algorithms in future versions
Demo Time!
Future Plans

■ Integration with other RE tools?

■ Cloud service
  ■ Social ratings of commits
  ■ Access control lists

■ Client
  ■   Real time notifications on updated annotations
  ■   New and improved matching algorithms
  ■   Ability to deal with cyclic type dependencies
  ■   Tracking of function/file mappings
  ■   Mass importing of common library code
Where to get it: http://crowd.re
Be Social. Use CrowdRE.

Be Social. Use CrowdRE.

  • 1.
    Be Social. UseCrowdRE. An IDA Plugin for Collaborative Reversing Tillmann Werner, Jason Geffner RECON, Montreal, Canada Friday, June 15, 2012
  • 2.
    CrowdStrike ■ Stealth modestartup ■ Handpicked ‘A’ team of technical talent ■ 26 Million Series A funding ■ “You don’t have a malware problem, you have an adversary problem”™ ■ We are hiring!
  • 3.
    Special Thanks Georg Wicherski Aaron Putnam TJ Little and Harley Jeff Stambolsky Sr. Research Scientist Sr. Research Engineer Sr. UI Engineers Resident Nerd
  • 4.
    Why ? ■ Developers work in teams to build the software we are reversing ■ Stuxnet, Flame, Duqu ■ RATs like PoisonIvy ■ Bots like Zeus ■ calc.exe ■ Code reuse is prevalent in malware variants ■ Working together, we can reverse more quickly and efficiently ■ Take a page from developer world and model RE after source control methodologies
  • 5.
    Collaborative Reversing ■ Approach1: Just-in-time propagation of results ■ All changes are synchronized to all users instantly ■ Well-suited for teaching reverse-engineering, demonstrations, etc. ■ Approach 2: Working on different parts, sharing results on demand ■ Distributed tasks ■ Multiple people can work on different parts simultaneously ■ Analysis results can be combined at any time
  • 6.
    Related Work –Tools of the Trade ■ IDA Sync, 2005 ■ Real-time synchronization of names, stack variables, comments ■ Hooks into IDA hot keys ■ CollabREate, 2008 ■ Successor of IDA Sync: IDA Pro “remote-control” ■ Snapshot report: replay all updates up until a certain point ■ BinCrowd, 2010 ■ Commit-based model ■ Supports matching similar functions
  • 7.
    The Platform ■ Community platform to support professional, distributed RE ■ Design similar to version control systems ■ Commits: annotations per function ■ Free Cloud service for the reverse engineering community ■ People can share their results ■ Reverse engineering projects can benefit from community input ■ IDA Pro plugin ■ Utilizes the power of the Hex-Rays Decompiler plugin ■ Integrates smoothly into IDA’s Qt GUI
  • 8.
    Rewoltke  CrowdRE + = rewoltke ...
  • 9.
    BinNavi Integration ■ Googleis adding integration for CrowdRE to BinNavi ■ Analysts will be able to use BinNavi to share their analysis results with the CrowdRE community ■ Our best wishes go to Thomas Dullien for a speedy recovery
  • 10.
    Annotations ■ Function prototype ■ Name ■ Calling convention ■ Return type ■ Parameter types and names ■ Stack variables ■ Register variables (Hex-Rays) ■ Structs, enums ■ Comments – IDA and Hex-Rays
  • 11.
    Type Information ■ Types ■ Structs ■ Enums ■ User-defined types ■ Function annotations depend on types ■ Dependencies are recursively included ■ Checkouts contain dependencies, too ■ Name duplicates require conflict resolution ■ User is prompted for solution (update, retain, keep) ■ Future plan: resolving cyclic dependencies
  • 12.
    Importing Annotations ■ Batchimport ■ The first thing to do when starting to work on a new binary ■ Always the most recent commit ■ Individual imports ■ More control over what to import ■ User can choose between different versions
  • 13.
    Finding Functions ■ Exactmatching ■ Binary’s hash + function offset ■ Fuzzy matching ■ SHA1 hash over sequence of mnemonics ■ Position-independent representation ■ Want to cover immediates, too ■ Jump and call operands are zeroed out ■ Same for immediates that generate cross-references
  • 14.
    Dealing with MultipleMatches FNV Hashes • Fast to compute ■ Multiple matches – which is the best? • Good Avalanche behavior ■ Quality of the annotation • For different word sizes ■ Code similarity ■ Compute similarity value for pairs of inputs hash := FNV_BASIS ■ Rank by this value, let the user choose for byte in input: hash ^= byte hash *= FNV_PRIME ■ Similarity hashing ■ Assign consecutive basic blocks to chunks ■ Fixed number of chunks ensures constant sized output ■ For each chunk: compute FNV hash ■ Combine FNV hashes to final hash ■ s(a, b) = 100 – normalized_levenshtein(simhash(a), simhash(b))
  • 15.
    Similarity Hashing –Details ■ Basic block reordering poses challenges ■ Define an order on the set of basic blocks ■ Come up with a reordering resilient scheme BB1 BB2 BB3 BB4 ■ Fuzzy hash serves as pre-filter ■ Matches are usually 100% equal ■ Make fuzzy hash more fuzzy fnv1 fnv2 ■ Position-independent representation quite strict ■ Need to take instruction reordering into account ■ Improved algorithms in future versions
  • 16.
  • 17.
    Future Plans ■ Integrationwith other RE tools? ■ Cloud service ■ Social ratings of commits ■ Access control lists ■ Client ■ Real time notifications on updated annotations ■ New and improved matching algorithms ■ Ability to deal with cyclic type dependencies ■ Tracking of function/file mappings ■ Mass importing of common library code
  • 18.
    Where to getit: http://crowd.re