Reversim	
  Summit	
  2014	
  	
  
Concurrency	
  and	
  Multi-­‐Threading	
  
Demystified	
  
!

Haim Yadid - Performize-IT
About	
  Me:	
  Haim	
  Yadid
•21 Years of SW development experience
•Performance Expert
•Consulting R&D Groups
•Training: Java Performance Optimization
Moore’s	
  Law
•Number of transistors doubles constantly
•CPU frequency à

stalled.

•Performance boost through parallelism
Yes….	
  But
•Performance is not a problem anymore
•We prefer commodity hardware
•We have Hadoop and Big Data!!!
•Hardware is cheap.
•Several processes will do
•My programming language

will protect me
!

Concurrency
Concurrency
•Decomposition of your program

into independently executing processes
•About structure
•About design
•It is not something you code it is
something you architect
Parallelism
•Simultaneous execution of (possibly
related) computations
•On different cores
•About execution and scheduling
•Not about architecture
Worker
•Someone who is capable of doing an
efficient job
•Hard working
•Can execute
•Will do it with pleasure

Sid
State
•What in Sid’s mind ?
•Whats in the environment
•In a certain point in time

Pile	
  of	
  sand
Task
•A Unit of work scheduled to sid
•Possibly code
•Data
•Means to communicate with Sid
•E.g. Java: Callable

Task
Our	
  World
Liveliness	
  Problems
•Deadlock
•Starvation
•Waiting for a train that will never come
func count() {

for i := 0; i < 1000; i++ {

fmt.Println(i)

time.Sleep(10 * time.Millisecond)

}

}
func main() {

go count()

time.Sleep(3000 * time.Millisecond)

for {} 

GOMAXPROCS=2
}

Go
Data	
  Races
•Inopportune interleaving
•Stale values
•Loosing updates
•Infinite loops

class Foo {
private HashSet h = new HashSet();

!

}

!

boolean introduceNewVal(Object v) {
if (!h.contains(v)) {h.add(v); return true; }
return false;
}

Java
Performance
•Communication overhead
•Contention
•Imbalanced task distribution
•False sharing
val v = Vector(…)
v.par.map(_ + 1)

Scala
Whats	
  wrong	
  here?
•IncrementX accessed from ThreadX
•IncrementY accessed from Thread Y
•An aggregator thread will read both
Class	
  T	
  {

	
  	
  	
  volatile	
  int	
  x	
  =	
  0;	
  

	
  	
  	
  volatile	
  int	
  y=0;


!

long incrementX() { x++; }
long incrementY() { y++; }


}
False	
  sharing:	
  cache	
  coherency	
  -­‐>	
  hitting	
  same	
  cache	
  line

Java
Sid	
  Life	
  Cycle
•Creation
•Destruction
Heavy	
  Weight	
  Sid
•Threads
•Thread Pools/Executors: Single
Threaded/Fixed size/Bounded min..max/
Unbounded
•Fork Join pools (Work stealing)
•Storm Bolts
Lightweight	
  Sid
•Sid is not a thread rather Scheduled to a
thread
•Green Threads
•Actors (Scala/Akka)
•Agents(Closures)
•Go Routines
Communication
•BlockingQueues
•Futures and promises CompletableFuture
•Dequeues
•<- Go Channels
•! (Scala actors)
Serial	
  Execution
•Three queries to DB executed
•In Serial
•Long response time
doGet(req,resp) {

rs1 = runQuery1()

rs2 = runQuery2()

rs3 = runQuery3()
resp.write(mergeResults(rs1,rs2,rs3))

}

Can	
  be	
  parallelized

Pseudo(Java)
Create	
  Threads
•Run three queries in parallel…..
•but:
doGet(req,resp) {

q1 = new Query();

new Thread(q1).start();

q2 = new Query();

new Thread(q2).start();

q3 = new Query();

new Thread(q3).start();


!

}

rs1 = q1.getRs();

rs2 = q2.getRs();

rs3 = q3.getRs();

resp.write(mergeResults(rs1,rs2,rs3));

Thread	
  leak	
  +	
  Thread	
  creation	
  overhead	
  +	
  data	
  races

Pseudo(Java)
Thread	
  Pool
doGet(req,resp) {

ExecutorService e = Executors.newFixedThreadPool(3); 

ArrayList tasks = …;

tasks.add(new QueryTask(Query1)) …. 3
List<Future<Integer>> fs;

fs = e.invokeAll(tasks); // invoke all in parallel



ArrayList results = …

for (Future<Integer> f : l) { // collect results 

if (f.isDone()) {

results.add(f.get());

}

}
resp.write(mergeResults(results));

}
Thread	
  pool	
  leak	
  +	
  Thread	
  pool	
  creation	
  overhead

Pseudo(Java)
Thread	
  Pool	
  2
static ExecutorService e = Executors.newFixedThreadPool(3);

!

doGet(req,resp) {
ArrayList tasks = …;
tasks.add(new QueryTask(Query1)) …. 3
List<Future<Integer>> fs;
fs = e.invokeAll(tasks); // invoke all in parallel

!

}

ArrayList results = …
for (Future<Integer> f : l) { // collect results
if (f.isDone()) {
res.add(f.get());
}
}
resp.write(mergeResults(results));
Size	
  ?/	
  share	
  thread	
  pool?	
  /	
  name	
  thread	
  pool	
  threads

Pseudo(Java)
Same	
  Example	
  With	
  Go
func	
  execQuery(query	
  string,	
  c	
  chan	
  *Row)	
  {

	
  	
  	
  	
  	
  c	
  <-­‐	
  db.Query(query)

}


!

func	
  doGet(req,resp) {

c := make(chan *Row)
go execQuery(query1,c)

go execQuery(query2,c)

go execQuery(query3,c)

for i := 0; i<3 ; i++
combineRs(rs <-c) 

}

Pseudo(Go)
State	
  Management
•Eventually we need to have state
•It is easy to deal with state when we have
one Sid
•But what happens when there are several
•We have three approaches
•Most are familiar with only one
Handling	
  State
•Shared Mutability
•Shared Immutability
•Isolated Mutability
Shared	
  Mutability
•Multiple Sids access the same data
•Mutate the state
•Easily exposed to concurrency hazards
Visibility
•Change made by sid1 is visible to sid2?

•Solutions
•volatile keyword
•Memory Model(Happens before)

Memory

L3	
  cache

L2	
  cache

L1	
  cache

•compiler reordering

Registers

•Caches

CPU

•Not so simple
Atomicity
•What can be done in a single step ?
•CAS constructs (Compare and swap)
•AtomicInteger
•AtomicLong
•ConcurrenctHashMap putIfAbsent
Atomicity	
  is	
  not	
  Viral
•An (almost) real example
•A non transactional database
•Balance per user
•Use atomicity to solve the problem
class	
  User	
  {	
  
	
  	
  	
  private	
  AtomicLong	
  balance	
  =	
  …..	
  

!

	
  	
  	
  int updateBalance(int diff) {
long temp = balance.addAndGet(diff);
setToDB(temp);
}
}

Java
Locking	
  (Pessimistic)
•Synchronized
•Sync
•Mutex
•Reentrant lock
•Semaphores
•Synchronized Collections
Beware	
  Sync	
  Collections
•Two synchronized operations
•are not synchronized

if	
  (syncedHashMap.get(“b”)	
  !=	
  null)	
  {	
  
	
  	
  syncedHashMap.put(“b”,2);	
  
}

Java
Hazards	
  
•Fine grained
•—> Deadlocks

•Coarsed Grained
•—> contention
STM	
  (Optimistic)
•Software Transactional Memory
•Transactional semantics ACI: (not Durable)
•Atomic,
•Consistent and
•Isolated
•No deadlocks - when collision retry!
•Clojure refs , Akka refs
STM	
  performance	
  problem	
  when	
  there	
  are	
  too	
  many	
  mutations.
STM
•Clojure refs and dosync

•Scala Refs and atomic
•Multiverse Java
Mutliverse	
  STM	
  Example
import org.multiverse.api.references.*;
import static org.multiverse.api.StmUtils.*;

!
!
!

public class Account{
private final TxnRef<Date> lastUpdate;
private final TxnInteger balance;
public Account(int balance){
this.lastUpdate = newTxnRef<Date>(new Date());
this.balance = newTxnInteger(balance);
}

Java
Mutliverse	
  STM	
  Example
public void incBalance(final int amount, final Date date){
atomic(() ->{
balance.inc(amount);
lastUpdate.set(date);

!

if(balance.get()<0){
throw new IllegalStateException("Not enough money");
}
});
}
}

Java8
Mutliverse	
  STM	
  Example

!

public static void transfer(final Account from, final Account to, final int amount){ Java8
atomic(()->{
Date date = new Date();
from.incBalance(-amount, date);
to.incBalance(amount, date);
});
}

Retry	
  ahead	
  beware	
  of	
  side	
  effects	
  
Pure	
  Immutability	
  
•We have shared state
•But Shared state is read only (after
construction)
•No concurrency issues
•No deadlocks
•No race conditions
•No stale values
•Optimal for cache
Support	
  From	
  Languages
•Functional Languages favour immutability
•vals in scala clojure
•final keyword in java
•freeze method in ruby
Immutable	
  Object	
  Example
Object cannot be changed after
construction
all fields are final
public final Class MySet {




private final Set<String> vals = new HashSet<String>();




public MySet(String names[]) {

for(name:names) vals.add(name);

}




public boolean containsVal(String name);…..

}

Java
CopyOnWrite	
  Collections	
  
•Any changes to it will create a new copy
•Safe
•Fast read, read without synchronisation
•Iteration is fast
•do not support remove() set() add()

Bad	
  performance	
  when	
  mutation	
  rate	
  is	
  high	
  
Persistent	
  collections
•Immutable collections
•Which are efficient
•Preserve the same O(_) characteristic of
a mutable collection.
•Shared structure
•Recursive definition
Persistent	
  Trie

root
0
A

1

2

C

E
1
D

F

2,1
Persistent	
  Trie

root

root
0
A

1

2

C

E

E
1
D

1
F
Example	
  Customization	
  Cache
•A Web application server
•Serving Complicated and customisable UI
•Each user has it’s own customization
(potentially)
•Classic for immutable collection
•Low mutation rate
•High read rate
Example	
  Customization	
  Cache
•Customization Data is immutable
•Customization Data HashMap is a
Persistent Map
•Cache is represented by a single STM
reference
•Update will fail if two are performing it
at once
Immutability	
  and	
  GC
•Immutability is great
•But: Generates of a lot of objects
•When done for short lived objects GC can
cope with it
•Long lived immutable objects/collections
which change frequently may cause GC to
have low throughput and high pause times
Isolated	
  Mutability
•No shared state
•Each Sid has its own pile of sand
•Message passing between Sids
•Prefer passing immutable objects
Isolated	
  Mutability
•Javascript Workers
•Ruby/NodeJS multi process
•Actors (Scala Erlang)
•Agents (Clojure)
•Go routines/ channels - Go
Actor
Actor

Isolated	
  
Mutable	
  
State

Actor

Actor

Actor
Building	
  a	
  Monitoring	
  System
•Monitoring System (e.g. Nagios)
•~100k of monitors
•running periodically
•Each one has a state.
•Consumers are able to query state.
•Some monitors may affect other monitor
state
Components
•MonitorActor (two actors)
•HostActor
•MonitorCache
•SchedulerActor
•QueryActor
•UpdateActor
Monitor	
  Actors
•MonitorStateActor(MSA)
•Alway readys to be queried
•State updated by message from MRA
•Stateless

•MonitorRecalculateActor(MRA)
•Maybe recalculating and not responsive
•Stateful
•Supervises MSA
MonitorsCache
•An immutable cache - holds all actor refs
•Single view of the world.
•Used by SchedulerActor and Query
Actor
•May have several objects managed By
STM
Actor
MonitorsCache

Status/	
  
state
Query	
  
Actor

Scheduler

MSA

MRA
Further	
  Reading
•Java Concurrency In Practice /Brian Goetz
•Effective Akka / Jamie Allen
•Clojure High Performance Programming /
Shantanu Kumar

•Programming Concurrency on the JVM:
Mastering Synchronization, STM, and
Actors /Subramaniam, Venkat
Thanks + Q&A + Contact Me
lifey@performize-it.com
blog.performize-it.com
www.performize-it.com
https://github.com/lifey
http://il.linkedin.com/in/haimyadid
@lifeyx

© Copyright Performize-IT LTD.

Concurrency and Multithreading Demistified - Reversim Summit 2014