Lesson2 7GraphPartitioning PDF
Lesson2 7GraphPartitioning PDF
TheGraphPartitioningProblem
Lookattheproblemfromadifferentangle:
LetsmultiplyasparsematrixAbyavectorX.
Recallthedualitybetweenmatricesandgraphs:
Rowsandcolumnsarevertices
Thenonzerosareedges
OnewaytodoaBFSistodoacomputationthatlookslikealinearalgebraproblem:
yA.x(Amultipliedbyx)
Todistributetheworkbyrows:
Assignrowstoprocesses.Thisisequivalenttopartitioningthegraph.
Whenyoupartitionthematrix,thisimpliesthatyouarealsopartitioningthe
vectorsxandy.Thisoccursbecausethereisaonetoonemappingofvector
entriestographvertices.
Work=O(nonzero)=theworkisproportionaltothenumberofnonzeros.
(If
n
isthenumberofnonzerosintherow,thenthedepthofthecomputationisthe
depthofthesum,whichis
O(logn)
,andtheworkisthesumoftheworkacrossthe
elements,whichis
O(n)
.)
Howtochoosepartitions:
Goal1:Divideuptherowstobalancethenumberofnonzerosperpartition.
Goal2:Minimizethecommunicationvolumebyreducingtheedgecuts.
Theclassicgraphpartitioningproblem:
Given:GraphG=(V,E)&numberofpartitionsP
Output:Computea(vertex)partition
V=V
p1
suchthat:
1. Thepartitionsshouldcoverallthevertices,butbedisjoint.
{V
}aredisjointV
V
=emptyset
i
i
j
2.Thepartitionsshouldallbeaboutequalinsize.
{V
}areroughlybalanced |V i| ~ |V j|
i
3.Thenumberofcutedgesshouldbeminimized.
LetE
={(u,v)|u V
,,v V
,i =/ j}
cut
i
i
Minimize |Ecut|
DoYouReallyWantaGraphPartition?
Forasparsematrixmultiply:
Recallthetwocomputationgoals:balancework,minimizecommunication
Arethegoalsthesameforgraphpartitioningandsparsematrixmultiply?
No,theyarenotthesame.
Considerathreewaypartitioningofagraph:
Eachpartitionhas2verticeseach
Thereareninecutedges
Ifwetranslatethisgraphpartitioningtothematrixpartitionwesee:
Thematrixpartitionsarenotequalwithregardstothenonzeros.Onepartition
has10andtheothertwohave7nonzeros.
Thismeansthevertexcountsarethesame,buttheWORKisNOT.
Findthepartitionthatminimizesthenumberofedgecutsandbalancesnumberof
nonzeros.(24nonzerostotal)
GraphBisectionandPlanarSeparators
GraphPartitioningisNPCompleteso:
Needheuristics
Needtoexploitstructure
(NPCompleteexplanation:
https://www.mathsisfun.com/sets/npcomplete.html
)
(Adivideandconquer
algorithm
worksbyrecursivelybreakingdownaproblemintotwoormore
subproblemsofthesame(orrelated)type,untilthesebecomesimpleenoughtobesolved
directly.Thesolutionstothesubproblemsarethencombinedtogiveasolutiontotheoriginal
problem.)
ASimpleHeuristic:Bisection(basedondivideandconquer)
GiveagraphG,divideitintoPpartitions.
Step1:dividethegraphintotwopartitions
Step2:Divideeachhalfintotwo
Step3:ContinuetodivideeachpartitionintohalfuntilPpartitionsarereached
Thisworks,buthowdowegettwowaypartitions?TBD
PlanarGraphs
Planargraphsareonesthatcanbedrawnintheplanewithnoedgecrossing
PlanarGraphTheorem:
LiptonandTarjanTheorem:AplanargraphG=(V,E)with|V|=nverticeshasadisjoint
partitionV=A SBsuchthat:
1. SseparatesAandB(thismeanstherearenoedgesthatdirectlyconnectAandB)
2. |A|,|B| 2/3n thismeansthatnopartitionismorethantwicethesizeoftheother|A|/|B|
2 .Thismeansthepartitionsarebalanced.
3. |S|=O( n ).
Inaplanargraphsmightbethearoworcolumn.
TheexistenceofSDOESNOTmeanyoucanminimizetheedgecutsefficiently.Butany
algorithmthatcanfindtheseparatorshouldbeabletofindagoodpartition.
(BreadthFirstSearch:
1. startatsomepoint,giveittwovalues(distance,predecessor).Distanceisitsdistance
fromthestartingpoint,predecessorontheshortestpathfromthestartingpoint)
2. Thenvisitthenextlevel,thosewithdistance1,andsetthe(distance,predecessor)for
thosevertex.
3. Continueuntilallverticeshavebeenvisited.
https://en.wikipedia.org/wiki/Breadthfirst_search)
Quiz:PartitioningviaBreadthFirstSearch
AnalgorithmtheusesBFStobisectagraph:
1. Pickanyvertexasastartingpoint
2. RunalevelsynchronousBFSfromthisvertex
3. Youwillnoticethateverylevelservesasaseparator.
4. Becauseofthis,youcanstopwhenyouhavevisitedabouthalfofthevertices.
5. Assignallvisitedverticestoonepartitionandallunvisitedverticestotheotherpartition.
Thisisnottheonlyoptionforastoppingcriteriacanyoucomeupwithothers?
BFSschemesworkwellonplanargraphsandtheyarecheapbutweareusingBFStosolvea
BFSproblem.
KernighanLin
KernighanLinalgorithmisthemostfamousheuristicforgraphpartitioning.
Givenagraph,dividetheverticesintoequalornearlyequalsize.Anysplitwillwork.
V V
V
,|V
|=|V
|
1
2
1
2
Definecosttobe:thenumberofedgesthatgobetweenV
andV
1
2
NowtakeasubsetofV
andV
,callthemX
andX
1
2
1
2
letX
andX
,with|X
|=|X
|
1 V
1
2 V
2
1
2
NowchangethepartitionssothatX
andX
.
2 V
1
1 V
2
Youexpectthecutsizetochange,butbyhowmuch?
Toanswerthisquestion:
PickavertexV
inpartition1andavertexV
inpartition2.
1
2
Theexternalcostsare:
(Theedgesthatcrossthepartitions)
E
(a V
1
) #ofedges(a,b V
)
1
2
E
(b V
2
) #ofedges(b,a V
)
2
1
Theinternalcostsare:
(TheedgesthatDONOTcrosspartitions)
I
(a V
) #ofedges(a,i V
)
1
1
1
I
(b V
) #ofedges(b,j V
)
2
2
2
TheCostofthePartition:isthecostofthepartitionignoringaandbplustheexternalcostofa
andbminussomeconstant.
Cost(V
V
)=Cost(V
{a},V
{b})+E
(a)+E
(b)c
1,
2
1
2
1
2
a,b
Theconstantisnecessarytoaccountforanedgebetweenaandb.
c
=1:ifthereisanedge
a,b
c
=0:ifthereisnoedge
a,b
BUTwhysubtractthisconstant?
Butnowswapaandb.Whatisthecostoftheswap?
1. Anyedgethatwasexternalisnowinternal
2. Anyedgethatwasinternalisnowexternal
Cost(V^
V^
)=Cost(V
{a},V
{b})+I
(a)+I
(b)+c
1,
2
1
2
1
2
a,b
Whatisthechangeincost?
gain(a V
,b V
)=Cost(V
V
)Cost(V^
V^
)=E
(a)+E
(b)(I
(a)+I
(b))2c
1
2
1,
2
1,
2
1
2
1
2
a,b
Thelargerthechangeincostthebetterbecausethismeansalargerdecreaseinthecost.
KernighanLinAlgorithmQuiz
Assume:
Everyvertexhasapartitionlabel,thelabelcanbeaccessedinconstanttimeO(1)
Themaximumdegreeofanyvertexisgiven.Maxdegree d
Question:Whatisthesequentialrunningtimetocomputegain(a,b)intermsof
d,n
=|V
|,n
=|V
|?
1
1
2
2
Answer:O(d)
Togettheanswer:
Sweepovertheneighbors,therewillbeatmostdneighbors.Todetermineifaneighboris
internalorexternal,checkitspartitionlabel.
KernighanLinAlgorithm
G (V,E)
Agraphispartitioned.Toimprovethepartition,tryswappingtheelements,X
,X
1
2.
HowareX
,X
chosen?
1
2.
TheKLProcedure
1. Computetheinternalandexternalcostforeveryvertex.
2. Markallthenodesasunvisited.
3. Thencarryoutaniterativeprocedure:
Gothrougheverypairofunmarkedvertices.Chosethepairwiththelargestgain
andmarkthatpairasvisited.
Gothroughandupdateeveryinternalandexternalcostsasifaandbhadbeen
swapped.Youarenotswappingaandb,justupdatingthecosts.
Repeatuntilalltheverticesarevisited.
Thealgorithm:
Atthecompletionofthealgorithm:asequenceofgainshasbeencomputed:
gain(a
,b
),gain(a
,b
),,
1
1
2
2
Thisistheendofthefirstpartofthealgorithm.
Now:SumallthegainsletGain(j) gain(a
,b
)
k
k
k=1
KernighanLinconcept:keepalltheswapsthatmaximizethegain.IftheGainisgreaterthan
zero,thenthisiscandidatewillimprovethepartition.Swapthetwosubsetsandupdatethe
overallcosts.
RepeattheaboveuntilthereisnomoreGain.
2
Themainconcernwiththisalgorithmisthecost.ThesequentialrunningtimeisO(|V
|d)
disthemaximumdegreeofanyvertex.
GraphCoarsening
Graphcoarseningisadifferentkindofgraphpartitioning,itisaformofdivideandconquer.
Thegoalofgraphcoarsening:takeagraph,coarsenitsoitlookssimilartotheoriginalgraph
butwithfewernodes.Dothisuntilyouachieveagraphthatissmallenoughtopartitioneasily.
HowtoCoarsenaGraph:
1. Identifyonesubsetoftheverticestocollapseormerge
2. Replacethesubsetwithasinglesupervertex.
3. Assignaweighttothesupervertexthatisequaltothenumberofverticesitreplaced.
4. Assignaweighttotheedges.
Example:
Initialgraphtofinalresult:
MaximalandMaximumMatchings
Docoarsenagrapheffectively,aschemeisnecessarytodeterminewhichverticestocombine.
Oneidea:computeamatching
Matching:amatchingofagraphG=(V,E)isasubsetofE^ Eofwithnocommonendpoints.
Intheexamplebelow,thethreeedgesareamatchingbecausetheydontshareanyendpoints.
Itisalsoamaximalmatchingbecausenomoreedgescanbeaddedtoit.
Amaximummatchingisonethathasthemostnumberofmatches.Thisgraphhasamatching
thathasmorethanthreeedgesinit.Thegreenmatchingisamaximummatching.
AFactaboutMaximalMatchings
Givenagraphwithnvertices,youcoarsenitktimessothatithassvertices.
Howlargemustkbeintermsofnands?
Answer:log
(n/s)
2
Why:
1. Imaginethatthereisamaximalthatwillmatcheveryvertex(meaningeveryvertexis
partofamatchededge).Thiswillresultinacoarsenedgraphthathasthenumberof
vertices.
2. Thekversionofthegraphmusthavehasmanyverticesasthepreviouslevel.
3. Everylevelhastofollowthispattern.
k
4. Thefinalgraphmusthaven/(2
)vertices.
5. Thismeansk log
(n/s)
2
Canyouthinkofaworstcasegraphtocoarsen?
ComputingaMaximalMatching
Ateachstageofthisschemechoseanyunmatchedvertexatrandom.
1. Pickanyvertex
2. Matchittoanyofitsunmatchedneighbors.
3. Theneighboryouwanttochoseistheonewiththehighestedgeweight.Thereasonto
dothisis.itwilldecreasetheoverallweightinthenextlevelofthegraph.
FinetoCoarseandBackAgainQuiz
ProjectedSeparator:theverticesandedgesthatwillbecombinedinthenextlevelofthegraph.
Therecanbeasituationwherethenextlevelgraphmapsambiguouslytothepreviouslevel.
PartitionRefinement
Aminimumbalancededgecutinacoarsenedgraphminimizesthebalancededgecutinthe
nextfinergraph.Thisisfalse.Youneedtorememberthatcoarseningisbasedonaheuristic.
Whatifthecoarsenedgraphhadbeenbasedonamaximumratherthanmaximalmatching?
SpectralPartitioningPart1:TheGraphLaplacian
Consideranunweighteddirectedgraph,G.
Whenrepresentedbyanincidencematrix,eachrowisanedgeandeachcolumnisavertex.
Puta1atthesource,anda1atthedestination.
T
GraphLaplacianL(G) C
C
L(G)willlooklikethis
Thediagonalsaretallyingthenumberofincidentedgesoneachvertex.Theycountthedegree
ofeachvertex.(D)
Theoffdiagonalswillindicatethepresenceofanedge.(W)
T
C
C=DW
TheGraphLaplacianExample:
TheL(G)shouldbesymmetricaboutthediagonalandeveryrowsumsto0.
SpectralPartitioning,Part3:AlgebraicConnectivity
HandyFacts:
1. L(G)issymmetric
2. L(G)hasrealvalued,nonnegativeeigenvaluesandrealvalued,orthogonal
eigenvectors.
MultiplyingL(G)byitseigenvectorwillgiveascaledeigenvector.Thescalingfactoris
theeigenvalue.
Orthogonalthedotproductofanypairofeigenvectorswillbe0iftheyaredifferent
and1iftheyarethesame.
3. Ghaskconnectedcomponentsifandonlyifkssmallesteigenvaluesareidenticalto0.
T
4. Thenumberofcutedgesinapartitionis:x
L(G)x.Soifyouwanttominimizeedge
cuts,minimizetheproduct.
CountingEdgeCuts
Summingthedegreesofalltheverticesisthesameascountingthenumberofedges,twice.
ThefirstsumisthenumberofedgeswhollycontainedinV+,itcountsthemtwiceandis
negative.(2*#ofedgesinV+)
Thetotalisbasically4timesthenumberofcutedges.
SpectralPartitioning,Part4:PuttingitallTogether
StartwithagraphG
ConstructitsLaplacianL(G)
NowsupposethereisapartitionofG,V=V
uV
+
Theverticesareseparatedinthetwosections
Eachvertexisassignedtoonepartitionortheother
Thecutedgescanbefound,andthenumberofthemcanevenbeminimized
Thepartitionshouldbethefollowingrules:
Everypartitionmustbeinonepartitionortheother
Assign+1or1toeachvertex
Thepartitionsmusthavethesamenumberofvertices
TheproblemisNPComplete.
Therefore,asaworkaround,
Removetherequirementthateachvertexmustbeassigned+1or1
Nowwecansay:
Thesecondsmallesteigenvectorisproportionaltotheminimumvalueofx
TheAlgorithmforSpectralPartitioning
1. CreateL(G)
2. ComputethesecondsmallesteigenpairofL(G)
3. Determinethepartitionusingthesignsoftheeigenpair