0% found this document useful (0 votes)
131 views18 pages

3D SIMP Method in Matlab For Complimance Based Topology Optimization

This paper presents compact and efficient Matlab codes for compliance topology optimization (TO) in 2D and 3D, named top99neo and top3D125, which significantly outperform previous implementations. The 2D code consists of 99 lines and achieves speedups of 2.55 to 5.5 times compared to the top88 code, while the 3D version shows a 1.9 times speedup. The authors aim to share these improvements to facilitate the research community's ability to tackle medium to large-scale TO problems efficiently.

Uploaded by

ALI ABBAS
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
131 views18 pages

3D SIMP Method in Matlab For Complimance Based Topology Optimization

This paper presents compact and efficient Matlab codes for compliance topology optimization (TO) in 2D and 3D, named top99neo and top3D125, which significantly outperform previous implementations. The 2D code consists of 99 lines and achieves speedups of 2.55 to 5.5 times compared to the top88 code, while the 3D version shows a 1.9 times speedup. The authors aim to share these improvements to facilitate the research community's ability to tackle medium to large-scale TO problems efficiently.

Uploaded by

ALI ABBAS
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd

Structural and Multidisciplinary Optimization (2020) 62:2211–2228

https://doi.org/10.1007/s00158-020-02629-w

EDUCATIONAL PAPER

A new generation 99 line Matlab code for compliance topology


optimization and its extension to 3D
Federico Ferrari1 · Ole Sigmund1

Received: 18 February 2020 / Revised: 2 May 2020 / Accepted: 10 May 2020 / Published online: 24 August 2020
© Springer-Verlag GmbH Germany, part of Springer Nature 2020

Abstract
Compact and efficient Matlab implementations of compliance topology optimization (TO) for 2D and 3D continua are given,
consisting of 99 and 125 lines respectively. On discretizations ranging from 3 · 104 to 4.8 · 105 elements, the 2D version,
named top99neo, shows speedups from 2.55 to 5.5 times compared to the well-known top88 code of Andreassen et al.
(Struct Multidiscip Optim 43(1):1–16, 2011). The 3D version, named top3D125, is the most compact and efficient Matlab
implementation for 3D TO to date, showing a speedup of 1.9 times compared to the code of Amir et al. (Struct Multidiscip
Optim 49(5):815–829, 2014), on a discretization with 2.2 · 105 elements. For both codes, improvements are due to much
more efficient procedures for the assembly and implementation of filters and shortcuts in the design update step. The use of
an acceleration strategy, yielding major cuts in the overall computational time, is also discussed, stressing its easy integration
within the basic codes.

Keywords Topology optimization · Matlab · Computational efficiency · Acceleration methods

1 Introduction Suresh 2010; Sanders et al. 2018), or integration of TO


within some finite element frameworks.
The celebrated top99 Matlab code developed by Sigmund With the evolution of TO and its application to more
(2001) has certainly promoted the spreading of topology and more challenging problems, implementations in top88
optimization among engineers and researchers, and the may have become outdated. Also, Matlab has improved in the
speedups carried by its heir, top88 (Andreassen et al. last decade. Hence, we believe it is time to present a new
2011), substantially increased the scale of examples that can “exemplary” code collecting shortcuts and speedups, allow-
be solved on a laptop. ing to tackle medium-/large-scale TO problems efficiently
On these footprints, several other codes have followed, on a laptop. Preconditioned iterative solvers, applied for
involving extension to 3D problems (Liu and Tovar example in Amir and Sigmund (2011), Amir et al. (2014),
2014; Amir et al. 2014), material design (Andreassen Ferrari et al. (2018) and Ferrari and Sigmund (2020), allow
and Andreasen 2014; Xia and Breitkopf 2015), level– the solution of the state equation with nearly optimal effi-
set parametrizations (Wang 2007; Challis 2010), use of ciency (Saad 1992). Thus, the computational bottleneck
advanced discretization techniques (Talischi et al. 2012; has been shifted on other operations, such as the matrix
assembly or the repeated application of filters. Efficiency
improvements for these operations were touched upon by
Responsible Editor: Palaniappan Ramu
Andreassen et al. (2011), however, without giving a quanti-
tative analysis about time and memory savings.
 Federico Ferrari Here, we provide compact Matlab codes for minimum
[email protected]
compliance topology optimization of 2D and 3D continua
Ole Sigmund which show a substantial speedup compared to the top88
[email protected] code. We include several extensions by default, such
as specification of passive domains, a volume-preserving
1 Department of Mechanical Engineering, Technical University
density projection (Guest et al. 2004; Wang et al. 2011) and
of Denmark, Nils Koppels Allé 404, 2800 Kongens Lyngby, continuation strategies for the penalization and projection
Denmark parameters in a very compact, yet sharp, implementation.
2212 F. Ferrari and O. Sigmund

Coincidentally, the new 2D TO implementation consists of by n the global number of Degrees of Freedom (DOFs) in
99 lines of code and is thus named top99neo. We also the discretization and by d the number of (local) DOFs of
show how to include an acceleration technique recently each element.
investigated for TO by Li et al. (2020), with a few extra lines Let x = {xe }e=1:m ∈ [0, 1]m be partitioned between xA
of code and potentially carrying major speedups. Changes and xP , the sets of active (design) variables and passive ele-
needed for the extension to 3D problems are remarkably ments, respectively. The latter may be further split in the sets
small, making the corresponding code (top3D125) the of passive solid P1 (xe = 1) and void P0 (xe = 0) elements,
most compact and efficient Matlab implementation for 3D of cardinalities mP1 and mP0 , respectively (see Fig. 1a).
compliance TO to date. The set of physical variables x̂A = H(x̃) are defined by
Our primary goal is not to present innovative new the relaxed Heaviside projection (Wang et al. 2011)
research. Rather, we aim at sharing some shortcuts and tanh(βη) + tanh(β(x̃e − η))
speedups that we have noticed through time, to the benefit H(x̃e , η, β) = (1)
tanh(βη) + tanh(β(1 − η))
of the research community. Improvements introduced by the
present codes will be much useful also on more advanced with threshhold η and sharpness factor β, where x̃ = H x is
problems, such as buckling optimization, which will be the filtered field, obtained by the linear operator

dealt with in an upcoming work. i∈N he,i xi
The paper is organized as follows. In Section 2, we H (xe , rmin ) :=  e (2)
i∈Ne he,i
recall the setting of TO for minimum compliance. Section 3
where Ne = {i | dist(i , e ) ≤ rmin } and he,i =
is devoted to describe the overall structure of the 2D
max(0, rmin − dist(i , e )).
code, focusing on differences with respect to top88.
Given a load vector f ∈ Rn and the volume fraction
Sections 3.1–3.5 give insights about the main speedups and
f ∈ (0, 1), we consider the optimization problem
show performance improvements with respect to top88. ⎧  
The very few changes needed for the 3D code are listed in ⎨ min m c x̂
xA ∈[0,1] A (3)
Section 4, where an example is presented and the efficiency ⎩s.t. V x̂ ≤ f | |
is compared to the previous code from Amir et al. (2014). h
Some final remarks are given in Section 5. Appendix A  
for the minimization of compliance c x̂ = uT f with an
gives some details about the redesigns step that are useful upper bound on the overall volume
for better understanding a method proposed in Section 3.2
and the Matlab codes are listed in Appendices B and C.    m
1 
V x̂ = |e |x̂e = m P1 + x̂e ≤f (4)
m
e=1 e∈A
Problem (3) is solved with a nested iterative loop. At
2 Problem formulation and solution scheme
each iteration, the displacement u is computed by solving
the equilibrium problem
We consider a 2D/3D discretization h consisting of m
equi-sized quadrilateral elements e . Hereafter we denote Ku = f (5)

Fig. 1 Definition of the active A, passive solid P1 , and void P0 e = 1, is used by the assembly operation. The symmetric repetitions in
domains (a) and illustration of the connectivity matrix C for a simple I are highlighted, and their elimination gives the reduced set Ir (see
discretization (b). The set of indices I , here shown for the element Section 3.1)
A new generation 99 line Matlab code for compliance... 2213

where the stiffness matrix K = K(x̂) depends on the also allows the projection (1), with eta and beta as
physical variables through a SIMP interpolation (Bendsøe parameters. ftBC specifies the filter boundary conditions
and Sigmund 1999) of the Young modulus (’N’ for zero- Neumann or ’D’ for zero-Dirichlect), move
p is the move limit used in the OC update and maxit sets the
E(x̂e ) = Emin + x̂e (E0 − Emin ) (6)
maximum number of redesign steps.
with E0 and Emin the moduli of solid and void (Emin  The routine is organized in a set of operations which
are performed only once and the loop for the TO iterative
E0 ). The gradients of compliance and structural volume redesign. The initializing operations are grouped as follows
with respect to x̂ read (χe = 1 if e ∈ A and 0 otherwise and
1m is the identity vector of dimension m) PRE.1) MATERIAL AND CONTINUATION PARAMETERS

    1 PRE.2) DISCRETIZATION FEATURES


∇x̂ c x̂ = −uT ∇x̂ KuχA , ∇x̂ V x̂ = 1m χA (7) PRE.3) LOADS, SUPPORTS AND PASSIVE DOMAINS
m
PRE.4) DEFINE IMPLICIT FUNCTIONS
and the sensitivities with respect to the design variables are
PRE.5) PREPARE FILTER
recovered as
  PRE.6) ALLOCATE AND INITIALIZE OTHER PARAMETERS
∇x c (x) = ∇x̃ H  (H T ∇x̂ c x̂ )
  (8) and below we give details only about parameters and
∇x V (x) = ∇x̃ H  (H T ∇x̂ V x̂ ) instructions not found in the top88 code.
where  represents the element-wise multiplication and To apply continuation on the generic parameter “par,”
a data structure is defined
1 − tanh(β(x̃ − η))2
∇x̃ H = β (9) parCont = {istart, maxPar, isteps, deltaPar};
tanh(βη) + tanh(β(1 − η))
The active design variables e ∈ A are then updated by such that the continuation starts when loop=istart and
the optimality criterion rule (Sigmund 2001) the parameter is increased by deltaPar each isteps,
⎧ up to the value maxPar. This is implemented in Lines 6

⎨δ− if Fk,e < δ− and 7 for the penalization parameter p and the projection
xk+1,e = U (xk,e ) = δ+ if Fk,e > δ+ (10) factor β, respectively. The update is then performed, by the


Fk,e otherwise instruction (see Line 92)

where δ− = max(0, xk,e − μ), δ+ = min(1, xk,e + μ), for


the fixed move limit μ ∈ (0, 1) and
∂e ck 1/2 making use of compact logical operations. Continuation
Fk,e = xk,e − (11) can be switched off, e.g., by setting maxPar<=par, or
λ̃k ∂e Vk
istart>=maxit.
depends on the element sensitivities. The blocks defining the discretization (PRE.2)) contain
In (11), λ̃k is the approximation to the current Lagrange some changes compared to top88. The number of
multiplier λ∗k associated with the volume constraint. This is elements (nEl), DOFs (nDof), and the set of node numbers
obtained by imposing V (x̂k+1 (λ̃)) − f |h | ≈ 0, e.g., by (nodeNrs) are defined explicitly, to ease and shorten some
bisection on an interval k ⊃ λ∗k .
(0)
following instructions. The setup of indices iK and jK, used
for the sparse assembly, is performed in Lines 15–21 and
follows the concept detailed in Section 3.1. The coefficients
3 Matlab implementation and speedups of the lower diagonal part of the elemental stiffness matrix
(s)
are defined in vectorized form, such that Ke = V (Ke ) (see
The Matlab routine for 2D problems (see Appendix B) is Lines 22–26). Ke is used for the assembly strategy described
called with the following arguments in Section 3.1. However, in Lines 27–29, we also recover
the complete elemental matrix (Ke0), used to perform the
double product uTe Ke ue when computing the compliance
sensitivity (7). Although this could also be written in terms
(s)
where nelx and nely define the physical dimensions of the matrix Ke only, this option would increase the
and the mesh resolution, volfrac is the allowed volume number of matrix/vector multiplications.
fraction on the overall domain (i.e., A ∪ P ), penal the In PRE.3), the user can specify the set of restrained
penalization used in (6), and rmin the filter radius for (2). (fixed) and loaded (lcDof) DOFs and passive regions
The parameter ft is used to select the filtering scheme: (P1 ↔ pasS and P0 ↔ pasV) for the given configuration.
density filtering alone if ft=1, whereas ft=2 or ft=3 Supports and loads are defined as in the top88 code,
2214 F. Ferrari and O. Sigmund

whereas passive domains may be specified targeting a set The stiffness interpolation and its derivative (sK, dsK)
of column and rows from the array elNrs. Independently are defined, and the stiffness matrix is assembled (see
of the particular example, Lines 34–36 define the vector of Lines 73–76). Ideally, one could also get rid of Lines 73–
applied loads, the set of free DOFs, and the sets of active 74 and directly define sK in Line 75 and dsK within
A ↔ act design variables. Line 79. However, we decide to keep these operations
In order to make the code more compact and read- apart, enhancing the readability of the code and to ease the
able, operations which are repeatedly performed within specification of different interpolation schemes. Equation
the TO optimization loop are defined through inline func- (5) is solved on Line 77 using the Matlab function
tions in PRE.4) (Lines 38–43). The filter operator is built decomposition, which can work with only half of
in PRE.5) making use of the built-in Matlab function the stiffness matrix (see Section 3.1). The sensitivity of
imfilter, which represents a much more efficient alter- compliance is computed, and the backfiltering operations
native to the explicit construction of the neighboring array. (8) are performed in RL.3).
A similar approach was already outlined by Andreassen The update (10), with the nested application of the
et al. (2011), pointing to the Matlab function conv2, which bisection process for finding λ̃k , is implemented in
√ RL.4)
is however not completely equivalent to the original oper- (Lines 86–91), and we remark that lm represents λ.
ator, as it only allows zero-Dirichlet boundary conditions Some information about the process is printed and the
for the convolution operator. Here, we choose imfilter, current design is plotted in RL.5) (Lines 94–97). On
which is essentially as efficient as conv2, but gives the small discretizations, repeated plotting operations absorb a
flexibility to specify zero-Dirichlet (default option), or zero- significant fraction of the CPU time (e.g., 15% for m =
Neumann boundary conditions. 4800). Therefore, one might just plot the final design,
Some final initializations and allocations are performed moving Lines 96–97 outside the redesign loop.
in PRE.6). The design variables are initialized with The tests in the following have been run on a laptop
the modified volume fraction, accounting for the passive equipped with an Intel(R) Core(TM) [email protected]
domains (Line 52–53) and the constant volume sensitivity CPU, 15 GB of RAM, and Matlab 2018b running in serial
(7) is computed in Line 51. mode under Ubuntu 18.04 (but a similar performance is
Within the redesign loop, the following five blocks of expected in Windows setups). We will often refer to the
operations are repeatedly performed
half MBB beam example (see Fig. 2) for numerical testing.
Unless stated otherwise, we choose h = 300 × 100,
RL.1) COMPUTE PHYSICAL DENSITY FIELD f = 0.5, and rmin = 8.75 (Sigmund 2007). The load, having
RL.2) SETUP AND SOLVE EQUILIBRIUM EQUATIONS total magnitude |q| = 1 is applied to the first node. No
RL.3) COMPUTE SENSITIVITIES passive domains are introduced for this example; therefore,
RL.4) UPDATE DESIGN VARIABLES AND APPLY CONTINUATION pasS=[];, pasV=[]; and we set E1 = 1, E0 = 10−9 ,
RL.5) PRINT CURRENT RESULTS AND PLOT DESIGN and ν = 0.3 in all the tests.

3.1 Speedup of the assembly operation


In block RL.1), the physical field is obtained, applying
the density filter and, if selected, also the projection. If ft=3, In top88, the assembly of the global stiffness matrix is
the special value of the threshold eta giving a volume-preserv- performed by the built-in Matlab function sparse
ing projection is computed, as discussed in Section 3.2.

where sK ∈ Rm∗d ×1 collects the coefficients of all the


2

elemental matrices in a column-wise vectorized form (i.e.,


V (Ke )) and iK and jK are the sets of indices mapping each
sK(i) to the global location K(iK(i),jK(i)).
These two sets are set up through the operations
   
iK = V (C ⊗ 1d )T , jK = V (C ⊗ 1Td )T (12)

where C[m×d] is the connectivity matrix and “⊗” is the


Kronecker product (Horn and Johnson 2012). The size of
the array I = [iK, jK] ∈ Nm∗d ×2 grows very quickly with
2

Fig. 2 Geometrical setting for the MBB example the number of elements m, especially for 3D discretizations
A new generation 99 line Matlab code for compliance... 2215

Table 1 Number of entries in the array I and corresponding memory requirement for the 2D and 3D test discretizations. White background refers
to the F strategy with coefficients specified as double, cyan background to the H strategy, and light green to the H strategy and element specified
as int32. The H strategy cuts |I | and memory of ≈ 44% in 2D and ≈ 48% in 3D. Then, specifying the indexes as int32 further cuts memory
of another 50%

(see Table 1), and even though its elements are integers, and the overall indexing array becomes Ir = [iK, jK] ∈
d 
the sparse function requires them to be specified as Nd̃∗m×2 where d̃ = j =1 i≤j i. The entries of the
double precision numbers. The corresponding memory indexing array and the memory usage are reduced by
burden slows down the assembly process and restricts the approx. 45% (see Table 1).
size of problems workable on a laptop. The set of indices (14) can be constructed by the
The efficiency of the assembly can be substantially following instructions (see Lines 15–21)
improved by
1. Acknowledging the symmetry of both Ke and K
2. Using an assembly routine working with iK and jK
specified as integers
To understand how to take advantage of the symmetry of
matrices, we refer to Fig. 1b and to the connectivity matrix
C. Each coefficient Cej ∈ N addresses the global DOF which can be adapted to any isoparametric 2D/3D element
targeted by the j th local DOF of element e. Therefore, (12) just by changing accordingly the number d of elemental
explicitly reads DOFs. In the attached scripts, based on 4-noded bilinear
Q4 and 8-noded trilinear H 8 elements, we set d=8 and
iKe = {ce , ce , . . . , ce } d=24, respectively. The last instruction sorts the indices
  
d times as iKr(i) > jKr(i), such that K (s) contains only
(13)
jK = {ce1 , . . . , ce1 , ce2 , . . . , ce2 , . . . , ced , . . . , ced }
e sub-diagonal terms.
        
d times d times d times
The syntax K=sparse(iK,jK,sK) now returns the
lower triangular matrix K (s) and we remark that the full
where ce = {ce1 , ce2 , . . . , ced } is the row corresponding to operator can be recovered by
element e.
If we only consider the coefficients of the (lower)
K = K (s) + (K (s) )T − diag[K (s) ] (15)
symmetric part of the elemental matrix Ke(s) and their
locations into the global one K (s) , the set of indices can be
which costs as much as the averaging operation 12 (K +K T ),
reduced to
performed in top88 to get rid of roundoff errors. However,
iKe = {ce1 , . . . , ced , ce2 , . . . , ced , . . . , ce3 , . . . , ced , . . . , ced } the Matlab built-in Cholesky solver and the corresponding
jKe = {ce1 , . . . , ce1 , ce2 , . . . , ce2 , ce3 , . . . , ce3 , . . . , ced }
        
(14) decomposition routine can use just K (s) , if called with
d times (d−1) times (d−2) times the option ’lower’.
2216 F. Ferrari and O. Sigmund

Point 2 gives the most dramatic improvement, and


can be accomplished by using routines developed by
independent researchers. The sparse2 function, from
Suite Sparse (Davis 2019), was already pointed out by
Andreassen et al. (2011) as a better alternative to the built-
in Matlab sparse; however, no quantitative comparisons
were performed. According to the CHOLMOD reference
manual (Davis 2009), sparse2 works exactly as sparse,
but allowing the indices iK and jK to be specified
as integers (accomplished by defining this type for the
connectivity matrix, see Lines 11 and 13).
Here we suggest the “fsparse” routine, developed
by Engblom and Lukarski (2016). Besides working with
integers iK and jK, the function enhances the efficiency of
the sparse assembly by a better sorting of the operations.
From our experience on a single core process, fsparse
gives a speedup of 170–250% compared to sparse2, and
is also highly parallelizable (Engblom and Lukarski 2016).
Defining the sets ik and jk as int32 type, we can
drastically cut the memory requirements, still representing
n ≈ 2.1 · 109 numbers, far beyond the size of problems one
can tackle in Matlab.
In order to use fsparse, one needs to download
the “stenglib” library1 and follow the installation
instructions in the README.md file. The packages of the
library can be installed by running the “makeall.m” file.
As fsparse is contained within the folder “Fast,” one may Fig. 3 Scaling of assembly time performed with the 3 strategies
discussed in Section 3.1. Compared to the standard (F) assembly, the
only select this folder when running makeall.m. H strategy alone cuts near 50% of time and memory, and with the use
We test the efficiency of the assembly approaches of fsparse gives an overall efficiency improvement of 10–15 times
on 2D and 3D uniform discretizations with m2 and m3
elements, respectively. Figure 3 shows time scalings for
the different strategies: “F” corresponds to the assembly jK, and sK would cause a memory overflow, ruling out the
in top88, “H” takes advantage of the matrix symmetry “F” approach.
only and “H,fsparse” correponds to the use of the
fsparse routine (Engblom and Lukarski 2016) also. All 3.2 Speedup of the OC update
the approaches exhibit a linear scaling of CPU time w.r.t
the DOFs number. However, half the CPU time can be cut The cost of the redesign step xk+1 = U (xk ) is proportional
just by assembling K (s) (strategy H,sparse). Therefore, to the number of bisections (nbs ) required for computing the
we definitely recommend this to users who aim to solve approximation λ̃k ≈ λ∗k . The following estimate (Quarteroni
medium-size (105 to 106 DOFs) structural TO problems et al. 2000)
on a laptop. However, the most substantial savings follow
from using fsparse (Engblom and Lukarski 2016) and log(|(0) |) − log(τ )
by coupling these two strategies (H,fsparse) speedups of nbs ≥ −1 (16)
log(2)
10 for the 2D and 15 for 3D setting can be achieved. It is
worth to highlight that a 3D stiffness matrix of the size of is a lower bound to this number for a given accuracy τ >
≈ 9 · 105 can be assembled in less than a second and even |λ∗k − λ̃k | and it is clear that nbs would decrease if (0) ,
one of size 6.2 · 106 can be assembled on a laptop in less the initial guess for the interval bracketing λ∗k , could be
than 10s. For this last case, the sole storage of the arrays iK, shrunk. Moreover, the volume constraint should be imposed
on the physical field (x̃ or x̂) and, in the original top88
implementation, this requires a filter application at each
1 https://github.com/stefanengblom/stenglib
bisection step, which may become expensive.
A new generation 99 line Matlab code for compliance... 2217

The efficiency of the redesign step can be improved by a


two-step strategy
1. Using volume-preserving filtering schemes
(0)
2. Estimating the interval k bracketing the current

Lagrange multiplier λk
Concerning point 1, the density filter is naturally volume-
preserving (i.e., V (xk ) = V (x̃k )) (Bourdin 2001; Bruns
and Tortorelli 2001). Therefore, the volume constraint can
be enforced on V (xk ) as long as the density filter alone is
considered (ft=1). The relaxed Heaviside projection (1),
on the other hand, is not volume-preserving for any η; thus,
it would require one filter-and-projection application at each
bisection step. However, (1) can also be made volume-
preserving by computing, for each x̃k , the threshhold ηk∗
such that (Xu et al. 2010; Li and Khandelwal 2015)

ηk∗ −→ min |V (x̂k (η)) − V (x̃k )| (17)


η∈[0,1]

This can be done, e.g., by the Newton method, starting



from the last computed ηk−1 and provided the derivative of
(1) with respect to η

∂V (x̃(η))  (eβ(1−x̃i ) − eβ(x̃i −1) )(eβx − eβx̃i )


= −2β (18)
∂η (eβ − e−β )[eβ(x̃i −η) + eβ(η−x̃i ) ]2
i∈A

Existence of η∗ ∈ [0, 1] for all x̃ ∈ [0, 1]m follows from


the fact that g(η) = V (x̂(η)) − V (x̃) is continuous on [0, 1] Fig. 4 Evolution of the parameter η∗ realizing the equivalence V (x̃) =
and g(0)g(1) ≤ 0; uniqueness follows from the fact that V (x̂), for different volume fractions f and filter radii rmin (a)
∂g and evolution of the Lagrange multiplier estimate λ# given by (19)
∂η < 0 for all η ∈ (0, 1). compared to λ∗ (b). For both plots, the cumulative number of Newton
Numerical tests on the MBB beam show that generally iterations nNewton (viz. number of bisection steps nbs ) is shown against
ηk∗ ∈ [0.4, 0.52], the larger variability occurring for low the right axis
volume fractions (see Fig. 4a). We also observe that ηk∗
takes values slightly above 0.5 when rmin is increased or
β is raised. Convergence to ηk∗ is generally attained in 1–2
Newton iterations (see Fig. 4a).
The procedure for computing ηk∗ from (17), with
tolerance = 10−6 and initial guess η0 = eta, provided by
the user, is implemented in Lines 63–67, that are executed
if the routine top99neo is called with the parameter
ft=3. Otherwise, if ft=2, the input threshhold eta is kept
fixed. In case of the latter, the volume constraint should be However, there could be other situations when one cannot
consistently applied on V (x̂); otherwise, some violation or rely on volume-preserving filters (e.g., when imposing
over-shooting of the constraint will happen. In particular, if length scale through robust design). Therefore, a more
the volume constraint is imposed on x and η is kept fixed, general strategy to reduce the cost of the OC update is to cut
one has V (x̂) > f |h |, if η < 0.5, and V (x̂) < f |h |, if the number of bisection steps.
η > 0.5. To this end, the selection of the initial bracketing interval
k may build upon the upper bound estimate for λ∗k
(0)
Even tough we usually observed small differences, these
may result in local optima or bad designs, especially for low (Hestenes 1969; Arora et al. 1991)
volume fractions or high β values. Therefore, accounting for  2
1 
m
∂e ck 1/2
this more general situation Lines 87–91 should be replaced λk =
#
xk,e − (19)
mf ∂e Vk
by the following e=1
2218 F. Ferrari and O. Sigmund

More details on the derivation of (19) are given in The update rule (20) is usually applied only once each q
Appendix A. The behavior of the estimate (19) is shown steps. Thus, we can write more generally xk+1 = xk + zk ,
in Fig. 4b for the MBB example. The overall number where (Pratapa et al. 2016)
of bisections (nbs ) in order to compute λ∗k meeting the 
αrk q ∈
if k+1 /N
tolerance τ = 10−8 when considering k = [0, λ∗k ]
(0)
zk = (23)
is cut by about 50%, compared with the one required by ζ I − (Xk + ζ Fk )γ k if q ∈ N
k+1

starting from (0) = [0, 109 ] as in top88. Moreover, if (α ∈ (0, 1)) obtaining the so-called periodic Anderson
no projection is applied, (19) could be used together with extrapolation (PAE) (Pratapa et al. 2016; Li et al. 2020).
(10) to perform an explicit Primal-Dual iteration to compute The implementation can be obtained, e.g., by adding the
(xk+1 , λ∗k ) and this would reduce the number of steps even following few lines after the OC step (Line 91)
more (see green curve in Fig. 4b).
However, in the basic versions of the codes, given in
Appendices B and C, we consider the bisection process and
(19) is used to bracket the search interval, as this procedure
is more general.

3.3 Acceleration of the OC iteration

The update rule (10) resembles a fixed-point (FP) iteration


xk+1 = U (xk ), generating a sequence {xk } converging to a
point such that r = U (x∗ ) − x∗ = 0.
Several methods are available to speedup the conver-
gence of such a sequence (Brezinski and Chehab 1998;
Ramiere and Helfer 2015), somehow belonging to the fam-
ily of quasi-Newton methods (Eyert 1996). The acceleration where the part solving (22) and the update has been put in a
proposed by Anderson (1965), for instance, is nowadays separate routine for better efficiency.
experiencing a renewed interest (Fang and Saad 2009; Prat- In the above, we use the “\” for solving the least squares
apa et al. 2016; Peng et al. 2018) and has recently been problem (22); however, strategies based on a QR (or SVD)
applied to TO by Li et al. (2020). decomposition may be preferred in terms of numerical
Anderson acceleration takes into account the residuals ri , stability. We refer to Fang and Saad (2009) for a deeper
their differences ri = ri+1 − ri and the differences of discussion on this point.
the updates xi = xi+1 − xi for the last mr iterations (i.e. In order to assess the effect of different filtering schemes
i = k − mr , . . . , k − 1), and obtains the new element of the and the introduction of parameter continuation, Anderson
vector sequence as acceleration is tested on the MBB example considering the
following options
xk+1 = x#k + ζ r#k (20)
T1 Density filter alone, p = 3;
where ζ ∈ [0, 1] is a damping coefficient and T2 Density-and-projection filter, with η∗ computed from

k−1 (17) and β = 2
x#k = xk −
(k)
γi xi = xk − Xk γ k T3 As T2, but with continuation on both β and p, defined
i=k−mr by the parameters betaCnt={250,16,25,2} and
(21) penalCnt={50,3,25,0.25}

k−1
r#k = rk −
(k)
γi ri = rk − Rk γ k T4 As T2, but for the discretization h = 600 × 200
i=k−mr √
For all the cases, the TO loop stops when rk 2 / m <
The coefficients γi
(k)
minimize the following 10−6 , where the residual is defined with respect to the
physical variables (i.e., rk = x̃k − x̃k−1 for T1 and rk =
{γi(k) }m
i=1 → min rk (γ )2
r # 2
(22) x̂k − x̂k−1 for T2–T4). The acceleration is applied each
γ
q = 4 steps, considering the last mr = 4 residuals, starting
The rationale behind the method is to compute a rank-mr from iteration q0 = 20 for T1–T2 and from q0 = 500
update of the inverse Jacobian matrix Jk−1 of the nonlinear for T3–T4, when both continuations have finished. We set
system rk = 0. This has been shown to be equivalent to a α = 0.9 for the non-accelerated steps. The choice mr = 4
multi-secant Broyden method (Eyert 1996; Fang and Saad is based on the observation that convergence improvements
2009) starting from J0−1 = −ζ I . increase very slowly for mr > 3 (Anderson 1965; Eyert
A new generation 99 line Matlab code for compliance... 2219

Table 2 Comparison of convergence-related parameters for the From Fig. 5 it is easy to notice the trend of PAE of
standard (T) and accelerated (T-PAE) TO tests, for the MBB example producing a design with some more bars. This may even
√ give slightly stiffer structures, such as for case T3, where
it. c c r2 / m mN D
the non accelerated approach removes some bars after it =
T1 2500 252.7 4.2 · 10−8 1.03 · 10−5 0.025 2000, whereas stopping at the design of T3–PAE gives a
T1-PAE 828 258.9 4.2 · 10−10 9.95 · 10−7 0.021 stiffer structure.
T2 2500 246.1 5.1 · 10−8 3.21 · 10−5 0.023 A comment is about the convergence criterion used,
T2-PAE 352 253.9 6.2 · 10−9 9.97 · 10−7 0.014 which is different from the one in top88 (maximum
T3 2500 199.6 1.1 · 10−4 1.91 · 10−3 0.014 absolute change of the design variables (xk+1 − xk ∞ ).
T3-PAE 752 197.5 3.7 · 10−8 8.72 · 10−7 0.007 Here, we consider it more appropriate to check the residual
T4 2500 191.8 2.0 · 10−7 3.21 · 10−5 0.006 with respect to the physical design field, and the 2-norm
T4-PAE 818 192.1 2.5 · 10−7 9.97 · 10−7 0.001 seems to give a more global measure, less affected by local
oscillations.
1996). However, a deeper discussion about the influence of
all parameters on the convergence is outside the scope of the 3.4 Performance comparison to top88
present work and we refer to Li et al. (2020) or, in a more
general context, to Walker and Ni (2011) for this . We compare the performance of top99neo to the previous
Results are collected in Table 2 and Figs. 5 and 6, top88 code. In the following, we will refer to “top88”
showing the evolution of the norm of the residual, the as the original code provided by Andreassen et al. (2011)
flatness of the normalized compliance ck /c0 = (ck − and to “top88U” as its updated version making use of the
ck−1 )/c0 and the non-discreteness measure mND = 100 · sparse2 function (Davis 2009) for the assembly, with iK
4xT (1 − x)/m. We observe how Anderson acceleration and jK specified as integers, and the filter implemented by
substantially reduces the number of iterations needed to using conv2.
fulfill the stopping criterion, at the price of just a moderate The codes are tested by running 100 iterations for the
increase in compliance (0.2–3%). Moreover, starting the MBB beam example (see Fig. 2), for the discretizations
acceleration just a few iterations later (e.g., it = 50 or 300 × 100, 600 × 200, and 1200 × 400, a volume fraction
it = 100 for T1) gives much lower compliance values f = 0.5 and considering mesh independent filters of radii
(c = 254.3 and c = 252.9, respectively) and for T3 and T4 rmin = 4, 8, and 16, respectively. For top88 and top88U,
when the acceleration is started as the design has stabilized, we only consider density filtering, whereas for the new
compliance differences are negligible. top99neo, we also consider the Heaviside projection,
with the η∗ computed as described in Section 3.2. It will be
apparent that the cost of this last operation is negligible.
Timings are collected in Table 3 where tit is the average
cost per iteration, tA and tS are the overall time spent
by the assembly and solver, respectively, and tU is the
overall time spent for updating the design variables. For
top88 and top88U, the latter consists of the OC updating
and the filtering operations performed when applying the
bisection on the volume constraint. For top99neo, this
term accounts for the cost of the OC updating, that for
estimating the Lagrange multiplier λ∗ as discussed in
Section 3.2 and the filter and projection (Lines 59–70). tP
collects all the preliminary operations, such as the set up of
the discretization, and filter, repeated only once, before the
TO loop starts.
From tit , we clearly see that top99neo enhances the
performance of the original top88 by 2.66, 3.85, and 5.5
times on the three discretizations, respectively. Furthermore,
timings of top88 on the largest discretization (1200×400),
relate to a smaller filter size (rmin = 12), because of memory
issues; thus, the speedup is even underestimated in this
Fig. 5 Optimized designs obtained without (left column) and with case. Comparing to top88U version, the improvements are
Anderson acceleration (right column) of the TO loop less pronounced (i.e., 1.55, 1.57, and 1.78 times) but still
2220 F. Ferrari and O. Sigmund

Fig. 6 Evolution of some parameters related to convergence for the second row shows a measure of the flatness of the objective function
standard and Anderson accelerated TO process. The first row shows and the last row shows the non-discreteness measure
the normalized norm of the residual defined on physical variables, the

substantial. The computational cost of the new assembly Computational savings would become even higher when
strategy is very low, even comparing to the top88U adopting the larger filter size rmin = 8.75 for the mesh
version, and its weight on the overall computational cost is 300 × 100, and scaling to rmin = 17.5 and rmin = 35
basically constant. Also, from Table 3, it is clear that the on the two finer discretizations. For these cases, speedups
design variables update weighs a lot on the overall CPU with respect to top88 amount to 4.45 and 10.35 on the
time, for both top88 and top88U. On the contrary, this first two meshes, whereas for the larger one, the setup of
becomes very cheap in the new top99neo thanks to the the filter in top88 causes a memory overflow. Speedups
strategies discussed in Section 3.2; tU takes about 4–5% of with respect to top88U amount to 1.55, 2.55 and 3.6 times
the overall CPU time. respectively.

Table 3 Comparison of numerical performance between the old top88/top88U and new top99neo Matlab code. tit is the cost per iteration,
tA , tS , tU are the overall times for assembly, equilibrium equation solve, and design update, respectively. tP is the time spent for all the preliminary
operations. Values within brackets represent the % weight of the corresponding operation on the overall CPU. On the larger mesh, top88 is run
with rmin = 12, because of memory issues

h 300 × 100, rmin = 4 600 × 200, rmin = 8 1200 × 400, rmin = 16

top88 top88U top99neo top88 top88U top99neo top88 top88U top99neo

tit 0.615 0.358 0.231 4.57 1.87 1.19 31.3 10.1 5.69
tA 19.4(31.5) 5.4(15.0) 1.4 (6.1) 83.1(18.2) 31.3(16.7) 5.6 (4.7) 361.1(11.6) 151.5(15.2) 30.7 (5.4)
tS 23.1(37.4) 22.9(59.3) 19.7(85.3) 122.4(26.8) 109.3(58.4) 106.9(89.7) 592.5(19.0) 513.2(50.9) 510.5(89.6)
tU 13.3(21.6) 4.8(13.5) 1.2 (4.8) 223.8(48.8) 38.0(20.3) 5.2 (4.4) 1164.2(37.4) 310.4(31.4) 29.2 (5.1)
tP 0.8(1.3) 0.06 (0.2) 0.1 (0.3) 12.9 (2.8) 0.1(< 0.1) 0.2(< 0.1) 92.3 (3.1) 0.5(< 0.1) 0.6(< 0.1)
A new generation 99 line Matlab code for compliance... 2221

Fig. 7 Designs obtained for the


frame reinforcement problem
sketched in Fig. 1a. In a, the
horizontal, triangular load
distribution is pointing
leftwards, whereas in b, it is
pointing rightwards

3.5 Frame reinforcement problem where lDofv and lDofh target the DOFs subjected to
vertical and horizontal forces, respectively. Then, the load
Let us go back to the example of Fig. 1a, adding the specifi- (Line 34) is replaced with
cation of passive domains and a different loading condition.
We may think of a practical application like a reinforcement
problem for the solid frame, with thickness t =L/50 (P1 ),
subjected to two simultaneous loads. A vertical, uniformly
distributed load with density q = −2 and a horizontal height- Figure 7 shows the two optimized design corresponding
proportional load, with density b = ±y/L. Some structural to the two orientations of the horizontal load b, after
material has to be optimally placed within the active design 100 redesign steps. The routine top99neo has been
domain A in order to minimize the compliance, while keeping called with the following arguments nely=nelx=900,
the void space (P0 ), which may represent a service opening. volfrac=0.2, penal=3, rmin=8, ft=3, eta=0.5,
To describe this configuration, we only need to replace beta=2 and no continuation is applied. The cost per
Lines 31–33 with the following iteration is about 10.8 s and, considering the fairly large
discretization of 1.62 · 106 DOFs, is very reasonable.

4 Extension to 3D

The implementation described in Section 3 is remarkably


easy to be extended to 3D problems (see Section Appendix).

Fig. 8 Geometrical sketch of the 3D cantilever example (a) and opti- mesh h = 96 × 48 × 48 and has been obtained by replacing the direct
mized topology for h = 48 × 24 × 24 and considering the two filter solver with the multigrid–preconditioned CG (see Amir et al. 2014 for
boundary conditions (b, c). The design in d corresponds to the finer details)
2222 F. Ferrari and O. Sigmund

(s)
Notable modifications are the definition of Ke for the state equation solve) whereas in top3D125 this weight is
8-node hexahedron (Lines 24–47) and the solution of the cut to 7 − 10%. Also, the time spent for the OC update
equilibrium (5), now performed by is reduced, even though the code of Amir et al. (2014)
already implemented a strategy for avoiding filtering at each
bisection step.

which in this context has been observed to be faster than the


decomposition routine. Then, apart from the plotting 5 Concluding remarks
instructions, all the operations are the same as in the 2D
code and only 12 lines need minor modifications, basically We have presented new Matlab implementations of compli-
to account for the extra space dimension (see tags “#3D#” ance topology optimization for 2D and 3D domains. Com-
in Section Appendix). pared to the previous top88 code (Andreassen et al. 2011)
We test the 3D implementation on the cantilever example and available 3D codes (e.g., by Liu and Tovar 2014 or Amir
shown in Fig. 8a, for the same data considered in Amir et al. et al. 2014), the new codes show remarkable speedups.
(2014). The discretization is set to h = 48 × 24 × 24, the Improvements are mainly due to the following:
volume
√ fraction is f = 0.12, and the filter radius rmin =
3. We also consider the volume-preserving Heaviside 1. The matrix assembly is made much more efficient by
projection, (ft=3). Figure 8b and c show the designs defining mesh-related quantities as integers (Matlab
obtained after 100 redesign steps, for the two different filter int32) and assembling just one half of the matrix.
boundary conditions. The design in (b), identical to the 2. The number of OC iterations is drastically cut by
one in Amir et al. (2014), corresponds to zero-Neumann looking at the explicit expression of the Lagrange
boundary conditions (i.e., the option “symmetric” was multiplier for the problem at hand.
used in imfilter). The design in (c) on the other hand, 3. Filter implementation and volume-preserving density
corresponds to zero-Dirichlect boundary conditions for the projection allow to speed up the redesign step.
filter operator and is clearly a worse local minimum.
The overall CPU time spent over 100 iterations is 1741 s The new codes are computationally well balanced and as
and about 96% of this is due to the solution of the state equa- the problem size increases the majority of the time (85 to
tion. Only 1.2% of the CPU time is taken by matrix 90% for 2D and even 96% for 3D discretizations) is spent
assemblies and 0.4% by filtering and the design update on the solution of the equilibrium system. This is precisely
processes. what we aimed at, as this step can be dealt with efficiently
Upon replacing the direct solver in top3D125 with the by preconditioned iterative solvers (Amir et al. 2014; Ferrari
same multigrid preconditioned CG solver of Amir et al. et al. 2018; Ferrari and Sigmund 2020). We also discussed
(2014), we can compare the efficiency of the two codes. Anderson acceleration, that has recently been applied to TO
We refer to Table 4 for the CPU timings, considering the also by Li et al. (2020), to accelerate the convergence of the
discretizations h = 48 × 24 × 24 (l = 3 multigrid levels) overall optimization loop.
and h = 96×48×48 (l = 4 multigrid levels). top3D125 We point out that even if we specifically addressed
shows speedups of about 1.8 and 1.9, respectively, and most volume constrained compliance minimization and density-
of the time is cut on the matrix assembly. In the code based TO the methods above can be applied also to level-set
of Amir et al. (2014), this operation takes about 50% of and other TO approaches. Point 1 can be extended to all
the overall time (and notably has the same weight as the problems governed by symmetric matrices. Points 2 and 3

Table 4 Performance comparison between the new top3D125 code and the one from Amir et al. (2014). tit , tA , tS , tU , and tP have the same
meaning as in Table 3 and numbers between brackets denote the % weight of the operations on the overall CPU time
√ √
h 48 × 24 × 24, rmin = 3 96 × 48 × 48, rmin = 2 3

top3dmgcg top3D125 top3dmgcg top3D125

tit 3.19 1.79 27.33 14.20


tA 160.6(50.3) 13.1 (7.4) 1369(50.1) 137.2(9.7)
tS 148.1(46.4) 151.7(84.7) 1250(45.7) 1272(89.5)
tU 1.97 (0.6) 0.7 (0.4) 21.2 (0.8) 15.12(1.1)
tP 0.74 (0.4) 0.24 (0.1) 39.2 (1.4) 0.29(<0.1)
A new generation 99 line Matlab code for compliance... 2223

can also be extended to other problems, to some extent, due to separability of the approximation. Let us denote
and Anderson acceleration is also usable in a more general the rightmost expression xe = F(j )e (λ), and taking into
setting (e.g., within MMA). account the box constraints in C , we have

Therefore, we believe that this contribution should be help- ⎪
⎨x(j +1),e = δ−
⎪ if e ∈ L = {e | x(j +1),e ≤ δ− }
ful to all researchers and practitioners who aim at tackling U (xe ) = x(j +1),e = δ+ if e ∈ U = {e | x(j +1),e ≥ δ+ } (26)


TO problems on laptops, and set a solid framework for the ⎩x
(j +1),e = F(j ),e if e ∈ M = {e | δ− < x(j +1),e < δ+ }
efficient implementation of more advanced procedures.
where C = L + U + M. The above is equivalent to (10).
Acknowledgments The project is supported by the Villum Fonden 2. We then evaluate the dual function for x(j +1) given by
through the Villum Investigator Project “InnoTop.” The authors are (26), and the stationarity (∂λ ψ = 0) gives
grateful to members of the TopOpt group for their useful testing of the
code. 
m
∂e V (ξ )(χU δ+ +χL δ− + F(j ),e (λ)χM )−f |h | = 0
Compliance with ethical standards e=1

where χ[·] is the characteristic function of a set.


Conflict of interests The authors declare that they have no conflict of In this simple case, the above can be solved for
interest.
λ(j +1) , the Lagrange multiplier enforcing the volume
constraint for the updated density x(j +1) , and after some
Replication of results Matlab codes are listed in the Appendix
and available at www.topopt.dtu.dk. The stenglib package, simplifications, we obtain
containing the fsparse function, is avaialble for download at  1/2 2
https://github.com/stefanengblom/stenglib. e∈M x(j +1)e (∂e c(ξ )/∂e V (ξ ))
λ(j +1) = (27)
f |h |/∂e V (ξ ) − |L|δ− − |U |δ+

Appendix A: Elaboration on the OC update where | · | denotes the number of elements in a set.
Equations (26) and (27) can be iteratively used to
Let us consider (3) at a given design point xk assuming the
compute the new solution (xk+1 , λ∗k ), as implemented in the
reciprocal and linear approximation for the compliance and √
code here below (again, note that lm here represents λ)
volume functions, respectively (Christensen and Klarbring
2008)
 m −1
min c (x)  ck + 2
e=1 (−xk,e ∂e c(xk ))xe
x∈[δ− ,δ+ ]m
m (24)
s.t. e=1 ∂e V (xk )xe − f |h | ≤ 0

We set up the Lagrangian associated with (24)


m
L(x, λ) = c(x) + λ ∂e V (xk )xe − f |h |
e=1

and seek the pair (xk+1 , λ∗k ) ∈ Rm × R+ solving the


subproblem
  and, for the MBB beam example, this performs as shown by
the green curves in Fig. 4b.
max ψ(λ) := min L(x, λ) (25)
λ>0 x∈C However, a closed form expression such as (27) cannot
be obtained for more involved constraint expressions and
where C = {x ∈ Rm | δ− ≤ xe ≤ δ+ , e = 1 . . . , m} therefore a root finding strategy must be employed to
and ψ(λ) is the dual function. Equation (25) is solved approximate the Lagrange multiplier. The application of
by primal-dual (PD) iterations, as x and λ are interlaced. (27) to the current, feasible design point (x(j +1) = xk )
Replacing ξ = xk and using subscripts (j ) to denote inner reduces to
PD iterations, we have  2
1 
m
∂e c(ξ ) 1/2
1. Fixed λ = λ(j ) , the inner minimization in (25) gives λ =
#
xk,e − (28)
mf ∂e V (ξ )
e=1
1
∂e c(ξ ) 2
since |M| = |h | = m, |L| = |U | = 0 and we made use of
ξe2 ∂e c(ξ )xe−2 +λ∂e V (ξ ) = 0 =⇒ xe = ξe −
λ∂e V (ξ ) (7). We immediately verify that (28) is identical to (19).
2224 F. Ferrari and O. Sigmund

AppendixB: The 2D code for compliance


minimization
A new generation 99 line Matlab code for compliance... 2225
2226 F. Ferrari and O. Sigmund

AppendixC: 3D code for compliance


minimization
A new generation 99 line Matlab code for compliance... 2227
2228 F. Ferrari and O. Sigmund

and projection functions. Int J Numer Methods Eng 61(2):238–


254. https://doi.org/10.1002/nme.1064
Hestenes MR (1969) Multiplier and gradient methods. J Optim Theory
References Appl 4(5):303–320. https://doi.org/10.1007/BF00927673
Horn RA, Johnson CR (2012) Matrix analysis, 2nd edn. Cambridge
Amir O, Sigmund O (2011) On reducing computational effort in University Press, New York
topology optimization: how far can we go? Struct Multidiscip Li L, Khandelwal K (2015) Volume preserving projection filters and
Optim 44(1):25–29. https://doi.org/10.1007/s00158-010-0586-7 continuation methods in topology optimization. Engineering Stru
Amir O, Aage N, Lazarov BS (2014) On multigrid–CG for efficient 85:144–161
topology optimization. Struct Multidiscip Optim 49(5):815–829. Li W, Suryanarayana P, Paulino G (2020) Accelerated fixed–point
https://doi.org/10.1007/s00158-013-1015-5 formulation of topology optimization: application to compliance
Anderson DG (1965) Iterative procedures for nonlinear integral minimization problems. Mech Rese Commun 103:103,469
equations. J Assoc Comput Mach 12(4):547–560 Liu K, Tovar A (2014) An efficient 3d topology optimization code
Andreassen E, Andreasen CS (2014) How to determine composite written in matlab. Struct Multidiscip Optim 50(6):1175–1196.
material properties using numerical homogenization. Comput https://doi.org/10.1007/s00158-014-1107-x
Mater Sci 83:488–495 Peng Y, Deng B, Zhang J, Geng F, Qui W, Liu L (2018) Anderson
Andreassen E, Clausen A, Schevenels M, Lazarov BS, Sig- acceleration for geometry optimization and physics simulation.
mund O (2011) Efficient topology optimization in matlab ACM Trans Graph 37(4):42:1–42:14
using 88 lines of code. Struct Multidiscip Optim 43(1):1–16. Pratapa PP, Suryanarayana P, Pask JE (2016) Anderson acceleration
https://doi.org/10.1007/s00158-010-0594-7 of the jacobi iterative method: An efficient alternative to krylov
Arora JS, Chahande AI, Paeng JK (1991) Multiplier methods for engi- methods for large, sparse linear systems. J Comput Phys 306:43–
neering optimization. Int J Numer Methods Eng 32(7):1485–1525 54. https://doi.org/10.1016/j.jcp.2015.11.018
Bendsøe MP, Sigmund O (1999) Material interpolation schemes Quarteroni A, Sacco R, Saleri F (2000) Numerical mathematics. Texts
in topology optimization. Arch Appl Mech 69(9):635–654. in applied mathematics. Springer
https://doi.org/10.1007/s004190050248 Ramiere I, Helfer T (2015) Iterative residual–based vector methods
Bourdin B (2001) Filters in topology optimization. Int J Numer to accelerate fixed point iterations. Comput Math Appl 70:2210–
Methods Eng 50(9):2143–2158. https://doi.org/10.1002/nme.116 2226
Brezinski C, Chehab JP (1998) Nonlinear hybrid procedures and Saad Y (1992) Numerical methods for large eigenvalue problems.
fixed point iterations. Numer Funct Anal Optim 19(5–6):465–487. Manchester University Press
https://doi.org/10.1080/01630569808816839 Sanders ED, Pereira A, Aguiló MA, Paulino GH (2018) Polymat: an
Bruns TE, Tortorelli DA (2001) Topology optimization of efficient Matlab code for multi–material topology optimization.
non-linear elastic structures and compliant mechanisms. Struct Multidiscip Optim 58:2727–2759
Comput Methods Appl Mech Eng 190(26):3443–3459. Sigmund O (2001) A 99 line topology optimization code writ-
https://doi.org/10.1016/S0045-7825(00)00278-4. http://www. ten in Matlab. Struct Multidiscip Optim 21(2):120–127.
sciencedirect.com/science/article/pii/S0045782500002784 https://doi.org/10.1007/s001580050176
Challis VJ (2010) A discrete level-set topology optimization code Sigmund O (2007) Morphology–based black and white filters for
written in matlab. Struct Multidiscip Optim 41(3):453–464. topology optimization. Struct Multidiscip Optim 33(4):401–424
https://doi.org/10.1007/s00158-009-0430-0 Suresh K (2010) A 199–line Matlab code for Pareto–optimal tracing in
Christensen P, Klarbring A (2008) An introduction to structural topology optimization. Struct Multidiscip Optim 42(5):665–679
optimization. Solid mechanics and its applications. Springer, Talischi C, Paulino GH, Pereira A, Menezes IF (2012) Polytop:
Netherlands a matlab implementation of a general topology opti-
Davis TA (2009) User guide for CHOLMOD: a sparse Cholesky mization framework using unstructured polygonal finite
factorization and modification package element meshes. Struct Multidiscip Optim 45(3):329–357.
Davis T (2019) Suitesparse: a suite of sparse matrix software. http:// https://doi.org/10.1007/s00158-011-0696-x
faculty.cse.tamu.edu/davis/suitesparse.html Walker HF, Ni P (2011) Anderson acceleration for fixed point
Engblom S, Lukarski D (2016) Fast matlab compatible sparse iterations. SIAM J Numer Anal 49(4):1715–1735
assembly on multicore computers. Parallel Comput 56:1–17 Wang MY (2007) Structural topology optimization using level set
Eyert V (1996) A comparative study on methods for convergence method. In: Computational methods in engineering & science.
acceleration of iterative vector sequences. J Comput Phys Springer, Berlin, pp 310–310
124(2):271–285. https://doi.org/10.1006/jcph.1996.0059 Wang F, Lazarov B, Sigmund O (2011) On projection methods,
Fang HR, Saad Y (2009) Two classes of multisecant methods for convergence and robust formulations in topology optimization.
nonlinear acceleration. Numer Linear Algebra Appl 16(3):197– Struct Multidiscip Optim 43(6):767–784
221. https://doi.org/10.1002/nla.617 Xia L, Breitkopf P (2015) Design of materials using topology
Ferrari F, Sigmund O (2020) Towards solving large-scale topology optimization and energy-based homogenization approach
optimization problems with buckling constraints at the cost of in matlab. Struct Multidiscip Optim 52(6):1229–1241.
linear analyses. Comput Methods Appl Mech Eng 363:112,911. https://doi.org/10.1007/s00158-015-1294-0
https://doi.org/10.1016/j.cma.2020.112911 Xu S, Cai Y, Cheng G (2010) Volume preserving nonlinear density
Ferrari F, Lazarov BS, Sigmund O (2018) Eigenvalue topology filter based on Heaviside functions. Struct Multidiscip Optim
optimization via efficient multilevel solution of the frequency 41:495–505
response. Int J Numer Methods Eng 115(7):872–892
Guest JK, Prévost JH, Belytschko T (2004) Achieving minimum Publisher’s note Springer Nature remains neutral with regard to
length scale in topology optimization using nodal design variables jurisdictional claims in published maps and institutional affiliations.

You might also like