Delft University of Technology
On-the-fly construction of surrogate constitutive models for concurrent multiscale
mechanical analysis through probabilistic machine learning
Rocha, I. B.C.M.; Kerfriden, P.; van der Meer, F. P.
DOI
10.1016/j.jcpx.2020.100083
Publication date
2021
Document Version
Final published version
Published in
Journal of Computational Physics: X
Citation (APA)
Rocha, I. B. C. M., Kerfriden, P., & van der Meer, F. P. (2021). On-the-fly construction of surrogate
constitutive models for concurrent multiscale mechanical analysis through probabilistic machine learning.
Journal of Computational Physics: X, 9, [100083]. https://doi.org/10.1016/j.jcpx.2020.100083
Important note
To cite this publication, please use the final published version (if applicable).
Please check the document version above.
Copyright
Other than for strictly personal use, it is not permitted to download, forward or distribute the text or part of it, without the consent of the author(s) and/or copyright holder(s), unless the work is under an open content license such as Creative Commons. Takedown policy
Please contact us and provide details if you believe this document breaches copyrights. We will remove access to the work immediately and investigate your claim.
This work is downloaded from Delft University of Technology.
Contents lists available atScienceDirect
Journal
of
Computational
Physics:
X
www.elsevier.com/locate/jcpx
On-the-fly
construction
of
surrogate
constitutive
models
for
concurrent
multiscale
mechanical
analysis
through
probabilistic
machine
learning
I.B.C.M. Rocha
a,
∗
,
P. Kerfriden
b,
c,
F.P. van der Meer
aaDelftUniversityofTechnology,FacultyofCivilEngineeringandGeosciences,P.O.Box5048,2600GADelft,theNetherlands bMinesParisTech(PSLUniversity),Centredesmatériaux,63-65RueHenri-AugusteDesbruèresBP87,F-91003Évry,France cCardiffUniversity,SchoolofEngineering,Queen’sBuildings,TheParade,Cardiff,CF243AA,UnitedKingdom
a
r
t
i
c
l
e
i
n
f
o
a
b
s
t
r
a
c
t
Articlehistory:
Received10June2020
Receivedinrevisedform5October2020 Accepted21December2020
Availableonline28December2020
Keywords: Concurrentmultiscale Surrogatemodeling Probabilisticlearning GaussianProcesses(GP) Activelearning
Concurrent multiscale finite element analysis (FE2) is a powerful approach for
high-fidelity modeling of materials for which a suitable macroscopic constitutive model is not available. However, the extreme computational effort associated with computing a nested micromodel at every macroscopic integration point makes FE2 prohibitive for
mostpracticalapplications.Constructingsurrogatemodelsabletoefficientlycomputethe microscopicconstitutiveresponseisthereforeapromisingapproachinenablingconcurrent multiscalemodeling.Thisworkpresentsareductionframeworkforadaptivelyconstructing surrogatemodelsforFE2basedonstatisticallearning.Thenestedmicromodelsarereplaced
by a machine learning surrogate model based on Gaussian Processes (GP). The need for offline data collection is bypassed by training the GP models online based on data comingfromasmallsetoffully-solved
anchor micromodels
thatundergothesamestrain historyastheirassociatedmacroscopicintegrationpoints.TheBayesianformalisminherent to GPmodels provides anatural tool for online uncertainty estimation through which new observations or inclusion ofnew anchor micromodels are triggered. The surrogate constitutivemanifoldisconstructedwithasfewmicromechanicalevaluationsaspossible byenhancingtheGPmodelswithgradientinformationandthesolutionschemeismade robustthroughagreedydataselectionapproachembeddedwithintheconventionalfinite elementsolutionloopfornonlinearanalysis.Thesensitivitytomodelparametersisstudied with atapered barexample with plasticityand theframework is furtherdemonstrated withthe elastoplasticanalysis ofaplatewithmultiplecutoutsandwithacrackgrowth exampleformixed-modebending.Althoughnotabletohandlenon-monotonicstrainpaths in its current form, the framework is found to be a promising approach in reducing the computational cost ofFE2,with significant efficiency gains being obtainedwithout resortingtooffline training.
©2020TheAuthor(s).PublishedbyElsevierInc.Thisisanopenaccessarticleunderthe CCBYlicense(http://creativecommons.org/licenses/by/4.0/).
*
Correspondingauthor.E-mailaddress:i.rocha@tudelft.nl(I.B.C.M. Rocha).
https://doi.org/10.1016/j.jcpx.2020.100083
2590-0552/©2020TheAuthor(s).PublishedbyElsevierInc.ThisisanopenaccessarticleundertheCCBYlicense (http://creativecommons.org/licenses/by/4.0/).
1. Introduction
Thereisagrowingdemandforhigh-fidelity numericaltechniquescapableofdescribingmaterialbehavioracrossspatial scales.Withrecentadvancesinadditivemanufacturingallowingforthedevelopmentofnovelmaterialswithhighly-tailored microstructures [1], multiscale modeling techniques will become increasingly relevant in the design of novel materials and structures. One popular approach for concurrent multiscale modeling is the so-calledFE2 approach [2,3], in which macroscopicmaterialresponseisdirectlyupscaledfromembeddedmicromodelswithoutintroducingadditionalconstitutive assumptions.FE2 isapowerfulandversatiletechnique usedinanumberofsolidmechanicsapplications,fromcontinuous [4] anddiscontinuous[5] mechanicalequilibriumtomultiphysicsproblemsinvolvingheatandmasstransfer[6] andmaterial degradationduetoaging[7,8].However,FE2 hasthemajordrawbackofbeingassociatedwithextremecomputationalcosts, hinderingitsapplicationinactualdesignscenarios.EnablingtheuseofFE2 inmanypracticalapplicationsthatwouldbenefit fromitsaccuracyandversatilityisthereforehighlycontingent onbeingabletoreduceits computationalcosttotractable levels.
A promisingapproachin accelerating FE2 models consistsinconstructing surrogatemodels that takethe placeofthe originalhigh-fidelitymicromodelsateachmacroscopicintegrationpoint.Whenbuildingsurrogates,thegoalistomaintain as much of the generality offered by the original micromodel while eliminating as much computational complexity as possible.One optionis toemploy unsupervised learning ona numberoffull-order solutionsnapshots inorderto define lower-dimensional solutionmanifoldsforbothdisplacements [9,10] andinternal forces[11,12] atthemicroscale [13–15]. Alternatively, a supervisedlearning approach canbe takenby using snapshotsofthe homogenizedmicromodelresponse to directlydefineadata-driven regressionmodelforthe macroscopicconstitutivebehavior[16–21]. Althoughresultingin modelsofdistinctnatures,bothapproachesrelyontheexistenceofanobservationdatabaseonthebehavioroftheoriginal micromodel that is usually obtainedoffline (before deployment on a multiscalesetting) and should coverevery possible scenariothesurrogateisexpectedtoapproximateonline.
However,buildingsuchadatabaseofmodelsnapshotscanbeachallengingtask(see[20,22] forinterestingapproaches based onDesign ofExperimentsand[14] for a data-driventraining framework basedonBayesian Optimization).For mi-cromodels employingpath-dependentmaterials,thisofflinetraining processentailssamplingahighly-complexconstitutive manifold that dependson anarbitrarily longstrain historyandcanthereforebe excessivelysensitive tosmallchanges in boundaryconditions(e.g.strain localizationandcrackpropagationproblems). Furthermore,training asurrogatewithsuch a complex datasetoften requiresadditional partitioningtechniquesin orderto avoidcomputationally inefficient reduced models [23,24]. An alternative to theconventional offline-online approach that hasbeen gainingpopularity isthe use of adaptive reduction frameworks that either preclude the need for offline training altogether [25,26] or combine different reductiontechniquesintoasingleframeworkwithonlylimitedoffline effortwhileemployingonline errorindicatorsto con-tinuouslyassessthequalityoftheapproximationandtriggerarefinementofthesurrogateswhennecessary.Nevertheless, obtaining consistent adaptivity criteriaforeitherhyper-reduced models[25] or machinelearning models basedon least-squaressolutions[19,27] isnotstraightforward.Attheotherendofthespectrum,worksdealing withconstitutivemodels basedonBayesianregressiontechniques,thatprovideanaturalwaytoestimateerror[28–30],donotoftentakeadvantage oftheirpotential forcreatingadaptiveframeworks(see[31,32] foraninterestingframework specificallytailoredtocrystal plasticity).Thereisaneed,therefore,forthedevelopmentoffully-onlineapproachesforgeneralFE2 modelswithreliable adaptivity strategies basedon sequential learning techniques andon error estimation methods with robust probabilistic foundations.
Thisworkpresentsanadaptiveprobabilisticframeworkforconstructingsurrogateconstitutivemodelsfornonlinear con-currentmultiscaleanalysis.Theapproachisbasedonsubstitutingtheoriginalmodelsassociatedtomacroscopicintegration pointswithamachinelearningsurrogate.Inordertobypasstheneedforanoffline trainingphase,asmallnumberof fully-solved models associated to representative macroscopicintegration points are used to generate constitutive data online.
Becausethesemodelsarenotdirectlyusedtomakepredictionsateverymacroscopiciterationbutonlyprovideanindirect couplingbetweenthescales,we denotethemasanchormodels (Fig.1).Stress andstiffness dataresultingfromsubjecting theanchor modelstothesamestrainhistoriesseenbytheirrespectiveanchoringpointsisusedtobuildasurrogatemodel basedontheGaussian Process(GP)regressiontechnique.The frameworkpreservesthegeneralityofFE2 by notassuming theexistenceofanyinternalvariablesatthemacroscaleandbuildssurrogatesthattakeonlystrainsasinput.However,this impliesaone-to-onemappingbetweenstressesandstrains,makingtheframeworkinitscurrentformunsuitablefor treat-ing non-monotonic loadpaths.Thiscan beaddressedinfutureextensions oftheframework byemploying autoregressive GPs[33,34].
The accuracy of thisreduced-order solution is controlled withuncertainty information that arises naturally fromthe Bayesian formalism of the GP models. The resultant material model is embedded within a conventional finite element solutioninawaythatensuresthemacroscopicsolutionisnumericallyrobustandlimitsthesamplingofnewdataasmuch aspossible inorderto maximize efficiency.The framework isdemonstratedwitha setof numericaltestsincludingboth bulkstrainlocalizationandcrackpropagationinordertoassessitsaccuracy,efficiencyandversatility.
Fig. 1. Schematic representationoftheonline adaptivereductionframeworkpresentedinthiswork.Asmallnumberofanchor models—inthiscasea matrixreinforcedwithcircularinclusions—areusedtotrainamachinelearningsurrogatethatevolvesasthemacroscopicstructureisloaded.Adaptive samplingguaranteesthatanchormodelsoperatingatnovelregionsinparameterspace—illustratedherebyplasticstrainbandsshowninorangeandred —willpopulatethedataset
D
moreoften.2. Concurrentmultiscaleanalysis
2.1. Macroscopicproblem
We beginby briefly introducing theconcurrent multiscale equilibriumproblemthat we seekto accelerate. Let
de-fine themacroscopicdomainbeingmodeled. Wewish tofindthe displacementfield u resultingfromaset ofDirichlet and Neumannboundary conditions applied to thesurface
that bounds
.
Under the assumption ofsmall strains,the equilibriumsolutionisfoundbysatisfying:div
σ
=
0ε
=
1 2∇
u+
∇
u T (1) wherediv(
·)
isthedivergenceoperator,σ
andε
arerespectivelythestressandstraintensorsandbodyforceshavebeen neglected.Inordertosolveforu,aconstitutivemodelM
thatrelatesσ
andε
mustbeintroduced:σ
=
M
ε
,
ε
h(2) where
ε
h is a history termthat accounts forstrain path dependency. In the context ofmultiscale analysis, the model
M
can beseen asahomogenizationoperatorthat lumps allphysicalprocesseshappeningatscales lower thanintoa homogeneous medium with equivalent behavior. Depending on howcomplex the microscopic behavior is,
M
can range fromhavingarelativelysimpleform(e.g.linearelasticity)tobeingnexttoimpossibletoformulateexplicitly.2.2. Microscopicproblem
Inaconcurrentmultiscaleapproach,wedonotformulate
M
directlyandinsteadoptforupscalingmicroscopicbehavior frommicromodelsembeddedateachmacroscopicmaterialpoint.Letω
be aRepresentativeVolumeElement(RVE)ofthe microscopic materialfeatures whosebehaviorwe wouldliketo upscale.Assuming theprincipleofseparation ofscalesis valid(ω
),wecanlinkthetwoscalesbyenforcing:
uω
=
ε
xω+
u (3)wheretheresponseisdecomposedintoalineardisplacementfieldimposedbythepresenceofmacroscopicstrains(xω are themicroscopicspatialcoordinates)andafluctuationfield
u thataccountsformicroscopicinhomogeneities.
Inordertoobtain
σ
,wefirstfindanequilibriumsolutionforuω bysatisfying:div
σ
ω=
0ε
ω=
1 2where we seethat once morethe needarises forthe definitionofa constitutive model,thistime relating
ε
ω withσ
ω .Underlyingthechoiceforamultiscaleapproachistheassumptionthatconstitutivebehaviorcanberepresentedbymodels of decreasingcomplexity asone descends to lower scales. Itis thereforecommonto employ regular constitutivemodels (e.g.(visco)elasticity,(visco)plasticity, damage)formaterialcomponentsatthemicroscale[4,6–8].However,theframework isflexibleinallowingformodelsonanevenlowerthirdscaletobeembeddedtomaterialpointsin
ω
(atthecostofeven highercomputationaleffort).2.3. Bulkhomogenization
Withasolution foruω under theconstraintthatthefluctuation field
u must beperiodic,we canusetheHill-Mandel principle[35] toobtainhomogenizationexpressionsforthemacroscopicstresses
σ
andconsistenttangentstiffnessD:σ
=
1ω
ω
σ
ωdω
D=
P
Kω (5)fromwhichweobtaintheintuitiveresultthatthemacroscopicstressesaresimplythevolumeaverageofthemicroscopic ones.Finally,themacroscopicconstitutivetangentstiffnessiscomputedthroughaprobingoperator
P
appliedontheglobal microscopictangentstiffnessmatrixKω [5].2.4. Cohesivehomogenization
Although theprecedingformulation allows forgeneralmicroscopic constitutivebehavior to be upscaled, theresponse losesobjectivitywithrespecttotheRVEsizeaftertheonsetofglobalmicroscopicsoftening[36],i.e.whenthedeterminant oftheacoustictensoriszeroalongagivendirectionn:
det
nTDn=
0 (6)Thisnon-objectivity arisesbecausethevolume
ω
d ofthestrainlocalizationbandthat causesthesofteningdoesnotscale withtheRVEsize,anobservationthatmotivatestheuseofamodifiedversionoftheHill-Mandelprinciple[5,36]:1 w
τ
δ
v
=
1 whωd
σ
ωδ
ε
ωdω
(7)wherew andh aregeometricRVEparametersthatdependonthelocalizationbandorientation,
τ
isamacroscopictraction andv isashifteddisplacementjumpthatallowsforaninitially-rigidcohesive response.Notethatthehomogenization isnowperformedtowardsacohesivetractionactingonamacroscopicsurfacethatdefinesadiscontinuityinu.
Thedevelopment ofa consistentstrategyforcontinuous-discontinuousscalelinkinginFE2 isanopenissuethatisleft out ofthescopeofthepresentdiscussion,withanumberofdifferentapproachesbeingfound,forinstance,in[5,36–39]. For the purposeof buildingsurrogate models for the RVEresponse, it suffices to acknowledge that two distinct models shouldbetrainedforbulkandcohesiveresponses,astheunderlyingconstitutivemanifoldshavedimensionswithdifferent physicalinterpretations(strain/stressversusjump/traction).
2.5. Accelerationstrategy
AssumingFEMisusedtosolvethemechanicalproblemsatbothscales,wecanapproximatethecomputationalcostofa singlemacroscopiciterationas:
Cost
≈
Ndofx+
NipNiterω Ndofω x (8)whereNdofrepresentsthenumberofdegreesoffreedomofthemodel,Nipisthenumberofmacroscopicintegrationpoints andNωiter isthenumberofiterationsnecessaryforconvergenceofthemicroscopicBVP.Weassumeforsimplicitythatthe bulk ofthe effort comesfrom solvingthe linearizedsystems ofequationsinvolving K andKω fornodaldisplacements, wherethecomplexityexponentx dependsonthesolverused.Itcanbeseenthatthesecondterm,associatedwithsolving themicroscopicequilibriumproblems,quicklyoutweighsthefirstandbecomesaperformancebottleneckasNωdofincreases, especially since thenumberof macroscopicintegrationpoints Nip increasestogether with Ndof. Constructing asurrogate thatreplacestheoriginalconstitutivemodel
M
ofEq. (2) isthereforeaneffectiveapproachtoacceleratingFE2.However, constructingsuch asurrogateoffline isa challengingundertaking,otherwisetherewouldnothavebeenneed foramultiscaleapproachinthefirstplace.From Eq. (2) weseethat theconstitutivemanifoldtobereproduced canhave anarbitrarilyhighdimensionalityduetothedependencyon
ε
h.Thisisequivalenttostatingthattheshapeofthe
ε
-σ
manifold canchangeaftereach loadstep.Samplingthishigh-dimensional inputspaceoffline inordertohaveasurrogatethatisaccurateforarbitrarystrainhistoriesseemstobeanintractableproblemthat,tothebestofourknowledge,hasstill notbeentackledinasatisfactoryway.
This issue can be avoided by exploiting thefact that
ε
(andthereforeε
h) are often highly constrained by the ge-ometry and boundary conditions of the macroscopic structure being modeled [24]. We therefore opt forconstructing a highly-tailoredsurrogatemodel
S
online basedonadatasetD
ofobservationscomingfromasmallnumberoffully-solved micromodels:σ
=
S
ε
,
D
(9)whichcan thenbe usedtocompute theconstitutive responseforafractionofthecost.When trying tokeep
D
assmall aspossibleforthe caseathand,it iscrucialto haveameans forquantifyingthe uncertaintyinprobingS
foranygivenε
.Inthiswork,aBayesianapproachisadoptedtoassessonline whetherD
islargeenoughtoprovidethedesiredlevelof confidenceinS
atagivenε
.3. Bayesiansurrogatemodeling
InthissectionweintroducetheBayesianregressionapproachusedtoconstructsurrogateconstitutivemodels,beginning fromparametricversionsofthesurrogatemodel
S
—i.e.by encapsulatingtheconstitutiveinformationinD
intoasetof parametersw —andeventuallymovingtoanon-parametricmodelbasedonGaussianProcesses(GP)thatusesthedatainD
directlyinordertomakepredictions.Ourgoalhereistoappealtothereaderwhomightbeunfamiliarwithprobabilistic regression models by starting fromclassical least-squares regressionand gradually moving towards a Bayesianapproach. Nevertheless,the discussioniskept asbriefandfocused aspossible.The interestedreadercanfindricherdiscussionson thesubjectin[40,41].3.1. Least-squaresregression
Westartbybuildingaparametricmodely
(
w,
x)
thatapproximatesascalartargetresponset byfittingw withadatasetD
ofN observationsto atXo=
[xo1· · ·
xoN].Wethereforeassumethatthetargett canbewrittenas:t
=
y(
w,
x)
+
with y
(
w,
x)
=
M j wjφ
j(
x)
and p(
)
=
N
|
0,
σ
n2 (10)wherethe noise
is givenbya zero-mean Gaussiandistribution withvariance
σ
2n and y is linearwithrespectto its M weights,gatheredinthevectorw.Notethattheassumptionofalinearmodeldoesnotlimititsfittingcapabilitiessincethe basisfunctions
φ
define afeaturespacethatcanbenonlinearinx.Inthecontextofthiswork,t canbeasingle stressor tractioncomponent,butthediscussionisequallyapplicabletotheproblemofmodelingindividualmodelcomponents(e.g. yieldparameters).Inorderto findvaluesforw,we computethelikelihoodfunction ofthemodel,i.e.how likelythemodelistoproduce thevaluestoin
D
givenw:p
to|
w,
σ
n2=
N iN
toi|
wTφ (
xoi) ,
σ
n2 (11) whichisaproductoftheprobabilitiesofeach pointinisolation becauseweassume thateachsampletoi issampledfrom theconditionaldistributionp(
t|
y)
independently.Wecanfindanoptimumdatafitforw bymaximizingthelikelihoodand assumingthattheshapesofφ
arefixedapriori:∇w
p to|
w,
σ
n2=
0⇒
wML=
T
−1
Ttoand
σ
n2=
1 N N i toi−
wTMLφ (
xoi)
2 (12) where∈ R
N×M is a matrix withbasis function valuesφ
i,j= φ
ixo j
evaluated at each point in
D
. Note that this is equivalent tominimizingthesumofsquareddifferencesbetween y andt,sowML isthesameparametervectorobtained bytheclassicalleastsquaresapproachandσ
2n quantifiesthespreadoft aroundy.
Herewedonotoptforaleast-squaresapproachforthreereasons.Firstly,itcansufferfromsevereoverfittingwhenthe dataset
D
issmall—whichisthecaseinthepresentworksince wehavenooffline trainingandonlyasmallnumberof fully-solved micromodelsto sample from. Secondly,the uncertaintyassociated withσ
2n is constantthroughout theinput spacex anddoesnot provideanindication that themodelisbeingusedata locationfar fromdata points(which could thenbeusedtotriggerarefinementof
D).
Finally,performingmodelselectionintheleast-squaresframework,using cross-validation for instance, is rather tedious compared to performing model selection in the context ofBayesian regression methods[40].3.2. Bayesianparametricregression
IntheBayesianapproachtoregression,wenotonlyassumeanuncertaintyoverthetargett butalsoovertheweightsw.
Weinitiallyassumeapriorprobabilityoverw thatrepresentsourinitialmodelassumptionsbeforeanydataisencountered:
p
(
w)
=
N
w|
0,
σ
w2I (13) whereσ
2wisthevarianceparameterassociatedwiththeuncertaintyovervaluesofw. Informationfrom
D
isincorporated byusingBayes’theoremtoobtainaposteriorprobabilitydistributionforw:p
(
w|
to)
=
p(
to|
w)
p(
w)
p(
to)
with p(
to)
=
p
(
to|
w)
p(
w)
dw (14)where p
(
to|
w)
isthelikelihoodfunctionofEq. (11) and p(
to)
isthemarginallikelihood ofthemodel(i.e.theprobabilityof producingthedatasetD
).Becausebothw andt|
w areGaussianvariables,ananalyticalsolutionexistsforp(
w|
to)
[40]:p
(
w|
to)
=
N
(
w|
wN,
SN)
with wN=
1σ
2 n SNTtoand SN
=
1σ
2 w I+
1σ
2 nT
−1 (15) Notethattheexpectedvalueofw nowdependsonboththedatapointsin
D
andonourinitialbeliefsaboutw representedbythepriordistribution.Giventhisposterior,thebestguessforw istheonewiththehighestp
(
w|
to).
Thisistheso-called Maximum A Posteriori (MAP) value and, for a Gaussian posterior, wMAP=
wN. This estimation for w is equivalent to a least-squarespredictionwithaquadraticregularizationtermproportionaltoσ
2w whichhelpsreducingthenegativeeffects ofoverfitting.
However,inafullyBayesiantreatment oflinearregression,wedonotchooseonespecificvalueforw butrathermake apredictionforanewtargetvaluet∗ byaveragingoverallpossiblevaluesofw:
p
t∗|
to,
σ
n2,
σ
w2=
p t∗
|
w,
σ
2 n p w|
to,
σ
w2,
σ
n2 dw (16)InordertosetthestageforGaussianProcesses,itisalsointerestingtorepresenttheexpectationoft∗ as:
E
[t∗|
x∗]=
N i k(
xoi,
x∗)
toi with k xp,
xq=
1σ
2 nφ
Txp SNφ
xq (17) wherew hasnowvanishedandthenewpredictioniscastasalinearcombinationofvaluesfromD
,wherekxp,
xq isakernel thatmeasuresthesimilaritybetweentwopointsininputspace.
TheBayesianapproachtothe(generalized)linearmodelcircumventstheproblemofoverfittingandprovidesconfidence intervals.However,thelinearmodelhasthedrawbackofhavingafixednumberofbasisfunctions
φ
whoseshapesneedto bechosenbeforeD
isobserved.Thiscanbecircumventedbyeitherallowingφ
tochangeinshapeduringtraining(e.g.ina neuralnetwork,whereφ
areneuronvaluescomingfromthelast hiddenlayer)orbyexplicitlychoosingakernelkxp,
xq andadoptingadistributionoverfunctionsinsteadofoverweights,asinGaussianProcessregressionmodels.3.3. GaussianProcess(GP)regression
Sincethe regressionfunction y is afunctionofw, adoptingaprior distributionoverw (Eq. (13))implicitlyleads toa distributionoverfunctionvaluesy.ThisisthebasisofGaussianProcess(GP)models.Hereweassumethesefunctionvalues arejointlyGaussian:
p
(
y)
=
N
(
y|
0,
K(
X,
X))
(18)whereazeromeanisassumedwithoutlossofgeneralityandK
(
X,
X)
istheso-calledGram matrixthatdefinesthe covari-ancebetweentheinputvaluesX associatedwithy throughakernelkxp,
xq : Kpq=
k xp,
xq=
σ
f2exp−
1 22xp
−
xq 2 (19) whereinsteadoffirstdefininga prioroverweights andderiving anequivalent kernelasinEq. (17) wedirectlyassumea kernel—inthiscasethesquaredexponential kernel[41] definedbyavarianceσ
2f andalengthscale
—whichdetermines the level ofcorrelation between points in input space. It can be shownthat adopting the squaredexponential kernel is equivalent toformulating a parametricmodelwithan infinitenumberofbasis functions
φ
oraBayesian neuralnetwork withonehiddenlayerwithaninfinitenumberofneurons[40,41].Withthisdefinitionforthecovariancestructurebetweenfunctionvalues,wecanobtainthejointdistributionoftraining targetsto bymarginalizing(averaging)overallpossiblevaluesofyo:
Fig. 2. GP predictionsforthe functiont=sin(x)constructedwith twoobservations.Inclusionofderivativeobservationsimprovespredictionsaround trainingpointsandreducesuncertainty.
p
(
to)
=
p
(
to|
yo)
p(
yo)
dy=
N
to|
0,
K(
Xo,
Xo)
+
σ
n2I (20) whichallowsustoincorporateinformationfromD
bydefininga jointdistributionbetweentrainingvaluesandnew pre-dictionsforthefunctionvalue:p
to y∗
=
N
to y∗
0
,
K
(
Xo,
Xo)
+
σ
n2I K(
Xo,
X∗)
K(
X∗,
Xo)
K(
X∗,
X∗)
(21) andsubsequentlyfindtheconditionalprobability p(
y∗|
to)
ofthenewfunctionvaluesgiventhatvaluescomingfromD
are known:p
(
y∗|
to)
=
N
(
y∗|
m∗,
S∗)
(22)whichisagainaGaussiandistributionwithmeanandcovariancegivenby[40,41]: m∗
=
K(
X∗,
Xo)
K(
Xo,
Xo)
+
σ
n2I −1 to (23) S∗=
K(
X∗,
X∗)
−
K(
X∗,
Xo)
K(
Xo,
Xo)
+
σ
n2I −1 K(
Xo,
X∗)
(24)Definingk∗
=
K(
Xo,
x∗)
∈ R
N×1 andKo=
K(
Xo,
Xo)
∈ R
N×N,we canarriveatshorterexpressionsfortheexpectation andvarianceofthefunctionvalueatasinglenewpointx∗:E
[ y∗|
x∗]=
kT∗Ko+
σ
n2I −1 to (25)V
[ y∗|
x∗]=
k(
x∗,
x∗)
−
kT∗Ko+
σ
n2I −1 k∗ (26)Notethathereweoptfordirectlyusingthefunctionvalue y∗insteadofitsnoisyversiont∗todefineoursurrogatemodels. Thiseffectivelymakestheadaptivecomponentsoftheaccelerationframework,whicharedrivenbyincreasesinthevariance givenbyEq. (26),lesssensitivetochangesin
σ
2n.
InFig.2awedemonstrateaGPmodelbyplottingpredictionsbasedontwoobservationsofthenoiselesstargetfunction
t
=
sin(x).
TheconditioningofEq. (22) guaranteesthat y∗ coincideswiththeobservationsatthetrainingpointsXo.Away from the training spaceD
the GP returns to its zero-mean prior with high variance. This behavior is desirable for our application, since thisuncertainty can be used asa trigger to eithermaking newobservations atthe existing full-order integrationpointsoraddingnewpointstothefull-ordersetwhennecessary.3.4. Predictingderivativesandincludingderivativeobservations
Inbuildingconstitutivemodelsurrogates,inadditiontopredictingnewfunctionvalues(Eq. (25)),theirderivativeswith respecttotheinputx∗ arealsoneeded.ThesecanbecomputedbydifferentiatingEq. (25):
∂
∂
x∗E
[ y∗|
x∗]=
N i ai∂
∂
x∗k(
x∗,
xi)
(27)whereaiisthei-thcomponentofthevectora
=
K
+
σ
2 nI −1t andthederivativeofthekernelisgivenby:
k
xp,
xq≡
∂
∂
xp kxp,
xq= −
12 xp
−
xq kxp,
xq (28) However, sincewecan alsoobservethesederivatives(intheformofD fromEq. (5))attheanchormodels,it isalso interestingtoincludethisinformationintheGPinordertoimproveitspredictions.Thisisachievedbyrecognizingthatthe derivativeofaGPremainsGaussiananddefiningthecovariancebetweenfunctionvaluesandderivativesas[42]:cov
yp,
yq=
∂
∂
xpk xp,
xq covyp,
yq=
∂
2∂
xq∂
xpk xp,
xq+
σ
d2 (29) whereσ
2d isanoiseparameterthatrepresentstheuncertaintyassociatedtoobservingthederivatives,thefirst-order deriva-tiveofthekernelisgiveninEq. (28) anditssecondderivativeisamatrixgivenby:
k
xp,
xq≡
∂
2∂
xqxpk xp,
xq=
12 I
−
12 xp
−
xqxp
−
xqT kxp,
xq (30)Withthesedefinitions,thejointpriordistributionoftrainingtargetsbecomes[43]:
p
to to
=
N
to to
0
,
Ko
+
σ
n2I Ktd KTtd Kdd≡
N
to to
0
,
Ko (31) where Ko is thesamekernelmatrixappearing inEq. (25) and Ktd∈ R
N×N D andKdd∈ R
N D×N D are composed ofblocks of kxp,
xqandk
xp,
xq,respectively, withKo beingthe resultant N
(
D+
1)×
N(
D+
1) covariancematrix (D is the dimensionalityofx).MakingpointpredictionsisdoneinasimilarwayasinEq. (25):k∗
=
k∗ k(
x1,
x∗)
· · ·
k(
xN,
x∗)
T(32)
E
[ y∗|
x∗]=
kT∗Ko−1toV
[ y∗|
x∗]=
k(
x∗,
x∗)
−
k∗TK−o1k∗+
σ
n2 (33) wherethetargetvectornowincludestheobservedderivatives:to
=
to1
· · ·
toN to1· · ·
toN T(34) and predictions for the tangent are done as in Eq. (27) but now by differentiating k∗ instead of k∗. Fig. 2 shows GP predictionsfort
=
sin(x)
withtwotrainingpointswithandwithoutincludingderivativeobservations.Nowtheconditioning notonlyconstrainsthepredictionstoagreewiththetargetvaluesinD
butalsowiththeirderivatives.Thismakeseffective useofthelimitedamountofinformationcomingfromasmallnumberofonline observations, asmakingasinglegradient observationisequivalenttoaddingD extrafunctionobservationsaroundthetargetvalue.3.5. Hyperparameteroptimization
Theprocess variance
σ
2f andlength scale
thatcomposethekernelandthetarget noise
σ
n2 arehyperparametersthat shouldbelearnedfromthedatasetD
.SinceafullBayesiantreatmentfortheseparameters—introducingaprior,deriving aposteriorandmarginalizing—usually demandstheuseofexpensivenumericaltechniquessuchasMarkovChain Monte Carlo(MCMC),hereweoptforamaximumlikelihoodsolutioninordertominimizethecomputationaloverheadassociated withtheonline calibrationoftheGPmodels.Furthermore,duetothelimitedamountofdataavailableforestimation,here werefrainfromoptimizingforthederivativenoiseσ
2d andinsteadassumederivativeobservationsarenoiseless.
Theaimhereistomaximize themarginallikelihood,obtainedby averagingtheprobability thatthemodelreproduces thetrainingtargetsoverallpossiblevaluesofyoassociatedwithto:
p
to|
σ
f2,
σ
n2,
=
p to
|
yo,
σ
n2 p yo|
σ
2 f,
dyo (35)whichisafunctionthatdependsonlyonthehyperparameters.WeoptimizethismarginallikelihoodwithaBFGSalgorithm [44], which requiresthe computation ofthegradient of p
t.Taking thenaturallogarithm ofboth sidesof Eq. (35) and differentiatingwithrespecttoKoyieldsanexpressionforthegradient:∇
ln pto= −
1 2tr K−o1∇
Ko+
1 2t T oK− 1 o∇
Koto (36)and
∇
Ko can be obtained in a straightforward manner by differentiating Eqs. (19), (28) and (30) with respect to each hyperparameterandreassemblingthematrixasinEq. (31).4. Surrogatemodelingframework
Inthissection,weusethetechniquesofSection3.2tobuildanadaptivesurrogatemodelingframeworkforFE2. Follow-ingFig.1,weintroduceasurrogatemodel
S
trainedonasetofconstitutiveobservationsD
obtainedfromasmallnumber of fully-solved anchor models (gathered in the setA
) subjected to the strain histories from their respective anchoring integrationpoints.Forbulkhomogenization,wedefineS
as:σ
ε
,
D
=
D eε
+ E
σ
ε
,
D
(37) whereDe isaninitialstiffnessobtainedfromthefirstcomputedintegrationpointand
σ
isastresscorrectiongivenbythe meanGPresponse,withthevariancebeingusedexclusivelyforadaptivitypurposes.Theconsistenttangentstiffness isthe combinationofDe andthederivativesoftheexpectedstresscorrections(Eq. (27)).ThedatasetD
ispopulatedwithstress andstiffnessvaluescomingfrompointswiththehighestvaluesfortheuncertaintyoveranycomponentofσ
.Weoptformodelingeachcomponentof
σ
(threecomponentsforthe2Dexamplestreatedhere,withwhichwe implic-itlyenforcesymmetrytothestresstensor)independently,eachwithits ownzero-meanGPmodel.Thisstrategyimplicitly statesthatcomponentsofthestresstensorareaprioriuncorrelated,whichgreatlysimplifiesthestructureofthecovariance matrixof theGPmodels andreducesthenumberof hyperparameterstobe estimated. Furthermore,we assumefornow thatσ
dependsonlyonthecurrentstrainvalueε
,whichmeansthattheGPmodelloads,unloads andreloadsalongthe samepath.Thisisalimitationtobeaddressedinfutureversionsoftheframeworkthatcurrentlyhindersitsabilitytotreat generalnon-monotonic loadpaths.1 Itisworth mentioning,however,that pathdependencyisalreadypartiallyaccounted forsincedifferentstrainhistorieswillleadtotheconstructionofdifferentsurrogates.Note that the generality of the
S
model givenby Eq. (37) is not compromised by the elastic-correction additive de-composition sinceσ
cantake anyshape, butforinitially linear-elasticmaterialsthissplitimproves therobustness ofthe surrogateintwo ways.Firstly, ithelpstheGP inreproducingthe initialelasticbehavioratamomentwhenthesurrogate modelshaveverylittledatatowork with.Secondly,itaidsinpreventingtheoccurrenceofspurious strainlocalizationas theGPmovesbacktoitszeroprior:awayfromthetrainingpoints,themodelofEq. (37) returnsinsteadtoalinear-elastic response.At this point, it is relevant to distinguish the present approach from other interesting active learning strategies for constitutivemodelingproposed inliterature[45].In[31,32],theauthorsconstructaconstitutivedatabasefromwhich ob-servations areadaptivelyselected andusedtobuild surrogateGP constitutivemodels bothforstressesandforanumber of assumedmacroscopicinternal variables, creatinga one-to-one mapping betweeninputsandoutputs. Here we opt for a moregeneralapproach anddonot assume theexistence ofmacroscopicinternal variables. Thischoice ismotivatedby thebasicFE2 premisethatmicroscopicbehavioristoocomplextoconformtoanapriori assumedmacroscopicconstitutive modelandleadstoanon-uniquemappinginvolvingexclusivelystressesandstrains.Thiseffectivelyamountstoareduction indimensionalityofthefeaturespaceofthesurrogates thatallowsustoemploythesameframework forcompletely dis-tinctconstitutivemodels(c.f.Eqs. (37) and (43)).Thisdimensionalityreductionimplicitlycreatestheneedfortheexistence ofanchormodelsandforsubjectingthemtothestrainpathsseenbytheirrespectiveanchoringpoints,allowingthe micro-scopic internalvariables—whicharenotobservedbythesurrogate—toinfluencethemacroscopicstress-strainresponse andcompensateforthelossofinformationincurredwhenavoidingmakingassumptionsonmacroscopicinternalvariables.2 In order to positionthe present active learning approach within a generalFEM implementation,it is useful to recall themainsteps involvedinfindingequilibriumsolutionsfornonlinearfiniteelement models.TheseareshowninFig.3.A solutionforu isobtainediterativelybyminimizingaglobalforceresidualr computedfromthematerialresponseatevery integrationpoint(
materialUpdate
).Uponconvergenceandbeforemovingtothenexttimestep,thecurrentsolutionis checked(checkSolution
)andcanberejectedifthereisaneedtoadaptthemodel(e.g.nucleate/propagatecracks,refine themesh,changeconstitutivemodels).Oncethesolutionisconvergedandaccepted,materialhistoryisupdated(commit
) andthemodelmovestothenexttimestep.InanFE2 model,Fig.3canrepresentthemacroscopicsolutionloopandthemicromodelscanbeseenasamaterial-like entity,whichleadstoanothersimilarsolutionloopbeingembeddedwithinthe
materialUpdate
routine.Hereweexploit this modularity and discussthe implementation ofthe learning approach asa surrogate to an arbitrarymaterial model denotedasfullModel
.We canthereforeimplementtheapproachasamaterialwrapperthatencapsulatesfullModel
and handleslearning andprediction tasks.Algorithms 1, 2,4 and5 show how the material routinesmarked in bold in Fig.3areimplementedforthiswrapper.Inthefollowing,weelaborateoneachofthesecomponentsandpresentthemain implementationalaspectsoftheframeworkindetail.1 Thiscanbeaddressedbyaugmentingtheinputspace,themoststraightforwardwaybeingusingbothεand
εasinputstotheGPmodels[33,34]. 2 Thenoisehyperparameterσ2
f learnedduringtheanalysisimplicitlyaccountsforthedimensionalityreductionerrorincurredbyassumingtwoanchor modelsoperatingatthesamestrainlevelbutwithdifferentmicroscopicinternalstateshavethesamestressresponse.
Fig. 3. Schematic analysisflowforfiniteelementproblemsinvolvingnonlinearmaterialbehavior.Thesubscriptso andn refertoold(converged)andnew (current)values,respectively.Theproposedaccelerationframeworkfocusesonthestepsmarkedinbold.
4.1. Initialsampling
Sincethereisnooffline training,thesurrogatemodelstartswithnopriorinformationontheconstitutivebehaviorbeing approximated (D = ∅) andno anchor modelsinitially present (A
= ∅
). We thereforeintroduce an initializationstep by solvingthefirsttimeincrementundertheassumptionthatallintegrationpointsaredeformingelasticallyinordertoobtain an initial strain distribution which we use to initializeA
andD
. During thisstep, we usethe fully-solved modelonly once inorderto obtain theinitial stiffness De andwe assume allother points have thesame stiffness inorderto avoid furtherfull-ordercomputations(Algorithm1).Aswewillseeshortly,thisisnotalimitingassumptionsincethisfirstelastic approximationisrejectedandthefirsttimestepisrevisitedaftertheGPmodelsandinitialanchorsareinitialized.Algorithm 1: The
materialUpdate
routine.Input: strain εattheintegrationpointp
Output: stress σandstiffness
D at
theintegrationpoint 1 if initialization step :2 if Deisnotinitialized(i.e.p isthefirstpointevercomputed) :
3 useoriginalconstitutivemodel:(σ,D)← fullModel :: materialUpdate (ε); 4 initializeelasticstiffness:
D
e←D;5 else
6 assumep hasthesamestiffnessasthefirstcomputedpoint:σ←Deε;
D
←De; 7 else8 computeelasticstresses:σe←Deε;
9 useGPtopredictaconstitutivecorrection:Eσ, D, V[σ]← GP (ε); 10 approximatetheresponse:σ←σe+ E[σ], D ←De+ ED; 11 computetheuncertaintyindicatorγasinEq. (38);
12 ifγ>γcancel:
13 cancel time step (Algorithm5); 14 if softening isdetected(Dii<0) :
15 penalizepointuncertainty:γ←γ−Dii; 16 if last stephasbeencancelled :
17 switchtoamodifiedNewton-Raphsonstrategy:
D
←De; 18 storevaluesforthispoint:εnp←ε;γpn←γ; 19 returnσ, D
After thisfirstapproximationforthesolutionisobtained,the
checkSolution
routineofAlgorithm2iscalled.Since thestrainsateveryintegrationpointarestored,aninformedinitialchoiceforA
canbemadebyclusteringtheintegration pointsusingak-meansclusteringalgorithm[46].Foreachcluster,thepointclosesttotheclustercentroidischosen,added toA
andimmediatelysampled.TheGPmodels(oneforeachstresscomponent)arethenpreparedtomakepredictionsby computingandstoringfactorizedversionsoftheircovariancematrices.Initialvaluesforthehyperparameters
σ
2f,
σ
n2 andcaneitherbeprovidedby theuserbasedonpriorknowledge (e.g. from using the same material on other models)3 or be estimated online with an initial training procedure. We do this herebycreatingafictitiousanchormodelforeach clusterandloadingitmonotonicallyinthestraindirectionseenbythe
3 Hyperparametersreflectintrinsicpatternsintheconstitutivemanifoldbeinginferredandshouldthereforebefairlyinsensitivetothemacroscopic modelbeingsolved.
Algorithm 2: The
checkSolution
routine. 1 if first timestep :2 initializedatasetandanchorpointset:
D
← ∅,A
← ∅;3 dividetheintegrationpointsintok clustersinstrainspace:C,εn← KMCεn,k; 4 for every clusterCi∈ C:
5 findrepresentativepointp=arg min
j∈Ci εn j−ε n i; 6 anchorafully-solvedmodelatpointp:
A
← A∪p;7 solvethenewly-createdfullmodelonce:
(
σ,D)← fullModel :: materialUpdateεnp
; 8 computecorrectionsandadddata:σ←σ−Deεnp;D←D −De;
D
← D ∪εn p, σ D; 9 updateGPandstoremarginallikelihood:updateGP(D);L←ln p(y|x);
10 reject solution (revisit the first time step); 11 else
12 update
D
basedonthetoleranceγtolusingAlgorithm3:D
← D ∪ addData (γtol); 13 ifThasnotchanged :14 accept solution as is (Algorithm4); 15 else
16 refactorcovariancematrix:updateGP(D); 17 ifL/ln pt>Lretrain:
18 recomputeGPhyperparameters:retrainGP(D);L←ln pt; 19 reject solution (continue on the same time step);
point closest to the cluster centroid.In order to avoid redundancy (e.g. adding linear-elasticdata coming frommultiple models),wefirstinitializetheGPwiththestandardhyperparametervalues
σ
2f
=
1.0 MPa 2,=
1.0×
10−2,σ
2n
=
0.0 MPa2 anduse avariance threshold asadosing mechanism toensure theadded datais uniformlyspaced.After optimizing for thehyperparameters,we discardthedataobtainedfromthefictitiousanchors sincethereisnoguaranteeanyofthereal integrationpointswillfollowthesamestrain history.OnceD
andA
areinitialized,thefirsttime stepisthenrevisitedby rejecting theinitial elastic approximationdone inAlgorithm1. Notethat thishappensonlyonceatthe beginningofthe analysis.4.2. Activelearning
After theinitializationstep,theGPsurrogates areusedtocompute
σ
andthe responseisapproximatedasinEq. (37) (referto Fig. 3forthe analysisflow ofa single load increment).Since an independentGP model isused foreach stress component,weadoptasingleuncertaintyindicatorγ
givenby:γ
=
maxnσ iV
[σ
i] (38) with nσ being the number of stress components, and use it to drive the adaptive components of the framework. The indicatorγ
isthereforean absolutemeasure ofuncertainty(expressedinunitsofstress).Weempiricallyfindthischoice of indicator to offer the best performance in the examples treated in this work. Nevertheless, other alternatives might performbetterdepending ontheproblembeingsolved —e.g. normalizingV
[σ
i] withtheprocessvariance,whichforthe presentexampleswe findtoadverselyaffecttheperformance oftheframeworkwhenσ
2f isperiodicallyre-estimated.The literatureonactivelearningandgreedyGPapproximationscanalsoprovidetheinterestedreaderwithfurtheralternatives fordefining
γ
(referforinstancetoSection8.3.3of[41] andreferenceswithin).Valuesof
γ
forallintegrationpointsareupdatedateveryglobalNewton-Raphsoniteration(Algorithm1)andtheirfinal valuesγ
n (uponglobalconvergence) are used to drivea greedy refinementof theGP models (Algorithm 2) and ensure the surrogatemodel remains accurate.Thisis enforced bythe toleranceparameterγ
tol throughtheaddData
routine of Algorithm3.HerewedefineanadditionalsetT
oftrackedanchorsthatcontainsthemodelsforwhichdatahasbeenadded during thepresenttime step.ThesetT
isusedboth toavoidrepeatedly addingdatafroma single anchormodelatany giventime step andto makesurethe single addeddatapoint isupdated toreflect thelatestequilibrium solution(since fromFig.3,multiplematerialUpdate
→ checkSolution → continue
cyclescanoccurbeforehistoryiscommitted). Theuncertaintylevelγ
ischeckedateveryintegrationpointandtheanchormodelassociatedwiththepointhavingthe highestvalueischosen forsampling.InordertokeepthenumberofmodelsinA
toaminimum,addData
givespriority to models already inA
and only considers other potential anchoring points if every model inA
with an uncertainty higherthanγ
tol hasalreadybeensampledonthecurrenttimestep.Sincethemacroscopicproblembeingsolvedimposesa degreeofsimilaritybetweenintegrationpoints,theideaisthataddingdatafrompointsinA
shouldalsohelpreduce the uncertaintyofnearbyintegrationpoints. Thiscanthereforebe seenasagreedy data selectionapproach[41] thataimsat keepingbothA
andD
assmallaspossible.IfallmodelsinA
havealreadybeensampledandanewanchoringpointmust be chosen,we firstrecoverthematerialhistoryofthenewly-createdmodelby makingitrevisit thecompletecumulative strainhistoryε
phofitsanchoringpoint p.AlsonotethatweavoidaddingdatafrommodelsinA
undergoingunloadingorreloading.Thisisaconsequenceofassumingauniquerelationshipbetweenstrainsandstresses:datacomingfromanchor modelsunloadingthroughadifferentpathwouldbeerroneouslyinterpretedbytheGPmodelsasnoise.
Algorithm 3: The
addData
routine.Input: A toleranceγtol
Output: A singledatapointtobeincludedin
D
1 updatedatasampledfrom
T
toreflectthenewequilibriumstrainsεnp∈T; 2 updateGPmodelsanduncertaintyindicators:updateGP(D);γn←γεn
; 3 trytofindanuntrackedanchorpointtosample:p=arg max
p∈A,p/∈Tγ
n
p|γpn>γtol; 4 if p cannot befound :
5 trytofindanewanchoringpoint:p=arg max
p∈/A,p∈/Tγ
n
p|γpn>γtol; 6 updatelistoftrackedpoints:
T
← T ∪p;7 updatelistofanchorpoints:
A
← A∪p if p∈ A/ ;8 callfullModel:: materialUpdate(εhp)tograduallybringtheanchoratp tothecurrenttimestep; 9 computemodelresponseatεn
p:
(
σ,D)← fullModel :: materialUpdate εn p ; 10 computecorrections:σn p←σ−Deεnp;Dnp←D −De; 11 returnεt p, σt p D t pwitht beingthelateststeponwhichp isnotunloading/reloading;
Upon making changes to
D
, the covariancematrices ofall GP models are updated and refactored.The addition ofa new observation alsoleads to a change in marginal likelihood (Eq. (35)). We therefore allow for a re-estimation of the hyperparameterstotakeplaceoncethelikelihoodreachesavalueLretraintimeslowerthantheonecomputedafterthelatest estimation.IfT
hasnotchangedduringthelatestcalltoaddData
,thecurrentsolutionisacceptedasitis.Otherwisewe continuewiththesametimestepbyrejectingthecurrentsolution,whichwillcausetheglobalNewton-Raphsonsolverto keepsearchingforanequilibriumsolution(refertoFig.3)butnowwithupdatedGPmodels.Twoadditionaladaptivitysafeguardsareputinplaceinthe
materialUpdate
routineofAlgorithm1.Firstly,thetime stepcanbecancelediftheuncertaintyishigherthanathresholdγ
cancel inordertoavoidconsideringequilibriumsolutions thatareexcessivelyfarfromtheconstitutivebehaviorbeingapproximated.AswewilldiscussinSection 4.3,cancelingthe currentloadstepdoesnot meangivingup ontheanalysis, withnewsolutionattemptsbeingmadeafterincludingextra anchor models inA
and retraining the GPmodels. Secondly,the diagonal ofthe tangent stiffness matrix ischecked for possiblenegative values,providingan indicationofsoftening.Thiswouldindicateaswitchfromstressestotractions isin order,butthedecisionforsuchaconstitutivemodelswitchshouldalwaysbebasedonaccurateinformationobtainedfrom modelsinA
.Therefore,weflagthepointforsamplingbypenalizingitsuncertainty.4Apart fromthe very first update computed inorder to obtain De,the expensive full-order modelis never computed during
materialUpdate
(Algorithm1).AnalternativetothisapproachwouldbetoactuallycallthemodelsinA
every time stressesat the anchoring points are computed. We do not opt forthis alternative fortwo reasons. Firstly, using a single constitutive model(the surrogateS
)for thewhole meshavoids potential non-uniquenessissuesthat wouldarise, for instance,if points inA
are switching betweendifferent constitutive regimes(loading/unloading/softening). Secondly, refraining from constantly updating the models inA
resultsin significant gains interms of accelerationby making the reducedmodelinsensitive tothenumberofglobalNewton-Raphsoniterationsneededforconvergence andallowing full-order models inA
to become dormant andessentially be removedfrom the analysis foras long as the uncertainty at the associated anchoring point doesnot increase. However, it is worth mentioning in passing that the more expensive alternative hasthe meritofallowing foran extranoveltydetection safeguardtobe employedthrough whichthedeviation betweenpredictionsforσ
comingfromthefull-orderandsurrogatemodelscanbekeptincheck.4.3. Solutionrobustness
Algorithms 2and3ensurethat onlya singledata pointisaddedto
D
atatime. Thisisdone inordertoavoidlarge perturbationstothecurrentequilibriumsolutioncausedbychangesintheconstitutiveresponse.5Onceanewobservationis added,thecurrent(nowunconverged)solutioniskeptbutthealgorithmcontinuesonthesametimestepuntilequilibrium isreestablished,atwhichpointcheckSolution
isonceagaincalled(refertoFig.3fortheorderinwhichtheseroutines are calledwithin a single analysis step). This process is repeated untilγ
n is lower thanγ
tol everywhere. The resulting cycleofcarefullyaddingdatawithoutsignificantlydriftingawayfromequilibriumhelpskeepingtheglobalsolutionscheme robust.
4 Intheexamplesofthispaperwedonottreatmodelsforwhichsuchaswitchfrombulktocohesivebehaviorwouldbenecessary.Thesafeguardis thereforeonlytriggeredinrareoccasionswhentheGPistryingtoapproximateaperfectly-plasticresponse.
5 AddingdatatotheGPaffectstheresponseofallpointswithinahypersphereinstrainspacewitharadiusthatdependson
.Theresultingchangein globalresponsemaycausethesolvertodiverge.
Algorithm 4: The
commit
routine. 1 clearlistoftrackedmodels:T
← ∅; 2 storeconvergedvalues:γo←γn;3 updatepersistentstrainhistory:εh←εh εn; 4 go to next time step;
Whenthesolutionconvergesand
γ
n<
γ
tolforallintegrationpoints,the
commit
routineiscalled(Algorithm4).Before moving to the next step, converged values forγ
are updated. If the global solver failsto convergeor if the additional uncertainty thresholdγ
cancel of Algorithm1 isviolated, the solutionfor thecurrenttime step is canceled(Algorithm 5). Beforemakinganewsolutionattempt, convergedvaluesarerecoveredandAlgorithm3isusedtosample themodelwith highest uncertainty even if its value forγ
is lower thanγ
tol. Furthermore, we switch to a secant solution strategy by fixingthestiffnessmatrixtoDe(Algorithm1).Weempiricallynoticethat,forcertaincombinationsofhyperparameters,the surrogate modelcausestheglobalNewton-Raphsonsolver toloserobustness duetorapidchanges instiffness. Switching toa secantstrategyafteracanceled stepassuresasolutionisfound underthisscenario,albeitwithalower convergence rate.Algorithm 5: The
cancel
routine.1 recoverconvergedvalues:γn←γo;
2 adddatafromthepointwiththehighestvariance(Algorithm3):
D
← D ∪ addData (0); 3 refactorcovariancematrix:updateGP(D);4 ifL/ln pt>Lretrain:
5 recomputeGPhyperparameters:retrainGP(D);L←ln pt; 6 go back to the beginning of the time step;
5. Numericalexamples
In this section, we put the proposed framework to the test on a number ofnumerical examples. The algorithms of Section4havebeenimplementedinanin-houseFiniteElementcodeusingtheopen-sourceJem/JiveC++numericalanalysis library[47].WebeginbydemonstratingtheapplicabilityoftheframeworkforFE2 analysisandmoveontoperformingan in-depthinvestigationofitsperformancewithfastsingle-scalehomogeneousexamplesthatallowforextensiveparametric studiestobeperformed.Itisemphasizedthat,althoughitisourvisiontousethesurrogatemodelinamultiscalecontext, itdoesnot matterforthepurposeoftestingtheactivelearning frameworkwhetherthematerialupdateofthefullmodel involvessolvingamicromechanicalBVPorjustevaluatinganonlinearconstitutiverelation.
5.1. FE2demonstration
The firstexampleconcerns afiber-reinforced compositetaperedspecimenloaded intransversetension.The geometry, meshandboundaryconditionsareshowninFig.4.A4-fiberRVEmodelisembeddedateachmacroscopicintegrationpoint. ThegeometryandmeshofthemicromodelarealsoshowninFig.4.Thefibersaremodeledaslinearelasticwithproperties
E
=
74000 MPa andν
=
0.2.Forthematrixweemploy thepressure-dependentelastoplasticmodelproposed byMelroet al. [48],withE=
3130 MPa,ν
=
0.37,ν
p=
0.32 (plasticPoisson’sratio)andyieldstressesgivenby:σ
t=
64.
80−
33.
6e−ε p eq/0.003407−
10.
21e−εeqp/0.06493 (39)σ
c=
81.
00−
42.
0e−ε p eq/0.003407−
12.
77e−εeqp/0.06493 (40) whereε
eqp istheequivalent plasticstrain. The modelissolved for100loadsteps,atwhichpoint theglobalmacroscopic responseisalmostperfectlyplasticandthestrainlocalizesaroundthecenterofthespecimen.We run thereducedmodelwithk
=
1,γ
tol=
0.3 MPa,γ
cancel=
20 MPa and withhyperparameters estimatedusinga single micromodelloadedinthedirectionofthesinglek-meansclusteringcentroid(inthiscasetheaveragestrain inthe specimen).The analysisstarts withA
= ∅
andD = ∅
andendswithatotalof14anchormodels inA
(outofa totalof 134points)and|
D|
=
73 datapoints.Weplottheresultantload-displacementresponseforbothfull- andreduced-ordermodelsinFig.5.Theactivelearning approach is able to capturethe correctglobal model response with negligible loss of accuracy while running 22 times faster than thefull-order model.Ofthe 35 s execution time of thereduced model,a total of15 s isspent on updating the GP models and usingthem for newpredictions. For larger micromodelswith denser meshes, the execution time of thereducedmodelwillbedominatedbycomputingthefewmicromodelsincludedin