Tianju
Xue
*a,
Alex
Beatson
b,
Maurizio
Chiaramonte
*c,
Geoffrey
Roeder
b,
Jordan T.
Ash
b,
Yigit
Menguc
c,
Sigrid
Adriaenssens
a,
Ryan P.
Adams
b and
Sheng
Mao
*de
aDepartment of Civil and Environmental Engineering, Princeton University, Princeton, NJ 08544, USA. E-mail: txue@princeton.edu
bDepartment of Computer Science, Princeton University, Princeton, NJ 08544, USA
cFacebook Reality Labs, Redmond, WA 98052, USA. E-mail: mchiaram@fb.com
dDepartment of Mechanics and Engineering Science, BIC-ESAT, College of Engineering, Peking University, Beijing 100871, People's Republic of China. E-mail: maosheng@pku.edu.cn
eDepartment of Mechanical and Aerospace Engineering, Princeton University, Princeton, NJ 08544, USA
First published on 14th July 2020
Cellular mechanical metamaterials are a special class of materials whose mechanical properties are primarily determined by their geometry. However, capturing the nonlinear mechanical behavior of these materials, especially those with complex geometries and under large deformation, can be challenging due to inherent computational complexity. In this work, we propose a data-driven multiscale computational scheme as a possible route to resolve this challenge. We use a neural network to approximate the effective strain energy density as a function of cellular geometry and overall deformation. The network is constructed by “learning” from the data generated by finite element calculation of a set of representative volume elements at cellular scales. This effective strain energy density is then used to predict the mechanical responses of cellular materials at larger scales. Compared with direct finite element simulation, the proposed scheme can reduce the computational time up to two orders of magnitude. Potentially, this scheme can facilitate new optimization algorithms for designing cellular materials of highly specific mechanical properties.
The rapid development on the fabrication side demands a fast and predictive computational method to predict the mechanical behaviors of CMMs. For materials with a relatively small number of cells, one can conduct a direct numerical simulation (DNS) on the whole specimen using the finite element method (FEM).17 However, the mechanical behaviors of soft CMMs are often highly nonlinear and may involve mechanical instability, thus requiring a fine spatiotemporal resolution for DNS calculations. DNS can easily become computationally intractable as CMMs usually consist of a large number of cells. Alternatively, one can conduct a fine-mesh FEM calculation on a representative volume element (RVE), often composed of one or several repeating cells,6 by prescribing appropriate boundary conditions. Such a method can efficiently predict the mechanical response of CMMs under homogeneous deformation. When CMMs are subject to complex loads, such predictions will fail.
When the size of the CMM is much larger than that of the repeating cells, multiscale analysis such as homogenization can be a useful tool. Homogenization, often called “coarse-graining” in the physics community, is often used to predict the macroscopic behavior of composites from their microstructures.18,19 The early development of homogenization primarily focuses on the analytical approach. Following the pioneering work of Eshelby,20 various analytical homogenization methods have been developed for linear materials21–26 and later extended to nonlinear materials.27–30 While these analytical approaches provide useful bounds, they are less accurate when microstructures become sophisticated.31 Computational homogenization is more suitable in this scenario. Reviews of the recent progress in this field can be found in ref. 32–34. Among various methods of computational homogenization, two main categories can be distinguished: concurrent and off-line methods. Concurrent methods integrate the macro- and micro-scale problems through proper mathematical formulations and solve both problems concurrently. For example, the FE2 method is a widely adopted concurrent scheme, and uses FEM to solve problems at both scales.35–40 Another example is given by the spectral method based on Fast Fourier Transform (FFT).41–44 Concurrent methods are powerful and accurate, yet can still be computationally expensive, mainly due to the nested numerical solvers needed to balance calculations at both scales. On the other hand, for off-line methods, the macro- and micro-scale problems are solved separately. Effective constitutive laws are first constructed based on a set of numerical calculations at microscales, and then used to solve problems at macroscales.45,46 Thus, offline methods can avoid the huge cost of nested numerical solvers and in some cases be orders of magnitude faster than the concurrent methods.46,47 Essentially, the constitutive relation can be seen as a multi-dimensional mapping from the microstructural features to the macroscopic response of the material, and the core of offline methods is to efficiently and accurately construct such a mapping. To this end, several techniques46–52 have been developed, including recent efforts using neural networks.53,54
Neural networks (NNs) are massively parametric function approximators inspired by biological neural networks.55 Given a large number of input/output examples, neural networks can often successfully learn functions on high-dimensional spaces. This process of “learning” the function corresponds to identifying an optimal set of parameters, i.e., neural network “weights”, to capture the relationship between inputs and outputs reflected in the training data. Neural networks have been demonstrated to be powerful function approximators for solving challenging engineering tasks, such as language modeling,56 image classification,57 and machine translation.58 More recently, there has been a growing interest in applying neural networks to complex problems in natural sciences, e.g., physics,59–61 chemistry,62 astronomy,63 biomedicine,64 and materials science.65–69
In the present work, we extend previous works on NN-based computational homogenization methods53,54 to model the nonlinear mechanical behavior of cellular mechanical metamaterials under large deformation. For that purpose, we first generate training data by conducting a large number of RVE calculations at the cellular scale. We then train a neural network on those data to approximate the effective strain energy density at the macro-scale, which is a function of the macroscopic strain and microscopic structural parameters. Through this NN-based surrogate model for effective strain energy density, the coarse-grained constitutive relations can be easily obtained, which greatly reduces the computational cost of determining the mechanical behavior of cellular materials at the macro-scale.
This paper is organized as follows. In Section 2, the finite deformation theory of hyperelastic cellular mechanical metamaterials and the specific geometries of this study are described. In Section 3, we propose a neural network-based multiscale computational scheme to determine the mechanical responses of CMMs and describe the procedures to carry out the scheme in Section 4. Numerical examples are presented in Section 5, where our NN-based method is compared with DNS in terms of accuracy and efficiency. In Section 6, we discuss the applicability and the limitations of our NN-based method. Finally we conclude in Section 7.
The constitutive model of a hyperelastic material can be defined by a strain energy density function (per volume) W, which depends on F, through the right Cauchy–Green tensor C = FTF. The displacement field u can be determined by solving for the stationary point δΨ = 0 of the following functional (in the absence of body force):
![]() | (1) |
Div P = 0 in ![]() |
u = ub on ∂![]() |
P·N = t on ∂![]() | (2) |
The formulation described above is applicable to any hyperelastic material, but in this work we focus on CMMs made of soft elastomers and therefore adopt the following form of W:
![]() | (3) |
r(θ) = r0(1 + ξ1![]() ![]() | (4) |
![]() | ||
Fig. 1 Left: A map from the parameters to the pore shapes indicated by eqn (4). For each (ξ1,ξ2), we get a corresponding pore shape. Right: The two representative pore shapes studied in this work. |
Among all the possible pore shapes, we focus on two types (A and B in Fig. 1): pore A is circular, with ξA = (0, 0), and pore B breaks the angular symmetry, with ξB = (−0.2, 0.2). We choose these two shapes as our representative examples for two reasons. First, mechanical instability can be triggered in CMMs made of both unit cells under compression, which leads to an asymmetric mechanical response of the CMMs under tension and compression. Second, even though mechanical instability can be triggered in these two CMMs, they still exhibit very distinct mechanical responses under the same mechanical loading.6,7
Our computational homogenization approach starts with a study of the representative volume element of the CMM. The choice of RVE is not unique, but it has to be simultaneously large enough to capture the influence of cellular geometries on the overall mechanical behavior and small enough to be considered as a material point for the CMM. For example, as shown in Fig. 2, for a CMM composed of many unit cells with pore A, we have chosen ABCD, a 2 × 2 array of unit cells, as our RVE. This choice of RVE ensures that we capture the mechanical instability that leads to reorganization of the neighboring unit cells.6,7 We emphasize that the above statement relies on the critical assumption of separation of length scales, i.e. the size of the CMM is much larger than that of the unit cell and RVE, without which the homogenization approach is invalid.
To construct the effective strain energy density of this RVE, we conduct a fine-mesh FEM calculation on the RVE under macroscopic deformations by prescribing the following periodic boundary conditions:
u(X) = (![]() ![]() | (5) |
The effective strain energy density can be obtained via the average W over the RVE:
with V being the total volume of the RVE (for plane strain, dV = dX1dX2). Other macroscopic quantities can be obtained in the same fashion. This effective strain energy density
is the bridge that connects the microscopic features to macroscopic mechanical responses. In essence,
should be a function of
as well as the microstructural features ξ:
=
(
,ξ). Once that relation is established, we then treat the RVE as a material point in the CMM and use FEM to find the approximate solution to the stationary point of the following functional:
![]() | (6) |
Div ![]() ![]() |
ū = ūb on ∂![]() |
![]() ![]() ![]() | (7) |
Neural networks (NNs) are powerful computational structures for constructing massively parametric mappings.55 The classic general form of a neural network is the multi-layer perceptron (MLP), made up of fully connected layers.71 The input vector x is processed through several hidden layers and finally to an output vector y. Suppose that the ith layer has m nodes and the i + 1 layer has n of them, the computation between these two layers is given by yi = fi(xi): = σ(wi·xi + bi), where wi ∈ n×m and bi ∈
m are the weight matrix and bias vector of the ith layer, and σ is an element-wise nonlinear function, such as a logistic function σ(y) = 1/(1 +e−y) or tanh. We can “stack” these transformations as y = fk(xk) = fk○fk−1(xk−1) = … = fk○…○f1(x) for a MLP with k − 1 hidden layers. We hereby denote this mapping as y = fθ(x) to emphasize that this mapping is under the parameter θ = (w1,w2,…,wk;b1,b2,…,bk). A typical MLP with only one hidden layer is shown in Fig. 3.
The weight matrices and bias vectors θ are called the parameters of the neural network and they are optimized through a training procedure that minimizes a loss function reflecting the quality of the fit to data. This training is typically performed via a stochastic optimization procedure such as stochastic gradient descent.72 At each step, one randomly selects a subset of the training examples x each with the corresponding target value y* and computes a loss function (y*,y) (for example, the squared Euclidean distance between targets and predictions ‖y* − y‖22). Gradients of this loss function with respect to the parameters are then computed using reverse-mode automatic differentiation and the parameters are updated accordingly.73 This proceeds until a convergence criterion is achieved. After training and other validation and calibration processes, we can then deploy the neural network to make predictions for new inputs.
In this work, we use an MLP to approximate the effective strain energy density, namely NN(
,ξ) ≈
(
,ξ). The input vector is a 5-dimensional vector: x = (
11,
12,
22,ξ1,ξ2) and the output is a scalar y =
NN. For any given input, the target output is generated by RVE calculations. The goal is to train an MLP which can reproduce the results of these calculations for out-of-sample vectors. After training the MLP, we replace the
in eqn (6) with
NN and solve the macroscopic problem.
We use scikit-learn,74 an open source machine learning framework, to build, train and later test our neural network. All finite element calculations are carried out using an open-source FEM package FEniCS.75
In this work, we focus on CMMs composed of two types of unit cells: those with pore A and pore B, so the choice of ξ(i) is limited: ξ(i) ∈ {ξA,ξB}. However, the choice of the macroscopic field (i) is in principle infinite since it varies continuously. Therefore some care is necessary in constructing an effective sampling method for
(i).
First we must identify a domain over which (i) is to be sampled. Such a region should cover the nonlinear mechanical behaviors of primary interest. For example, for CMMs composed of unit cells with pore A, beyond certain compression the CMM undergoes mechanical instability which leads to auxetic behavior. This behavior can be observed for
22 = −0.125,
12 =
21 = 0 and
11 ∈ [−0.08, 0.08], as shown in Fig. 5a. The effective Poisson's ratio is calculated to be
= −0.26 (defined as
at the star point, where
). As for CMMs made of unit cells with pore B, when subject to both shear and compression, mechanical instability can lead to the bifurcation of microstructures.7 This is observed when
22 = −0.125,
11 =
21 = 0 and
12 ∈ [−0.7, 0.7]. As shown in Fig. 5b, a double well shape of
is observed as a function of
12 with two energy minima. Therefore, when the CMM is subject to an overall compression
22 = −0.125, the CMM will bifurcate into two different microstructures that correspond to these two minima.
The inspections above offer insight into a reasonable range of for sampling data. The range should be sufficiently wide so that important mechanics are reflected in the database, while it should also be limited so that undesired behaviors like self-contact are avoided with the current FEM simulation. The sampling region chosen for pore A is
{−0.2 ≤ ![]() ![]() |
−0.2 ≤ ![]() ![]() | (8) |
{−0.2 ≤ ![]() ![]() |
−0.8 ≤ ![]() ![]() | (9) |
The neural network used here is an MLP with one hidden layer, which as we will show later is able to provide sufficient accuracy for our problem. The logistic function σ(y) = 1/(1 + e−y) is chosen as our activation function, and mean squared error (MSE) is chosen as our loss function: , where n is the number of data points evaluated and ŷ(i) denotes the output of the MLP given input x(i). The neural network is trained using mini-batch stochastic gradient descent with Adam optimizer.77 Hyperparameters such as number of neurons in the hidden layer are first optimized via k-fold cross-validation78 (see ESI†). We obtained the following optimal hyperparameters: learning rate 10−2, batch size 64, and 128 neurons in the hidden layer. We then train our neural networks using these hyperparameters. The training time for the NN is typically within 1 minute on a personal computer. After training of 1000 epochs, our NN model reports a training MSE of 6.11 × 10−5 and a test MSE of 7.67 × 10−5.
We performed polynomial regression on the same data set as a baseline comparison. As shown in Fig. 7, when the maximum degree of polynomial is increased, the MSE tends to decrease. But when the maximum degree exceeds 10, the regression problem starts to become ill-conditioned and thus results in large MSE.79 The best of polynomial regression reports a test error of 1.05 × 10−3, which is more than an order of magnitude larger than the neural network model.
We further validate our NN model in two test cases to serve as our benchmarks by prescribing the following :
(1) Uniaxial strain: 11,
12,
21 = 0,
22 ≠ 0
(2) Simple shear strain: 11,
21,
22 = 0,
12 ≠ 0
We conducted the above two test cases on both unit cells with pore A and pore B and ensured that the 's tested here are different from those in the database. The results are shown in Fig. 6, where the effective strain energy obtained via RVE calculation and that from the NN model are compared. It is observed that the NN model agrees with the RVE calculation reasonably well. The errors between NN model and RVE calculation are also small among both benchmarks: the respective MSEs are 2.42 × 10−5 and 4.28 × 10−5.
![]() | (10) |
By extracting out the optimal parameters of the NN model, the explicit functional form of NN(
,ξ) is readily available, which can be computed iteratively as described in Section 3. However, to find the stationary point of the above functional
, one needs to efficiently evaluate the derivatives of
to compute quantities like the first Piola–Kirchhoff stress
as well as the tangent stiffness tensor
. In this work, we made use of the automatic symbolic differentiation feature provided by the open-source package FEniCS (specifically the UFL component80 of the package) to compute those derivatives. The standard Newton–Raphson method is used to find the root of the nonlinear equations. We use automatic differentiation and other features of FEniCS, so that the code can be written in a concise and near-math fashion, which provides an easier access to broader audiences. Equivalently, the exact same derivatives could be computed analytically53 or via finite differences.54
![]() | ||
Fig. 8 Boundary conditions for uniaxial tests. The upper and lower sides of the plate have fixed displacement conditions while the left and right sides are traction free. |
We perform this uniaxial test on a CMM with spatially uniform pore shapes of either pore A or pore B. Previous studies76 suggested that compression can lead to distinct mechanical behaviors of these two CMMs. This phenomenon is indeed observed in Fig. 9, where the deformed shapes of the CMMs under an engineering strain of −10% are shown. Both DNS (Fig. 9a and b) and the NN-based model (Fig. 9e and f) have shown that even under the same compressive loads, the two CMMs develop very different shapes. We also computed the average displacement field over each RVE element of the CMMs using data from DNS. The results are shown in Fig. 9c and d.
Specifically for CMMs with pore A, as mentioned in Section 4.2, a negative Poisson's ratio is expected, i.e., the compression in the e2 direction will lead to a contraction also in the e1 direction. This can be easily observed for the DNS results shown in Fig. 9a, where the CMM starts to contract in the middle region. The NN-based model also produces that result, as shown in Fig. 9e where displacement component u1 is plotted. We can observe that it is positive on the left and negative on the right, which is a clear sign of contraction. As for CMMs with pore B, compression can lead to a bifurcation of the local microstructure, as mentioned in Section 4.2, which again can be observed from the DNS result in Fig. 9b. For the NN-based model, since the microstructural information has been averaged out, such bifurcation can only be inferred indirectly. Recall that in Fig. 5b, the two minima of correspond to nonzero shear deformation with equal and opposite shear components. In Fig. 9f, we observe that under pure compression, the CMM develops a large local shear in the upper and lower part of the material, with opposite shear directions. The results in Fig. 9, whether obtained via DNS or the NN-based approach, are all in qualitative agreement with previous experimental work.6
Beyond visual qualitative comparisons, we obtained a stress–strain (σ–ε) curve for both CMMs to quantitatively assess the accuracy of our NN-based approach. The curves are obtained by varying the applied engineering strain ε from −10% to 10% and the results are shown in Fig. 10. The curves obtained from DNS and the NN-based model are almost identical for both pore A and pore B when under tension, even at large strain up to 10%. The NN-based model can reproduce the results of DNS for pore B for compression, but it starts to deviate from DNS results for pore A when the compression becomes large. This deviation is observed even with a much finer mesh for our NN-based approach.
![]() | ||
Fig. 10 Stress–strain curve for CMMs under uniaxial test with pore A (red) and pore B (blue) obtained via DNS (solid line) and the NN-based model (dashed line). |
One possible reason for such behavior is the boundary effect. For CMMs with pore B we can observe some variation of microstructures near the boundary, but the variation is quite smooth and the main effect – the bifurcation into two shear states – is captured by the NN-based model. However for CMMs with pore A, on the left and right boundaries there are strong pore–boundary interactions, resulting in a drastic change in the pore shapes (see ESI†). This boundary effect persists even with the increasing number of composing unit cells. Another possible reason is that there is strong microstructure localization in the case of pore A. We can see that the rotation of the holes is drastic, and the distances between holes are small, thus leading to strong local interactions between the RVEs (Fig. 9a and ESI†). This is in contrast to the case of pore B, where pore rotations vary more smoothly yielding no strong localization (Fig. 9b and ESI†). Moreover, the strongly localized deformations in the case of pore A can lead to contacts between the interior boundaries of the pore, at which point our finite element model becomes invalid (without extension to handle contacts). In conclusion, the strong local variations in the microstructural features may impede the predictive power of our computational scheme. Possible ways of resolving this issue include using a separate model to account for the boundary effects,81 or utilizing more advanced homogenization techniques to deal with the aforementioned localization effects.32,33
Aside from accuracy, we also compared the efficiency of the NN-based approach with DNS. Recall that one of our main motivations to construct such a surrogate-based approach is to reduce the computational complexity of DNS. This computational expense is mainly due to the fact that a small mesh size is needed to resolve the detailed cellular geometries, which results in a large number of degrees of freedom (DoF). Our NN-based model can use a much coarser mesh since the detailed microstructural features have already been averaged out and factored into the effective strain energy density. This significantly reduces the degrees of freedom when solving for the macroscopic behavior of the CMM – of course at the cost of losing the detailed local information of relevant fields. Table 1 summarizes the computational performance of DNS and the NN-based approach when they are used to generate the stress–strain curve in Fig. 10. The NN-based approach needs much fewer DoF to accurately capture the overall mechanics of the CMMs and hence can reduce the computational time by up to two orders of magnitude for both CMM with pore A and that with pore B. Note that for our problem, we used the same increment in ε to obtain the stress–strain curve. For a time-dependent problem, it is also important to examine the numerical stability of both approaches when different time steps are used.
DoF | Computation time [s] | |
---|---|---|
DNS – pore A – compression | 158![]() |
5.15 × 102 |
DNS – pore A – tension | 7.42 × 102 | |
NN – pore A – compression | 290 | 6.18 |
NN – pore A – tension | 6.41 | |
DNS – pore B – compression | 139![]() |
5.33 × 102 |
DNS – pore B – tension | 5.19 × 102 | |
NN – pore B – compression | 290 | 8.20 |
NN – pore B – tension | 6.94 |
![]() | ||
Fig. 11 Deformed configuration of a CMM with non-uniform pore shapes under a uniaxial tension of 20% obtained via (a) DNS and (b) the NN-based approach. Color contours indicate the value of u2. |
We suggest two explanations for this phenomenon. First, since is a strongly nonlinear function of ξ, especially when under compression, the ability of our neural network to capture that functional dependence remains uncertain, given that only two different ξ's (ξA and ξB) are included in the training data. Second, since the local mechanical properties of the CMM depend on ξ in a strongly nonlinear fashion, the ξ used in this study can still create a strong local variation. When the characteristic length of that variation is comparable to the RVE size, the assumption of separation of scales can break and in that case our multiscale framework will fail. Particularly in the above example, when the CMM is under uniaxial tension, as shown in Fig. 10, the mechanical properties of CMMs vary smoothly between different ξ's. Therefore, the NN-based approach gives us good agreement with the DNS results in tension. However, when under compression, as the properties vary drastically between different ξ's, our NN-based approach starts to fail.
To enhance the fidelity of our NN-based approach for CMMs with complex pore shapes, it is useful to include a larger number of sampled ξ's in the database. This way, our neural networks can better capture the nonlinear dependence of on the geometric parameters ξ. We can validate our results by conducting similar uniaxial tests on CMMs with a homogeneous ξ that are different from those in the training data and compare with DNS results. Once that dependence is established with good accuracy, we can use the NN-based approach to predict the mechanical behavior of CMMs with an arbitrary uniform ξ or those with a non-uniform ξ, which varies “smoothly enough”.
However, for those ξ's that lead to large local variations and with characteristic lengths comparable to the size of RVE, our NN-based approach will fail since it breaks the basic assumption of our multiscale framework. For those CMMs, we have to rely on other approaches like DNS. We must be cautious about the spatial inhomogeneity of the problem that we are solving: be it either in the geometric features or in the deformation field. For the multiscale approach to work, we must ensure that those variations have a negligible influence on the macroscopic problems of interest. In principle, our NN-based multiscale approach only works when the size of the CMM is much larger than that of the RVE and when there is no strong localization effect.
In Section 5 we have demonstrated that our NN-based scheme can capture the mechanics of moderately sized CMMs with reasonable accuracy, but with significant computational savings over DNS. We anticipate further reduction for even larger CMM size. Nevertheless, it is worth noting that the most time-consuming step of our NN-based scheme is not solving the macroscopic problem using the NN-based model, but the generation of the training data. As reported in Section 4.1, it takes about 20 hours to construct the database, although it is trivially parallelizable. Even though it is time-consuming, this generation step is only required once, and the resulting NN-based model can be used to solve multiple problems. Therefore, significant time can still be saved compared to other multiscale methods like FE2 where micro- and macroscopic problems have to be solved at the same time, as pointed out in ref. 53 and 54.
Owing to its great computational efficiency, the NN-based approach proposed in this work can be particularly useful for solving design problems – the reverse engineering of CMMs to achieve specific mechanical properties. Compared with typical multiscale approaches, our approach provides a straightforward way of evaluating the necessary derivatives needed for the sensitivity analysis,82,83 once the functional dependence of on ξ is well established. Therefore we envision that those powerful tools used for structural topological optimization of composite materials can now be used for the design of CMMs under the NN-based framework. However, since our NN-based multiscale approach only works for CMMs with smoothly varying ξ, proper regularization has to be imposed to ensure that the optimal solution that we obtained does not violate that condition. Other structural parameters can also be included for the design of CMM, for example, the porosity of the CMM, the arrangement of the pores (other than the square arrays used in this work), etc.
Differentiability and fast deployment are two important motivators for choosing neural networks for this work. Other than neural networks and polynomial regression in Section 4.2, there are many machine learning techniques available, each suitable for different types of problems. For example, decision trees84 are a popular machine learning technique which offers great interpretability; however they are not typically differentiable. Gaussian process regression can be an alternative to neural networks for small datasets, and can sometimes outperform neural networks in terms of accuracy. Yet, there can be great computational cost for deployment once the dataset becomes large.85 The choice of machine learning techniques largely depends on the types of problems and applications considered. Sometimes multiple machine learning techniques can be combined together for optimal performance. In this work, our aim is to demonstrate that neural networks are a promising tool for the design of CMMs. There are limitations – the need to generate a large dataset, the need for smoothly varying ξ – but we expect that future work may be able to compose the model we present with other machine learning techniques to relax these restrictions.
Footnote |
† Electronic supplementary information (ESI) available. See DOI: 10.1039/d0sm00488j |
This journal is © The Royal Society of Chemistry 2020 |