1 Introduction

Stochastic stability is a critical concern in engineering, manifesting in vibration dynamics like power systems [1], wind-induced vibration [2, 3], railway vehicle dynamics [4]. There are several definitions of stochastic stability, with the Lyapunov stability with probability one being the most commonly adopted one. The stochastic averaging method has extensively utilized to study Lyapunov stability with probability one due to its efficacy in analyzing multi-degree-of-freedom (MDOF) strongly nonlinear systems. Examples include the stability analysis of quasi-Hamiltonian Systems excited by Gaussian white noise [5,6,7], systems driven by combined Gaussian and Poisson white noises [8,9,10], systems with fractional derivative damping [11, 12]. In these studies, ideal or Markovian noise excitation is assumed, while real-world noise exhibits long-range correlation. Consequently, these works have limitations in analyzing stochastic stability of engineering systems.

Fractional Gaussian noise (fGn), recently gaining popularity, exhibits long-range correlation, making it more suitable for modeling practical noise. However, Due to its long correlation, developing theoretical methods for predicting system responses under fGn excitation is extremely challenging. Several scholars have conducted research in this area. For instance, Biagini et al. [13] developed calculus and the stochastic differential equations (SDEs) with respect to fractional Brownian motion (fBm). Kaarakka [14] and Grigoriu [15] investigated the case of one-dimensional linear system. Xu et al. [16] made significant effort on developing the stochastic averaging principle for systems subject to fGn. Based on this averaging principle, Lu et al. [17, 18] have developed the stochastic averaging method for quasi-Hamiltonian system systems subjected to fGn excitation. Due to the non-Markovian nature of the system response, numerical simulations are required to obtain response statistics when using this method. To address this limitation, Lu et al. [19, 20] recently have developed an analytical prediction for the response of quasi-integrable Hamiltonian system driven by fGn under appropriate parameter conditions. Despite numerous studies on stochastic dynamics of quasi-Hamiltonian systems under fGn excitation, the stochastic stability of the systems remains largely unexplored.

In previous studies, analyzing the stochastic stability of MDOF nonlinear systems involved combining the stochastic averaging method with the maximum Lyapunov exponent method. This approach first simplifies MDOF strongly nonlinear systems into approximate lower dimensional systems using stochastic averaging. The stochastic stability of the system is then analyzed using the maximum Lyapunov exponent method, such as the work in Ref. [6]. However, when dealing with complex and MDOF systems, particularly those systems with strong coupling and strong nonlinearity, it is challenging in applying stochastic averaging method to derive analytical expressions for the drift and diffusion coefficients of the averaged SDEs.

Currently, deep learning, as a data-driven model, can excels in handling large-scale, high-dimensional data and has achieved significant success in many application fields [21,22,23]. It has also been introduced into the field of stochastic dynamics [24]. Given the advantages of deep learning, this paper combines deep learning with stochastic averaging method. Specifically, we use a backpropagation neural network (BPNN) [25] to obtain the drift and diffusion coefficients of the averaged SDEs.

Based on data-driven stochastic averaging method, a procedure for studying the asymptotic Lyapunov stability with probability one of quasi-integrable and non-resonant Hamiltonian systems subject to fGn is first proposed. First, fGn is approximately regarded as wideband noise. The original system is transformed into averaged SDEs by applying the stochastic averaging method for quasi-integrable Hamiltonian systems subject wideband noise, where the drift and diffusion coefficients of the averaged SDEs were obtained using BPNN. Subsequently, the expression for the Lyapunov exponent of the averaged system is obtained by using a procedure similar to Khasminskii’s [26] and is considered as the first approximation of the largest Lyapunov exponent of the original system. Consequently, the asymptotic Lyapunov stability with probability one of the original system is approximately determined by using this largest Lyapunov exponent. Finally, two example are presented to provide detailed analyses. Comparative assessments between theoretical calculations and Monte Carlo simulations are performed to validate the proposed procedure.

This paper is arranged as follows: in Sect. 2, the general system equations are briefly introduced; in Sect. 3, the method of system dimensionality reduction is introduced, i.e., deep learning-based stochastic averaging method; in Sect. 4, the method for calculating the largest Lyapunov exponent of system is introduced; two examples are carried out in Sect. 5; the conclusion is given in Sect. 6.

2 Formulation of the problem

Consider an n-DOF quasi-Hamiltonian system driven by fGn and governed by the following motion equations:

$$\begin{gathered} \dot{Q}_{i} = \frac{\partial H}{{\partial P_{i} }}, \hfill \\ \dot{P}_{i} = - \frac{\partial H}{{\partial Q_{i} }} - \varepsilon^{{2{\mathcal{H}}}} c_{ij} ({\mathbf{Q}},{\mathbf{P}})\frac{\partial H}{{\partial P_{j} }} + \varepsilon^{{\mathcal{H}}} f_{ik} ({\mathbf{Q}},{\mathbf{P}})W_{k}^{H} (t), \hfill \\ i,j = 1,2, \ldots ,n;\quad k = 1,2, \ldots ,m, \hfill \\ \end{gathered}$$
(1)

where \(Q_{i}\) and \(P_{i}\) are generalized displacements and momenta, respectively; \(H = H({\mathbf{Q}},{\mathbf{P}})\) is twice differentiable Hamiltonian; \(\varepsilon\) is a small parameter; \(\varepsilon^{{2{\mathcal{H}}}} c_{ij} ({\mathbf{Q}},{\mathbf{P}})\) denote weak damping coefficient; \(\varepsilon^{{\mathcal{H}}} f_{ik} ({\mathbf{Q}},{\mathbf{P}})\) denote amplitudes of random excitations; \(W_{k}^{{\mathcal{H}}} (t)\;(k = 1,2, \ldots ,m)\) are the independent fGns. Similar to Gaussian white noise, fGn is the formal derivative of fractional Brown motion (fBm) \(W^{H} (t) = {\text{d}}B^{H} (t)/{\text{d}}t\). The mathematical expression of fBm is [27]

$$B^{H} (t) = C_{H} \left\{ {\int\limits_{ - \infty }^{0} {[(t - s)^{{{\mathcal{H}} - {1 \mathord{\left/ {\vphantom {1 2}} \right. \kern-0pt} 2}}} - ( - s)^{{{\mathcal{H}} - {1 \mathord{\left/ {\vphantom {1 2}} \right. \kern-0pt} 2}}} ]d^{ - } B(s)} + \int\limits_{0}^{t} {(t - s)^{{{\mathcal{H}} - {1 \mathord{\left/ {\vphantom {1 2}} \right. \kern-0pt} 2}}} d^{ - } B(s)} } \right\},$$
(2)

where parameter \(\mathcal{H}\) is called Hurst index. The coefficient \(C_{H}\) is rederived by Deng [17]

$$C_{H} = \left[ {\frac{{4^{{\mathcal{H}}} \Gamma ({\mathcal{H}} + 1)\sin ({\mathcal{H}}\pi )}}{{\sqrt \pi \Gamma ({\mathcal{H}} + {1 \mathord{\left/ {\vphantom {1 2}} \right. \kern-0pt} 2})}}} \right]^{1/2} .$$
(3)

The corresponding power spectral densities \(S_{k} (\omega )\) of fGn is

$$\begin{aligned} S_{k} (\omega ) & = \frac{{2D_{k} {\mathcal{H}}_{k} \Gamma (2{\mathcal{H}}_{k} )\sin ({\mathcal{H}}_{k} \pi )}}{\pi }\left| \omega \right|^{{1 - 2{\mathcal{H}}_{k} }} , \\ & \quad \frac{1}{2} < {\mathcal{H}}_{k} < 1,\quad k = 1,2, \ldots ,m. \\ \end{aligned}$$
(4)

where \(2D_{k}\) is the intensity of \(W_{k}^{H} (t)\); \({\mathcal{H}}_{k}\) is Hurst index. Figures 1 and 2 show the samples of fGn and the corresponding power spectral density (PSD) \(S(\omega )\) for different Hurst index \({\mathcal{H}}\), respectively. It is observed that as \({\mathcal{H}}\) ranges from 1/2 to 1, \(S(\omega )\) transitions from constant value to Dirac delta function. Asymptotic analysis for Eq. (4) also indicates that when \({\mathcal{H}} \to 1/2\) or \({\mathcal{H}} \to 1\), \(S(\omega ) \to D/\pi\) or \(S(\omega ) \to 2D\delta (\omega )\), respectively. Therefore, fGn can be understood as a type of noise with properties lying between Gaussian white noise and Gaussian random variable. In the domain of high frequencies, e.g., greater than 1.0, \(S(\omega )\) remains relatively flat. Thus, if the nature frequency of the system (1) locate in this frequency domain, the stochastic averaging method for quasi-integrable Hamiltonian systems under wideband random excitation [28] can be applied to system (1).

Fig. 1
figure 1

Samples of fGn for different Hurst index \({\mathcal{H}}\)

Fig. 2
figure 2

Power spectral density of fGn with different Hurst index \({\mathcal{H}}\)

3 Simplification of system (1) using averaging method

3.1 Stochastic averaging

Assume that the Hamiltonian system associated with Eq. (1) is integrable and the Hamiltonian \(H\) is separable, i.e.,

$$\begin{aligned} H({\mathbf{Q}},{\mathbf{P}}) & = \sum\limits_{i = 1}^{n} {H_{i} (Q_{i} ,P_{i} )} \\ H_{i} (Q_{i} ,P_{i} ) & = P_{i}^{2} /2 + U_{i} (Q_{i} ), \\ \end{aligned}$$
(5)

where the potential

$$U_{i} (Q_{i} ) = \int\limits_{0}^{{Q_{i} }} {g_{i} (u){\text{d}}u}$$
(6)

Under certain conditions [28], system (1) has following randomly periodic solution

$$\begin{aligned} Q_{i} (t) & = A_{i} (t)\cos \Phi_{i} (t) + B_{i} ,\quad P_{i} (t) = - A_{i} (t)\nu_{i} \sin \Phi_{i} (t), \\ \Phi_{i} (t) & = \Gamma_{i} (t) + \Theta_{i} (t),\quad i = 1,2, \ldots ,n, \\ \end{aligned}$$
(7)

where \(A_{i}\) is the displacement amplitude; \(B_{i}\) is central position of the \(i\)th DOF. Under the influence of random excitations \(W_{k}^{H} (t)\), the \(A_{i} ,\Phi_{i} ,\Gamma_{i} ,\Theta_{i}\) become random processes. The \(\nu_{i} = \nu_{i} (A_{i} ,\Phi_{i} )\) in Eq. (7) represents the instantaneous frequency of the \(i\)th DOF and can be expressed as

$$V_{i} (A_{i} ,\Phi_{i} ) = \frac{{\sqrt {2[U_{i} (A_{i} ) - U_{i} (A_{i} \cos \Phi_{i} + B_{i} )]} }}{{\left| {A_{i} \sin \Phi_{i} } \right|}}.$$
(8)

Treating Eq. (7) as a transformation from \([{\mathbf{Q}}_{{}}^{{\text{T}}} ,{\mathbf{P}}_{{}}^{{\text{T}}} ]^{{\text{T}}}\) to \([{\mathbf{A}}^{{\text{T}}} ,{{\varvec{\Phi}}}_{{}}^{{\text{T}}} ]^{{\text{T}}}\) and substituting it into system (1), one can obtain the motion equations for \([{\mathbf{A}}^{{\text{T}}} ,{{\varvec{\Phi}}}_{{}}^{{\text{T}}} ]^{{\text{T}}}\). By using the relation \(H_{i} = U_{i} (A_{i} )\), we can obtain the motion equations for \([{\mathbf{H}}^{{\text{T}}} ,{{\varvec{\Phi}}}^{{\text{T}}} ]^{{\text{T}}}\) as follows

$$\begin{aligned} \dot{H}_{i} & = \varepsilon F_{i}^{H} ({\mathbf{H}},{{\varvec{\Phi}}}) + \varepsilon^{1/2} \mathop \Sigma \limits_{k = 1}^{m} G_{ik}^{H} ({\mathbf{H}},{{\varvec{\Phi}}})W_{k}^{H} (t), \\ \dot{\Phi }_{i} & = \nu_{i} (H_{i} ,\Phi_{i} ) + \varepsilon F_{i}^{\Phi } ({\mathbf{H}},{{\varvec{\Phi}}}) + \varepsilon^{1/2} \mathop \Sigma \limits_{k = 1}^{m} G_{ik}^{\Phi } ({\mathbf{H}},{{\varvec{\Phi}}})W_{k}^{H} (t), \\ i & = 1,2, \ldots ,n, \\ \end{aligned}$$
(9)

where \({\mathbf{H}} = [H_{1} ,H_{2} , \ldots ,H_{n} ]^{{\text{T}}}\), \({\mathbf{\Phi = [}}\Phi_{1} ,\Phi_{2} , \ldots ,\Phi_{n} {\mathbf{]}}^{{\text{T}}}\) and

$$\begin{aligned} F_{i}^{H} & = - U_{i}^{ - 1} (H_{i} )\nu_{i} \sin \Phi_{i} \mathop \Sigma \limits_{j = 1}^{n} [c_{ij} U_{j}^{ - 1} (H_{j} )\nu_{j} \sin \Phi_{j} ], \\ F_{i}^{\Phi } & = \frac{{ - \nu_{i} (\cos \Phi_{i} + r_{i} )}}{{g[U_{i}^{ - 1} (H_{i} ) + B_{i} ](1 + r_{i} )}}\mathop \Sigma \limits_{j = 1}^{n} [c_{ij} U_{j}^{ - 1} (H_{j} )\nu_{j} \sin \Phi_{j} ], \\ G_{ik}^{H} & = - U_{i}^{ - 1} (H_{i} )\nu_{i} \sin \Phi_{i} f_{ik} ,\quad G_{ik}^{\Phi } = \frac{{ - \nu_{i} (\cos \Phi_{i} + r_{i} )}}{{g[U_{i}^{ - 1} (H_{i} ) + B_{i} ](1 + r_{i} )}}f_{ik} , \\ r_{i} & = \frac{{g[ - U_{i}^{ - 1} (H_{i} ) + B_{i} ] + g[U_{i}^{ - 1} (H_{i} ) + B_{i} ]}}{{g[ - U_{i}^{ - 1} (H_{i} ) + B_{i} ] - g[U_{i}^{ - 1} (H_{i} ) + B_{i} ]}}, \\ \nu_{i} (H_{i} ,\Phi_{i} ) & = \omega_{0i} (H_{i} ) + \mathop \Sigma \limits_{s = 2,4,6, \cdots }^{\infty } \omega_{si} (H_{i} )\cos s\Phi_{i} . \\ \end{aligned}$$
(10)

Considering the case that all mean frequencies \(\omega_{0i} (H_{i} )\;(i = 1,2, \ldots ,n)\) fall within the domain where the power spectral densities \(S_{k} (\omega )\;(k = 1,2, \ldots ,m)\) remain relatively flat (see Fig. 2), the stochastic averaging method for quasi-integrable Hamiltonian systems under wideband noise excitation can then be applied to system (9) [19, 28]. It is observed from Eq. (9) that in non-resonant case \({\mathbf{H}}\) are slowly varying vector processes and \({{\varvec{\Phi}}}\) rapidly varying vector processes. According to the Stratonovich–Khasminskii limit theorem [29, 30], \({\mathbf{H}}(t)\) in Eq. (9) converge weakly to an \(n\)-dimensional Markov diffusion process as \(\varepsilon \to 0\). The governing averaged Itô equations are of the form

$${\text{d}}H_{i} = \varepsilon a_{i} ({\mathbf{H}}){\text{d}}t + \varepsilon^{1/2} \mathop \Sigma \limits_{k = 1}^{m} \sigma_{ik} ({\mathbf{H}}){\text{d}}B_{k} (t),\quad i = 1,2, \ldots ,n.$$
(11)

where \(B_{1} (t),B_{2} (t), \ldots ,B_{m} (t)\) are the independent standard Wiener processes. The drift coefficient functions \(a_{i}\) and diffusion coefficient functions \(b_{ij}\) are given by

$$\begin{aligned} a_{i} ({\mathbf{H}}) & = \left\langle {F_{i}^{H} + \mathop \Sigma \limits_{k,l = 1}^{m} \int_{ - \infty }^{0} {\mathop \Sigma \limits_{j = 1}^{n} \left( {\left. {\frac{{\partial G_{ik}^{H} }}{{\partial H_{j} }}} \right|_{t} \left. {G_{jl}^{H} } \right|_{t + \tau } { + }\left. {\frac{{\partial G_{ik}^{H} }}{{\partial \Phi_{j} }}} \right|_{t} \left. {G_{jl}^{\Phi } } \right|_{t + \tau } } \right)} R_{kl} (\tau ){\text{d}}\tau } \right\rangle_{t} , \\ b_{ij} ({\mathbf{H}}) & = \mathop \Sigma \limits_{k = 1}^{m} \sigma_{ik} \sigma_{jk} = \left\langle {\mathop \Sigma \limits_{k,l = 1}^{m} \int_{ - \infty }^{\infty } {(\left. {G_{ik}^{H} } \right|_{t} \left. {G_{jl}^{H} } \right|_{t + \tau } )} R_{kl} (\tau ){\text{d}}\tau } \right\rangle_{t} , \\ i,j & = 1,2, \ldots ,n, \\ \end{aligned}$$
(12)

where \(R_{kl} (\tau )\) is the cross-correlation function of fGn; \(\langle \cdot \rangle_{t}\) denotes the following averaging operation

$$\langle [ \cdot ]\rangle_{t} = \mathop {\lim }\limits_{T \to \infty } \frac{1}{T}\int_{0}^{T} {[ \cdot ]} {\text{d}}t = \frac{1}{{(2\pi {)}^{n} }}\int_{0}^{2\pi } {[ \cdot ]{\text{d}}\Phi_{1} \cdots {\text{d}}\Phi_{n} } .$$
(13)

Traditionally, obtaining the explicit expressions for \(a_{i}\) and \(b_{ij}\) involves expanding those terms in Eq. (10) into Fourier series with respect to \(\Phi_{i}\), integrate with respect to \(\tau\), and then average according to Eq. (13). This mathematical processing is extremely complex. For general systems, especially those with MDOF and strong nonlinearity, it is often impossible to derive accurate expressions for \(a_{i}\) and \(b_{ij}\). This complexity poses significant challenges for subsequent stability analysis. Therefore, this paper proposes using a Backpropagation Neural Network (BPNN) to determine the values of these coefficients. This approach combined with stochastic averaging will be more efficient in obtaining the averaged SDEs, i.e. the approximate system with fewer dimensions of the original system.

3.2 BP neural network

Figure 3 depicts a neural network named by the Error Back Propagation Neural Network (BPNN) [25], which consists of input layer (\(n\) nodes), hidden layer (\(l\) nodes), and output layer (\(m\) nodes). In this network, the input data \(A_{1} ,A_{2} , \ldots ,A_{n}\) are \(H_{i}\), which are the motion variables of the averaged SDEs in Eq. (11). The output data \(Z_{1} ,Z_{2} , \ldots ,Z_{m}\) correspond to the drift and diffusion coefficients defined in Eq. (12). \(C_{1} ,C_{2} , \ldots ,C_{l}\) are the output of hidden layer. The \(w_{ji} \;(j = 1,2, \ldots ,l;\;i = 1,2, \ldots ,n)\) and \(v_{kj} \;(k = 1,2, \ldots ,m;\;j = 1,2, \ldots ,l)\) are the connection weights between the three layers. The \(\theta_{1} ,\theta_{2} , \ldots ,\theta_{l}\) and \(\gamma_{1} ,\gamma_{2} , \ldots ,\gamma_{m}\) denote the offsets of the hidden layer and output layer, respectively.

Fig. 3
figure 3

The structure of BPNN (back propagation neural network)

Training data of \(H_{i}\) and drift and diffusion coefficients, captured from motion-state \(({\mathbf{Q}},{\mathbf{P}})\) of original system (1) according to Eqs. (5) and (12), enter the network through the input layer, and we employ mean-squared error as the loss function, i.e.,

$$L_{f} = \frac{1}{2}\sum\limits_{{\phantom{a}}}^{S} {\sum\limits_{k = 1}^{m} {(T_{k} - Z_{k} )^{2} } } ,$$
(14)

where \(T_{k}\) is the training data enter the output layer; \(S\) is the total number of training data. The network iteratively learns until the loss function \(L_{f}\) reaches its minimum. In the process of learning, the adjustment value \(\Delta v_{kj}\) and \(\Delta w_{ji}\) are determined by using the gradient descent method. That is, for the adjustment value \(\Delta v_{kj}\) of output layer,

$$\begin{aligned} \Delta v_{kj} & = - \alpha \frac{{\partial L_{f} }}{{\partial v_{kj} }} = - \alpha \frac{{\partial L_{f} }}{{\partial Z_{k} }}\frac{{\partial Z_{k} }}{{\partial {\text{net}}_{k} }}\frac{{\partial {\text{net}}_{k} }}{{\partial v_{kj} }} \\ & = \alpha (T_{k} - Z_{k} )f^{\prime}({\text{net}}_{k} )C_{j} , \\ \end{aligned}$$
(15)

where \(\alpha\) are constant which measure the adjustment level; \({\text{net}}_{k} = \sum\nolimits_{j = 1}^{l} {v_{kj} C_{j} + \gamma_{k} }\); \(f( \cdot ),f^{\prime}( \cdot )\) are the activation function and its derivative function. For the adjustment value \(\Delta w_{ji}\) of hidden layer,

$$\begin{aligned} \Delta w_{ji} & = - \beta \frac{{\partial L_{f} }}{{\partial w_{ji} }} = - \beta \frac{{\partial L_{f} }}{{\partial Z_{k} }}\frac{{\partial Z_{k} }}{{\partial {\text{net}}_{k} }}\frac{{\partial {\text{net}}_{k} }}{{\partial C_{j} }}\frac{{\partial C_{j} }}{{\partial {\text{net}}_{j} }}\frac{{\partial {\text{net}}_{j} }}{{\partial w_{ji} }} \\ & = \left[ {\sum\limits_{k = 1}^{m} {(T_{k} - Z_{k} )f^{\prime}({\text{net}}_{k} )v_{kj} } } \right]f^{\prime}({\text{net}}_{j} )A_{i} . \\ \end{aligned}$$
(16)

where constant \(\beta\) measure the adjustment level; \({\text{net}}_{j} = \sum\nolimits_{i = 1}^{n} {w_{ji} A_{i} + \theta_{j} }\).

Thus, the process of deep learning based on BPNN can be roughly divided into three steps. Firstly, calculate output data of hidden layer and output layer,

$$\begin{gathered} C_{j} = f\left( {\sum\limits_{i = 1}^{n} {w_{ji} A_{i} + \theta_{j} } } \right),\quad j = 1,2, \ldots ,l, \hfill \\ Z_{k} = f\left( {\sum\limits_{j = 1}^{l} {v_{kj} C_{j} } + \gamma_{k} } \right),\quad k = 1,2, \ldots ,m, \hfill \\ \end{gathered}$$
(17)

where the activation function is defined by the sigmoid function \(f(x) = 1/(1 + \exp ( - x))\). Secondly, noting that \(f^{\prime}(x) = f(x)[1 - f(x)]\), the error \(\delta_{k}\) at output layer and error \(\sigma_{j}\) at hidden layer are

$$\begin{aligned} \delta_{k} & = (T_{k} - Z_{k} )Z_{k} (1 - Z_{k} ), \\ \sigma_{j} & = C_{j} (1 - C_{j} )\sum\limits_{k = 1}^{m} {v_{kj} \delta_{k} } , \\ k & = 1,2, \ldots ,m,\quad j = 1,2, \ldots ,l. \\ \end{aligned}$$
(18)

Finally, adjust the connection weight \(w_{ji} ,v_{kj}\) and offset value \(\gamma_{k} ,\theta_{j}\),

$$\begin{aligned} v_{kj} & = v_{kj} + \alpha \delta_{k} C_{j} ,\quad \gamma_{k} = \gamma_{k} + \alpha \delta_{k} , \\ w_{ji} & = w_{ji} + \beta \sigma_{j} A_{j} ,\quad \theta_{j} = \theta_{j} + \beta \sigma_{j} , \\ i & = 1,2, \ldots ,n,\quad j = 1,2, \ldots ,l,\quad k = 1,2, \ldots ,m. \\ \end{aligned}$$
(19)

Once learning is complete, the network retains the learned results. Consequently, when input variables \(H_{i}\) are introduced, even if they are not part of the training data, the output layer can still provide reasonable values for the drift and diffusion coefficients.

4 The largest Lyapunov exponent

By employing the data-driven stochastic averaging method, the original system (1) can be simplified into averaged SDEs (11). Subsequently, the maximum Lyapunov exponent method can be utilized to analyze the stochastic stability.

Linearizing Eq. (11) at \(H = 0\) yields

$$\begin{aligned} {\text{d}}H_{r} & = F_{r} ({\mathbf{H}}){\text{d}}t + \sum\limits_{k = 1}^{m} {G_{rk} ({\mathbf{H}})} {\text{d}}B_{k}^{H} (t), \\ r & = 1,2,3,...,n,\quad k = 1,2,...,m. \\ \end{aligned}$$
(20)

And the coefficients of Eq. (20) satisfy the following conditions

$$\begin{gathered} \mathop {\lim }\limits_{{\left| {\mathbf{H}} \right| \to 0}} F_{r} ({\mathbf{H}}) = 0,\quad \mathop {\lim }\limits_{{\left| {\mathbf{H}} \right| \to 0}} G_{rk} ({\mathbf{H}}) = 0, \hfill \\ kF_{r} ({\mathbf{H}}) = F_{r} (k{\mathbf{H}}),\quad kG_{rk} ({\mathbf{H}}) = G_{rk} (k{\mathbf{H}}), \hfill \\ \left( {G({\mathbf{H}})G({\mathbf{H}})^{T} \alpha ,\alpha } \right) \ge c\left| {\mathbf{H}} \right|^{2} \left| {\mathbf{a}} \right|^{2} . \hfill \\ \end{gathered}$$
(21)

Traditionally, the coefficients \(F_{r} ({\mathbf{H}})\) and \(G_{rk} ({\mathbf{H}})\) are extremely difficult to obtain because the coefficients in Eq. (12) are usually not exact expressions and contain multiple integrals. After applying BPNN method in Sect. 3.2 to obtain the average SDEs, it is easy to obtain the two coefficients in Eq. (20).

Under the condition of Eq. (21), the largest Lyapunov exponent of averaged SDEs (20) can be obtained using the procedure similar to that of Khasminskii [26]. Introduce following new variables,

$$\begin{aligned} \rho & = \frac{1}{2}\ln H \\ \alpha_{r} & = H_{r} /H,\quad r = 1,2,3...,n. \\ \end{aligned}$$
(22)

The Itô Equation for ρ and \(\alpha_{r}\) can be derived from Eq. (20) using Itô differential rule

$${\text{d}}\rho = Q(\alpha ){\text{d}}t + \sum\limits_{k = 1}^{m} {(\alpha )} {\text{d}}B_{k}^{H} (t)$$
(23)
$${\text{d}}\alpha_{r} = m_{r} (\alpha ){\text{d}}t + \sum\limits_{k = 1}^{m} {\sigma_{rk} (\alpha )} {\text{d}}B_{k}^{H} (t),$$
(24)

where

$$\begin{aligned} Q(\alpha ) & = \frac{1}{2}\sum\limits_{s = 1}^{n} {F_{s} (\alpha )} - \frac{1}{4}\sum\limits_{{s,s^{\prime} = 1}}^{n} {\sum\limits_{k = 1}^{m} {G_{sk} (\alpha )G_{{s^{\prime}k}} (\alpha )} } \\ m_{r} (\alpha ) & = - \alpha_{r} \sum\limits_{s = 1}^{n} {F_{s} (\alpha )} + F_{r} (\alpha ) + \alpha_{r} \sum\limits_{{s,s^{\prime} = 1}}^{n} {\sum\limits_{k = 1}^{m} {G_{sk} (\alpha )G_{{s^{\prime}k}} (\alpha )} } \\ & \quad - \frac{1}{2}\sum\limits_{s = 1}^{n} {\sum\limits_{k = 1}^{m} {G_{rk} (\alpha )G_{sk} (\alpha )} } - \frac{1}{2}\sum\limits_{k = 1}^{m} {G_{rk}^{2} (\alpha )} \\ \sigma_{rk}^{{}} (\alpha ) & = G_{rk} (\alpha ) - \alpha_{r} \sum\limits_{s = 1}^{n} {G_{sk} (\alpha )} . \\ \end{aligned}$$
(25)

Note that \(\sum\nolimits_{r = 1}^{n} {\alpha_{r} } = 1\) and only n − 1 equations of Eq. (22) is independent. Let \(\alpha^{\prime} = [\alpha_{1} ,\alpha_{2} ,...,\alpha_{n - 1} ]^{{\text{T}}}\) be n − 1 dimensional vector diffusion process and \(\alpha_{n}\) be replaced by \(\alpha_{n} = 1 - \sum\nolimits_{r = 1}^{n - 1} {\alpha_{r} }\).

Define the Lyapunov exponent of linearized averaged system (20) as the asymptotic rate of the exponential growth of \(H^{1/2}\)

$$\lambda = \mathop {\lim }\limits_{t \to \infty } \frac{1}{2t}\ln H.$$
(26)

Integral Eq. (23) from 0 to t and divided by t

$$\frac{1}{2t}\ln H(t) = \frac{1}{2t}\ln H(0) + \frac{1}{t}\int\limits_{0}^{t} {Q\left[ {\alpha^{\prime}(\tau )} \right]} {\text{d}}\tau + \frac{1}{t}\int_{0}^{t} {\sum\limits_{r = 1}^{n} {\sum\limits_{k = 1}^{m} {\sigma_{rk} \left[ {\alpha^{\prime}(\tau )} \right]} } } {\text{d}}B_{k}^{H} (\tau ).$$
(27)

When \(t \to \infty\), the first term and third term of right hand of Eq. (27) approach to 0. Thus

$$\lambda = \mathop {\lim }\limits_{t \to \infty } \frac{1}{t}\int_{0}^{t} {Q\left[ {\alpha^{\prime}(\tau )} \right]} {\text{d}}\tau .$$
(28)

Assume \(\alpha^{\prime}\) is an ergodic diffusion process in the interval \(0 < \left\| {\alpha^{\prime}} \right\| < 1\). According to the ergodic theorem [31], Lyapunov exponent is approaching to the largest Lyapunov exponent, i.e.,

$$\lambda_{\max } = \mathop {\lim }\limits_{t \to \infty } \frac{1}{t}\int_{0}^{t} {Q\left[ {\alpha^{\prime}(\tau )} \right]} {\text{d}}\tau = E\left[ {Q(\alpha^{\prime})} \right] = \int {Q(\alpha^{\prime})p} (\alpha^{\prime}){\text{d}}\alpha^{\prime},$$
(29)

where \(p(\alpha^{\prime})\) is the stationary probability density of \(\alpha^{\prime}\) and can be obtained from solving following stationary FPK equation associated to the former n − 1 equations of Eq. (24)

$$0 = - \sum\limits_{r = 1}^{n - 1} {\frac{\partial }{{\partial \alpha_{r} }}\left[ {m_{r} (\alpha^{\prime})p(\alpha^{\prime})} \right]} + \frac{1}{2}\sum\limits_{r,i = 1}^{n - 1} {\frac{{\partial^{2} }}{{\partial \alpha_{r} \partial \alpha_{i} }}\left[ {\sum\limits_{k = 1}^{m} {\sigma_{rk} \sigma_{ik} (\alpha^{\prime})} p(\alpha^{\prime})} \right]} .$$
(30)

The boundary conditions of Eq. (30) are

$$\begin{gathered} p = {\text{finite}}, \, \alpha_{r} = 0, \, \hfill \\ p,\partial p/\partial \alpha_{r} \to 0, \, \left| {\alpha^{\prime}} \right| \to \infty \hfill \\ r = 1,2,...,n - 1. \hfill \\ \end{gathered}$$
(31)

\(\lambda_{\max }\) in Eq. (29) is the approximation of the largest Lyapunov exponent of original system (1) and can be utilized to study the asymptotic stability with probability one of system (1). The domain of asymptotic stability with probability one of system (1) in the parameter space of system is determined by \(\lambda_{\max } < 0\).

5 Examples

5.1 Example 1

Consider the asymptotic Lyapunov stability with probability one of two Duffing oscillators coupled by both linear damping under parametric excitations of fGn. The motion equations of the system are

$$\begin{gathered} \ddot{X}_{1} + \beta_{11} \dot{X}_{1} + \beta_{12} \dot{X}_{2} + \omega_{1}^{2} X_{1} + \alpha_{1} X_{1}^{3} = \sqrt {2D_{1} } X_{1} W_{1}^{H} (t), \hfill \\ \ddot{X}_{2} + \beta_{21} \dot{X}_{1} + \beta_{22} \dot{X}_{2} + \omega_{2}^{2} X_{2} + \alpha_{2} X_{2}^{3} = \sqrt {2D_{2} } X_{2} W_{2}^{H} (t). \hfill \\ \end{gathered}$$
(32)

where \(\beta_{ij}\), \(\omega_{i}\), \(\alpha_{i}\), \((i = 1,2)\) are constants; \(W_{1}^{H} (t),W_{2}^{H} (t)\) are independent unit fGn with Hurst index \({\mathcal{H}}\) and with PSD in Eq. (4); \(2D_{1} ,2D_{2}\) play the role of modulating the excitation intensity for fGns.

Letting \(X_{1} = q_{1}\), \(\dot{X}_{1} = p_{1}\), \(X_{2} = q_{2}\), \(\dot{X}_{2} = p_{2}\). The original system (32) can be expressed as that of the form of quasi-Hamiltonian system (1). The associated Hamiltonian is

$$\begin{aligned} H & = H_{1} + H_{2} ,\quad H_{i} = \frac{1}{2}p_{i}^{2} + U_{i} (q_{i} ), \\ U_{i} (q_{i} ) & = \frac{1}{2}\omega_{i}^{2} q_{i}^{2} + \frac{1}{4}\alpha_{i} q_{i}^{4} . \\ \end{aligned}$$
(33)

Assume that \(\omega_{1}\) and \(\omega_{1}\) are in the frequency domain where the power spectral density of \(S_{1} (\omega )\) and \(S_{2} (\omega )\) of \(W_{1}^{H} (t)\) and \(W_{2}^{H} (t)\) are relatively flat. By applying the stochastic averaging method introduced in Sect. 3, the following averaged SDEs governing Hamiltonian \(H_{1} (t),H_{2} (t)\) can be obtained

$$\begin{gathered} {\text{d}}H_{1} = \overline{m}_{1} ({\mathbf{H}}){\text{d}}t + \overline{\sigma }_{11} ({\mathbf{H}}){\text{d}}B_{11} (t) + \overline{\sigma }_{12} ({\mathbf{H}}){\text{d}}B_{12} (t), \hfill \\ {\text{d}}H_{2} = \overline{m}_{2} ({\mathbf{H}}){\text{d}}t + \overline{\sigma }_{21} ({\mathbf{H}}){\text{d}}B_{21} (t) + \overline{\sigma }_{22} ({\mathbf{H}}){\text{d}}B_{22} (t), \hfill \\ \end{gathered}$$
(34)

Those drift and diffusion coefficients functions in Eq. (34) can be obtained by using the stochastic averaging in Eq. (12). Two ways are employed to finish the stochastic averaging. One way is the data-driven stochastic averaging. Another way is the exact expression and its numerical calculation. For the data-driven method, take the drift and diffusion coefficients \(\overline{m}_{1} ({\mathbf{H}})\) and \(b_{11}^{{}} ({\mathbf{H}})\) for example,

$$\begin{aligned} \overline{m}_{1} ({\mathbf{H}}) & = \left\langle {D_{1} q_{1}^{2} - p_{1} (\beta_{11} p_{1} + \beta_{12} p_{2} )} \right\rangle_{t} , \\ b_{11} ({\mathbf{H}}) & = \sigma_{11}^{2} ({\mathbf{H}}) = \left\langle {2D_{1} p_{1}^{2} q_{1}^{2} } \right\rangle_{t} . \\ \end{aligned}$$
(35)

The motion-state \((q_{1} ,q_{2} ,p_{1} ,p_{2} )\) is the simulation data from original system (32). Then the training data for \(\overline{m}_{1} ({\mathbf{H}}),b_{11}^{{}} ({\mathbf{H}})\) is determined by doing statistics according to Eq. (35).

For the exact expressions of drift and diffusion coefficients in Eq. (34), they are

$$\begin{aligned} \overline{m}_{i} ({\mathbf{H}}) & = \left. {[(\omega_{i}^{2} A_{i} + \alpha_{i} A_{i}^{3} )m_{i} ({\mathbf{A}}) + \frac{1}{2}(\omega_{i}^{2} + 3\alpha_{i} A_{i}^{2} )b_{ii}^{2} ({\mathbf{A}})]} \right|_{{A_{i} = U_{i}^{ - 1} (H_{i} )}} , \\ \overline{b}_{ii} ({\mathbf{H}}) & = \left. {[(\omega_{i}^{2} A_{i} + \alpha_{i} A_{i}^{3} )^{2} b_{ii}^{2} ({\mathbf{A}})]} \right|_{{A_{i} = U_{i}^{ - 1} (H_{i} )}} , \\ \overline{b}_{12} ({\mathbf{H}}) & = \overline{b}_{21} ({\mathbf{H}}) = 0. \\ \end{aligned}$$
(36)

where

$$\begin{aligned} {m_i}({\bf{A}}) & = - \frac{{A_i^2}}{{8{g_i}}}{\beta _{ii}}(4w_i^2 + 5{\alpha _i}A_i^2/2) + \frac{{\pi f_i^2A_i^2}}{{32{g_i}}} \\ & \quad \bigg\{(2{b_{0i}} - {b_{4i}})\left[ {\frac{{\rm{d}}}{{{\rm{d}}{A_i}}}\left( {\frac{{A_i^2(2{b_{0i}} - {b_{4i}})}}{{{g_i}}}} \right) + \frac{{2A_i^{}}}{{{g_i}}}(2{b_{0i}} + 2{b_{2i}} + {b_{4i}})} \right]{S_i}(2{w_i}) \\ & \quad + (2{b_{2i}} - {b_{6i}})\left[ {\frac{{\rm{d}}}{{{\rm{d}}{A_i}}}\left( {\frac{{A_i^2(2{b_{2i}} - {b_{6i}})}}{{{g_i}}}} \right) + \frac{{4A_i^{}}}{{{g_i}}}({b_{2i}} + 2{b_{4i}} + {b_{6i}})} \right]{S_i}(4{w_i}) \\ & \quad + {b_{4i}}\left[ {\frac{{\rm{d}}}{{{\rm{d}}{A_i}}}\left( {\frac{{A_i^2{b_{4i}}}}{{{g_i}}}} \right) + \frac{{6A_i^{}}}{{{g_i}}}({b_{4i}} + 2{b_{6i}})} \right]{S_i}(6{w_i}) \\ & \quad + {b_{6i}}\left[ {\frac{{\rm{d}}}{{{\rm{d}}{A_i}}}\left( {\frac{{A_i^2{b_{6i}}}}{{{g_i}}}} \right) + \frac{{8A_i^{}}}{{{g_i}}}{b_{6i}}} \right]{S_i}(8{w_i})\bigg\} \\ b_{ii} ({\bf{A}}) & = \frac{{\pi f_i^2A_i^4}}{{16g_i^2}}\left[ {{\rm{ }}{{(2{b_{0i}} - {b_{4i}})}^2}{S_i}(2{w_i}) + {{({b_{2i}} - {b_{6i}})}^2}{S_i}(4{w_i}) + {b_{4i}}^2{S_i}(6{w_i}) + {b_{6i}}^2{S_i}(8{w_i})} \right], \\ b_{12}^{}({\bf{A}}) & = b_{21}^{}({\bf{A}}) = 0, \quad i = 1,2. \end{aligned}$$
(37)

It is seen that analytically calculating the drift and diffusion coefficients in Eq. (36) is difficult, even numerically calculating is difficult. Therefore, the data-driven method based on BPNN described in Sect. 3.2 is used to obtain the drift and diffusion coefficients.

To demonstrate the effectiveness of the data-driven methods, Fig. 4 takes the coefficients \(\overline{m}_{1} ({\mathbf{H}})\) and \(b_{11}^{{}} ({\mathbf{H}})\) as examples, showing the results from three different sources: deep learning of Eq. (34), numerical calculation of Eq. (36) and Monte Carlo simulation of original system (32). It can be seen that the deep learning results meet very well with the other two results when \(H_{1}\) is small. For large \(H_{1}\), however, the deviation will become large too. For drift coefficient \(\overline{m}_{1} ({\mathbf{H}})\), when \(H_{1} = 0.5\), the relative error with the exact solution is 6%, 35%, NaN When \(H_{1} = 0.5\), \(H_{1} = 1\), \(H_{1} = 3\), respectively; for diffusion coefficient \(b_{11}^{{}} ({\mathbf{H}})\), the relative error is 8%, 10%, 14% at the same \(H_{1}\) values, respectively. Since this paper only requires the linearization of the two coefficients for stability analysis, the relative error when H is small satisfies the requirements for stability analysis.

Fig. 4
figure 4

a Coefficient \(\overline{m}_{1} (H_{1} )\). b Coefficient \(b_{11} (H_{1} )\) obtained from three different sources

For the study of system stochastic stability, a large number of H values will be concentrated near the equilibrium point (zero point), making the amount of training data in small H is much larger than that in large \(H\). According to the properties of the system, small H values for training data is more conducive to analyze system stability.

Then, linearizing Eq. (34) at \(H = 0\) yields

$$\begin{gathered} {\text{d}}H_{1} = F_{1} (H_{1} ){\text{d}}t + G_{11} (H_{1} )d^{ - } B_{1}^{H} (t), \hfill \\ {\text{d}}H_{2} = F_{2} (H_{2} ){\text{d}}t + G_{22} (H_{2} )d^{ - } B_{2}^{H} (t), \hfill \\ \end{gathered}$$
(38)

The coefficients \(F_{1} (H_{1} )\), \(F_{2} (H_{2} )\), \(G_{11} (H_{1} )\), \(G_{22} (H_{2} )\) in Eq. (38) can be easily obtained from the averaged SDEs with determined coefficient values.

Using the method mentioned in Sect. 4, the SDEs of \(\rho\) and \(\alpha_{1}\) can be obtained

$$\begin{gathered} {\text{d}}\rho = Q(\alpha_{1} ){\text{d}}t + \sigma (\alpha_{1} ){\text{d}}B_{{}}^{H} (t) \hfill \\ {\text{d}}\alpha_{1} = m_{1} (\alpha_{1} ){\text{d}}t + \sigma_{1} (\alpha_{1} ){\text{d}}B_{{}}^{H} (t), \hfill \\ \end{gathered}$$
(39)

where

$$\begin{aligned} Q(\alpha_{1} ) & = \frac{1}{2}[F_{1} \alpha_{1} - F_{2} (1 - \alpha_{1} )] - \frac{1}{4}[G_{11}^{2} \alpha_{1}^{2} + G_{22}^{2} (1 - \alpha_{1}^{{}} )^{2} ], \\ m_{1} (\alpha_{1} ) & = (F_{1} - F_{2} )\alpha_{1} (1 - \alpha_{1} ) - G_{11}^{2} \alpha_{1}^{2} (1 - \alpha_{1} ) + G_{22}^{2} \alpha_{1} (1 - \alpha_{1}^{{}} )^{2} , \\ \sigma_{1}^{2} (\alpha_{1} ) & = (G_{11}^{2} + G_{22}^{2} )\alpha_{1}^{2} (1 - \alpha_{1}^{{}} )^{2} . \\ \end{aligned}$$
(40)

Solve the FPK equation corresponding to the It̑o equation of \(\alpha_{1} (t)\) to obtain the stationary probability density \(p(\alpha_{1} )\), which can be written as

$$p(\alpha_{1} ) = \frac{1}{{\sigma_{1}^{2} (\alpha_{1} )}}\exp \left( {\int\limits_{0}^{{\alpha_{1} }} {\frac{{2m_{1} (u)}}{{\sigma_{1}^{2} (u)}}{\text{d}}u} } \right).$$
(41)

Then, the largest Lyapunov exponent is calculated as

$$\lambda_{\max } = E[Q(\alpha_{1} )] = \int {Q(\alpha_{1} )} p(\alpha_{1} ){\text{d}}\alpha_{1} .$$
(42)

Figures 5 and 6 show the stability boundaries of system (32) in coefficient plane \((\beta_{11} ,\beta_{22} )\) with system parameters \(\beta_{12} = 0.2\), \(\beta_{21} = 0.2\), \(\omega_{1}^{{}} = 1.414\), \(\omega_{2}^{{}} = 1\), \(\alpha_{1} = 1\), \(\alpha_{2} = 0.6\). The analytical results show good agreement with the corresponding simulated results from the original system (32). Figure 5 illustrates that the stable region increases as the stochastic excitation intensity D decreases. Figure 6 demonstrates that under the condition of the same excitation intensity, the stability region of the system decreases under fGn as Hurst index \({\mathcal{H}}\) approaches to 1/2. This implies that fGn excitation is weaker than Gaussian white noise excitation when they have the same intensity. Additionally, it confirms that as the Hurst index \({\mathcal{H}}\) increases, the randomness of fGn gradually weakens (see Fig. 1). Figure 7 illustrates that the largest Lyapunov exponent calculated from Eq. (42) agrees well with the simulated results from original system (32) over a wide range of Hurst index \({\mathcal{H}}\), provided that the natural frequencies \(\omega_{1}^{{}}\), \(\omega_{2}^{{}}\) of the system are larger than certain values, such as large than 0.6, which is consistent with findings in the literature [19].

Fig. 5
figure 5

Region of asymptotic Lyapunov stability with probability one in plane (\(\beta_{11} ,\beta_{22}\)) with different noise intensity. Solid lines denote the analytical results; symbols o denote the results from Monte Carlo simulation of original system. The parameter is \({\mathcal{H}} = 0.7\)

Fig. 6
figure 6

Region of asymptotic Lyapunov stability with probability one in plane (\(\beta_{11} ,\beta_{22}\)) with different Hurst index \({\mathcal{H}}\). Solid lines denote the analytical results; Symbols o denote the results from Monte Carlo simulation of original system. The parameter is \(D_{1} = 0.4\), \(D_{2} = 0.4\)

Fig. 7
figure 7

The largest Lyapunov exponent of system with different Hurst index \({\mathcal{H}}\) and varying natural frequency \(\omega_{1}^{{}}\), \(\omega_{2}^{{}}\). Solid lines denote the analytical results; Symbols o denote the results from Monte Carlo simulation of original system. The parameter is \(D_{1} = 0.3\), \(D_{2} = 0.3\)

5.2 Example 2

To demonstrate the applicability of proposed procedure to MDOF systems, consider a general system with 4-DOF coupling nonlinear damping under parametric excitation of fGn and the system equations can be expressed as

$$\ddot{X}_{i} + \beta_{i} ({\mathbf{X}},{\dot{\mathbf{X}}}) + \omega_{i}^{2} X_{i} = X_{i} W_{i}^{H} (t),\quad i = 1,2,3,4,$$
(43)

where \(\beta_{i} ({\mathbf{X}},{\dot{\mathbf{X}}})\) are the nonlinear coupling damping; \(W_{i}^{H} (t)\) are independent unit fGn with Hurst index \({\mathcal{H}}\) and with PSD in Eq. (4).

Letting \(Q_{i} = X_{i} ,\;P_{i} = \dot{X}_{i}\), the original system (43) can be transformed into quasi-integrable Hamiltonian system in the form of Eq. (1) and associated Hamiltonian functions is

$$H = \sum {H_{i} } ,\quad H_{i} = \frac{1}{2}P_{i}^{2} + \frac{1}{2}\omega_{i}^{2} Q_{i}^{2} ,\quad i = 1,2,3,4.$$
(44)

By applying the stochastic averaging method introduced in Sect. 3.1, the following averaged SDEs governing Hamiltonian \(H_{i} (t)\) can be obtained

$${\text{d}}H_{i} = m_{i} ({\mathbf{H}}){\text{d}}t + \sigma_{i} ({\mathbf{H}}){\text{d}}B_{i} (t),\quad i = 1,2,3,4.$$
(45)

Those drift and diffusion coefficient functions in Eq. (45) can be obtained by using the stochastic averaging in Eq. (12). Similar to example 1, the data-driven stochastic averaging and the exact expression can be employed. For the data-driven method, take the drift and diffusion coefficients \(\overline{m}_{1} ({\mathbf{H}})\) and \(b_{11}^{{}} ({\mathbf{H}})\) for example,

$$\begin{gathered} \overline{m}_{1} ({\mathbf{H}}) = \left\langle {D_{1} q_{1}^{2} - p_{1} \beta_{1} ({\mathbf{q}},{\mathbf{p}})} \right\rangle_{t} , \hfill \\ b_{11} ({\mathbf{H}}) = \sigma_{1}^{2} ({\mathbf{H}}) = \left\langle {2D_{1} p_{1}^{2} q_{1}^{2} } \right\rangle_{t} . \hfill \\ \end{gathered}$$
(46)

The motion-state \(({\mathbf{q}},{\mathbf{p}})\) is the simulation data from original system (43). Then the training data for \(\overline{m}_{1} ({\mathbf{H}}),b_{11}^{{}} ({\mathbf{H}})\) is determined by doing statistics according to Eq. (46).

For the exact expressions, the coefficients \(m_{i} ({\mathbf{H}})\) and \(\sigma_{i} ({\mathbf{H}})\) are

$$\begin{aligned} m_{i} ({\mathbf{H}}) & = \frac{{D_{i} H_{i} }}{{\omega_{i}^{2} }} + \sqrt {2H_{i} } \left\langle {\beta_{i} ({\mathbf{Q}},{\mathbf{P}})\sin \Phi_{i} } \right\rangle_{t} , \\ b_{ii} ({\mathbf{H}}) & = \sigma_{i} \sigma_{i} = \frac{{D_{i} H_{i}^{2} }}{{\omega_{i}^{2} }}, \\ \end{aligned}$$
(47)

where \(D_{i} = \pi S(\omega_{i} )\) and \(\left\langle \cdot \right\rangle_{t} = \frac{1}{{(2\pi )^{n} }}\int_{0}^{2\pi } { \cdots \int_{0}^{2\pi } {( \cdot ){\text{d}}\Phi_{1} \cdots {\text{d}}\Phi_{n} } }\).

Then, Linearizing Eq. (47) at \(H = 0\) yields

$${\text{d}}H_{i} = {\mathbf{A}}^{{\text{T}}} {\mathbf{H}}{\text{d}}t + \frac{{\sqrt {D_{i} } }}{{\omega_{i} }}H_{i} {\text{d}}B_{i} (t),$$
(48)

where \({\mathbf{A}} = [a_{1} ,a_{2} , \ldots ,a_{n} ],\;{\mathbf{H}} = [H_{1} ,H_{2} , \ldots ,H_{n} ]^{{\text{T}}}\).

The strategy proposed in this paper is to directly obtain the coefficient values of averaged SDEs from the training data of the original system (43) using data-driven method based on BPNN described in Sect. 3.2. Linearize the averaged SDEs and then the largest Lyapunov exponent can be calculated with the method presented in Sect. 4.

The coupling damping form of \(\beta_{i} ({\mathbf{X}},{\dot{\mathbf{X}}})\) is exceptionally complex for general structural systems and cannot be explicitly expressed. For the convenient to demonstrate the accuracy of applying deep learning algorithm in Sect. 3.2, we adopted the following damping coupling form

$$\begin{aligned} \beta_{1} ({\mathbf{X}},{\dot{\mathbf{X}}}) & = (\eta_{1} + \gamma_{1} (\dot{X}_{1}^{2} + \dot{X}_{2}^{2} ))\dot{X}_{1}^{{}} \\ \beta_{2} ({\mathbf{X}},{\dot{\mathbf{X}}}) & = (\eta_{2} + \gamma_{2} (\dot{X}_{2}^{2} + \dot{X}_{3}^{2} ))\dot{X}_{2}^{{}} \\ \beta_{3} ({\mathbf{X}},{\dot{\mathbf{X}}}) & = (\eta_{3} + \gamma_{3} (\dot{X}_{3}^{2} + \dot{X}_{4}^{2} ))\dot{X}_{3}^{{}} \\ \beta_{4} ({\mathbf{X}},{\dot{\mathbf{X}}}) & = (\eta_{4} + \gamma_{4} (\dot{X}_{1}^{2} + \dot{X}_{4}^{2} ))\dot{X}_{4}^{{}} \\ \end{aligned}$$
(49)

As an explanation of the physical background of the original system (43), Fig. 8 provides the diagram of a 4-DOF mass-spring system with coupling damping of Eq. (49). Substituting Eq. (49) into Eq. (47) can obtain the coefficient \(m_{i} ({\mathbf{H}})\) and \(b_{ii} ({\mathbf{H}})\)

$$\begin{aligned} m_{1} & = - \eta_{1} H_{1} - \gamma_{1} H_{1} \left( {\frac{3}{2}H_{1} + H_{2} } \right) + \frac{{D_{1} }}{{\omega_{1}^{2} }}H_{1} , \\ m_{2} & = - \eta_{2} H_{2} - \gamma_{2} H_{2} \left( {\frac{3}{2}H_{2} + H_{3} } \right) + \frac{{D_{2} }}{{\omega_{2}^{2} }}H_{2} , \\ m_{3} & = - \eta_{3} H_{3} - \gamma_{3} H_{3} \left( {\frac{3}{2}H_{3} + H_{4} } \right) + \frac{{D_{3} }}{{\omega_{3}^{2} }}H_{3} , \\ m_{4} & = - \eta_{4} H_{4} - \gamma_{4} H_{4} \left( {\frac{3}{2}H_{4} + H_{1} } \right) + \frac{{D_{4} }}{{\omega_{4}^{2} }}H_{4} , \\ b_{ii} & = \frac{{D_{i} }}{{\omega_{i}^{2} }}H_{i}^{2} . \\ \end{aligned}$$
(50)
Fig. 8
figure 8

The diagram of a 4-DOF mass-spring system with coupling damping

Then, the expression for the linearized coefficients \({\mathbf{A}} = [a_{1} ,a_{2} , \ldots ,a_{n} ],\;{\mathbf{H}} = [H_{1} ,H_{2} , \ldots ,H_{n} ]^{{\text{T}}}\) in Eq. (48) can be calculated as following

$$\begin{aligned} a_{1} & = \frac{{D_{1} }}{{\omega_{1}^{2} }} - \eta_{1} - \gamma_{1} H_{2} , \\ a_{2} & = \frac{{D_{2} }}{{\omega_{2}^{2} }} - \eta_{2} - \gamma_{2} H_{3} , \\ a_{3} & = \frac{{D_{3} }}{{\omega_{3}^{2} }} - \eta_{3} - \gamma_{3} H_{4} , \\ a_{4} & = \frac{{D_{4} }}{{\omega_{4}^{2} }} - \eta_{4} - \gamma_{4} H_{1} . \\ \end{aligned}$$
(51)

To demonstrate the effectiveness of deep learning algorithms in calculating drift and diffusion coefficients of average SDEs, Figs. 9 and 10 give the results of coefficients of Eq. (46) getting from three different ways: training data from original system (43), exact expression in Eq. (50) and deep learning results. Similar to Example 1, the results obtained from deep learning algorithms are in good agreement with the exact solution, especially when the \(H_{1} ,H_{2}\) values are small. The error increases with the increase of \(H_{1} ,H_{2}\) values. Figure 11 shows the relative error of the two coefficients between deep learning results and the exact solutions. The reason for the gradual increase in relative error is that the amount of training data in small \(H_{1} ,H_{2}\) is much larger than that in large \(H_{1} ,H_{2}\), i.e., as the system tends to stability, resulting in fewer samples for larger \(H_{1} ,H_{2}\). Correspondingly to theoretical linearization, the results shown in the Figs. 9 and 10 near the equilibrium position (zero point) are consistent. At small H values, the accuracy of deep learning algorithms is sufficient to meet the research of stochastic stability of the system.

Fig. 9
figure 9

Drift coefficients \(\overline{m}_{1} ({\mathbf{H}})\) in Eq. (46) getting from different sources. a Training data; b exact expression in Eq. (50); c deep learning result

Fig. 10
figure 10

Diffusion coefficient \(b_{11} (H_{1} )\) in Eq. (46) getting from different sources. a Training data; b exact expression in Eq. (50); c deep learning result

Fig. 11
figure 11

Relataive error of \(\overline{m}_{1} ({\mathbf{H}})\) and \(b_{11} (H_{1} )\) between results from BPNN and exact solutions (52), respectively

Meanwhile, the efficiency of deep learning algorithms is much higher than that of numerical simulations. Data-driven method requires time for neural networks to learn and collecting training data, while stability evaluation takes almost no time after the network is formed. For Example 2, it takes 38 s for 5000 times learning of 10,000 samples using Python on the computer with CPU ‘AMD Ryzen 7 4800H’. However, it takes about 26 min to simulate 10,000 samples using Matlab on the same computer.

Based on the linearized averaged SDEs (48), the largest Lyapunov exponent can be obtained using the method described in Sect. 4. Then, the asymptotic Lyapunov stability with probability one of system (43) is determined. Figures 12 and 13 show the stable boundaries in parameters plane \(\left( {\eta_{1} ,\eta_{2} } \right)\) and \(\left( {\eta_{1} ,\gamma_{2} } \right)\) with different Hurst index \({\mathcal{H}}\) and noise intensity, respectively. Figure 14 shows the largest Lyapunov exponent with different noise intensity. System parameters are \(\eta_{3} = \eta_{4} = 0.4\), \(\gamma_{3} = \gamma_{4} = 0.6\), \(\omega_{1}^{{}} = 1\), \(\omega_{2}^{{}} = 1.732\), \(\omega_{3}^{{}} = 1.414\), \(\omega_{4}^{{}} = 2.236\), \(D_{i} = 0.7(i = 1,2,3,4)\). The results show that the stable region is consistent with Monte Carlo simulation results from original system, thus verifying the effectiveness of the proposed procedure in this paper.

Fig. 12
figure 12

Region of asymptotic Lyapunov stability with probability one in plane \(\left( {\eta_{1} ,\eta_{2} } \right)\) with different Hurst index \({\mathcal{H}}\). Solid lines denote the analytical results; Symbols o denote the results from Monte Carlo simulation of original system. The parameter is \(\gamma_{1} = \gamma_{2} = 0.6\)

Fig. 13
figure 13

Region of asymptotic Lyapunov stability with probability one in plane \(\left( {\eta_{1} ,\gamma_{2} } \right)\) with different noise intensity \(D_{i} = D(i = 1,2,3,4)\). Solid lines denote the analytical results; Symbols o denote the results from Monte Carlo simulation of original system. \(\eta_{2} = 0.1\), \(\gamma_{1} = 0.6\)

Fig. 14
figure 14

The largest Lyapunov exponent of system with different parameters \(\gamma_{1}\) and varying noise intensity \(D_{i} = D(i = 1,2,3,4)\). Solid lines denote the analytical results; Symbols o denote the results from Monte Carlo simulation of original system

6 Conclusion

The non-Markovian nature of a dynamical system under fGn excitation makes the study of stochastic stability extremely challenging. In the present paper, based on the observation that the PSD of fGn is quite flat for larger frequency, fGn is regarded approximately as a wide-band process. Then the deep learning-based stochastic averaging method for quasi-integrable Hamiltonian systems under wide-band random excitation was applied to obtain the approximate expression for the largest Lyapunov exponent and to determine the asymptotic Lyapunov stability with probability one of quasi-integrable and non-resonant Hamiltonian systems under parametric excitations of fGn. The results of two example demonstrate the effectiveness of the proposed procedure under the condition that the natural frequencies of the system locate in frequency range where the fGn can be treated approximately as wideband noise.