1 Introduction

Nonlinear partial differential equations (PDEs) occupy a central role in mathematical physics, especially in descriptions of continuum mechanics and as the principle mode of elucidating complex physical systems [1,2,3,4]. Besides the development of PDEs in analysis, one significant endeavor in mathematical physics is the quest for solutions to partial differential equations (PDEs). Presently, a universally effective method for deriving general solutions to these equations remains elusive. Typically, the resolution of nonlinear differential equations heavily relies on numerical techniques. While substantial strides have been made over time in utilizing numerical methods to solve nonlinear differential equations, this approach is not without limitations [5]. However, with the advancement in computational power and the development of deep learning, the mathematical community has begun to employ deep learning neural network frameworks to obtain precise solutions to PDEs [6,7,8,9,10]. Thanks to the improvement of computing power, the construction method of analytical solutions with symbolic computation has been widely developed in nonlinear science such as Hirota bilinear method [11,12,13], Jacobi elliptic function expansion method [14], Inverse \((G'/G)\)-Expansion Method [15], new Inverse \((G'/G)\)-Expansion Method [16], Lie symmetry approach [17], generalized exponential rational function (GERF) method [18], the extended rational sine-cosine/sinh-cosh schemes [19], the \(\left( m+\frac{1}{G'}\right) \)-expansion method [20], modified generalized Kudryashov method, modified F-expansion method [21], and so on.

Recently, Zhang et al. [22] introduced the Bilinear Neural Network Method (BNNM), a new method that combines bilinear methods with deep learning neural network models, the author of [22] have obtained a large number of arbitrary functional solutions of the p-gBKP equation using the BNNM, giving a variety of nice solution diagrams that show the diversity of exact solutions of partial differential equations. In addition to that, the author additionally obtained brand-new exact analytical solutions of the p-gBKP equation by changing the activation function as well as altering the neural network using BNNM [23], investigated the (2+1)-dimensional Caudrey-Dodd-Gibbon-Kotera-Sawada-like (CDGKS-like) equations, and constructed the generalized lump solution, classical lump solution and the novel analytical solution [24]. This proves the applicability of the BNNM to a wide range of complex nonlinear partial differential equations. This approach encompasses a multitude of classical test function methods to construct abundant analytical solutions including lump-type solutions [25], solitary waves [26], rogue wave solutions [27,28,29], soliton solutions [30,31,32], interactions [33, 34], M-lump solutions [35], localized waves [36,37,38], periodic wave solutions [39], and breather solutions [40, 41]. BNNM has significant advantages over traditional symbolic computation as well as numerical methods, within the BNNM framework, deep learning neural networks exhibit robust adaptability and exceptional nonlinear characteristics. Due to the highly complex nonlinear characteristics of neural network models, the bilinear neural network method encompasses many classical methods for constructing exact analytical solutions of nonlinear partial differential equations. Based on the universal approximation theorem, this approach can be generally applied to PDEs with bilinear mathematical structures [42], however, traditional symbolic computation usually requires reformulation of the strategy for each new problem. At the same time, BNNM is not constrained by a mesh and is thus more suitable than mesh-based numerical methods for problems with complex geometries or boundary conditions. Moreover, the solutions provided by traditional symbolic computation are often in a fixed form. In contrast, BNNM can compute in parallel a large number of solutions, solutions of arbitrary types, and interaction solutions. Of course, the author found during the experimental process that BNNM can produce results very quickly in the face of simple models, however, in the face of complex models, BNNM will be very time-consuming and the computational cost will be significantly higher, and as a new method, BNNM still needs more experiments to determine its applicability and effectiveness.

In this paper, due to the versatility and powerful approximation capabilities of BNNM and the advantage of having access to many different analytical solutions. we extend the single-layer neural network model based on BNNM to a double-layer neural network model and apply it to solve the generalized (3+1)-dimensional Kadomtsev–Petviashvili (gKP) equation [43]:

$$\begin{aligned} u_{ty}+u_{tx}+u_{tz}-u_{zz}+3(u_{x}u_{y})_{x}+u_{xxxy}=0. \end{aligned}$$
(1)

It is well known that the KP equation, formulated by Kadomtsev and Petviashvili in 1970 [44], is a nonlinear partial differential equation set in two spatial and one temporal coordinate, the form is shown in Eq. (2):

$$\begin{aligned} u_t+6uu_x+(u_{xxx})_x+au_{yy}=0, \end{aligned}$$
(2)

it describes the evolution of nonlinear long waves with small amplitudes that have slow transverse coordinate dependency. It can aptly simulate nonlinear phenomena in areas such as fluid physics, plasma physics, Bose-Einstein condensates, and optics.

Moreover, the generalized (3+1)-dimensional KP(gKP) equation introduced in this paper is an extension of its original equation, which is presented as follows:

$$\begin{aligned} \begin{aligned}&u_{xxxy}+3(u_xu_y)_x+\alpha u_{xxxz}+3\alpha (u_xu_z)_x+\beta _1u_{xt}\\&+\beta _2u_{yt}+\beta _3u_{zt}+\gamma _1 u_{xz}+\gamma _2u_{yz}+\gamma _3u_{zz}=0, \end{aligned} \end{aligned}$$
(3)

the parameters chosen for Eq. (1) are as follows: with \(\alpha =\gamma _{1}=\gamma _{2}=0, \beta _{1}=\beta _{2}=\beta _{3}=1\), and \(\gamma _{3}=-1\), this allows Eq. (3) to degenerate to Eq. (1). Additionally, inserting other parameter values will also degenerate into other well-known equations, as illustrated below:

  1. (1)

    For the parameter value case of \(\alpha =\beta _{1}=\beta _{3}=\gamma _{2}=\gamma _{3}=0,\beta _{2}=2\) and \(\beta _1=-3\), Eq. (3) can be reduced to the Jimbo–Miwa equation [45].

  2. (2)

    For the parameter value case of \(\alpha =\beta _2=\beta _3=1\) and \(\begin{aligned}\beta _1=\gamma _1=\gamma _2=\gamma _3=0\end{aligned}\), Eq. (3) can be degenerated to the Boiti–Leon–Manna–Pempinelli equation [46].

  3. (3)

    For the parameter value case of \(\alpha =\beta _{1}=\beta _{3}=\gamma _{2}=\gamma _{3}=0\) and \(\beta _{2}=\gamma _{1}=-1\), Eq. (3) can be transformed to the shallow water-like equation [47].

This paper primarily investigates the relevant aspects of the generalized (3 + 1)-dimensional Kadomtsev–Petviashvili (gKP) equation, while other equations will be studied individually in future work. Using the suggested bilinear Bäcklund transformation, Wen-Xiu Ma et al. [48, 49] calculate two categories of exponential and rational traveling wave solutions with arbitrary wave numbers, Wronski-type Pfaffian and Gramm-type Pfaffian solutions. Junjie Li et al. [50] employed a multi-exponential function method to obtain multi-soliton solutions for the generalized (3+1)-dimensional Kadomtsev–Petviashvili (gKP) equation, yielding some intriguing results. Based on the Hirota bilinear method, Xue Guan et al. [51] derived the lump and lump strip solutions of this equation using symbolic computation. Abdul-Majid Wazwaz et al. [52] used the simplified form of Hirota’s method to formally establish multiple-soliton solutions and multiple singular soliton solutions for a generalized (3 + 1)-dimensional Kadomtsev–Petviashvili (gKP) equation. The author of [53] obtained a large number of group-invariant solutions for a (3+1)-dimensional generalized Kadomtsev–Petviashvili (gKP) equation using the Lie symmetry method. Wu et al. [54] constructed Wronskian determinant solutions by a combined Wronskian condition. Wang et al. [55] obtained some exact solutions using the homoclinic test approach. The author of [56] constructed a bilinear Bäcklund transformation based on the Hirota bilinear form, resulting in exponential function solutions. Additionally, complexiton solutions for the KP equation were obtained using the bilinear form and the extended transformed rational function technique.

In this paper, the bilinear transformation is employed to address Eq. (1). In academic discussions of the bilinear method, Hirota’s bilinear operator method is often the primary focus. Hirota is renowned for introducing the D-operator [57]:

$$\begin{aligned} \begin{aligned} D_{p,x_{1}}^{n_{1}}\cdots D_{p,x_{M}}^{n_{M}}a\cdot b(x_{1},\cdots ,x_{M}) \\ =\prod _{i=1}^{M}\left( \frac{\partial }{\partial x_{i}} +\alpha \frac{\partial }{\partial x_{i}^{'}}\right) ^{n_{i}} \\ a(x_{1},\cdots ,x_{M})b(x_{1}^{'},\ldots ,x_{M}^{'})\mid _{x^{'}=x_{1},\ldots ,x^{'}=x_{M}}, \end{aligned} \end{aligned}$$
(4)

\(n_1,\ldots ,n_M\) represent arbitrary non-negative whole numbers, and for a given integer m, the mth exponent of \(\alpha \) is determined in the following way:

$$\begin{aligned}{} & {} (\alpha _p)^m \nonumber \\{} & {} =(-1)^{r(m)},m\equiv r(m)\bmod p,0\le r(m)<p, \end{aligned}$$
(5)

via dependent variable transformation:

$$\begin{aligned} u(x,y,z,t)=2[lnf(x,y,z,t)]_x, \end{aligned}$$
(6)

we get the following Hirota’s bilinear form:

$$\begin{aligned} B_{gKP}\left( f\right):= & {} \left( D_{x}^{3}D_{y} +D_{x}D_{t}+D_{y}D_{t}\right. \nonumber \\{} & {} \left. +D_{z}D_{t}-D_{z}^{2}\right) \mathrm {f\cdot f}=0, \end{aligned}$$
(7)

Equation (7) is equivalent to

(8)

the Hirota bilinear transformation technique is applied to transform Eq. (1) into Eq. (8). This transformation is significant as it endows Eq. (8) with a greater number of nonlinear terms, thereby enhancing its nonlinear characteristics. This enhancement enables the equation to account for more complex dynamical phenomena. Furthermore, the pronounced nonlinear nature of Eq. (8) makes it an ideal candidate for testing the applicability of the bilinear neural network method employed in this study.

The organization of the paper is as follows: In Sect. 2, the Bilinear Neural Network Method (BNNM) is introduced, and a detailed exposition of its procedural steps is provided. The corresponding tensor equations aimed at obtaining analytical solutions for nonlinear partial differential equations are presented. In Sect. 3, a single-layer neural network model with a “4-3-1” configuration is employed to derive Lump solutions and Fractal soliton wave solutions for Eq. (1). Moving to Sect. 4, a double-layer neural network with a “4-2-2-1” architecture is utilized to obtain Superposition of soliton periodic wave solutions and Bright-Dark solitons for Eq. (1). The dynamic characteristics of these exceptional solutions are depicted through line plots, three-dimensional graphs, contour plots, and density maps. Finally, in Sect. 5, the entire manuscript is summarized.

2 BNNM and its corresponding tensor formula

This study aims to discover the exact analytical solution for the bilinear Eq. (8). To achieve this, we have implemented a simulation of neural network algorithms and developed a framework specifically designed for deriving solutions. The tensor formulation for this nonlinear neural network is presented below:

$$\begin{aligned} f=W_{l_n,f}F_{l_n}(\xi _{l_n}), \end{aligned}$$
(9)

where \(W_{a,b}\) is the weight coefficient from neuron a to neuron b. The function F denotes a generalized activation function, which is subject to arbitrary definition. However, in the final layer, the function F must be satisfied that \(F_{{ln}}(\xi )\ge 0\). The set \(l_n=\{m_{n-1}+1,m_{n-1}+2,\ldots ,n\}\) corresponds to the space of the nth layer in the neural network model. The parameter \(\xi _l\) is defined as follows:

$$\begin{aligned} \xi _{l_i}=W_{l_{i-1},l_i}F_{l_{i-1}}(\xi _{l_{i-1}})+b_{l_i},\quad i=1,2,\cdots ,n, \end{aligned}$$
(10)

in Eq. (10),\(\varvec{l_{0}}=\{x_{1},x_{2},\ldots ,x_{n}\}\), \(\varvec{l_{1}}=\{1,2,\ldots ,m_{1}\}\),..., \(l_{i}=\{m_{i-1}+1,m_{i-1}+2,\ldots ,m_{i}\}(i=2,3,\ldots ,n-1)\),b means a threshold, which can be treated as a constant. The tensor model of this neural network is visually represented in Fig. 1.

Fig. 1
figure 1

Network model for Eq. (9) and the single hidden layer network model

The primary steps for employing the BNNM to solve for the exact analytical solutions of nonlinear partial differential equations(PDEs) are as follows:

Step 1:

By applying the bilinear transformation (6), the initial Eq. (1) is converted into a bilinear form as expressed in Eq. (6), which is equivalent to Eq. (7). The transformed equation contains a greater number of nonlinear terms, making it more suitable for the nonlinear characteristics inherent in neural networks.

Step 2:

Substituting expression (9) into Eq. (8) results in a complex equation.

Step 3:

By setting the coefficients of each term in this equation to 0, a new set of underdetermined systems of nonlinear algebraic equations can be obtained.

Step 4:

Utilizing Maple software for symbolic computation on this set of nonlinear equations enables the derivation of the solutions for the coefficients.

Step 5:

By substituting the solutions of the coefficients and the neural network tensor formula (9) into the bilinear transformation expression (6), the exact analytical solution of the nonlinear partial differential equation(PDEs) can be obtained.

Step 6:

By selecting appropriate parameters and functions from the exact analytical solutions derived through the aforementioned steps, one can use Maple software to create three-dimensional plots, contour maps, heat maps, etc., to demonstrate the dynamical characteristics of the solution.

Most classical function construction methods are also applicable to BNNM, which can be viewed as a single-hidden-layer neural network. The strength of BNNM lies in its ability to simultaneously construct multiple hidden layers, thus possessing improved representational and generalization capabilities. However, it is also crucial to pay attention to issues such as overfitting. In this paper, we will use BNNM with a “4-3-1” single-hidden-layer neural network model and a “4-2-2-1” double-hidden-layer neural network model to derive the exact analytical solutions of the generalized KP equation.

3 Lump and fractal soliton wave solutions by a single hidden layer neural network model

3.1 Lump solution

After multiple attempts, it has been found that employing a single-hidden-layer neural network with various activation functions can yield Lump solutions (Fig. 2a), periodic solutions (Fig. 2b), and Breathers solutions (Fig. 2c), extended three-soliton solutions (Fig. 2d), and so on, from the single-hidden layer neural network model (Fig. 1b).

A “4-3-1” neural network model has been selected, where the input layer \(l_0\) consists of 4 neurons, and the hidden layer \(l_1\) comprises 3 neurons. The architecture of this neural network model is illustrated in Fig. 2e

Fig. 2
figure 2

single hidden layer network model with partial exact analytical solutions

By analyzing the structure of the neural network of “4-3-1”, we have derived:

$$\begin{aligned} \begin{aligned}&{f=w_{1,f}F_1(\xi _1)+w_{2,f}F_2(\xi _2)+w_{3,f}F_3(\xi _3)+b_4,} \\&\left\{ \begin{aligned}&{\xi _1=w_{t,1}t+w_{x,1}x+w_{y,1}y+w_{z,1}z+b_1}, \\&{\xi _2=w_{t,2}t+w_{x,2}x+w_{y,2}y+w_{z,2}z+b_2}, \\&{\xi _3=w_{t,3}t+w_{x,3}x+w_{y,3}y+w_{z,3}z+b_3}, \end{aligned} \right. \end{aligned} \end{aligned}$$
(11)

where \(b_{\kappa }(k=1,2,3,4)\) and \(w_{i,j}(i=x,y,z,t,1,2,3, j=1,2,3,f)\) are real constants, \(F_i(\xi _i)(i=1,2,3)\) are the activation function to be determined.

First choose the most common classical test function, \(l_0=\{x,y,z,t\}\),\(l_1=\{1,2,3\}\), \(F_1(\xi _1)=(\xi _1)^2\), \(F_2(\xi _2)={\xi _2}^2\),\(F_3(\xi _3)={\xi _3}^2\) as Fig. 2a, by substituting expression (11) into Eq. (8), a complex equation is obtained. Setting the coefficients of each term in this equation to zero results in 61 algebraic equations. Utilizing Maple for symbolic computation to solve these algebraic equations, several solutions were obtained and four different solutions are shown below:

Case 1:

$$\begin{aligned} \begin{aligned}&b_1 = \frac{b_3 w_{z,1}}{w_{z,3}}, w_{1,f} = -\frac{w_{3,f} w_{z,3}^2}{w_{z,1}^2},\\&w_{t,1} = 0, w_{t,2} = 0, w_{t,3} = 0, \\&w_{x,1} = \frac{w_{x,3} w_{z,1}}{w_{z,3}}, w_{y,1} = \frac{w_{y,3} w_{z,1}}{w_{z,3}}, \\&w_{y,2} = 0, w_{z,2} = 0. \end{aligned} \end{aligned}$$
(12)

Case 2:

$$\begin{aligned} \begin{aligned}&b_2 = \frac{b_1 w_{2,f} w_{z,2}^2 + b_1 w_{3,f} w_{z,3}^2 - b_3 w_{3,f} w_{z,1} w_{z,3}}{w_{2,f} w_{z,1} w_{z,2}},\\&w_{1,f} = -\frac{w_{2,f} w_{z,2}^2 + w_{3,f} w_{z,3}^2}{w_{z,1}^2}, \\&w_{t,1} = 0, \quad w_{t,2} = 0, \quad w_{t,3} = 0, \\&w_{y,2} = \frac{w_{y,1} w_{z,2}}{w_{z,1}}, w_{y,3} = \frac{w_{y,1} w_{z,3}}{w_{z,1}},\\&w_{x,2} \\&= \frac{w_{x,1} w_{z,2}^2 w_{2,f} + w_{3,f} w_{x,1} w_{z,3}^2 - w_{3,f} w_{x,3} w_{z,1} w_{z,3}}{w_{2,f} w_{z,1} w_{z,2}}. \end{aligned} \end{aligned}$$
(13)

Case 3:

$$\begin{aligned} \begin{aligned}&w_{1,f} = -\frac{w_{2,f} w_{z,2}^2}{w_{z,1}^2}, w_{t,1} = 0, \\&w_{t,2} = 0, w_{t,3} = 0, w_{x,1} = \frac{w_{x,2} w_{z,1}}{w_{z,2}},\\&w_{y,2} = \frac{w_{y,1} w_{z,2}}{w_{z,1}},\\&w_{y,3} =\\&-\frac{w_{2,f}^2 w_{z,2}^2 \left( b_1^2 w_{z,2}^2 - 2 b_1 b_2 w_{z,1} w_{z,2} + b_2^2 w_{z,1}^2\right) }{3 w_{3,f}^2 wx_3^3 w_{z,1}^2}, \\&w_{z,3} = 0. \end{aligned} \end{aligned}$$
(14)

Case 4:

$$\begin{aligned} \begin{aligned}&w_{t,1} = 0, w_{x,1} = 0, w_{x,2}\\&\quad =-\frac{{w_{z,2}} \left( {w_{t,2}}-{w_{z,2}}\right) }{{w_{t,2}}},\\&\quad w_{x,3} = 0, w_{y,1} =0, w_{y,2} =0,\\&w_{y,3}=-\frac{{w_{z,2}}{w_{t,3}} \left( {w_{t,2}}-{w_{z,2}}\right) }{{w_{t,2}}^{2}}, \\&\quad w_{z,1} = 0, w_{z,3} = \frac{{w_{t,3}} {w_{z,2}}}{{w_{t,2}}}. \end{aligned} \end{aligned}$$
(15)

By substituting Case 4 into Eq. (11),we can obtain the Lump solution of the generalized (3+1)-dimensional KP equation through a bilinear transformation (6):

$$\begin{aligned} \begin{aligned}&u=-\frac{4w_{2,f}\Phi _1w_{z,2} \left( {w_{t,2}}-{w_{z,2}}\right) }{w_{t,2}f}, \\&\left\{ \begin{aligned}&f=w_{1,f}b_1^2 + w_{2,f}\Phi _1^2 + w_{3,f}\Phi _2^2 + b_4, \\&\Phi _1=tw_{t,2} - \frac{w_{z,2}(w_{t,2} - w_{z,2})x}{w_{t,2}} + w_{z,2}z + b_2, \\&\Phi _2=tw_{t,3} - \frac{w_{z,2}w_{t,3}(w_{t,2} - w_{z,2})y}{w_{t,2}^2} \\&+ \frac{w_{z,2}w_{t,3}z}{w_{t,2}} + b_3. \end{aligned} \right. \end{aligned}\nonumber \\ \end{aligned}$$
(16)

With the assistance of Maple software, Eq. (16) was substituted into Eq. (1). The results show that the left-hand side of Eq. (1) has a value of 0, thus the validity of Eq. (16) is proved. To analyze the dynamic characteristics of the bilinear neural network method and briefly discuss its evolutionary features, we went through a series of assignment experiments, and in order to make the final result compound the laws of physics, the following appropriate parameters and functions will be introduced into Eq. (16):

$$\begin{aligned} \begin{aligned}&b_1 = -5, b_2 = 2, b_3 = 6, b_4 = -6, \\&w_{1,f} = 3, w_{2,f} = -4,\\&w_{3,f} = -3, w_{t,2} = 1, w_{t,3} = 5, w_{z,2} = -5. \end{aligned} \end{aligned}$$
(17)
Fig. 3
figure 3

3d plots of with = 2, t = 0, curve plots, density plot, and contour plot of Eq. (16)

The Lump solution, as depicted in Fig. 3, exhibits distinct features. In Fig. 3a–c, with y=2 and t=0 held constant, an anomalous wave featuring protrusions is observed, which appears rapidly in a constant background and disappears in a very short time, but peaks at intermediate times, has been observed. It is well known that \(u \rightarrow 0\) when the quadratic function tends to positive or negative infinity. When fixing t = 0 and z = 0, Figure 3d is the curve plot of Lump solution, it is observed that u has peaks and troughs with the variable x in a very short period of time, at the same time, when y is changed, the graph does not change, which implies that y does not affect the curves of u and x. Figure 3e and f respectively illustrate the density plot and contour plot of this waveform. These waveforms are a direct consequence of the action of their activation function. The Lump solution has a wide range of applications in several fields of physics, such as describing isolated waves, vortices, or other local structures.

3.2 Fractal soliton wave solution

In addition to classical activation functions, the following interacting activation functions were also experimented with, we choose \(l_0=\{x,y,z,t\}\), \(l_1=\{1,2,3\}\),\(F_1(\xi _1)=(\xi _1)^2\),\(F_2(\xi _2)={\xi _2}^2\), \(F_3(\xi _3)=\exp (\xi _3)\) as shown in Fig. 2f. By substituting Eq. (11) into Eq. (8), a complex equation is obtained. Setting the coefficients of each term in this equation to zero results in 61 algebraic equations. Utilizing Maple for symbolic computation to solve these algebraic equations, a series of different solutions are obtained, six of which are shown below:

Case 1:

$$\begin{aligned} \begin{aligned}&b_2 = \frac{b_1w_{y,2}}{w_{y,1}}, w_{2,f} \\&= -\frac{w_{1,f}w_{y,1}^2}{w_{y,2}^2}, w_{t,2} = \frac{w_{t,1}w_{y,2}}{w_{y,1}}, w_{t,3} = 0,\\&w_{x,1} = \frac{w_{x,2}w_{y,1}}{w_{y,2}}, \\&w_{x,3} = 0, w_{y,3} = 0, w_{z,2} = \frac{w_{y,2}w_{z,1}}{w_{y,1}}, \\&w_{z,3} = 0. \end{aligned} \end{aligned}$$
(18)

Case 2:

$$\begin{aligned} \begin{aligned}&w_{2,f} = -\frac{w_{1,f} w_{y,1}^2}{w_{y,2}^2},\\&w_{t,3} = 0, w_{x,1} = -w_{y,1}, w_{x,2} = -w_{y,2},\\&w_{x,3} = 0, w_{y,3} = 0, w_{z,1} = 0, w_{z,2} = 0, w_{z,3} = 0. \end{aligned} \end{aligned}$$
(19)

Case 3:

$$\begin{aligned} \begin{aligned}&w_{1,f} = -\frac{w_{2,f} w_{y,2}^2}{w_{x,1}^2}, w_{t,1} = 0, w_{t,2} = w_{z,2}, w_{t,3} = 0, \\&w_{x,2} = -w_{y,2}, w_{x,3} = 0, \\&w_{y,1} = -w_{x,1}, w_{y,3} = 0, w_{z,1} = 0, w_{z,3} = 0. \end{aligned} \end{aligned}$$
(20)

Case 4:

$$\begin{aligned} \begin{aligned}&b_1 = \frac{b_2 w_{t,1}}{w_{t,2}}, w_{1,f} = -\frac{w_{2,f} w_{t,2}^2}{w_{t,1}^2},\\&w_{t,3} = 0, w_{x,1} = \frac{w_{t,1} w_{x,2}}{w_{t,2}}, \\&w_{x,3} = 0, w_{y,1} = \frac{w_{y,2} w_{t,1}}{w_{t,2}},\\&w_{y,3} = 0, w_{z,2} = \frac{w_{t,2} w_{z,1}}{w_{t,1}}, w_{z,3} = 0. \end{aligned} \end{aligned}$$
(21)

Case 5:

$$\begin{aligned} \begin{aligned}&w_{1,f} = -\frac{w_{2,f} w_{z,2}^2}{w_{z,1}^2}, w_{t,2} = \frac{w_{t,1} w_{z,2}}{w_{z,1}}, w_{t,3} = 0, \\&w_{x,1} = -\frac{\left( w_{t,1} w_{x,2} + w_{t,1} w_{z,2} - w_{z,1} w_{z,2}\right) w_{z,1}}{w_{t,1} w_{z,2}}, \\&w_{x,3} = 0, w_{y,1} = \frac{w_{x,2} w_{z,1}}{w_{z,2}}, \\&w_{y,2} = -\\&\frac{w_{t,1} w_{x,2} + w_{t,1} w_{z,2} - w_{z,1} w_{z,2}}{w_{t,1}}, w_{y,3} = 0, w_{z,3} = 0. \end{aligned} \end{aligned}$$
(22)

Case 6:

$$\begin{aligned} \begin{aligned}&w_{1,f} = -\frac{w_{2,f} w_{x,2} (w_{t,2} w_{x,2}+w_{t,2} w_{z,2}-w_{z,2}^{2})}{w_{t,2} w_{x,1}^{2}}, \\&w_{t,1} = 0, w_{x,3} = 0, \\&w_{y,1}=-w_{x,1}, w_{y,2} = -\frac{w_{t,2} w_{x,2}+w_{t,2} w_{z,2}-w_{z,2}^{2}}{w_{t,2}}, \\&w_{y,3}=-\frac{w_{t,3} w_{z,2} (w_{t,2}-w_{z,2})}{w_{t,2}^{2}},\\&w_{z,1} = 0, w_{z,3} = \frac{w_{z,2} w_{t,3}}{w_{t,2}}. \end{aligned}\nonumber \\ \end{aligned}$$
(23)

By substituting Case 6 into Eq. (11), the fractal soliton wave solution of the generalized (3+1)-dimensional KP equation can be obtained through a bilinear transformation (6):

$$\begin{aligned} \begin{aligned}&u=\frac{2 \left( -\frac{2\Phi _1 \left( w_{x,1} x-w_{x,1} y+b_{1}\right) }{w_{t,2} w_{x,1}}+2 w_{2,f} \Phi _2w_{x,2}\right) }{f}, \\&\left\{ \begin{aligned}&f=-\frac{\Phi _1 \left( w_{x,1} x-w_{x,1} y+b_{1}\right) ^{2}}{w_{t,2} w_{x,1}^{2}}\\&+w_{2,f} \Phi _2^{2}+w_{3,f}\Phi _3+b_{4}, \\&\Phi _1=w_{2,f} w_{x,2} \left( w_{t,2} w_{x,2}+w_{t,2} w_{z,2}-w_{z,2}^{2}\right) , \\&\Phi _2=t w_{t,2}+w_{x,2} x-\frac{\left( w_{t,2} w_{x,2}+w_{t,2} w_{z,2} -w_{z,2}^{2}\right) y}{w_{t,2}}\\&+w_{z,2} z+b_{2},\\&\Phi _3=e^{w_{t,3} t-\frac{w_{t,3} w_{z,2} \left( w_{t,2}-w_{z,2}\right) y}{w_{t,2}^{2}}+\frac{w_{z,2} w_{t,3} z}{w_{t,2}}+b_{3}}. \end{aligned} \right. \end{aligned} \end{aligned}$$
(24)
Fig. 4
figure 4

3d plots with y = 2, t = 0, curve plots, density plot, and contour plot of Eq. (24)

With the assistance of Maple software, substituting Eq. (24) into Eq. (1) demonstrates that the left-hand side of the equation equals zero, thus proving that the solution (8) we obtained is one of the exact analytical solutions of Eq. (1) at the same time proving the validity of the BNNM. Compared to traditional neural network methods, which typically yield only approximate solutions, the Bilinear Neural Network Method demonstrates superior accuracy. To further analyze its dynamical characteristics and briefly discuss its evolutionary features, we have likewise gone through a series of assignment experiments, to make the final result compound the laws of physics, the following appropriate parameters and functions will be incorporated into Eq. (24):

$$\begin{aligned} b_1{} & {} = 1, b_2 = 5, b_3 = -5, b_4 = 5, w_{2,f} \nonumber \\{} & {} = -2, w_{3,f} = 1,\nonumber \\ w_{t,2}{} & {} = -5, w_{t,3} = -4, w_{x,1} = 6, w_{x,2} \nonumber \\{} & {} = 3, w_{z,2} = -1. \end{aligned}$$
(25)

The fractal soliton wave solution is depicted in Fig. 4. Specifically, Fig. 4a–c present 3-d plots from various perspectives, showing the wave structure at a specific instant with t = 0, This structure is characterized by its continuity and localization in 3-d space, while also displaying fractal properties, such as exhibiting similar patterns across different scales. The formation of this waveform can be attributed to a precise balance between the nonlinear and dispersive terms in the nonlinear partial differential equation. This equilibrium is achieved through the activation functions used in the BNNM-two quadratic functions and an exponential function. The quadratic functions likely enhance the nonlinear characteristics of the wave pattern, while the exponential function produces localized amplification effects in space, leading to a high concentration of energy in specific regions.

Figures 4d–f show curve plots of the fractal soliton wave solution, detailing the wave dynamics at different time points with a fixed z-axis value and varying y-sections. As time progresses, the waveform undergoes shifts and deformations. Notably, the plot at y = 0 clearly displays distinct peaks and troughs. The bright areas in the density plot, Fig. 4g, reveal regions of concentrated energy density, likely corresponding to wave peaks intensified by the exponential part of the activation function. Conversely, the darker areas might indicate regions with lower energy or diminishing waves. The contour plot in Fig. 4h further reveals the subtle variations of the wave in two-dimensional space, crucial for capturing the detailed structure of the waveform at specific times and locations.

Overall, this solution represents a wave pattern that stably propagates in a nonlinear medium, maintaining a specific shape in three-dimensional space while consistently displaying dynamic behavior over time. Such wave structures and behaviors may find relevance in various physical contexts, such as turbulence phenomena in fluid dynamics, electromagnetic wave propagation in plasma physics, and light waveguide modes in optics.

4 Superposition of soliton periodic wave solutions and bright-dark solitons by a two-hidden layer neural network model

4.1 Periodic wave solutions

A double-hidden-layer neural network model with a “4-2-2-1” configuration is chosen. In this model, the input layer \(l_0\) remains unchanged with four neurons, and there are two hidden layers (\(l_1\) and \(l_2\)), each containing two neurons. The structure is illustrated in Fig. 5a.

Fig. 5
figure 5

“4-2-2-1”Double hidden layers neural network model of Eq. (26) by choosing \(F_1(\xi _1)=\xi _1, F_2(\xi _2)=\xi _2^2, F_3(\xi _3)=e^{-\xi _3},F_4(\xi _4)=e^{\xi _4}\) and \(F_1(\xi _1)=\xi _1, F_2(\xi _2)=\xi _2, F_3(\xi _3)=\xi _3^2,F_4(\xi _4)=\xi _4^2\)

Fig. 6
figure 6

3d plots with y = 2, t = 2, line plot, density plot, and contour plot of Eq. (28)

By choosing \(F_1(\xi _1)=\xi _1, F_2(\xi _2)=\xi _2^2, F_3(\xi _3)=e^{-\xi _3}, F_4(\xi _4)=e^{\xi _4}\) as shown in Fig. 5b, the following can be obtained:

$$\begin{aligned} \begin{aligned}&f=w_{3,f}e^{-\xi _3}+w_{4,f}e^{\xi _4}+b_4, \\&\left\{ \begin{aligned}&\xi _3=w_{1,3}\xi _1+w_{2,3}{\xi _2}^2, \\&\xi _4=w_{1,4}\xi _1+w_{2,4}{\xi _2}^2, \\&\xi _1=w_{t,1}t+w_{x,1}x+w_{y,1}y+w_{z,1}, \\&\xi _2=w_{t,2}t+w_{x,2}x+w_{y,2}y+w_{z,2}, \end{aligned} \right. \end{aligned} \end{aligned}$$
(26)

where \(b_4\) and \(w_{i,j}(i=x,y,z,t,1,2,3,4,j=1,2,3,4,f)\) are real constants.

Substituting Eq. (26) into Eq. (1) results in a complex equation. By setting the coefficients of each term in the equation to zero, a system of 212 coefficient equations is obtained. These equations were solved using Maple software through symbolic computation, resulting in 38 sets of solutions. Below, one set of solutions is presented as an example:

$$\begin{aligned} \begin{aligned}&w_{2,3} = -w_{2,4}, w_{t,1} = -\frac{w_{1,3}^{2} w_{x,1}^{3}w_{y,1} + 2 w_{1,4} \, w_{x,1}^{3} w_{y,1} w_{1,3} + w_{x,1}^{3} w_{y,1} \, w_{1,4}^{2} - w_{z,1}^{2}}{w_{x,1} + w_{y,1} + w_{z,1}},\\&w_{t,2} = 0, w_{x,2} = 0, w_{z,2} = 0,b_{1} = 1,b_{2} = 2,b_{3} = 3,b_{4} = 0, \end{aligned} \end{aligned}$$
(27)

substituting (27) into Eq. (26) and applying the bilinear transformation Eq. (6), the analytical solution of the gKP equation can be obtained:

$$\begin{aligned} \begin{aligned}&u=\frac{2 \left( -w_{x,1} w_{1,3}\Phi _1+w_{x,1} w_{1,4}\Phi _2 w_{4,f}\right) }{f}, \\&\left\{ \begin{aligned}&f=\Phi _1 w_{3,f}+\Phi _2w_{4,f},\\&\Phi _1={e}^{-\left( -\frac{t \left( w_{1,3}^{2} w_{x,1}^{3} w_{y,1}+2 w_{1,4} \,w_{x,1}^{3} w_{y,1} w_{1,3}+w_{x,1}^{3} w_{y,1} \,w_{1,4}^{2}-w_{z,1}^{2}\right) }{w_{x,1}+w_{y,1}+w_{z,1}}+w_{x,1} x+w_{y,1} y+w_{z,1} z\right) w_{1,3}+w_{y,2}^{2} y^{2} w_{2,4}}, \\&\Phi _2={e}^{\left( -\frac{t \left( w_{1,3}^{2} w_{x,1}^{3} w_{y,1}+2 w_{1,4} \,w_{x,1}^{3} w_{y,1} w_{1,3}+w_{x,1}^{3} w_{y,1} \,w_{1,4}^{2}-w_{z,1}^{2}\right) }{w_{x,1}+w_{y,1}+w_{z,1}}+w_{x,1} x+w_{y,1} y+w_{z,1} z\right) w_{1,4}+w_{y,2}^{2} y^{2} w_{2,4}}. \end{aligned} \right. \end{aligned} \end{aligned}$$
(28)

With the assistance of Maple software, the following appropriate parameters and functions will be incorporated into Eq. (28):

$$\begin{aligned} \begin{aligned}&w_{1,3} = -6, w_{1,4} = 5, w_{2,4} = -1,\\&w_{3,f} = 6, w_{4,f} = -2, \\&w_{x,1} = 2, w_{y,1} = 6,\\&w_{z,1} = -2, w_{y,2} = 4. \end{aligned} \end{aligned}$$
(29)

Figure 6 displays the superposed periodic wave solution. Figures 6a–c provide 3-d plots from different angles, clearly showing the interaction solutions formed under the influence of a double-layer neural network. In the first hidden layer, \(\xi _1\) and \(\xi _2^2\) maintain the fundamental characteristics of the wave solution and introduce nonlinearity into the solution. In the second hidden layer, \(e^{-\xi _3}\) and \(e^{\xi _4}\) correspond to the wave peaks and troughs that evolve over time, respectively. These activation functions work together to create a mathematical model capable of precisely describing complex wave phenomena in nonlinear media. Figure 6d presents a line plot of x concerning t, with y and z held constant at 2. It clearly illustrates the periodic nature of the waveform. The density plot, Fig. 6e, and the contour plot, Fig. 6f, offer another perspective on the distribution of wave intensity on a two-dimensional plane. In the density plot, the variation in color shades represents different intensities of the wave, while the contour plot uses the density of lines to depict these changes. These two diagrams help us understand the energy distribution of the wave in space at a specific point in time. Such solutions can represent the coexistence of multiple wave modes, reflecting the complex wave phenomena that might occur in nonlinear media. This type of wave structure has broad applications in both theoretical and experimental physics. For example, it is used in describing crystal vibrations in solid-state physics, surface waves in fluid dynamics, and wave modes in plasma physics.

4.2 Bright-dark solitons

By choosing \(F_1(\xi _1)=\xi _1, F_2(\xi _2)=\xi _2, F_3(\xi _3)=\xi _3^2,F_4(\xi _4)=\xi _4^2\) as shown in Fig. 5c, the following can be obtained:

$$\begin{aligned} \begin{aligned}&{f=w_{3,f}\xi _3^2+w_{4,f}\xi _4^2+b_4}, \\&\left\{ \begin{aligned}&{\xi _3=w_{1,3}\xi _1+w_{2,3}\xi _2}, \\&{\xi _4=w_{1,4}\xi _1+w_{2,4}\xi _2}, \\&{\xi _1=w_{t,1}t+w_{x,1}x+w_{y,1}y+w_{z,1}}, \\&{\xi _2=w_{t,2}t+w_{x,2}x+w_{y,2}y+w_{z,2}}, \end{aligned} \right. \end{aligned} \end{aligned}$$
(30)

where \(b_4\) and \(w_{i,j}(i=x,y,z,t,1,2,3,4,j=1,2,3,4,f)\) are real constants. Substituting Eq. (30) into Eq. (1) results in a complex equation. Setting the coefficients of each term in the equation to zero yields a system of 11 coefficient equations. These equations were solved using Maple software through symbolic computation, leading to 43 sets of solutions. As an illustrative example, one set of solutions is presented below:

$$\begin{aligned} \begin{aligned} w_{2,3}&= -\frac{w_{1,3} \left( 2 w_{1,4} w_{x,1}+w_{2,4} w_{x,2}\right) }{w_{1,4} w_{x,2}},\\ w_{3,f}&= -\frac{w_{1,4}^{2} w_{4,f}}{w_{1,3}^{2}}, w_{t,2} = 0,\\ w_{y,1}&= -\frac{w_{t,1} w_{x,1}+w_{z,1} w_{t,1}-w_{z,1}^{2}}{w_{t,1}}, \\ w_{y,2}&= -w_{x,2}, w_{z,2} = 0, \end{aligned} \end{aligned}$$
(31)

substituting (30) into Eq. (26) and applying the bilinear transformation Eq. (6), the analytical solution of the gKP equation can be obtained:

$$\begin{aligned} \begin{aligned}&u=\frac{2 \left( -\frac{2 \left( \Phi _1 w_{1,3}-\Phi _2\right) w_{1,4}^{2}\Phi _3}{w_{1,3}^{2}}+2 \left( \Phi _1 w_{1,4}+\left( w_{x,2} x-w_{x,2} y\right) w_{2,4}\right) w_{4,f} \left( w_{x,1} w_{1,4}+w_{x,2} w_{2,4}\right) \right) }{f},\\&\left\{ \begin{aligned}&f=-\frac{\left( \Phi _1 w_{1,3}-\Phi _2\right) ^{2} w_{1,4}^{2} w_{4,f}}{w_{1,3}^{2}}+\left( \Phi _1 w_{1,4}+\left( w_{x,2} x-w_{x,2} y\right) w_{2,4}\right) ^{2} w_{4,f}+b_{4}, \\&\Phi _1=t w_{t,1}+w_{x,1} x-\frac{\left( w_{t,1} w_{x,1}+w_{z,1} w_{t,1}-w_{z,1}^{2}\right) y}{w_{t,1}}+w_{z,1} z, \\&\Phi _2=\frac{\left( w_{x,2} x-w_{x,2} y\right) w_{1,3} \left( 2 w_{1,4} w_{x,1}+w_{2,4} w_{x,2}\right) }{w_{1,4} w_{x,2}}, \\&\Phi _3= w_{4,f} \left( w_{x,1} w_{1,3}-\frac{w_{1,3} \left( 2 w_{x,1} w_{1,4}+w_{x,2} w_{2,4}\right) }{w_{1,4}}\right) . \end{aligned} \right. \end{aligned} \end{aligned}$$
(32)

With the assistance of Maple software, the following appropriate parameters and functions will be incorporated into Eq. (32):

$$\begin{aligned} \begin{aligned}&b_{4} = -3, w_{1,3} = 6, w_{1,4} = -2, \\&w_{2,4} = -4, w_{4,f} = -4, \\&w_{t,1} = -3, w_{x,1} = 2, w_{x,2} = 3, w_{z,1} = -1. \end{aligned} \end{aligned}$$
(33)
Fig. 7
figure 7

3d plots with y = 2, t = 2, density plot, and contour plot of Eq. (32)

Figure 7 presents the bright-dark soliton solution. Figures 7a–d are 3-d representations from varying perspectives, revealing the intricate structure of soliton solutions within a double-layer neural network model. The interplay between the linear activation functions in the first layer and the quadratic activation functions in the second layer contributes to the formation of soliton waves. The localized energy distribution is evident in the bright regions indicative of bright solitons, where energy is highly concentrated, while dark regions exhibit lower energy, corresponding to dark solitons. The density plot, Fig. 7e, and the contour plot, Fig. 7f, depict the amplitude of the bright-dark soliton on a 2-d plane. In Fig. 7e, color variations not only demonstrate differences in amplitude but also reveal the peaks and troughs within the waveform. The continuity of color changes allows for the observation of either smooth or abrupt transitions within the soliton wave. Fig. 7f uses the density of lines to correspond to the gradient changes within the wave. Such soliton wave structures hold significant importance in the realm of physics research, particularly within the fields of optical fiber communication, nonlinear optical materials, and condensed matter physics, where the interaction, stability, and propagation characteristics of solitons are critical points of study.

5 Conclusion

In this work, the bilinear neural network method (BNNM) is adopted to derive exact analytical solutions for the generalized (3+1)-dimensional KP equation, by choosing single-layer “4-3-1” and double-layer “4-2-2-1” neural network structures, with appropriate activation functions and numerical values, inserted into the bilinearly transformed Eq. (8). This resulted in the derivation of lump solutions, fractal soliton solutions, superposed periodic wave solutions, and bright-dark soliton solutions. The dynamic properties of these solutions were illustrated through three-dimensional graphs, curve plots, density maps, and contour maps. These results emphasize the essential role of nonlinear partial differential equations in characterizing the complex dynamics of the real world, underscoring their importance for a deeper understanding of nonlinear phenomena in various fields including aerodynamics, optics, acoustics, thermal conduction, fluid dynamics, and classical mechanics. BNNM has obvious advantages over traditional symbolic computation for obtaining exact analytical solutions of the generalized (3+1)-dimensional KP equations. Symbolic computation methods require a different solution for each solution, whereas BNNM can compute multiple analytical solutions simultaneously in parallel, and BNNM can obtain different interacting solutions through the multilayered neural network architecture, which will accelerate the exploration of more undiscovered exact analytical solutions in the future. This study thus fills a gap in the existing literature on the generalized (3+1)-dimensional KP equation.

We aim to further refine our neural network models by expanding their breadth and depth. Specifically, we plan to explore single-layer networks with configurations such as ’4-4-2’ and ’4-5-2’, as well as delve into more complex multi-layer networks like ’4-3-3-3-2’ and ’4-2-2-2-2-2-1’, incorporating updated activation functions. In addition to refining these models, our future work will also involve applying these methodologies to solve for exact analytical solutions of several extended equations related to the generalized (3+1)-dimensional KP equation introduced earlier. At the same time, we will leverage the unique capabilities of our refined neural network models to expand our research scope, attempting to apply BNNM to other areas such as fractional partial differential equations.