Yuntao Jin,
Zhengjie Zhang*,
Baitong Chang,
Rui Cao,
Hanqing Yu,
Yefan Sun,
Xinhua Liu
and
Shichun Yang
School of Transportation Science and Engineering, Beihang University, Beijing 102206, China. E-mail: zhengjie_zhang@buaa.edu.cn; yt_jin@buaa.edu.cn; changbaitong@buaa.edu.cn; crcaorui@buaa.edu.cn; hanqingyu@buaa.edu.cn; zy2213111@buaa.edu.cn; liuxinhua19@buaa.edu.cn; yangshichun@buaa.edu.cn
First published on 25th August 2025
As the basis for many functions of the battery management system (BMS) such as state estimation and thermal runaway warning, stable sampling data are crucial for the safe operation of electric vehicles (EVs). In this paper, a sampling fault diagnosis method for power battery data in cloud platforms is proposed based on a residual network (ResNet) and bi-directional long short-term memory (BiLSTM) neural network, which can effectively identify the abnormalities of the battery sampling data and recognize the failure modes. Firstly, through the analysis of fault data and sampling circuits for real EVs, four typical failure modes are selected to complete the fault injection experiments. The physical simulation model of the fault circuit is established, and the corresponding mathematical empirical model is condensed. Then, based on the understanding of the abnormal data distribution pattern, the fault diagnosis algorithms based on a threshold and the ResNet–BiLSTM neural network are developed, respectively. Finally, the algorithms are introduced into the simulation dataset and real-vehicle dataset for testing. The results show that both algorithms have high effectiveness and accuracy, with the latter exhibiting strong fault diagnosis capability. In summary, the proposed sampling fault diagnosis method is feasible and provides a theoretical basis for future multi-type fault diagnosis of BMSs.
The cloud platform of power battery data normally accomplishes functions such as the state of health (SoH) estimation, fault diagnosis and remaining useful life (RUL) prediction, based on information uploaded from the vehicle, which require accurate and real-time battery data as support.3 Among them, in terms of the fault diagnosis of power battery systems, current main research focuses on the fault types that occur for the battery itself, such as internal short circuit, overcharge, over-discharge and capacity loss.4 Li et al. developed a thermal runaway warning algorithm for abnormal heat production using vehicle state, driving behavior and local weather as inputs.5 Jiang et al. used a state representation methodology (SRM) for battery fault diagnosis based on original cell voltages, which captured subtle changes in the battery and enabled rapid fault identification.6 Zhang et al. built a deep learning fault identification model based on a dynamical autoencoder using voltage, current, temperature, state of charge and other signals as inputs, which comprised over 690000 charging snippets from 347 vehicles’ battery packs.7 These studies have used popular data-driven techniques as methodologies, while defaulting to the authenticity and reliability of the data collected by sensors.8
However, in actual use, various factors may cause data to be missing, damaged and misplaced such as vehicle/cloud-end communication interruption, sampling signal failure and cloud platform component abnormality.9 Due to the long-term operation and severe working environment, sensor faults are inevitable within the lifespan of a BMS, even though the probability of fault happening is on the order of one part per million.10 Therefore, it is essential to have a better understanding of failure mechanisms of power electronic components and to explore innovative approaches to increase the reliability of power electronic circuits and systems.11 Cloud platforms can usually interpolate and infill or directly reject the missing and damaged data. However, misaligned data are often difficult to directly distinguish from normal data, which can easily lead to false alarms and distorted state estimation.12 In the field of engineering, identifying signal errors caused by sampling faults has become a critically needed function for cloud-side data platforms. Currently, sensor fault diagnosis research is divided into three stages: detection, identification and quantification. All three stages are based on the establishment of a high-fidelity model of the battery sampling circuit to explore the failure mechanism of the system.13,14 Meanwhile, in order to address the issue of fewer samples caused by the small probability of the failure event itself, researchers have already generated customized failure sample datasets by establishing physical simulation models or data-driven models of the research object. It reached a balance between the number of positive and negative samples and realized the development of diagnosis algorithms.15,16
Throughout the lifespan of EVs, onboard sampling circuit failures caused by bumpy driving, humid environments or aging devices are extremely common, with specific failure modes such as short or break circuits in wiring harnesses, components or connectors. The most serious scenario such as the direct breakdown of the sampling chip usually demonstrates that the data cannot be collected on the vehicle or in the cloud, and the failure would be directly detected by the system at that time. It is out of the scope of research in this paper. This paper mainly focuses on the abnormalities of sampling circuits, aiming to reduce the impact on the BMS functions caused by data acquisition errors. In Section 2 of this paper, different fault injection simulation experiments are performed according to a sampling circuit board of real-vehicle BMSs. The corresponding failure physical models are established using MATLAB/Simscape software, and the mathematical empirical models for different failure modes are summarized. The experiments and models provide a theoretical basis for the subsequent development of sampling fault diagnosis algorithms. In Section 3, the threshold algorithm for identifying anomaly patterns of data and the deep learning algorithm for detecting anomalies in temporal data are developed, respectively, and the algorithms are tested and validated in a 30-vehicle-scale dataset for real EVs. Section 4 presents the results and discussion, and the conclusion is summarized in Section 5.
The research object of this paper is a sampling board equipped with a LTC6811 sampling chip, which is also widely used onboard currently. It can simultaneously measure the voltage of up to 12 cells, with a measurement range of 0–5.5 V per channel and an error of less than 1.2 mV. When the sampling chip is in operation, the voltage difference between each branch (i.e., C0, C1 and C2) and the reference ground, as shown in Fig. 2(a), is equal to Usample,i in order of distance from the reference ground from near to far. The difference calculation of adjacent measurements can be obtained as the cell sampling voltage Ui as in eqn (1). In addition, Ubat,i is used as the true terminal voltage value for each cell in this paper. With normal sampling circuits, the terminal voltage of the battery Ubat,i should be equal to the sampling voltage Ui. When a fault occurs, the sampling voltage would be offset from the true battery voltage.
Ui = Usample,i − Usample,i−1, (i = 1, 2, 3, …, 12) | (1) |
![]() | ||
Fig. 2 Sampling board hardware circuit topology: (a) differential filter circuit and (b) ground filter circuit. |
The sampling fault studied in this paper is mainly related to the sampling board, and the present hardware topology of the sampling board mainly has two kinds of ground filtering and differential filtering, as shown in Fig. 2. The ground filtering circuit connects all the filter capacitor branches of the cell to ground, which reduces the crosstalk of different branch currents in the differential filtering circuit and increases the voltage regulator diode in each branch to avoid the energy impact when large current passes through. In practice, the ground filtering is widely used because it can achieve better voltage ripple suppression, even if the cost is higher.
This section mainly focuses on the more complex ground filtering circuit to carry out fault injection experiments. The experimental setup for simulating sampling faults is depicted in Fig. 3, encompassing many components such as the constant voltage power source (utilized for simulating battery inputs), the sampling host computer and the BMS sampling board. When employing the battery module for experimental purposes, there is a potential risk to induce short circuits, leading to fire or explosion. Therefore, this study initially employs a constant voltage power source equipped with a safety circuit for the simulation. Subsequently, utilizing a calibrated fault model, this paper investigates the behavior of abnormal data under real vehicle conditions. The experimental setup uses a sampling board equipped with two sampling chips capable of simultaneously collecting 24 channels of cell voltage data. In this study, we specifically focus on conducting fault injection experiments on 12 selected channels of voltage associated with one of the chips, while the remaining 12 channels of grounding are not subjected to investigation.
Based on the statistics and analysis of real-vehicle faults and different device failure modes, the injection and evaluation of four representative fault types, namely, sampling harness breakage, equalization loop closure, filter capacitor breakdown and voltage regulator diode breakdown, were completed, respectively. Furthermore, a simplified physical model was constructed according to the topology of the circuit using MATLAB/Simscape software, and the circuit analysis and modeling part of this paper can be found in the Appendix. To streamline computations in the model, only the first six cells’ voltages that were affected by faults were retained. Building upon this foundation, a substantial dataset comprising both fault and normal scenarios was generated through simulations. This dataset serves as a valuable resource for training and validating deep learning algorithms of anomaly detection. In the following, we outline the simulation experiments of each type of fault injection and elucidate the method used for establishing the corresponding fault modes. The fault injection locations were strategically chosen, focusing on the middle cells of the circuit or selecting the first and last cells.
![]() | (2) |
![]() | (3) |
Assuming that the cell voltage collected from the faulty branch at this time is Un, where n = 2, 3, …, N − 1. The physical model of fault injection is built using MATLAB/Simscape software according to the hardware failure mechanism. The simulation results can be aligned with experimental data, leading to the derivation of a more universally applicable mathematical empirical model, as shown in eqn (4).
Un − Ubat,n = Ubat,n+1 − Un+1 = Uoverhang | (4) |
Similarly, the cell voltages at the time of disconnecting the first or last branches can be obtained through simulation, as shown in Fig. 4(b) and (d), respectively. The first and last cell sampling voltages in the above two disconnected cases are respectively shown as follows:
U1 = Ubat,1 − Uoverhang | (5) |
Ulast = Ubat,last + Uoverhang | (6) |
![]() | (7) |
Kirchhoff's voltage law is then used to calculate the measured voltages of the first four affected branches, respectively, as shown in eqn (8).
![]() | (8) |
The sampled voltage value of the corresponding cell is determined by the difference between the voltages of two adjacent circuits, as shown in the following equation.
![]() | (9) |
From the calculation results of eqn (9) and the experimental results, it can be found that the voltage of the faulty cell is lower than the reference voltage. Simultaneously, as illustrated in Fig. 5(c), the voltages of the two adjacent cells surpass the reference voltage, with the latter cell exhibiting a more notable voltage shift. From eqn (9), a line resistance RL,2 of 0.1915 Ω for branch 2 and RL,3 of 0.2533 Ω for branch 3 can also be calculated. The simulation results derived from the fault physical model align with the experimental results, enabling the extension to establish the mathematical empirical model for equalization loop closure faults, as illustrated in eqn (10).
![]() | (10) |
The same principle is deduced for the case of an equalization fault in the first cell, as shown in Fig. 5(b):
![]() | (11) |
The sampled voltage values of the abnormal cell 1 and cell 2 can be obtained as shown in the following equation:
![]() | (12) |
As shown in Fig. 5(d), when an equalization loop closure fault occurs in the last battery cell, only its own voltage measurement is affected, and the voltage value is as follows:
![]() | (13) |
The RC filter in the sampling board is mainly to reduce the interference of environmental noise on the voltage signal. When the circuit is under normal operating conditions, the filter capacitor is directly grounded equivalent to disconnection. Therefore, this paper mainly simulates the filter capacitor breakdown into a short-circuit failure mode. Similarly, a 0 Ω resistor is used to replace the filter capacitor of the corresponding branch of cell 2, which is grounded at this time; thus, the measured voltage is 0 V. When two adjacent regulator diodes are in the conduction state, the voltage difference between the first branch and 0 V exceeds the forward conduction voltage of the diode, denoted as Upos_lead, and the measured voltage is about Upos_lead = 0.76 V. The voltage difference between the voltage of the third branch and 0 V is greater than the reverse conduction voltage of the diode, denoted as Uneg_lead, and the measured voltage is about Uneg_lead = 7.5 V. The measured voltages of the first four branches affected are shown in eqn (14), respectively.
![]() | (14) |
The sampled voltage value of the corresponding cell is determined by the difference between the voltages of two adjacent circuits, as shown in the following equation:
![]() | (15) |
Specifically, despite the inability of the voltage difference between the fourth branch and the third branch to induce reverse conduction in the diode, the sampling voltage surpasses the upper and lower protection limits configured for the chip. Consequently, it is constrained to a range between 0 and 5.5 V. Similarly, the sampling voltages obtained from the other branches subject to these limitations are as follows:
![]() | (16) |
As shown in Fig. 6(c), the simulation results are consistent with the experimental results. The physical model of the filter capacitor fault is expressed as a mathematical empirical model. The sampling voltage of the faulty cell remains at 0 V, while the voltage of the preceding kth cell deviates by a magnitude of k forward conduction voltages. Normal voltage levels are reinstated when the measured voltage falls below the cumulative forward conduction voltages. Conversely, the voltage of the subsequent kth cell deviates by k reverse conduction voltages, until the normal value is restored when the measured voltage is less than the cumulative reverse conduction voltages, as depicted in eqn (17).
![]() | (17) |
In this case, the cell sampling voltages can be further derived as follows:
![]() | (18) |
When the first cell is injected fault, except the first two cells, the other cells have normal voltages. The faulty voltages are as follows (as shown in Fig. 6(b)):
![]() | (19) |
Similarly, after injecting the fault into the last cell, it can be found that except that the first cell sampling voltage is normal, the rest of the cell voltages are 0 V (as shown in Fig. 6(d)).
![]() | (20) |
![]() | (21) |
The sampling voltage of the different cells can be obtained by making a difference between two adjacent cells as shown in the following equation (as shown in Fig. 7(c)).
![]() | (22) |
The resulting mathematical empirical model is as follows:
![]() | (23) |
Injecting a voltage regulator diode breakdown fault into the first cell (as shown in Fig. 7(b)), the first two cells can be sampled with the voltage as shown:
![]() | (24) |
Injecting a fault into the last cell yields, the last two cell sampling voltages are as follows (as shown in Fig. 7(d)):
![]() | (25) |
In summary, by conducting fault injection experiments and replicating faults in the physical model, empirical mathematical models corresponding to the four failure modes (i.e., sampling harness breakage, equalization loop closure, filter capacitor breakdown and voltage regulator diode breakdown) can be developed. This enables the generation of a substantial volume of simulated failure data for subsequent algorithmic development. Meanwhile, the method can also be used to explore other possible device failure patterns. This paper only discusses the above four types of typical failures that have occurred in the real vehicle and test process.
![]() | ||
Fig. 8 The development of the sampling fault diagnosis algorithm in this paper, following the “test–model–algorithm–validation” procedure. |
Fault type | Fault causes | |
---|---|---|
1 | Bias fault | Stable deviations between the sensor output and the true value due to bias currents in the circuit, such as in this paper when the sampling fault occurs at the non-start and non-end cells. |
2 | Impulse fault | The sensor is disturbed by a certain pulse signal, such as a faulty connection between cells in the battery pack. An abnormal pulse signal will appear when the vehicle vibration reaches a certain amplitude. |
3 | Drift fault | Deviation of the output due to performance degradation, temperature drift, zero drift, etc., such as capacity diving and electrolyte leakage of the battery. |
4 | Periodic fault | Sensors disturbed by a certain periodic signal cause the measured value to show a periodic trend, typically occurring in rotating components such as motors and bearings. |
5 | Open circuit fault | Sensor outputs are maximized due to disconnection of the power source system, component damage, etc., such as the filter capacitor breakdown and voltage regulator diode breakdown faults involved in this paper. |
6 | Short circuit fault | Sensor outputs are close to zero due to component breakdown, device sticking, etc., such as the filter capacitor breakdown and voltage regulator diode breakdown faults involved in this paper. |
According to Section 2, the data distribution pattern resulting from various failure types can be summarized. The sampling faults are mainly characterized by the following features:
(1) Failure of adjacent battery cells. Since the cell sampling voltage is obtained by differential calculations, the failure of a cell at any position other than the first and last two cells usually affects the sampling voltage calculation results of the adjacent cells.
(2) Similarity of voltage differential sequences. From the mathematical model of sampling faults, it can be seen that the difference between the voltage of each faulty battery and the reference voltage has similarity (usually taken the median or the mode of all cell voltages). Moreover, the sequence of voltage differences at adjacent moments of each faulty cell also follows a consistent pattern.
(3) Abnormal voltages could have fixed upper and lower limits. Battery internal failures (e.g., capacity diving, internal short circuits and electrolyte leakage typically) lead to a gradual deviation of voltage data from that of normal cells. Unlike stable voltage values at the upper and lower limits of 5.5 V and 0 V, these failures disrupt the expected voltage stability.
A threshold-based sampling fault diagnosis algorithm has been developed on the basis of above laws, the flow of the algorithm is shown in Fig. 9, and the pseudo code is shown below:
The pseudo-code of finding the all-1 maximal rectangle in the matrix mentioned in step 3 of Algorithm 1 is as follows:
Algorithm 2. Algorithm of finding the all-1 maximal rectangle |
---|
Result: The area of the all-1 maximal rectangle (max_area), the position of the upper left corner (max_pos), and the length and width of the coverage (max_size). |
Input: A matrix consists of 0 and 1 with rows and columns m and n, respectively. |
Initialize m, n, heights, max_area, max_pos, and max_size. |
for i from 0 to m: |
Update heights for each column j using matrix [i][j] |
Initialize stack with −1 |
for j from 0 to n: |
while (stack not empty) & (heights [j] < heights [stack [−1]]): |
– Pop the last element from stack as h |
– Calculate w and update max_area, max_pos, and max_size if necessary |
Append j to stack |
Return max_area, max_pos, max_size |
The algorithm first traverses the matrix by rows and columns, respectively, using a one-dimensional matrix of equal width with the input matrix. This auxiliary matrix is used to store a sequence of histograms with each row serving as the foundation. Subsequently, a monotonically decreasing stack is generated from this process. The height of the all-1 matrix to be solved is the current stack top value, and the width is the index of the current column minus the index of the top stack column. If the stack is empty, the width is the index j of the current column. The rectangle with the largest area is computed and selected as the required anomaly fragment.
While the logic of the threshold-based algorithm is relatively simple, it is imperative to determine the threshold size based on the data distribution pattern of the actual fault. It should be noted that various fault types may trigger the same threshold alarm, posing a challenge in identifying the specific cause of the fault. Therefore, it is necessary to develop a set of diagnosis algorithms which can realize fault recognition and cause inference at the same time.
CNNs have been widely used in the fields of pattern classification, object detection and object recognition, which utilize efficient convolutional computation to replace complex image feature extraction operations. The network is mainly composed of convolutional layers, pooling layers, activation functions, etc. By combining and arranging different layers, classical network structures such as ResNet and the visual geometry group (VGG) are created.22,23 The close connection between the CNN layers and the ability to acquire spatial information make it particularly suitable for image processing and understanding with strong generalization ability and high computational efficiency. Fig. 10(a) shows the network structure of a ResNet block, which consists of a convolutional layer, a normalization layer (also known as the batch normalization layer), an activation function (also known as ReLU) and a residual feedback shortcut connection. The design of the residual block avoids the phenomena of gradient vanishing and gradient explosion when deepening the layers of the network, and speeds up the convergence of the network.
RNNs are mainly used to process time-series related data, which is recursive in the evolutionary direction of the sequence and is formed by all the recursive units being chained together. Variant unit structures such as long short-term memory (LSTM) and gated recurrent unit (GRU) have also been designed to address the problem of temporal memory and dependence of sequence data.24 On this basis, researchers have found that the LSTM structure combining forward and backward processing can better capture bi-directional temporal dependencies. This variant is recognized as a Bi-LSTM network.25 In this paper, the structure of the LSTM neural network is shown in Fig. 10(b). A pivotal feature is the gating mechanism to control the information transfer path, and the three gates need to be controlled as the input gate it, forgetting gate ft and output gate ot, respectively.26 The gates in the LSTM unit are soft, which assume values between 0 and 1. These values signify the proportion of information that is allowed through. The functions of these three gates are as follows:
(1) The forgetting gate ft controls the amount of internal state ct−1 information forgotten at the previous moment;
(2) The input gate it controls the amount of saved information about the candidate state t at the current moment;
(3) The output gate ot controls the amount of information passed from the internal state ct to the external state ht at the current moment.
The three gates and the candidate states are computed as follows:
![]() | (26) |
Long-term memory determines whether historical information needs to be absorbed, retained and discarded from the training data through the gating strategy. Consequently, the acquisition of important historical information in memory unit c is stronger than short-term memory and weaker than long-term memory. This characteristic enables LSTM neural networks to enhance the performance compared to traditional RNNs.
In this paper, the measured voltage signal is affected by the working conditions and noise. Solely relying on CNNs to extract the correlation relationship of multiple cells would neglect the inherent temporal characteristics of the voltage data itself. While the RNN considers the temporal characteristics of voltage, it is difficult to focus on the problem of adjacent cell signal anomalies caused by sampling faults, and the network itself may have gradient disappearance and gradient explosion phenomenon in the process of model training. The voltage data in this paper contains local spatial features pertaining to the interaction between faulty cells and adjacent cells, as well as multiscale-dependent temporal features. These aspects cannot be extracted from the two kinds of features by a separate network structure; thus in this paper, we construct the fault diagnosis model by connecting the ResNet network and BiLSTM network in tandem.27 The ResNet–BiLSTM fault diagnosis model constructed in this paper is shown in Fig. 10(c). The structure of the model is sequentially as follows:
(1) Firstly, the algorithm stacks three residual blocks and one max-pooling layer, which is designed to extract local features. The residual block contains two sets of the 3 × 3 convolutional kernel layer, batch normalization layer and ReLU activation function layer, and uses the shortcut mechanism to accelerate model training.
(2) In order to capture long-term dependencies from a sequence of local features, a Bi-LSTM layer is introduced to build ResNet–BiLSTM networks following the ResNet layers. The model is composed of a 2-layer bidirectional LSTM network, where a single-layer LSTM network encompasses 128 hidden states.
(3) In Section 4, model ablation experiments illustrate that the 2D ResNet–BiLSTM network effectively learns numerous spatio-temporal features from the data. Furthermore, a dropout mechanism is integrated into the final linear layer of the classification network to prevent the model from succumbing to overfitting.
The performance of a deep learning neural network model has an important relationship with the hyper-parameter settings. In this paper, the GridSearch technique is used to find the optimization of the network parameters, and the finalized parameter settings are as follows: batch size = 64, learning rate = 0.001, and optimizer = Adam.
![]() | (27) |
![]() | (28) |
The kappa coefficient, namely a kappa index of agreement (KIA), is a value used in statistics to assess the multi-classification effect. In practical applications, the range of value is generally [0,1], and higher values represent better classification accuracy achieved by the model.
![]() | (29) |
Ablation experiments are designed to explore the accuracy of models composed of various algorithmic blocks. These experiments aimed to assess the effectiveness of the algorithms proposed by comparing the results of ResNet, BiLSTM and ResNet–BiLSTM algorithms, respectively.29 The results of the algorithms on the test set are shown in Fig. 11, where Fig. 11(a) shows the binary classification confusion matrices obtained from the experiments of the different algorithms on the test set for determining whether a sampling fault occurs or not. Each row of the matrix represents the number of samples for true classification, and each column represents the number of samples for predictive classification. The more samples falling on the diagonal of each confusion matrix indicates the better classification of the model. The results of the confusion matrix are further quantified using the accuracy and F1-score indicators as shown in Table 2, which shows that the results of the ResNet–BiLSTM algorithm are significantly better than those of the other three algorithms, with a classification accuracy of 98.86%. Table 2 also includes the most fundamental CNN-LSTM algorithm as a benchmark for comparison, utilized to analyze the ResNet–BiLSTM algorithm's capability in characterizing the time-space distribution characteristics of fault data. The results likewise demonstrate that the ResNet–BiLSTM algorithm exhibits marked advantages in both accuracy and classification performance. The last two columns of the table compare the computational time and model size required by different algorithms for processing the test data. It is found that the ResNet–BiLSTM algorithm achieves superior results without significantly increasing the computational demand. Additionally, the F1-score attains a value of 0.9868, which indicates that the model is able to achieve excellent classification results and sufficient stability. It is also found that the simplest threshold recognition algorithm can also achieve high accuracy, and its indicators are even better than the BiLSTM algorithm. Since the threshold-based algorithm is unable to multi-classify the samples, the confusion matrices of the three neural network algorithms are further subdivided into multi-classification confusion matrices for seven types of data. Among these, the confusion matrix of the ResNet–BiLSTM algorithm is shown in Fig. 11(b). It can be seen that the true labels and the classification labels are able to almost completely correspond to the main diagonal of the matrix, indicating that the algorithms have good classification ability for different fault types. The figure also shows that there are no normal samples misclassified, which can greatly reduce costs in real industrial application scenarios. Although 8 faulty samples are misclassified as normal, they may still be recognized by the fault diagnostic algorithm for the battery itself.30 Using the KIA as an evaluation indicator of the algorithm's multi-classification ability, it is found that the ResNet–BiLSTM algorithm is still better than the ResNet and BiLSTM algorithms, which indicates that the algorithm has the ability to accurately classify the failure modes. The inclusion of simulated internal short circuit and capacity degradation fault data enables the testing of whether the algorithm has the ability to distinguish between sampling faults and battery faults. As can be seen from the confusion matrix, the number of false alarms between different fault types is relatively small. This is primarily attributable to the fact that battery faults typically affect only a single cell and induces sustained voltage outliers, whereas sampling faults intermittently corrupt measurements across multiple channels and frequently manifest distinctive anomalous values (such as 0 V). Fig. 11(c) illustrates the accuracy curves of the three neural networks in the training and test sets. The ResNet–BiLSTM algorithm obtains stable results at about the thirtieth epoch, and its accuracy in the training and test sets is significantly higher than that of the other three algorithms. In summary, it is considered to have the possibility of being applied in practical engineering applications.
ACC | F1-score | KIA | Time (ms) | Model size (M) | |
---|---|---|---|---|---|
Threshold-based | 0.8829 | 0.8794 | — | 22.2311 | — |
ResNet | 0.9643 | 0.9600 | 0.8850 | 118.3542 | 6.31 |
BiLSTM | 0.8100 | 0.7629 | 0.7783 | 92.5617 | 4.09 |
CNN-LSTM | 0.9443 | 0.9349 | 0.8533 | 102.1473 | 7.52 |
ResNet–BiLSTM | 0.9886 | 0.9868 | 0.8917 | 121.2940 | 8.72 |
Fig. 12(a) shows the sampling harness breakage fault occurring in the real vehicle. It can be seen that except that the voltages of cell 1 and cell 2 deviate as outliers and exhibit symmetrical deviations, all the other cells demonstrate consistent behavior. Since the fault disappears at the end of the segment, it is hypothesized that the fault may be due to poor contact. As shown in Fig. 12(b) for the equalization loop closure fault occurring in the real vehicle, cell 3 is shifted downward about the reference voltage, cell 2 and cell 4 are shifted upward about the reference voltage, and the shift amplitude of cell 2 is obviously smaller than that of cell 4. Consequently, it can be considered that the fault is caused by the equalization loop closure corresponding to cell 3.
Fig. 12(c) shows the test results of the threshold-based fault detection algorithm, which achieves an accuracy of 86.67% on the real-vehicle dataset, and 4 fault samples are not accurately recognized in them. Fig. 12(d) illustrates that the accuracy of the ResNet–BiLSTM algorithm reaches 96.67%, and its KIA attains 0.9450. Importantly, no normal samples are erroneously classified as abnormal. Since the occurrence of sampling faults is a small sample event and the probability of a failure vehicle is less than 0.1%, the accuracy achieved by the ResNet–BiLSTM algorithm proposed in this paper on the real-vehicle test set can meet the application requirements in real-world scenarios.
The research on sampling faults in this paper follows the steps of “test–model–algorithm–validation”. The fault diagnosis algorithm developed can realize the accurate detection and classification of sampling faults in the cloud platform without increasing hardware redundancy. Furthermore, it can also better support the BMS in accomplishing more complex tasks such as state estimation and thermal runaway warning.
This journal is © The Royal Society of Chemistry 2025 |