BPNN based MECG elimination from the abdominal signal to extract fetal signal for continuous fetal monitoring

Fetal monitoring may help with possible recognition of problems in the fetus. This research work focuses on the design of the Back-propagation Neural Network (BPNN) and Adaptive Linear Neural Network (ADALINE) to extract the Fetal Electrocardiogram (FECG) from the Abdominal ECG (AECG). FECG is extracted to assess the fetus well-being during the pregnancy period of a mother to overcome some existing difficulties regarding the fetal heart rate (FHR) monitoring system. Different sets of ECG signal has been tested to validate the algorithm performance. The accuracy of the QRS detection using the designed algorithm is 99%. This research work further made a comparison study between various methods' performance and accuracy and found that the developed algorithm gives the highest accuracy. This paper opens up a passage to biomedical scientists, researchers, and end users to advocate to extract the FECG signal from the AECG signal for FHR monitoring system by providing valuable information to help them for developing more dominant, flexible and resourceful applications.


Introduction
FHR monitoring is a routine work for obtaining the significant information about the condition of a fetus during the pregnancy and labor period of a mother (HASAN et al., 2007).FECG characteristics (heart rate, dynamic behavior, and waveform) are convenient in determining the fetal life, fetal maturity, fetal development, and existence of fetal distress or congenital heart disease (HASAN et al., 2007).FHR variations observed during pregnancy and labors have commonly been used as indirect indications of fetal conditions.FHR monitoring can identify conditions, which may lead to fetal and/or maternal mortality or morbidity (HELGASON et al., 2011).The FHR may change as the fetus responds to conditions in the uterus.An abnormal FHR or pattern may mean that the fetus is not getting enough oxygen or there are other problems (HASAN et al., 2009).However, FHR abnormalities are unpredictable and may occur at any time.To monitor such abnormalities, ambulatory monitoring has been proven a useful method with use of long-term monitoring of the FHR, where, the woman can maintain normal daily activities, work, and avoid unnecessary hospital stays.
The most familiar means of acquiring the FHR is Doppler ultrasound.The FECG can also be acquired for FHR using surface electrodes on the mother's abdomen, whilst the fetal magneto-cardiogram (FMCG) is detectable using superconducting quantum interference device (SQUID) magnetometers and fetal phonocardiography (FPCG) also allows the heart sounds to be detected and hence FHR can be calculated (CROWE et al., 1995).Currently, Doppler ultrasound and FECG have proven to be reliable techniques for monitoring FHR (KHANDOKER et al., 2009).The FHR monitoring using the Doppler ultrasound is widely used and appropriate because an invasive test cannot be used daily.The advantage of the Doppler ultrasound technique is that it can be virtually assured that a recording of FHR will be obtained.The disadvantages of such systems require intermittent repositioning of the transducer and they are only suitable for use with highly trained midwifes.The ultrasound transducer is problematic and uncomfortable while the procedure involves launching a 2 MHz signal towards the fetus.The use of Doppler ultrasound (invasive manner) is not suitable for long periods of FHR monitoring (FUKUSHIMA et al., 1985;UNGUREANU et al., 2007).Doppler ultrasound technique suffer from the limitation of sensitivity to movement and most Doppler systems rely upon some form of averaging to produce their FHR data.
In contrast, methods utilizing the AECG have a greater prospect for long-term monitoring of FHR and fetal well-being using signal processing techniques.The AECG signal can also be used for antepartum non-invasive FHR determination through the detection of small fetal cardiac potentials at the surface of the maternal abdomen (SOLUM et al., 1980).The AECG can be used to produce true RR interval data, which is suitable for heart rate variability studies if required.Its advantage is that it is completely non-invasive and unobtrusive, has comparatively low power requirements and can be used over extended (e.g.24 hours) periods.The method additionally allows the maternal heart rate (MHR) to be recorded since the MECG is also detected from the AECG.The advantage of using AECG to extract FECG is that additional information can be extracted compare to using Doppler ultrasound although the accuracy of this technique is much lower compare to former one (MARIA et al., 2001).
The recording and monitoring of the FHR from electrodes on the maternal abdomen is the most convenient option for an ambulatory recorder although it involves overcoming several difficulties.The difficulty arises in detecting the fetal QRS complexes from the AECG signal, which consists of both the MECG and FECG.This composite signal may also contain a relatively large amount of noise, and may be further distorted by muscle and breathing movements.A relatively weak FECG also causes difficulties.The signal strength of the MECG is usually many times that of the FECG.While this is itself may not be a problem if the noise level is low, it is definitely an issue if the maternal and fetal ECG QRS's are coincident with each other.This causes the MECG to completely overlap the FECG so that only the MECG QRS is detectable.To overcome the above problems, some multiple-lead algorithms use the thoracic MECG to cancel the MECG in the AECG to get FECG (KHAMENE; NEGAHDARIPOUR, 2000), though this is inconvenient for the patient during long-term monitoring.Hence, to make the AECG suitable for the detection of the FECG, the signal to noise ratio (SNR) must be enhanced.The decision was made to base on the investigation into the possibility of constructing an ambulatory FHR recorder around the acquisition of the abdominal FECG.Therefore, an attempt has been made to overcome the above problems of the AECG method and this development work is an effort to produce an FHR monitoring system with the capability of long-term recording of the FHR ambulatory technique.
There is a significant amount of work being done to improve SNR of FECG signal (PARMAR; BHUVAN, 2010).Recent research shows that the nonlinear domain can be modeled more accurately with artificial intelligence technologies.Some approaches like fuzzy logic and moving averaged have been proposed to extract fetal ECG from abdominal ECG of pregnant woman (HASAN et al., 2009;PARK et al., 1992).Among different artificial intelligence tools, neural networks are increasingly applied to detect FHR and extract fetal ECG (PETRENAS et al., 2012).In this research, a BPNN has been designed.The BPNN is chosen mainly because it is adaptive to nonlinear and time-varying features of AECG signal (HASAN et al., 2011).It can be trained to recognize the normal waveform and filter out the unnecessary artifacts.

Material and methods
The algorithm is a crucial part for processing the AECG to detect the R peaks in QRS complex of MECG and FECG for measuring the MHR and FHR respectively.Many different techniques have been developed for FECG enhancement and detection from the AECG signal.In the proposed and developed methodology has been used AECG signal from the mother's abdomen where the algorithm must have the capability of processing the AECG signal and continuously detect the maternal and fetal QRS complexes.Mainly, Neural Network has been designed in this research to detect QRS complex in AECG signal.QRS complex detection is important so that RR-interval can be extracted for disease classification and can be made the correct decision by the physician and clinician during the pregnancy.Neural network has been chosen mainly because it is adaptive to the nonlinear and timevarying features of ECG signal (HASAN et al., 2011).It can be trained to recognize the normal waveform and filter out the unnecessary artefacts from the ECG signal.According to the design of the algorithm, the AECG signal from mother's abdomen considered for FHR extraction using the AI.Mainly, there are some features has been collected from the AECG, and then the features have been used to feed the network to detect the QRS complex for MECG and to remove the noises parallel.After this, again the AI has been used for the extraction of FECG by suppressing the MECG from the AECG for FHR monitoring.
In the case of QRS complex detection i.e. before feeding into the neural network, the network is trained to memorize the characteristics of an R peak of QRS complex.One of the distinct features of R peak is large amplitude compared with other portions of ECG like P and T waves.R peak has high differentiation value due to sudden large amplitude change over short period.The duration of QRS complex exceeding certain threshold is usually large than the other parts of signal.R peak is also a maximum point with high positive threshold at QR portion and high negative threshold at RS portion.If we plot the differentiation signal, there is crossing zero point from positive to negative.Besides, usually the interval between RR is almost constant for a particular ECG signal.If the point is the first point exceeding in a signal, first-element flag is set.This is to let the network learns that the first element can have variable RR interval, since the RR interval calculated is the distance from zero point.These attributes, amplitude, differentiation, duration, approximate RR interval, and zero-crossing flag, first-element flag are used to train the network to recognize R peak.
The flow chart of QRS complex detection is shown in Figure 1.The signal is amplified by power two to amplify the signal with large amplitude and reduce the effects of high frequency but low amplitude noise.This is also to detect peaks that have low axis due to base line drift.When the signal is amplified by power two, the negative peak with high amplitude can be detected when the positive is low.
The threshold has adaptive nature i.e. the threshold will be updated according to the features characteristics.The maximum is chosen from the signal over an initialization period.The maximum is recalculated when a peak is found to reflect the effect of the current peak.If the peak is low, the threshold is lowered for the next peak, and vice versa.This is to deal with base line drift effect.The threshold is lowered when a possible peak is not detected over a maximum period.The search is rechecked over the period in case the peak suddenly goes low due to noise and base line drift.According to the algorithm the threshold1, threshold2 and threshold3 have been considered as follows: (1) where: the Xmax is the maximum (chosen from the signal over an initialization); threshold1 is 0.3875 times for Xmax; threshold2 for positive peak; threshold2 for negative peak.When a value greater than the threshold is found, it scans the maximum for the duration the values greater than the peak.Then the next peak is found from the end of the duration.For every possible peak found the differentiation and the original amplitude at that point is calculated.There is a second threshold for the original peak.The duration of the original signal is found.Since the R peak always has the highest frequency, the differentiation is found.If the current differentiation value is positive and the next value is negative, a crossingzero flag is set to one to indicate it is a maximum value.If on the other hand, the current differentiation value is negative and the next is positive value, the crossing-zero flag is set to negative to indicate that this is a minimum point.The interval between the current point and the previous possible peak is calculated.This is to get the approximate RR interval to determine if the point is an R peak.
The parameters of amplitude of original signal, duration over second threshold, differentiation value, possible RR-Interval, and crossing -zero point are written to a file.Then the file is re-opened to check if there are any points that are redundant.A minimum RR-interval is chosen as 5 points.Since the signal is encoded as 128 samples per second and the normal human heart beat are 72 per minute.It is equivalent to 1.2 beats per second.Then the average RR interval is around 128 points for one QRS complex to detect.The RR low is set to 5. It is very unlike that two QRS complexes occur within 5 samples.Therefore, for peaks that are less than 5 samples from the previous possible peak, the peak with highest amplitude is chosen and the remained are discarded.
All the remaining points are fed into the network to train the network so that it can remember the characteristics of an R peak.During simulation, data that have characteristics close to training data will be identified as R peak.The ECG signal is raised to power two.This is because when a particular QRS complex has significant low axis (baseline drift), the signal is equal to power two of absolute value.This takes into consideration of negative amplitude.Power 2 is used because this significantly distinguishes the peak amplitude from low amplitudes.Therefore, the unnecessary artifact can be removed.The amplitude of power two is used to detect amplitude exceeding the threshold.The amplitude of original signal is used for neural network input.This is because we need to take into consideration of the positive and negative sign of the amplitude.If it is positive, then it is more likely to be R peak.However, when exceeding maximum window size and there is no positive peak detected, negative peak with appropriate RR interval is given consideration.
Usually the R peak has larger duration compared with P wave, T waves, and noises.Therefore, it has been easily detected the duration of R peak.Usually R peak happens at quite consistent interval.It is a criterion that we look into to determine R peak.If after certain duration no possible peak is found, the threshold is reduced to search again, in case of baseline drift.
The QRS usually has the largest differentiation values compared with other portions of ECG.However, there are times that the noise and T wave has differentiation even larger than the QRS.The R peak is the maximum point.At this point, there should be a change from positive to negative derivative values.If it happens, the crossing-zero flag is set.
Back propagation is the generalization of the Widrow-Hoff learning rule to multiple-layer networks and nonlinear differentiable transfer functions.Feed forward networks often have one or more hidden layers of sigmoid neurons followed by an output layer of linear neurons.Multiple layers of neurons with nonlinear transfer functions allow the network to learn nonlinear and linear relationships between input and output vectors.The linear output layer lets the network produce values outside the range -1 to +1.The designed Back propagation network has two layers (HASAN et al., 2007).It is a two-layer tansigmoid/linear network.Each layer has a weight matrix W, a bias vector b, and an output vector a.The architecture of the multilayer feed forward network is shown in Figure 2 means layer weight from hidden layer 1 to layer 2. The weight is from 5th output at layer 1 to 3rd neuron in layer 2. P is the input vector, n is the output layer before transfer function and a is the actual output vector.The network was designed with 6 inputs, which are amplitude, differentiation, duration, RR interval, zero-crossing flag, and first-element flag for each point that needs to be judged if it is an R peak.The network is trained to output 1 for R peak and 0 to non-R peak.There are 13 neurons in the hidden layer.There is no definite way of determining the right number of neurons in hidden layer.It is chosen based on Kolmogorov's theorem.Kolmogorov's theorem states that if the number of input neuron is m, and the inputs are scaled to lie in the region from 0 to 1, a network with only one hidden layer and 2m+1 neurons in this layer can exactly map these inputs to the outputs.There should not be any constraint on the output for this theorem to be applicable.Therefore, 6*2+1 = 13 neurons are chosen as the number of neurons.However, Kolmogorov's theorem does not specify whether this network is an optimum solution for this mapping.The network with this number of neuron in hidden layer may not be the simplest to do the mapping.However, due to the cause that with 13 neurons, the network still takes quite some time for training, the number is not reduced for optimized solution.The network is created by using command of newff for two layer neural networks. where: net is the network object; the number of neurons 13 in the hidden layer; the transfer function is considered tansig.
The number of neuron in output layer is 1 with the purelin transfer function.Finally, the training function has been considered trainlm that is the Levenberg-Marquardt second order training speedtraining functions.The decision of choosing number of neurons in hidden layer actually still remains a challenge.If the number of neuron is too large, the network needs more storage and the training is more complicated.The memory is distributed over large number of weights.Some weights may be insignificant to the overall performance.However, if the number is too small, though the network still can do the exact mapping, there may be over fitting.Over fitting means, that the network cannot generalize when presented with slightly different inputs.The network is initialized with the following settings: Preprocessing and post processing are used before and after training.This is because the range of values for different parameters differs too much.For example, RR interval is normally below 350, but amplitude can be as large as 30,000.Preprocessing normalize the inputs so that training becomes smoother and faster.The output is post processed to get back to original range.
For FECG extraction, fetal signal is contained within AECG of pregnant woman, together with maternal signal.It is well known that fetal signal is easily susceptible to noise corruption due to weak amplitudes.Besides, maternal signal is the dominating ECG in the AECG.FECG needs to be filtered out from AECG of pregnant woman.
The architecture of the neural network is mainly designed by using the adaptive filtering approach that is the combination of ADALINE (adaptive linear network) and TDL (tapped delay line).According to the concept of TDL, the input signal MECG enters and passes through the N-1 delays and the output of the TDL is an Ndimensional vector, made up of the input signal at the current time, the previous signal, which is feed to the ADALINE.For the less complexity, the value of N is considered 2. The adaptive filter linear neural network has been shown in Figure 3 where, the MECG, which is predicted and closely to the AECG, passes through the 1 delay and the delayed output was multiplied by the two corresponding initial weights.After addition of the weighted output, it passes through the linear activation function.Finally, the output of the network was detracted from the target input (AECG) and to reduce the difference between input and target signal the weight has been updated every step.Therefore, the difference is considered the FECG by suppressing from the AECG.ADALINE filter is better than conventional filtering because suppression is used instead.It can avoid eliminating desirable signal.The initial weight was considered w 1,1 = 0, and w 1,2 = -2.For input signal, 1000 data was fed into the network that is considered the maternal signal and for the target signal, 1000 data has been used as abdominal signal.Initially, the learning rate and momentum has been taken 0.8 and 0.5 respectively.The learning rate and momentum changeable.The changing of the learning rate and the momentum also affect the output of the network.The input (AECG) considered as → p , the desired output (MECG) considered as and the adaptive weight considered as → w .The learning rate is η and the momentum is m.Therefore, the network representation will be like below.
The learning rule will be The operation of the network is as followed.The MECG, which is the signal to be predicted, enters into network through tapped delay line.The value that enters the network is the current value.The two outputs of the tapped delay lines are actually the previous values of current ECG value.The three values are multiplied with three weights value.Three weighted values enter a summer and linear transfer function.Since the target is AECG of pregnant woman, the network changes the weight on each time step to minimize the error.If the error is zero, then the network output is exactly equal to target ECG.However, the best the network can predict is the MECG in AECG of pregnant woman due to the correlation between two signals.Thus, the error does not equal to error instead, it equals fetal ECG, which is what actually we want to extract.

Results and discussion
To extract FECG from the AECG for the monitoring of FHR, the algorithm was developed in the previous sections.In this research, the signals used in training and testing for the neural network were downloaded from MIT database website.The training set signals consists of both normal AECG and AECG with paroxysmal atrial fibrillation.The test signals consists of both normal and atrial fibrillation ECG.
The training set consists of 50 record sets.Each record set contains two 30-minute with consecutive record names (e.g., n01 and n02), and two 5-minute "continuation" records with names ending in c (e.g., n01c and n02c).All four records in each record set are excerpts of longer continuous AECG recordings of a single subject.The 50 record sets come from 48 different subjects.The records with names beginning with p come from subjects who have paroxysmal atrial fibrillation (PAF).The second (even-numbered) record in each pair of 30-minute records contains the ECG immediately preceding an episode of PAF.Thus, for example, record p16 immediately precedes the episode of the PAF in record p16c.
The first (odd-numbered) record of the set (for example, record p15) contains 30 minutes of the AECG during a period that is distant from any episode of PAF.Distant means there is no PAF during the 45-minute period before or after the 30minute record in this case.The corresponding 5minute continuation record (e.g., record p15c) shows the minutes immediately following the PAFdistant record.
The records with names beginning with n come from subjects who do not have documented atrial fibrillation, either during the period from which the records were excerpted or at any other time.The subjects include healthy controls, patients referred for long-term ambulatory ECG monitoring, and patients in intensive care units.
The test set is constructed from 50 different subjects.As in the training set, pairs of consecutively numbered records come from the long-term AECG recording of a single subject.Approximately half of the record sets in the test set come from subjects with PAF.All signals are converted into text format and graphical format in excel files.
The *.dat files downloaded are in binary format.They contain the digitized ECGs with 16 bits per sample, least significant byte first in each pair, 128 samples per signal second, samples from each channel alternating, nominally 200 A/D units per millivolt.
They are manually converted to text file format *.txt by using a C++ program, confil.cpp.Since the total number of samples is large (128 samples per second = 7680 samples per minute, thus total samples = 7680*40 = 307200 samples per subject), only two thousand samples are converted in each text file (2000*4 = 8000 samples per subject).This is to due to reduce the neural network training time.
The QRS detection in the MECG was done by using the Backpropogation Neural Network approach.Initially, to train the neural network, 20 different signals were used and for testing, more than 10 signals were used.Approximately, 1,000 input-target pairs of the ECG signal fed into the network with different amplitudes, noise level, and heart rate.In each training session, the weight and bias values for both hidden layer and input layer were saved.With the unsatisfactory simulation, the network is trained one more time with the last saved weight and bias values, which improved the network and helped to reduce the number of time of training.The preprocessing and post processing has been done on the signal before and after training the network.This was because the range of values for different parameters differs too much (HASAN et al., 2011).For example, RR interval is normally below 350, but amplitude can be as large as 30,000.Preprocessing concepts normalize the inputs so that training becomes smoother and faster.The output is post processed to get back to original range.The testing output signal has been shown in Figure 4.According to the figure, there were actually 5 QRS complex for maternal ECG in the input signal and all the 5 R-peaks were detected.According to the output result of this figure, it has been showing that there was neither false R-peak nor missing R-peak in the output result, i.e. the algorithm can be able to detect R-peak in the QRS complex correctly.According to the output result, it can be said that the QRS detection algorithm was working properly under the satisfactory level.
From the Table 1, it is being shown that the developed R-peak detection algorithm shows 99% average accuracy.During the testing on T06 data set, only one false R-peak is detected, which contributed lower percentage (0.09%) of false positive peak.High amplitude and differentiation values have also been experienced but those were in reasonable range.During the testing on the output of another date set T09, one R-peak was missing thus contributed lower missing peak percentage (0.09%), which is due to the smaller RR interval than the average RR interval in training set.After detecting, the QRS complex of MECG in the AECG, the next task would be extracted the FECG.To extract the FECG, in this research, adaptive neural network has been designed where the network using the adaptive filtering approach that is the combination of ADALINE and TDL.The input signal has been given to the neural network as AECG and the estimated signal has been considered as MECG signal that is target signal.Initially, the learning rate and momentum has been taken arbitrary.The changing of the learning rate and the momentum also affect the output of the network.According to the Figure 5, the suppressed output that is the FECG from the input signal AECG to target signal MECG.In the figure, there were 4 QRS complex in the AECG that was cleared to understandable.Around 50 to 100 samples, there were maternal and fetal QRS overlapping but in the output, the fetal signal was clearly extracted in this overlapping condition.Again, around 650 to 700 samples also happened regarding the overlapping problem but the algorithm was able to separate the fetal ECG from the Abdominal ECG.A comparison has been highlighted in terms of the accuracy with this work and some other work done by other researcher shown in Table 2.According to the table of accuracy for FHR extraction method, it can be said that the proposed method are offering better opportunities in the field of FHR extraction.

Conclusion
An efficient system of FECG Extraction for the monitoring of FHR during the pregnancy has been successfully developed in this research using Artificial Neural Network approach.The results obtained from the simulation in MATLAB shows that the developed system could accurately extract the FECG from the AECG.Although the previous research methods, which has been used by the previous researcher, accurately achieve for the FECG extraction from the AECG, still all of them suffer from familiar limitations that this research solves.This research also totally noninvasive approach therefore the problem for the invasive approach also was solving by this research.This research can also be used as a reference for other researches targeting FHR extraction and monitoring system.

Figure 1 .
Figure 1.Flow chart of QRS complex detection algorithm.
. The notation IW means input layer weight and LW means hidden layer weight.The notation w a, b m, n means weight for the connection link between from layer b to layer a.It is the weight for m th neuron at layer a from n th output from layer b.For example, lw 2, 1 3, 5
means that for every 100 iteration, the error is displayed once.The maximum epoch for training is 800 and the goal is to reach error at 1e -3 .For each training session, the training stops when reaches either maximum epochs or goal error.The network is trained with 20 signals.The total points fed into the network are around 1000 input-target pairs.The signals are with different amplitudes, heart rate, and noise level.The weight and bias values are saved for each training session.When the simulations are not satisfactory, the network is trained one more time with the last saved weight and bias values.This can improve the network and reduce the number of time of training.

Figure 4 .
Figure 4. Testing signal for neural network.

Figure 5 .
Figure 5. Extract fetal ECG from abdominal ECG to maternal ECG.
Table 1 summarizes the results of detection accuracy for 10 signals.The Accuracy of the algorithm has been considered as below.

Table 2 .
Accuracy comparison for FHR extraction method.