Detecting cardiac cycles#

Author: Nicolas Legrand nicolas.legrand@cfin.au.dk

Hide code cell source
%%capture
import sys
if 'google.colab' in sys.modules:
    ! pip install systole
import matplotlib.pyplot as plt
import seaborn as sns
import numpy as np
from systole.detection import interpolate_clipping, ecg_peaks
from systole.plots import plot_raw
from systole import import_dataset1, import_ppg
from systole import serialSim
from systole.recording import Oximeter

from bokeh.io import output_notebook
from bokeh.plotting import show
output_notebook()

sns.set_context('talk')
Loading BokehJS ...

This notebook focuses on the characterisation of cardiac cycles from the physiological signals we previously described (PPG and ECG). Here we only briefly mention different QRS-detection algorithms as related to the estimation of heart rate frequency, which is a commonly used measure in psychology and cognitive science. However, both the ECG and the PPG signals contain rich information beyond the beat-to-beat frequency that is relevant for cognitive and physiological modelling. All these approaches are not fully covered by Systole, but are implemented in other software (e.g. see [Makowski et al., 2021] for ECG components delineation).

In this notebook, we are going to review the peak detection algorithm, which future tutorials covering, for example, heart-rate variability will build upon. Here our intention is to provide a better intuition of what is happening in our peak detection algorithms, the corrections that can be applied, and the possible bias and artefacts that can emerge. However, you do not need a perfect understanding of all these steps if you want to apply peak detection to your data, and you might also consider skipping this part to proceed directly to the next tutorial focusing on artefact detection and correction.

Electrocardiography#

Because we will ultimately be interested in heart rate and its variability, our first goal will be to detect the R peaks. A large variety of algorithms have been proposed to extract the timing of R waves while controlling for signal noise and physiological variability. Reviewing and comparing all of the available methods is beyond the scope of this tutorial. Here, we are going to restrict our focus to some of the most popular methods, which also are available in the Python open-source ecosystem. We will use the systole.detection.ecg_peaks() function to perform R peak detection. This function can call a variety of peak detection algorithms among the following:

  • sleepecg (default)

  • hamilton

  • christov

  • engelse-zeelenberg

  • pan-tompkins

  • wavelet-transform

  • moving-average

The default method (sleepecg) uses a modified version of the Pan-Tompkins algorithm provided by the sleepecg package. The rest of the algorithms were implemented in the py-ecg-detectors module [Porr and Howell, 2019]. Systole uses a modified version of these implementations that runs with the Numba package for better performance, resulting in 7-30x faster estimation, depending on the algorithm.

The detection algorithm can be selected via the method parameter. In these tutorials, we will use the modified version of the pan-tompkins algorithm [Pan and Tompkins, 1985] from sleepecg as it is a fast, well-performing, and a commonly used algorithm for QRS detection.

Let’s first load an ECG recording. Here, we will select a 5-minute interval and compare the performances of the different algorithms supported by Systole.

# Import ECG recording
ecg_df = import_dataset1(modalities=['ECG'], disable=True)
signal = ecg_df[ecg_df.time.between(60, 360)].ecg.to_numpy()  # Select 5 minutes

Detecting R peaks#

from systole.detectors import pan_tompkins, hamilton, moving_average, christov, engelse_zeelenberg

Pan-Tompkins#

The Pan-Tompkins algorithm was introduced in 1985. It uses band-pass filtering and derivation to identify the QRS complex and apply an adaptive threshold to detect the R peaks on the filtered signal.

This is a very popular - maybe the most popular - method for R peak detection. One of the advantages is that it can easily be applied for online detection (see this implementation for example). As we can see in the timing report, this algorithm is also the fastest among those available in the Systole toolbox, making it ideal for applications such as online peak detection and biofeedback. You can also see that the algorithm’s peak detection is highly accurate as we do not see any artifacts in the estimated R-R time series. One notable exception is the first peak which has been estimated incorrectly. This is due to the filtering steps in the Pan-Tompkins algorithms and can be corrected by using a slightly longer signal than the range of interest.

%%timeit
peaks_pt = pan_tompkins(signal, sfreq=1000)
8.23 ms ± 851 µs per loop (mean ± std. dev. of 7 runs, 1 loop each)
show(
    plot_raw(signal, modality='ecg', detector='pan-tompkins', backend='bokeh', show_heart_rate=True)
)

Moving average#

%%timeit
peaks_wa = ecg_peaks(signal, sfreq=1000, method="moving-average")
8.24 ms ± 603 µs per loop (mean ± std. dev. of 7 runs, 1 loop each)
show(
    plot_raw(signal, modality='ecg', detector='moving-average', backend='bokeh', show_heart_rate=True)
)

Hamilton#

%%timeit
peaks_ha = hamilton(signal, sfreq=1000)
16.6 ms ± 1.42 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)
show(
    plot_raw(signal, modality='ecg', detector='hamilton', backend='bokeh', show_heart_rate=True)
)

Christov#

%%timeit
peaks_ch = christov(signal, sfreq=1000)
175 ms ± 3.69 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)
show(
    plot_raw(signal, modality='ecg', detector='christov', backend='bokeh', show_heart_rate=True)
)

Engelse-Zeelenberg#

%%timeit
peaks_ew = engelse_zeelenberg(signal, sfreq=1000)
55.5 ms ± 2.22 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)
show(
    plot_raw(signal, modality='ecg', detector='engelse-zeelenberg', backend='bokeh', show_heart_rate=True)
)

Photoplethysmography#

Online systolic peaks detection#

Let’s first simulate some PPG recording. Here, we will use the serialSim class from Systole in conjunction with the Oximeter recording class. This will simulate the presence of a pulse oximeter on the computer and provide the information from a pre-recorded signal in real-time. Simulating synthetic cardiac data in this way can be a great way to test your data analysis pipelines before collecting data - avoiding painful mistakes that can occur!

# Pulse oximeter simulation
ser = serialSim()

# Create an Oxymeter instance, initialize recording and record for 10 seconds
oxi = Oximeter(serial=ser, sfreq=75).setup()
oxi.read(20);
Reset input buffer
# Retrieve the data from the oxi class
times = np.array(oxi.times)
threshold = np.array(oxi.threshold)
recording = np.array(oxi.recording)
peaks = np.array(oxi.peaks)

This method uses the derivative to find peaks in the signal and select them based on and adaptive threshold based on the rolling mean and rolling standard deviation in a given time window.

Hide code cell source
fig, ax = plt.subplots(figsize=(15, 5))
ax.set_title("Oximeter recording")

ax.plot(times, threshold, linestyle="--", color="gray", label="Threshold", linewidth=1.5)
ax.fill_between(
    x=times, y1=threshold, y2=recording.min(), alpha=0.2, color="gray"
)
ax.plot(times, recording, label="Recording", color="#4c72b0", linewidth=1.5)
ax.fill_between(x=times, y1=recording, y2=recording.min(), color="w")
ax.scatter(x=times[np.where(peaks)[0]], y=recording[np.where(peaks)[0]],
           color="#c44e52", alpha=.6, label="Online estimation",
           edgecolors='k')
ax.set_ylabel("PPG level")
ax.set_xlabel("Time (s)")
ax.legend()
sns.despine()
../_images/483ebd3b24807af287e20acde69506c1f06f20b12149ea47f8aea2c447fa2e19.png

Offline systolic peaks detection#

ppg = import_ppg()

A simple online approach like the one we described is usually good enough to detect all the systolic peaks, provided that the subject is not moving too much. The package comes with two algorithms for systolic peaks detection presented below and that can be controlled through the method parameter of the systole.detection.ppg_peaks().

Rolling mean#

The current default is an adaptation of the rolling mean method proposed by [van Gent et al., 2019]. This method has the advantage of being fast, simple and performing well when the signal-to-noise ratio is good.

show(
    plot_raw(signal=ppg, modality="ppg", detector="rolling_average", sfreq=75, backend="bokeh")
)

The Multi-scale peak and trough detection algorithm#

Systole also includes a version of the Multi-scale peak and trough detection algorithm [Bishop and Ercole, 2018], which has been reported to be one of the best-performing algorithms in recent benchmark [Kotzen et al., 2021].

show(
    plot_raw(signal=ppg, modality="ppg", detector="msptd", sfreq=75, backend="bokeh")
)

Clipping artefacts#

Clipping is a form of distortion that can limit signal when it exceeds a certain threshold see the Wikipedia page. Some PPG devices can produce clipping artefacts when recording the PPG signal, for example, if a participant is pressing too hard on the finger clip. Here, we can see that some clipping artefacts are found between 100 and 150 seconds in the previous recording. The threshold value (here 255), is often set by the device and can easily be found depending on the manufacturer. These artefacts should be corrected before systolic peak detection. One way to correct these artefacts is to remove the portion of the signal where clipping artefacts occur and use cubic spline interpolation to reconstruct a plausible estimate of the real underlying signal. This is what the function interpolate_clipping() does [van Gent et al., 2019].

signal = ppg[ppg.time.between(110, 113)].ppg.to_numpy()  # Extract a portion of signal with clipping artefacts
clean_signal = interpolate_clipping(signal, min_threshold=0, max_threshold=255)  # Remove clipping segment and interpolate missing calues
Hide code cell source
plt.figure(figsize=(15, 5))
plt.plot(np.arange(0, len(clean_signal))/75, clean_signal, label='Corrected PPG signal', linestyle= '--', color='#c44e52')
plt.plot(np.arange(0, len(signal))/75, signal, label='Raw PPG signal', color='#4c72b0')
plt.legend()
plt.xlabel('Time (s)')
plt.ylabel('PPG level (a.u)')
sns.despine()
plt.tight_layout()
../_images/b850d24a3b50f21e712773c4d501e502b1d07e235057c3388cf50a0268e87583.png

This concludes the tutorial on using Systole to detect individual heartbeats and to reject common ECG and PPG artefacts. The next tutorial will go further into more nuanced issues in artefact detection and correction.