\IEEEtitleabstractindextext

[Uncaptioned image]

STAL: Spike Threshold Adaptive Learning Encoder for Classification of Pain-Related Biosignal Data

Freek Hens \orcidlink0009-0005-0405-0560    Mohammad Mahdi Dehshibi \orcidlink0000-0001-8112-5419 \IEEEmembershipSenior Member, IEEE    Leila Bagheriye \orcidlink0000-0003-1605-5850 \IEEEmembershipMember, IEEE    Mahyar Shahsavari \orcidlink0000-0001-7703-6835     \IEEEmembershipMember, IEEE    Ana Tajadura-Jiménez \orcidlink0000-0003-3166-3512 The paper is submitted for review on July 11, 2024. “This work was supported by the European Union’s Horizon 2020 program under grant No 101002711 for the BODYinTRANSIT project.” Freek Hens (e-mail: freek.hens@outlook.com), Mahyar Shahsavari, and Leila Bagheriye are with the Donders Institute for Brain, Cognition and Behavior at Radboud University in Nijmegen, The Netherlands. Mohammad Mahdi Dehshibi and Ana Tajadura-Jiménez are with the Department of Computer Science and Engineering at Universidad Carlos III de Madrid, Léganes, Spain.
Abstract
This paper presents the first application of spiking neural networks (SNNs) for the classification of chronic lower back pain (CLBP) using the EmoPain dataset. Our work has two main contributions. We introduce Spike Threshold Adaptive Learning (STAL), a trainable encoder that effectively converts continuous biosignals into spike trains. Additionally, we propose an ensemble of Spiking Recurrent Neural Network (SRNN) classifiers for the multi-stream processing of sEMG and IMU data. To tackle the challenges of small sample size and class imbalance, we implement minority over-sampling with weighted sample replacement during batch creation. Our method achieves outstanding performance with an accuracy of 80.43%, AUC of 67.90%, F1 score of 52.60%, and Matthews Correlation Coefficient (MCC) of 0.437, surpassing traditional rate-based and latency-based encoding methods. The STAL encoder shows superior performance in preserving temporal dynamics and adapting to signal characteristics. Importantly, our approach (STAL-SRNN) outperforms the best deep learning method in terms of MCC, indicating better balanced class prediction. This research contributes to the development of neuromorphic computing for biosignal analysis. It holds promise for energy-efficient, wearable solutions in chronic pain management.
{IEEEkeywords} Spike Train Encoder, Spiking Recurrent Neural Network, EmoPain, Chronic Pain

1 Introduction

\IEEEPARstart

Biosignals are pivotal in non-invasively analysing and understanding physiological processes and human body function [1]. Among these, surface electromyography (sEMG) and data from inertial measurement units (IMUs) offer a comprehensive representation of an individual’s motor behaviour [2]. sEMG measures the electrical activity of skeletal muscles, while IMUs directly quantify body segment movements. Analysing human motor behaviour has found widespread applications in several domains, particularly in chronic pain management [3].

Chronic lower back pain (CLBP) is a challenging condition that impacts millions of individuals worldwide, often resulting in reduced quality of life and significant economic strain. The precise detection and monitoring of CLBP are essential for developing effective pain management strategies [4]. However, the analysis of biosignals for this purpose presents several challenges. One primary challenge is signal contamination from various sources, including motion artefacts and crosstalk from nearby physiological processes [5]. Another lies in inter- and intra-individual variability due to physiological differences or factors like fatigue and stress [6]. Beyond these challenges, sEMG data are high-dimensional and multi-channel, with a non-stationary nature due to variations in muscle anatomy between individuals. Similarly, IMU signals present challenges such as drift and integration errors, sensitivity to sensor placement, and the complexity of interpreting 3D motion data [7].

Conventionally, researchers have addressed these challenges using handcrafted features and filtering techniques. While promising, these approaches often require domain expertise and may fail to capture the intricate dynamics of complex biosignals [8]. More recently, deep learning models such as convolutional and recurrent neural networks [9, 7] have demonstrated the potential to derive intricate features from raw data for CLBP classification. However, these models typically demand large, high-quality datasets for training and are computationally intensive.

Spiking Neural Networks (SNNs) offer a promising alternative for biosignal analysis. Inspired by biological neural systems, SNNs process information through discrete spike events, mimicking the communication mechanism of biological neurons. This biological plausibility is particularly relevant for sEMG and IMU analysis, as SNNs can capture the complex temporal dynamics of these signals more effectively [10]. Moreover, when deployed on neuromorphic hardware, SNNs can operate with significantly lower power consumption compared to other neural network architectures. This feature makes them ideal for resource-constrained environments like wearable devices [11].

Despite these advantages, a significant challenge in using SNNs for biosignal analysis lies in converting continuous analogue signals into binary spike trains – the native language of SNNs. Various encoding methods have been proposed, including rate-based, time-to-first-spike, and delta-modulated approaches. However, these methods often struggle to preserve the temporal dynamics of biosignals, resist inherent noise, and adapt to signal characteristics without violating biological constraints [12, 13].

In this paper, we present the first application of neuromorphic computing techniques to CLBP classification using sEMG and IMU data provided in the EmoPain dataset [3]. This dataset includes recordings from 25 healthy individuals and 21 subjects with CLBP performing a sequence of physical exercises. We introduce two key innovations:

  1. 1.

    Spike Threshold Adaptive Learning (STAL): A learnable encoder designed to convert continuous sEMG and IMU signals into spike trains that efficiently preserve temporal information.

  2. 2.

    An ensemble of Spiking Recurrent Neural Network (SRNN) classifiers: This multi-stream SRNN uses recurrent leaky integrate-and-fire neurons to process the spike trains and improves accuracy and robustness in classifying individuals as healthy or having CLBP with an ensemble of Random Forest classifiers.

By combining the biological plausibility and efficiency of SNNs with the proven performance of ensemble learning methods, our two-fold contribution provides opportunities for developing more effective, personalised rehabilitation strategies for individuals with CLBP. Furthermore, the proposed approach has potential broader impacts beyond CLBP classification. It could be adapted for other biosignal analysis tasks in healthcare, such as detecting neurodegenerative disorders or monitoring cardiovascular health. The low-power consumption of SNNs also makes this approach promising for long-term, continuous health monitoring using wearable devices.

The remainder of this paper is organised as follows: Section 2 reviews related work in biosignal encoding and studies that addressed challenges defined in the EmoPain database. Section 3 details our proposed method, including the STAL encoder and the ensemble of SRNN classifiers. Section 4 presents our experimental setup and results, comparing our method’s performance to existing approaches. Finally, Section 5 concludes the paper and discusses future research directions.

2 Literature Review

The use of spiking neural networks in biosignal processing has shown great promise, thanks to their biological plausibility, lower power consumption, and reduced training requirements compared to conventional deep learning methods [14]. However, their potential application in chronic pain detection and identification of pain-related protective behaviour is an unexplored area, presenting a unique opportunity in this domain.

Refer to caption
Figure 1: A schematic of the proposed architecture, including the Spike Threshold Adaptive Learning (STAL) encoder and Spiking Recurrent Neural Network (SRNN), which are responsible for classifying data from EmoPain [3] into Healthy and Chronic Lower Back Pain (CLBP) classes. The multimodal biosignals consist of sEMG sensors and derived features (i.e., Joint Energy and Joint Angle) from IMU sensors. These continuous signals convert into spike trains using the STAL encoder and are subsequently classified using an ensemble of SRNNs. The ensemble prediction is generated using a Random Forest meta classifier.

A critical component of using SNNs in biosignal processing is the conversion of continuous signals into discrete spike trains, known as spike encoding. This process is particularly challenging for EMG and IMU data due to their rapidly changing patterns and susceptibility to noise [7]. Several encoding methods have been proposed, each with distinct trade-offs in temporal resolution, noise robustness, and computational efficiency.

The use of rate-based encoding, though energy-efficient, often faces challenges in capturing the rapid changes inherent in EMG and IMU signals [15]. Conversely, phase coding provides noise resilience at the cost of complex synchronisation mechanisms [16]. Other techniques, including Send-on-Delta, Ben’s spiker algorithm [17], and Variational mode decomposition (VMD) [18], have their unique advantages in precise spike generation or direct encoding of analogue signals but frequently encounter difficulties in handling the intricate dynamics of in vivo EMG and IMU data and fail to fully leverage the spatial information in high-density EMG arrays.

Recent research has explored learnable encoders for multimodal biosignal analysis in SNNs [19], offering adaptability to specific signal characteristics. However, these approaches often need more focus on high temporal resolution, noise robustness, and computational efficiency, which are crucial for real-world CLBP applications using wearable sensors [20].

The limitations of existing encoding methods in handling the unique challenges of EMG and IMU data for CLBP classification motivate the development of a new approach. An ideal encoder for this purpose should retain temporal dynamics, resist inherent noise, and adjust to signal characteristics without violating biological constraints. Hence, we introduce Spike Threshold Adaptive Learning in this study, which is designed to address the unique requirements of CLBP classification in the EmoPain dataset.

3 Proposed Method

This section details the proposed two-stage approach for classifying chronic lower back pain using biosignals. In the first stage, a novel learnable encoder – Spike Threshold Adaptive Learning (STAL) – converts biosignals (sEMG, Joint Angle, and Joint Energy) into spike trains. In the second stage, independent spiking recurrent neural networks are used within a multi-stream ensemble classification framework to analyse the encoded spike trains. This ensemble approach combines the predictions from each modality-specific SRNN using a Random Forest classifier to perform classification. Figure 1 illustrates a schematic of the proposed architecture.

Let 𝐗T×c𝐗superscript𝑇𝑐\mathbf{X}\in\mathbb{R}^{T\times c}bold_X ∈ blackboard_R start_POSTSUPERSCRIPT italic_T × italic_c end_POSTSUPERSCRIPT denote the input data, where T𝑇Titalic_T is the number of time steps and c𝑐citalic_c is the number of channels. To maintain consistency in notation, we refer to the biosignal input dimensions as “channels,” irrespective of whether they represent physical sensor channels or derived features. The data is segmented into windows of size ω𝜔\omegaitalic_ω time steps, resulting in a collection of windowed data points denoted by 𝐗(ω)ω×csuperscript𝐗𝜔superscript𝜔𝑐\mathbf{X}^{(\omega)}\in\mathbb{R}^{\omega\times c}bold_X start_POSTSUPERSCRIPT ( italic_ω ) end_POSTSUPERSCRIPT ∈ blackboard_R start_POSTSUPERSCRIPT italic_ω × italic_c end_POSTSUPERSCRIPT for each window ω𝜔\omegaitalic_ω.

3.1 STAL: Spike Threshold Adaptive Learning

The STAL architecture, depicted in Fig. 2, consists of feature extraction and feature-to-spike conversion modules. The feature extraction module comprises two consecutive blocks, each containing a dense layer, dropout, ReLU activation, and batch normalisation. The ReLU activation function (ReLU()=max(0,)ReLU0\mathrm{ReLU}(\cdot)=\max(0,\cdot)roman_ReLU ( ⋅ ) = roman_max ( 0 , ⋅ )) introduces non-linearity, allowing the network to learn complex patterns. Dropout helps prevent overfitting by randomly setting a fraction of input units to 0 during training. Batch normalisation improves the stability and performance of the neural network by normalising the inputs to each layer.

Refer to caption
Figure 2: The STAL architecture consists of feature extraction and feature-to-spike conversion modules. Here, \mathbf{\mathcal{L}}caligraphic_L is the encoder loss function, formulated in Eq. (3.1).

To prepare for feature extraction, we reshape 𝐗(ω)superscript𝐗𝜔\mathbf{X}^{(\omega)}bold_X start_POSTSUPERSCRIPT ( italic_ω ) end_POSTSUPERSCRIPT into a feature vector 𝐗f1×ωcsubscript𝐗𝑓superscript1𝜔𝑐\mathbf{X}_{f}\in\mathbb{R}^{1\times\omega c}bold_X start_POSTSUBSCRIPT italic_f end_POSTSUBSCRIPT ∈ blackboard_R start_POSTSUPERSCRIPT 1 × italic_ω italic_c end_POSTSUPERSCRIPT. The first block processes 𝐗fsubscript𝐗𝑓\mathbf{X}_{f}bold_X start_POSTSUBSCRIPT italic_f end_POSTSUBSCRIPT through a dense layer 𝐙(1)superscript𝐙1\mathbf{Z}^{(1)}bold_Z start_POSTSUPERSCRIPT ( 1 ) end_POSTSUPERSCRIPT, resulting in 𝐙(1)1×superscript𝐙1superscript1\mathbf{Z}^{(1)}\in\mathbb{R}^{1\times\ell}bold_Z start_POSTSUPERSCRIPT ( 1 ) end_POSTSUPERSCRIPT ∈ blackboard_R start_POSTSUPERSCRIPT 1 × roman_ℓ end_POSTSUPERSCRIPT, where \ellroman_ℓ denotes the number of neurons in the layer. The second block applies similar operations to the output of the first block, producing 𝐙(2)1×superscript𝐙2superscript1\mathbf{Z}^{(2)}\in\mathbb{R}^{1\times\ell}bold_Z start_POSTSUPERSCRIPT ( 2 ) end_POSTSUPERSCRIPT ∈ blackboard_R start_POSTSUPERSCRIPT 1 × roman_ℓ end_POSTSUPERSCRIPT.

The feature-to-spike conversion module of the STAL converts the extracted features into surrogate spiking activity. This process begins with an element-wise repetition operation on the output of the second block using Eq. (1), expanding it to 𝐡1×ωψc𝐡superscript1𝜔𝜓𝑐\mathbf{h}\in\mathbb{R}^{1\times\omega\psi c}bold_h ∈ blackboard_R start_POSTSUPERSCRIPT 1 × italic_ω italic_ψ italic_c end_POSTSUPERSCRIPT, where ψ𝜓\psiitalic_ψ represents the number of spikes allowed at each time step.

𝐡=Repeat(𝐙(2),ψ)𝐡Repeatsuperscript𝐙2𝜓\mathbf{h}=\mathrm{Repeat}(\mathbf{Z}^{(2)},\psi)bold_h = roman_Repeat ( bold_Z start_POSTSUPERSCRIPT ( 2 ) end_POSTSUPERSCRIPT , italic_ψ ) (1)

A learnable threshold vector, denoted as 𝚽[ϕ1,ϕ2,,ϕωψc]𝚽superscriptsubscriptitalic-ϕ1subscriptitalic-ϕ2subscriptitalic-ϕ𝜔𝜓𝑐top\mathbf{\Phi}\triangleq\left[\phi_{1},\phi_{2},\cdots,\phi_{\omega\psi c}% \right]^{\top}bold_Φ ≜ [ italic_ϕ start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT , italic_ϕ start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT , ⋯ , italic_ϕ start_POSTSUBSCRIPT italic_ω italic_ψ italic_c end_POSTSUBSCRIPT ] start_POSTSUPERSCRIPT ⊤ end_POSTSUPERSCRIPT, is incorporated into the encoding function f:{0,1}:𝑓01f:\mathbb{R}\rightarrow\{0,1\}italic_f : blackboard_R → { 0 , 1 }. This function is designed as a differentiable surrogate in Eq. (2) to generate binary spike trains 𝐁{0,1}ω×ψ×c𝐁superscript01𝜔𝜓𝑐\mathbf{B}\in\{0,1\}^{\omega\times\psi\times c}bold_B ∈ { 0 , 1 } start_POSTSUPERSCRIPT italic_ω × italic_ψ × italic_c end_POSTSUPERSCRIPT. Each threshold, denoted by ϕi𝒰(a,b),i{1,2,,ωψc}formulae-sequencesimilar-tosubscriptitalic-ϕ𝑖𝒰𝑎𝑏for-all𝑖12𝜔𝜓𝑐\phi_{i}\sim\mathcal{U}(a,b),~{}\forall i\in\{1,2,\cdots,\omega\psi c\}italic_ϕ start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT ∼ caligraphic_U ( italic_a , italic_b ) , ∀ italic_i ∈ { 1 , 2 , ⋯ , italic_ω italic_ψ italic_c }, is initialised from a uniform distribution with hyperparameters a,b+𝑎𝑏superscripta,b\in\mathbb{R}^{+}italic_a , italic_b ∈ blackboard_R start_POSTSUPERSCRIPT + end_POSTSUPERSCRIPT.

fsurr(𝐡;𝚽)=σ(α(𝐡𝚽))subscript𝑓surr𝐡𝚽𝜎𝛼𝐡𝚽f_{\mathrm{surr}}(\mathbf{h};\mathbf{\Phi})=\sigma\left(\alpha\left(\mathbf{h}% -\mathbf{\Phi}\right)\right)italic_f start_POSTSUBSCRIPT roman_surr end_POSTSUBSCRIPT ( bold_h ; bold_Φ ) = italic_σ ( italic_α ( bold_h - bold_Φ ) ) (2)

where σ()𝜎\sigma(\cdot)italic_σ ( ⋅ ) is the sigmoid function and α𝛼\alphaitalic_α controls the slope of the surrogate function. When α𝛼\alpha\rightarrow\inftyitalic_α → ∞, the function approaches a true binary step function, mimicking the behaviour of real spiking neurons.

To align the dimensions of the spike train with the input for subsequent computations, we reshape 𝐁𝐁\mathbf{B}bold_B into 𝐁^{1,2,,𝒦(ψ)}1×ωc^𝐁superscript12𝒦𝜓1𝜔𝑐\mathbf{\hat{B}}\in\{1,2,\cdots,\mathcal{K}(\psi)\}^{1\times\omega c}over^ start_ARG bold_B end_ARG ∈ { 1 , 2 , ⋯ , caligraphic_K ( italic_ψ ) } start_POSTSUPERSCRIPT 1 × italic_ω italic_c end_POSTSUPERSCRIPT using Eq. (3). Here, 𝒦(ψ)𝒦𝜓\mathcal{K}(\psi)caligraphic_K ( italic_ψ ) denotes the number of possible combinations arising from summing the activations across ψ𝜓\psiitalic_ψ spike positions at each time step and channel, capturing the temporal information about the order of spikes within 𝐁^^𝐁\mathbf{\hat{B}}over^ start_ARG bold_B end_ARG.

𝐁^i=j=1ψpj𝐁i,j,k,i{1,,ω},k{1,,c}formulae-sequencesubscript^𝐁𝑖superscriptsubscript𝑗1𝜓subscript𝑝𝑗subscript𝐁𝑖𝑗𝑘formulae-sequencefor-all𝑖1𝜔𝑘1𝑐\mathbf{\hat{B}}_{i}=\sum_{j=1}^{\psi}p_{j}\mathbf{B}_{i,j,k},\quad\forall i% \in\{1,\cdots,\omega\},k\in\{1,\cdots,c\}over^ start_ARG bold_B end_ARG start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT = ∑ start_POSTSUBSCRIPT italic_j = 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_ψ end_POSTSUPERSCRIPT italic_p start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT bold_B start_POSTSUBSCRIPT italic_i , italic_j , italic_k end_POSTSUBSCRIPT , ∀ italic_i ∈ { 1 , ⋯ , italic_ω } , italic_k ∈ { 1 , ⋯ , italic_c } (3)

where pjsubscript𝑝𝑗p_{j}italic_p start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT represents the weight associated with the j𝑗jitalic_j-th spike position. This process incorporates the temporal information about the order of spikes into the final representation, as different spike orders will lead to different weighted sums and, consequently, different entries in 𝐁^^𝐁\mathbf{\hat{B}}over^ start_ARG bold_B end_ARG.

We introduce a customised loss function to update the parameters of the STAL encoder. This function consists of the Mutual Information term, inspired by [21], and the Sparsity term. The Mutual Information term (subscript\mathcal{L}_{\mathcal{MI}}caligraphic_L start_POSTSUBSCRIPT caligraphic_M caligraphic_I end_POSTSUBSCRIPT) measures the extent to which the encoder captures relevant information from the input data (𝐗fsubscript𝐗𝑓\mathbf{X}_{f}bold_X start_POSTSUBSCRIPT italic_f end_POSTSUBSCRIPT) at multiple levels of representation, i.e., the encoded information in the hidden layers (𝐙(1)superscript𝐙1\mathbf{Z}^{(1)}bold_Z start_POSTSUPERSCRIPT ( 1 ) end_POSTSUPERSCRIPT and 𝐙(2)superscript𝐙2\mathbf{Z}^{(2)}bold_Z start_POSTSUPERSCRIPT ( 2 ) end_POSTSUPERSCRIPT) and encoded spike train (𝐁^^𝐁\mathbf{\hat{B}}over^ start_ARG bold_B end_ARG). Meanwhile, the Sparsity term (𝒮subscript𝒮\mathcal{L}_{\mathcal{S}}caligraphic_L start_POSTSUBSCRIPT caligraphic_S end_POSTSUBSCRIPT) serves as a regularisation term, encouraging the generation of sparse spike trains that efficiently represent the input data, promoting resource efficiency and robustness against overfitting. Formally, the loss function is formulated as in Eq. (3.1).

subscript\displaystyle\mathcal{L}_{\mathcal{MI}}caligraphic_L start_POSTSUBSCRIPT caligraphic_M caligraphic_I end_POSTSUBSCRIPT =13([𝐗f;𝐙(1)]+[𝐗f;𝐙(2)]+[𝐗f;𝐁^]),absent13subscript𝐗𝑓superscript𝐙1subscript𝐗𝑓superscript𝐙2subscript𝐗𝑓^𝐁\displaystyle=-\dfrac{1}{3}\left(\mathcal{I}\left[\mathbf{X}_{f};\mathbf{Z}^{(% 1)}\right]+\mathcal{I}\left[\mathbf{X}_{f};\mathbf{Z}^{(2)}\right]+\mathcal{I}% \left[\mathbf{X}_{f};\mathbf{\hat{B}}\right]\right),= - divide start_ARG 1 end_ARG start_ARG 3 end_ARG ( caligraphic_I [ bold_X start_POSTSUBSCRIPT italic_f end_POSTSUBSCRIPT ; bold_Z start_POSTSUPERSCRIPT ( 1 ) end_POSTSUPERSCRIPT ] + caligraphic_I [ bold_X start_POSTSUBSCRIPT italic_f end_POSTSUBSCRIPT ; bold_Z start_POSTSUPERSCRIPT ( 2 ) end_POSTSUPERSCRIPT ] + caligraphic_I [ bold_X start_POSTSUBSCRIPT italic_f end_POSTSUBSCRIPT ; over^ start_ARG bold_B end_ARG ] ) ,
𝒮subscript𝒮\displaystyle\mathcal{L}_{\mathcal{S}}caligraphic_L start_POSTSUBSCRIPT caligraphic_S end_POSTSUBSCRIPT =λ𝐗f𝐁^1,absent𝜆subscriptnormsubscript𝐗𝑓^𝐁1\displaystyle=\lambda\cdot||\mathbf{X}_{f}-\mathbf{\hat{B}}||_{1},= italic_λ ⋅ | | bold_X start_POSTSUBSCRIPT italic_f end_POSTSUBSCRIPT - over^ start_ARG bold_B end_ARG | | start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT ,
\displaystyle\mathcal{L}caligraphic_L =+𝒮absentsubscriptsubscript𝒮\displaystyle=\mathcal{L}_{\mathcal{MI}}+\mathcal{L}_{\mathcal{S}}= caligraphic_L start_POSTSUBSCRIPT caligraphic_M caligraphic_I end_POSTSUBSCRIPT + caligraphic_L start_POSTSUBSCRIPT caligraphic_S end_POSTSUBSCRIPT (4)

where [;]\mathcal{I}\left[\cdot;\cdot\right]caligraphic_I [ ⋅ ; ⋅ ] computes the Mutual Information between tensors, ||||1||\cdot||_{1}| | ⋅ | | start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT is the L1 norm, and λ𝜆\lambdaitalic_λ is the strength coefficient controlling the level of sparseness imposed on the internal representations.

To calculate Mutual Information, the encoded information in the hidden layers and spike train need to have the same dimensions as the input data. We reshape the hidden layer outputs when their dimensions differ. If <ωc𝜔𝑐\ell<\omega croman_ℓ < italic_ω italic_c, the features will be repeated ωc𝜔𝑐\left\lfloor\frac{\ell}{\omega c}\right\rfloor⌊ divide start_ARG roman_ℓ end_ARG start_ARG italic_ω italic_c end_ARG ⌋ times. Conversely, if >ωc𝜔𝑐\ell>\omega croman_ℓ > italic_ω italic_c, average pooling will be used with a pool size and stride of ωc𝜔𝑐\left\lfloor\frac{\omega c}{\ell}\right\rfloor⌊ divide start_ARG italic_ω italic_c end_ARG start_ARG roman_ℓ end_ARG ⌋.

The role of 𝒮subscript𝒮\mathcal{L}_{\mathcal{S}}caligraphic_L start_POSTSUBSCRIPT caligraphic_S end_POSTSUBSCRIPT in regulating the sparsity of the spike train is essential. It uses L1 norm regularisation to calculate the difference between the input data (𝐗fsubscript𝐗𝑓\mathbf{X}_{f}bold_X start_POSTSUBSCRIPT italic_f end_POSTSUBSCRIPT) and the surrogate spike train (𝐁^^𝐁\mathbf{\hat{B}}over^ start_ARG bold_B end_ARG). This difference serves as an approximation of the number of spikes in the spike train. To limit the number of spikes to align approximately with the area of the input signal, Eq. (5) defines the impact of the sparsity term on the training process. This control over sparsity is achieved by taking the derivative of the sparsity term with respect to 𝐁^^𝐁\mathbf{\hat{B}}over^ start_ARG bold_B end_ARG.

𝒮𝐁^=λsign(𝐗f𝐁^).subscript𝒮^𝐁𝜆signsubscript𝐗𝑓^𝐁\dfrac{\partial\mathcal{L}_{\mathcal{S}}}{\partial\mathbf{\hat{B}}}=-\lambda% \cdot\mathrm{sign}(\mathbf{X}_{f}-\mathbf{\hat{B}}).divide start_ARG ∂ caligraphic_L start_POSTSUBSCRIPT caligraphic_S end_POSTSUBSCRIPT end_ARG start_ARG ∂ over^ start_ARG bold_B end_ARG end_ARG = - italic_λ ⋅ roman_sign ( bold_X start_POSTSUBSCRIPT italic_f end_POSTSUBSCRIPT - over^ start_ARG bold_B end_ARG ) . (5)

where sign()sign\mathrm{sign}(\cdot)roman_sign ( ⋅ ) is the sign function. This derivative encourages the output values to be close to 0 or 1, leading to a sparse output:

  1. 1.

    When 𝐁^>𝐗f^𝐁subscript𝐗𝑓\mathbf{\hat{B}}>\mathbf{X}_{f}over^ start_ARG bold_B end_ARG > bold_X start_POSTSUBSCRIPT italic_f end_POSTSUBSCRIPT, the derivative is negative, pushing output values towards 0.

  2. 2.

    When 𝐁^<𝐗f^𝐁subscript𝐗𝑓\mathbf{\hat{B}}<\mathbf{X}_{f}over^ start_ARG bold_B end_ARG < bold_X start_POSTSUBSCRIPT italic_f end_POSTSUBSCRIPT, the derivative is positive, pushing output values towards 1.

  3. 3.

    When 𝐁^=𝐗f^𝐁subscript𝐗𝑓\mathbf{\hat{B}}=\mathbf{X}_{f}over^ start_ARG bold_B end_ARG = bold_X start_POSTSUBSCRIPT italic_f end_POSTSUBSCRIPT, the derivative is 0, leaving output values unchanged.

3.2 Spiking Recurrent Neural Network

The proposed spiking recurrent neural network, depicted in Fig. 3, leverages the temporal dynamics of spike trains for chronic lower back pain classification. The SRNN processes the output of the STAL encoder (𝐁^^𝐁\mathbf{\hat{B}}over^ start_ARG bold_B end_ARG) through a network of Recurrent Leaky Integrate-and-Fire (R-LIF) neurons.

The network comprises L𝐿Litalic_L layers of R-LIF neurons, where L=2𝐿2L=2italic_L = 2 in our current implementation. The first layer (i.e., hidden layer) contains 500 neurons, and the second layer (i.e., readout layer) contains two neurons corresponding to the healthy and CLBP classes. All layers implement recurrent connections, allowing the network to capture complex temporal dependencies in the input spike trains.

Each R-LIF neuron is implemented using an exponentially decaying membrane potential, represented in discrete time steps. For each layer l{1,,L}𝑙1𝐿l\in\{1,\cdots,L\}italic_l ∈ { 1 , ⋯ , italic_L }, the membrane potential 𝐔l[t]superscript𝐔𝑙delimited-[]𝑡\mathbf{U}^{l}[t]bold_U start_POSTSUPERSCRIPT italic_l end_POSTSUPERSCRIPT [ italic_t ] and output spikes 𝐒l[t]superscript𝐒𝑙delimited-[]𝑡\mathbf{S}^{l}[t]bold_S start_POSTSUPERSCRIPT italic_l end_POSTSUPERSCRIPT [ italic_t ] at time step t𝑡titalic_t are computed by Eq. (3.2).

𝐔l[t+1]superscript𝐔𝑙delimited-[]𝑡1\displaystyle\mathbf{U}^{l}[t+1]bold_U start_POSTSUPERSCRIPT italic_l end_POSTSUPERSCRIPT [ italic_t + 1 ] =β𝐔l[t]+𝐈l[t+1]+absent𝛽superscript𝐔𝑙delimited-[]𝑡limit-fromsuperscript𝐈𝑙delimited-[]𝑡1\displaystyle=\beta\mathbf{U}^{l}[t]+\mathbf{I}^{l}[t+1]+= italic_β bold_U start_POSTSUPERSCRIPT italic_l end_POSTSUPERSCRIPT [ italic_t ] + bold_I start_POSTSUPERSCRIPT italic_l end_POSTSUPERSCRIPT [ italic_t + 1 ] +
𝐕l(𝐒l[t])𝐑l[t]𝐔thr,superscript𝐕𝑙superscript𝐒𝑙delimited-[]𝑡superscript𝐑𝑙delimited-[]𝑡subscript𝐔thr\displaystyle\mathbf{V}^{l}(\mathbf{S}^{l}[t])-\mathbf{R}^{l}[t]\mathbf{U}_{% \mathrm{thr}},bold_V start_POSTSUPERSCRIPT italic_l end_POSTSUPERSCRIPT ( bold_S start_POSTSUPERSCRIPT italic_l end_POSTSUPERSCRIPT [ italic_t ] ) - bold_R start_POSTSUPERSCRIPT italic_l end_POSTSUPERSCRIPT [ italic_t ] bold_U start_POSTSUBSCRIPT roman_thr end_POSTSUBSCRIPT ,
𝐒l[t+1]superscript𝐒𝑙delimited-[]𝑡1\displaystyle\mathbf{S}^{l}[t+1]bold_S start_POSTSUPERSCRIPT italic_l end_POSTSUPERSCRIPT [ italic_t + 1 ] =Θ(𝐔l[t+1]𝐔thr),absentΘsuperscript𝐔𝑙delimited-[]𝑡1subscript𝐔thr\displaystyle=\Theta(\mathbf{U}^{l}[t+1]-\mathbf{U}_{\mathrm{thr}}),= roman_Θ ( bold_U start_POSTSUPERSCRIPT italic_l end_POSTSUPERSCRIPT [ italic_t + 1 ] - bold_U start_POSTSUBSCRIPT roman_thr end_POSTSUBSCRIPT ) , (6)

where β=0.99𝛽0.99\beta=0.99italic_β = 0.99 is the membrane potential decay rate chosen to enhance the LIF neurons’ capacity to capture temporal dependencies in input spike trains by promoting longer memory. 𝐈l[t]superscript𝐈𝑙delimited-[]𝑡\mathbf{I}^{l}[t]bold_I start_POSTSUPERSCRIPT italic_l end_POSTSUPERSCRIPT [ italic_t ] is the input current for layer l𝑙litalic_l. The recurrent connections, 𝐕l(𝐒l[t])superscript𝐕𝑙superscript𝐒𝑙delimited-[]𝑡\mathbf{V}^{l}(\mathbf{S}^{l}[t])bold_V start_POSTSUPERSCRIPT italic_l end_POSTSUPERSCRIPT ( bold_S start_POSTSUPERSCRIPT italic_l end_POSTSUPERSCRIPT [ italic_t ] ), capture the impact of spikes from all interconnected neurons within layer l𝑙litalic_l. The binary vector 𝐒l[t]superscript𝐒𝑙delimited-[]𝑡\mathbf{S}^{l}[t]bold_S start_POSTSUPERSCRIPT italic_l end_POSTSUPERSCRIPT [ italic_t ] represents the output spikes of all neurons in layer l𝑙litalic_l at time t𝑡titalic_t. We set the membrane threshold 𝐔thrsubscript𝐔thr\mathbf{U}_{\mathrm{thr}}bold_U start_POSTSUBSCRIPT roman_thr end_POSTSUBSCRIPT to 1 in this study. The binary reset-by-subtraction mechanism 𝐑l[t]superscript𝐑𝑙delimited-[]𝑡\mathbf{R}^{l}[t]bold_R start_POSTSUPERSCRIPT italic_l end_POSTSUPERSCRIPT [ italic_t ] is set to 1 when the threshold is reached and 0 otherwise, effectively subtracting 𝐔thrsubscript𝐔thr\mathbf{U}_{\mathrm{thr}}bold_U start_POSTSUBSCRIPT roman_thr end_POSTSUBSCRIPT (which is 1 in this case) from the membrane potential. Lastly, Θ()Θ\Theta(\cdot)roman_Θ ( ⋅ ) refers to the Heaviside step function.

Refer to caption
Figure 3: The SRNN architecture. The spike trains 𝐁^^𝐁\mathbf{\hat{B}}over^ start_ARG bold_B end_ARG of each data type (sEMG, Energy, Angle) are classified separately. T𝑇Titalic_T represent the total time steps, c𝑐citalic_c are the channels, τ𝜏\tauitalic_τ is the time dimension of the Recurrent Leaky Integrate-and-Fire (R-LIF) neurons.

All layers implement all-to-all connectivity. The input currents and recurrent connections for each layer are computed using Eq. (3.2).

𝐈1[t]superscript𝐈1delimited-[]𝑡\displaystyle\mathbf{I}^{1}[t]bold_I start_POSTSUPERSCRIPT 1 end_POSTSUPERSCRIPT [ italic_t ] =𝐖{0,1}𝐁^[t],absentsuperscript𝐖01^𝐁delimited-[]𝑡\displaystyle=\mathbf{W}^{\{0,1\}}\mathbf{\hat{B}}[t],= bold_W start_POSTSUPERSCRIPT { 0 , 1 } end_POSTSUPERSCRIPT over^ start_ARG bold_B end_ARG [ italic_t ] ,
𝐈l[t]superscript𝐈𝑙delimited-[]𝑡\displaystyle\mathbf{I}^{l}[t]bold_I start_POSTSUPERSCRIPT italic_l end_POSTSUPERSCRIPT [ italic_t ] =𝐖{l1,l}𝐒l1[t],for l>1,formulae-sequenceabsentsuperscript𝐖𝑙1𝑙superscript𝐒𝑙1delimited-[]𝑡for 𝑙1\displaystyle=\mathbf{W}^{\{l-1,l\}}\mathbf{S}^{l-1}[t],\quad\text{for }l>1,= bold_W start_POSTSUPERSCRIPT { italic_l - 1 , italic_l } end_POSTSUPERSCRIPT bold_S start_POSTSUPERSCRIPT italic_l - 1 end_POSTSUPERSCRIPT [ italic_t ] , for italic_l > 1 ,
𝐕l(𝐒l[t])superscript𝐕𝑙superscript𝐒𝑙delimited-[]𝑡\displaystyle\mathbf{V}^{l}(\mathbf{S}^{l}[t])bold_V start_POSTSUPERSCRIPT italic_l end_POSTSUPERSCRIPT ( bold_S start_POSTSUPERSCRIPT italic_l end_POSTSUPERSCRIPT [ italic_t ] ) =𝐖recl𝐒l[t],for l1,,L,formulae-sequenceabsentsuperscriptsubscript𝐖rec𝑙superscript𝐒𝑙delimited-[]𝑡for 𝑙1𝐿\displaystyle=\mathbf{W}_{\mathrm{rec}}^{l}\mathbf{S}^{l}[t],\quad\text{for }l% \in{1,\cdots,L},= bold_W start_POSTSUBSCRIPT roman_rec end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_l end_POSTSUPERSCRIPT bold_S start_POSTSUPERSCRIPT italic_l end_POSTSUPERSCRIPT [ italic_t ] , for italic_l ∈ 1 , ⋯ , italic_L , (7)

where 𝐖{0,1}n1×csuperscript𝐖01superscriptsubscript𝑛1𝑐\mathbf{W}^{\{0,1\}}\in\mathbb{R}^{n_{1}\times c}bold_W start_POSTSUPERSCRIPT { 0 , 1 } end_POSTSUPERSCRIPT ∈ blackboard_R start_POSTSUPERSCRIPT italic_n start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT × italic_c end_POSTSUPERSCRIPT, 𝐖{l1,l}nl×nl1superscript𝐖𝑙1𝑙superscriptsubscript𝑛𝑙subscript𝑛𝑙1\mathbf{W}^{\{l-1,l\}}\in\mathbb{R}^{n_{l}\times n_{l-1}}bold_W start_POSTSUPERSCRIPT { italic_l - 1 , italic_l } end_POSTSUPERSCRIPT ∈ blackboard_R start_POSTSUPERSCRIPT italic_n start_POSTSUBSCRIPT italic_l end_POSTSUBSCRIPT × italic_n start_POSTSUBSCRIPT italic_l - 1 end_POSTSUBSCRIPT end_POSTSUPERSCRIPT, and 𝐖reclnl×nlsuperscriptsubscript𝐖rec𝑙superscriptsubscript𝑛𝑙subscript𝑛𝑙\mathbf{W}_{\mathrm{rec}}^{l}\in\mathbb{R}^{n_{l}\times n_{l}}bold_W start_POSTSUBSCRIPT roman_rec end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_l end_POSTSUPERSCRIPT ∈ blackboard_R start_POSTSUPERSCRIPT italic_n start_POSTSUBSCRIPT italic_l end_POSTSUBSCRIPT × italic_n start_POSTSUBSCRIPT italic_l end_POSTSUBSCRIPT end_POSTSUPERSCRIPT are learnable weight matrices. Here, nlsubscript𝑛𝑙n_{l}italic_n start_POSTSUBSCRIPT italic_l end_POSTSUBSCRIPT represents the number of neurons in layer l𝑙litalic_l, with n1=500subscript𝑛1500n_{1}=500italic_n start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT = 500 and n2=2subscript𝑛22n_{2}=2italic_n start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT = 2 in our architecture, and 𝐁^[t]^𝐁delimited-[]𝑡\mathbf{\hat{B}}[t]over^ start_ARG bold_B end_ARG [ italic_t ] represents the input at time step t𝑡titalic_t.

The SRNN processes the input over τ𝜏\tauitalic_τ time steps, where τ=5𝜏5\tau=5italic_τ = 5 is chosen to capture the temporal dynamics of the R-LIF neurons. The network’s output is computed using Eq. (3.2), which accumulates the spikes and membrane potentials of the output layer over all time steps.

𝐒outsubscript𝐒out\displaystyle\mathbf{S}_{\mathrm{out}}bold_S start_POSTSUBSCRIPT roman_out end_POSTSUBSCRIPT =[𝐒L[1],𝐒L[2],,𝐒L[τ]],absentsuperscript𝐒𝐿delimited-[]1superscript𝐒𝐿delimited-[]2superscript𝐒𝐿delimited-[]𝜏\displaystyle=\left[\mathbf{S}^{L}[1],\mathbf{S}^{L}[2],\cdots,\mathbf{S}^{L}[% \tau]\right],= [ bold_S start_POSTSUPERSCRIPT italic_L end_POSTSUPERSCRIPT [ 1 ] , bold_S start_POSTSUPERSCRIPT italic_L end_POSTSUPERSCRIPT [ 2 ] , ⋯ , bold_S start_POSTSUPERSCRIPT italic_L end_POSTSUPERSCRIPT [ italic_τ ] ] ,
𝐔outsubscript𝐔out\displaystyle\mathbf{U}_{\mathrm{out}}bold_U start_POSTSUBSCRIPT roman_out end_POSTSUBSCRIPT =[𝐔L[1],𝐔L[2],,𝐔L[τ]],absentsuperscript𝐔𝐿delimited-[]1superscript𝐔𝐿delimited-[]2superscript𝐔𝐿delimited-[]𝜏\displaystyle=\left[\mathbf{U}^{L}[1],\mathbf{U}^{L}[2],\cdots,\mathbf{U}^{L}[% \tau]\right],= [ bold_U start_POSTSUPERSCRIPT italic_L end_POSTSUPERSCRIPT [ 1 ] , bold_U start_POSTSUPERSCRIPT italic_L end_POSTSUPERSCRIPT [ 2 ] , ⋯ , bold_U start_POSTSUPERSCRIPT italic_L end_POSTSUPERSCRIPT [ italic_τ ] ] , (8)

where the final classification is determined by the neuron with the highest cumulative activity across these τ𝜏\tauitalic_τ time steps in the output layer.

To train the SRNN, we use the cross-entropy loss function, which is particularly suitable for our binary classification task between healthy and CLBP classes. The cross-entropy loss \mathcal{L}caligraphic_L is defined as in Eq. (9).

SRNN=i=1Cyilog(y^i)subscriptSRNNsuperscriptsubscript𝑖1𝐶subscript𝑦𝑖subscript^𝑦𝑖\mathcal{L}_{\mathrm{SRNN}}=-\sum_{i=1}^{C}y_{i}\log(\hat{y}_{i})caligraphic_L start_POSTSUBSCRIPT roman_SRNN end_POSTSUBSCRIPT = - ∑ start_POSTSUBSCRIPT italic_i = 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_C end_POSTSUPERSCRIPT italic_y start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT roman_log ( over^ start_ARG italic_y end_ARG start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT ) (9)

where C𝐶Citalic_C is the number of classes (in our case, C=2𝐶2C=2italic_C = 2), yisubscript𝑦𝑖y_{i}italic_y start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT is the true label (0 or 1) for class i𝑖iitalic_i, and y^isubscript^𝑦𝑖\hat{y}_{i}over^ start_ARG italic_y end_ARG start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT is the predicted probability for class i𝑖iitalic_i. The predicted probabilities are obtained by applying a softmax function to the accumulated output 𝐒outsubscript𝐒out\mathbf{S}_{\mathrm{out}}bold_S start_POSTSUBSCRIPT roman_out end_POSTSUBSCRIPT and 𝐔outsubscript𝐔out\mathbf{U}_{\mathrm{out}}bold_U start_POSTSUBSCRIPT roman_out end_POSTSUBSCRIPT. To handle the non-differentiable nature of spikes during backpropagation, we employ a surrogate gradient function. Specifically, we use the gradient of the shifted arc-tan function in Eq. (10).

SU=1π1(1+(πUα2)2).𝑆𝑈1𝜋11superscript𝜋𝑈𝛼22\frac{\partial S}{\partial U}=\frac{1}{\pi}\frac{1}{\left(1+\left(\pi U\frac{% \alpha}{2}\right)^{2}\right)}.divide start_ARG ∂ italic_S end_ARG start_ARG ∂ italic_U end_ARG = divide start_ARG 1 end_ARG start_ARG italic_π end_ARG divide start_ARG 1 end_ARG start_ARG ( 1 + ( italic_π italic_U divide start_ARG italic_α end_ARG start_ARG 2 end_ARG ) start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT ) end_ARG . (10)

This surrogate gradient allows us to backpropagate the loss through the network and update the weights of the SRNN, enabling effective training despite the discrete nature of spike events.

3.3 Multi-Stream Ensemble Classification

We have implemented a multi-stream ensemble classification approach using separate STAL-SRNN pipelines to leverage the diverse information captured by various biosignal modalities. This tailored approach provides customised encoding and classification strategies for each type of data to compensate for the information loss during continuous-to-spike conversion and leverages cross-modal interactions. For each data modality m{sEMG,Angle,Energy}𝑚sEMGAngleEnergym\in\{\mathrm{sEMG},\mathrm{Angle},\mathrm{Energy}\}italic_m ∈ { roman_sEMG , roman_Angle , roman_Energy }, we train an independent STAL-SRNN pipeline formulated in Eq. (11).

𝐘m=𝒫m(𝐁^m;𝜽m),subscript𝐘𝑚subscript𝒫𝑚subscript^𝐁𝑚subscript𝜽𝑚\mathbf{Y}_{m}=\mathcal{P}_{m}(\mathbf{\hat{B}}_{m};\boldsymbol{\theta}_{m}),bold_Y start_POSTSUBSCRIPT italic_m end_POSTSUBSCRIPT = caligraphic_P start_POSTSUBSCRIPT italic_m end_POSTSUBSCRIPT ( over^ start_ARG bold_B end_ARG start_POSTSUBSCRIPT italic_m end_POSTSUBSCRIPT ; bold_italic_θ start_POSTSUBSCRIPT italic_m end_POSTSUBSCRIPT ) , (11)

where 𝐁^msubscript^𝐁𝑚\mathbf{\hat{B}}_{m}over^ start_ARG bold_B end_ARG start_POSTSUBSCRIPT italic_m end_POSTSUBSCRIPT is the output of the STAL encoder for modality m𝑚mitalic_m, as defined in Eq. (3). Here, 𝒫m()subscript𝒫𝑚\mathcal{P}_{m}(\cdot)caligraphic_P start_POSTSUBSCRIPT italic_m end_POSTSUBSCRIPT ( ⋅ ) denotes the STAL-SRNN pipeline for that modality, 𝜽msubscript𝜽𝑚\boldsymbol{\theta}_{m}bold_italic_θ start_POSTSUBSCRIPT italic_m end_POSTSUBSCRIPT represents the learnable parameters of the pipeline, and 𝐘m2subscript𝐘𝑚superscript2\mathbf{Y}_{m}\in\mathbb{R}^{2}bold_Y start_POSTSUBSCRIPT italic_m end_POSTSUBSCRIPT ∈ blackboard_R start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT is the output prediction (i.e., probabilities for Healthy and CLBP classes) for modality m𝑚mitalic_m. Each STAL-SRNN pipeline is trained independently using the cross-entropy loss function, as described in Section 3.2.

The outputs from these individual pipelines serve as features for the meta-classifier, which is implemented using a Random Forest classifier [22]. This classifier is comprised of multiple decision trees, each trained on a bootstrap sample of the training data using the Gini impurity function to measure the quality of a split. The final prediction is determined by majority voting among the trees using Eq. (12).

y^=mode𝐶{RFi(𝐘sEMG,𝐘Angle,𝐘Energy)}i=1Nt,^𝑦𝐶modesuperscriptsubscriptsubscriptRF𝑖subscript𝐘sEMGsubscript𝐘Anglesubscript𝐘Energy𝑖1subscript𝑁𝑡\hat{y}=\underset{C}{\mathrm{mode}}\left\{\mathrm{RF}_{i}\left(\mathbf{Y}_{% \mathrm{sEMG}},\mathbf{Y}_{\mathrm{Angle}},\mathbf{Y}_{\mathrm{Energy}}\right)% \right\}_{i=1}^{N_{t}},over^ start_ARG italic_y end_ARG = underitalic_C start_ARG roman_mode end_ARG { roman_RF start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT ( bold_Y start_POSTSUBSCRIPT roman_sEMG end_POSTSUBSCRIPT , bold_Y start_POSTSUBSCRIPT roman_Angle end_POSTSUBSCRIPT , bold_Y start_POSTSUBSCRIPT roman_Energy end_POSTSUBSCRIPT ) } start_POSTSUBSCRIPT italic_i = 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_N start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT end_POSTSUPERSCRIPT , (12)

where RFi()subscriptRF𝑖\mathrm{RF}_{i}(\cdot)roman_RF start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT ( ⋅ ) represents the prediction of the i𝑖iitalic_i-th decision tree in the Random Forest. The final prediction y^^𝑦\hat{y}over^ start_ARG italic_y end_ARG is determined by majority voting among the Ntsubscript𝑁𝑡N_{t}italic_N start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT trees.

4 Experiments and Results

4.1 Dataset and Preparation

The EmoPain dataset [3] is a multimodal collection of data from 46 participants, encompassing healthy individuals and those with CLBP. Participants performed various exercises (one-leg-stand, reach-forward, stand-to-sit, sit-to-stand, and bend-down) at normal and difficult levels, with transitions included to simulate real-life movements.

Data were collected using 18 wearable IMU sensors and four sEMG sensors. The IMU sensors recorded full-body 3D motion, from which 13 joint angle data (Joint Angle) and 13 angular energy data (Joint Energy) were derived. The sEMG sensors captured muscle activity data from the right and left upper and lower back muscle groups. The dataset includes self-reported pain intensity ratings (0-10) from CLBP individuals and labels for six pain-related behaviours provided by physiotherapists. We assigned positive labels to individuals reporting pain intensity greater than 5, resulting in 12 positive and 34 negative labels.

To prepare the data for analysis, we addressed the variation in recording lengths by trimming recordings that exceeded 18,000 frames by 6,000 frames from both ends. This method allowed us to balance standardisation with data preservation, reducing information loss and the impact of stress and fatigue [5]. We used normalisation techniques for all data modalities (sEMG, Angles, and Energies) to ensure consistent scaling and facilitate model training. Specifically, we normalised values to a 0-1 range by subtracting the minimum value and dividing by the maximum value within each sample. Additionally, we zero-padded the data to match the longest recording length (17,995 time steps). Lastly, we employed a sliding window approach with a 3,000 time-step window (50 seconds at 60 Hz), allowing the STAL to capture longer-term temporal dependencies and enhance robustness to noise.

4.2 Implementation Details

The experiments were conducted on a high-performance computing system equipped with an AMD EPYC 7302 16-core processor, 504GB of RAM, and an NVIDIA RTX A6000 GPU. The deep learning framework used was PyTorch version 2.3.1 [23]. The AdamW optimiser [24] was employed for both the STAL encoder and the SRNN classifier, with fixed learning rates of 5.0×1035.0superscript1035.0\times 10^{-3}5.0 × 10 start_POSTSUPERSCRIPT - 3 end_POSTSUPERSCRIPT and 7.5×1047.5superscript1047.5\times 10^{-4}7.5 × 10 start_POSTSUPERSCRIPT - 4 end_POSTSUPERSCRIPT, respectively, due to its effectiveness in handling sparse gradients, a common characteristic in SNN training.

The number of neurons in the hidden layers was chosen based on the size of the sliding window. Therefore, both 𝐙(𝟏)superscript𝐙1\mathbf{Z^{(1)}}bold_Z start_POSTSUPERSCRIPT ( bold_1 ) end_POSTSUPERSCRIPT and 𝐙(𝟐)superscript𝐙2\mathbf{Z^{(2)}}bold_Z start_POSTSUPERSCRIPT ( bold_2 ) end_POSTSUPERSCRIPT were designed with 3000 neurons each to effectively process the temporal information within the data. The SRNN classifier’s hidden layer was composed of 500 recurrent LIF neurons. This decision was deliberate to find a balance between computational efficiency, model performance, and generalisability. Additionally, to ensure that the model learns from all classes in the context of this imbalanced dataset with a small sample size and to mitigate any biasing effects, we applied minority over-sampling with weighted sample replacement during batch creation.

Table 1: performance of four encoders on the EmoPain Database.
Spike density \downarrow
Encoder Classifier Accuracy \uparrow AUC \uparrow F1 \uparrow MCC \uparrow Ensemble sEMG Energy Angle
Baseline methods
Rate-coding [17] SRNN 73.91% 0.500 0.000 0.000 0.167 0.068 0.584 0.703
Latency-coding [17] SRNN 71.74% 0.512 0.133 0.044 0.200 0.200 0.200 0.200
Proposed method
STAL–Stacked SRNN 80.43% 0.679 0.526 0.437 0.549 0.360 0.703 0.793
STAL–Vanilla SRNN 76.09% 0.596 0.353 0.270 0.021 0.007 0.439 0.528

An empirical study identified optimal hyperparameters for our STAL-SRNN architecture. The best configuration (see Fig. 4) included a batch size of 32 for sEMG, 8 for Energy, and 16 for Angle, along with a dropout rate of 0.5 and ψ=5𝜓5\psi=5italic_ψ = 5.

Refer to caption
Refer to caption
Figure 4: (a) The influence of batch size on the F1-score of the proposed STAL-SRNN for each modality. (b) The Spike density for the proposed architecture as a function of (ψ,p)𝜓𝑝(\psi,p)( italic_ψ , italic_p ). While the difference in spike density between (ψ,p)=(5,0.5)𝜓𝑝50.5(\psi,p)=(5,0.5)( italic_ψ , italic_p ) = ( 5 , 0.5 ) and (ψ,p)=(10,0.5)𝜓𝑝100.5(\psi,p)=(10,0.5)( italic_ψ , italic_p ) = ( 10 , 0.5 ) is less than 1% (0.549 vs 0.540), the former achieves better performance across other classification metrics.

The training process consisted of 30 epochs for the STAL encoder and 25 epochs for the SRNN classifier, with early stopping implemented to prevent overfitting. This training regime was carefully chosen to balance model convergence and computational efficiency. All the hyperparameters and settings determined through this empirical search were subsequently used consistently across all experiments in this study. This approach ensures the reproducibility of our results and allows for a fair comparison across different experimental conditions111The implementation is accessible for reproducibility purposes on https://github.com/freek1/emopain-stl..

4.3 Evaluation Metrics

We used both standard classification metrics and a sparsity measure tailored to SNNs to evaluate the effectiveness of the STAL-SRNN framework for CLBP classification on the EmoPain dataset. All metrics were assessed using leave-one-subject-out cross-validation to ensure a rigorous analysis, especially given the relatively small size of the EmoPain dataset.

The standard classification metrics employed in this study encompassed accuracy, Macro F1 score, Area Under the Curve (AUC), and Matthews Correlation Coefficient (MCC). These metrics were suggested in the EmoPain challenge [25] for comprehensive insights into the model’s performance across various aspects.

In addition to these standard metrics, we introduced a spike density measure to assess the efficiency of our encoded spike trains. This measure provides insight into the sparsity of our neural representations across our multimodal approach. The overall spike density is computed as the harmonic mean of the individual densities for each data modality, as shown in Eq. (4.3).

SpikeDensitySpikeDensity\displaystyle\mathrm{SpikeDensity}roman_SpikeDensity =3m1Dm,absent3subscript𝑚1subscript𝐷𝑚\displaystyle=\frac{3}{\sum_{m}\frac{1}{D_{m}}},= divide start_ARG 3 end_ARG start_ARG ∑ start_POSTSUBSCRIPT italic_m end_POSTSUBSCRIPT divide start_ARG 1 end_ARG start_ARG italic_D start_POSTSUBSCRIPT italic_m end_POSTSUBSCRIPT end_ARG end_ARG ,
Dmsubscript𝐷𝑚\displaystyle D_{m}italic_D start_POSTSUBSCRIPT italic_m end_POSTSUBSCRIPT =i=1Tj=1ψk=1cm𝐁m,i,j,kTψcm,absentsuperscriptsubscript𝑖1𝑇superscriptsubscript𝑗1𝜓superscriptsubscript𝑘1subscript𝑐𝑚subscript𝐁𝑚𝑖𝑗𝑘𝑇𝜓subscript𝑐𝑚\displaystyle=\frac{\sum_{i=1}^{T}\sum_{j=1}^{\psi}\sum_{k=1}^{c_{m}}\mathbf{B% }_{m,i,j,k}}{T\psi c_{m}},= divide start_ARG ∑ start_POSTSUBSCRIPT italic_i = 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT ∑ start_POSTSUBSCRIPT italic_j = 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_ψ end_POSTSUPERSCRIPT ∑ start_POSTSUBSCRIPT italic_k = 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_c start_POSTSUBSCRIPT italic_m end_POSTSUBSCRIPT end_POSTSUPERSCRIPT bold_B start_POSTSUBSCRIPT italic_m , italic_i , italic_j , italic_k end_POSTSUBSCRIPT end_ARG start_ARG italic_T italic_ψ italic_c start_POSTSUBSCRIPT italic_m end_POSTSUBSCRIPT end_ARG , (13)

where, 𝐁m{0,1}T×ψ×cmsubscript𝐁𝑚superscript01𝑇𝜓subscript𝑐𝑚\mathbf{B}_{m}\in\{0,1\}^{T\times\psi\times c_{m}}bold_B start_POSTSUBSCRIPT italic_m end_POSTSUBSCRIPT ∈ { 0 , 1 } start_POSTSUPERSCRIPT italic_T × italic_ψ × italic_c start_POSTSUBSCRIPT italic_m end_POSTSUBSCRIPT end_POSTSUPERSCRIPT represents the spike train for modality m𝑚mitalic_m, summed over T𝑇Titalic_T time steps, cmsubscript𝑐𝑚c_{m}italic_c start_POSTSUBSCRIPT italic_m end_POSTSUBSCRIPT channels, and ψ𝜓\psiitalic_ψ spikes per time step. The denominator represents the maximum possible number of spikes for that modality.

4.4 Experimental Results

Our analysis focuses on three main aspects: (1) the performance of different spike train encoding strategies with the SRNN classifier, (2) an ablation study to understand the impact of architectural choices and hyperparameters, and (3) a comparison with state-of-the-art deep learning models for chronic pain assessment.

4.4.1 Spike Train Encoders

We conducted an assessment of four spike train encoders using the SRNN classifier with leave-one-subject-out cross-validation on the EmoPain dataset. The term STAL–Stacked refers to the proposed encoder outlined in Section 3.1. STAL–Vanilla represents a simplified version of STAL–Stacked, wherein the input is directly connected to the Repeat layer, excluding the two feature extraction blocks. Rate-coding and Latency-coding are considered baseline methods that exclusively encode information in the spatial and temporal domains, respectively. The experimental results are presented in Table 1, and Fig. 5 shows the confusion matrices for each encoder.

Refer to caption
Figure 5: Confusion matrices of the encoders with the SRNN classifier.

The results indicate that STAL–Stacked outperformed all other models, achieving the highest accuracy of 80.43%, AUC of 0.679, F1 score of 0.526, and Matthews Correlation Coefficient (MCC) of 0.437. This suggests that STAL–Stacked offers the most robust and balanced performance among the encoders tested. STAL–Vanilla emerged as the second-best performer, achieving an accuracy of 76.09%, AUC of 0.596, F1 score of 0.353, and MCC of 0.270. Notably, STAL–Vanilla achieved this with the least dense spike trains (spike density of 0.021), indicating its potential for efficient computation.

In comparison, rate-coding and latency-coding methods showed lower performance across all metrics. Rate-coding achieved 73.91% accuracy but did not effectively differentiate between classes (F1 score and MCC of 0). Latency-coding performed slightly better in terms of F1 score and MCC but had the lowest overall accuracy at 71.74

4.4.2 Ablation Study

To assess the effectiveness of the STAL architecture and the impact of different design choices, we conducted an ablation study comparing STAL–Stacked and STAL–Vanilla and examining the effects of dropout probability.

The primary distinction between STAL–Stacked and STAL–Vanilla lies in the inclusion of feature extraction blocks. STAL–Stacked integrates two feature extractor blocks before the Repeat layer, while STAL–Vanilla directly connects the input to the Repeat layer. The superior performance of STAL–Stacked (80.43% accuracy compared to 76.09% for STAL–Vanilla) suggests that these additional layers contribute to a more effective feature representation. Additionally, STAL–Stacked employed a higher spike density (0.549) compared to STAL–Vanilla (0.021). This outcome indicates that the improved performance of STAL–Stacked is attributed to a more information-rich encoding, albeit at the cost of increased computational resources.

Both variants of the STAL method performed better than the baseline rate- and latency-based encoding approaches. This improvement can be attributed to STAL’s unique ability to encode spike trains both spatially and temporally at the same time. Specifically, STAL encodes spatially by learning a distinct threshold value ϕi𝚽subscriptitalic-ϕ𝑖𝚽\phi_{i}\in\mathbf{\Phi}italic_ϕ start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT ∈ bold_Φ for each feature hi𝐡subscript𝑖𝐡h_{i}\in\mathbf{h}italic_h start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT ∈ bold_h, as shown in Eq. (2). At the same time, the loss function encourages temporal encoding of information by promoting spikes at the end of the spike train to encode a high signal, as demonstrated by the positional encoding in Eq. (3). This dual encoding capability enables STAL to capture more complex patterns in the data, resulting in improved performance compared to the baseline methods.

In contrast, rate-coding encodes information in the spatial domain by proportionally correlating the probability sum of generated spikes for each input value while randomly positioning the spikes in the temporal domain. Similarly, latency-coding only encodes information temporally. This is evident from the uniform distribution in the spatial domain, where 1 spike is generated for each input, and the spike is strategically positioned in the temporal domain.

In our ablation study, we focused on examining the impact of dropout probability on the performance of our models. We found that in the STAL–Stacked architecture, a higher dropout probability (p=0.5)𝑝0.5(p=0.5)( italic_p = 0.5 ) had a positive effect on performance. This can be attributed to the presence of fully connected layers in the feature extraction blocks, which facilitated information distribution during dropout. Conversely, in STAL–Vanilla, a lower dropout probability (p=0.0)𝑝0.0(p=0.0)( italic_p = 0.0 ) was found to be optimal. This is because the absence of feature extraction layers in this architecture made it more susceptible to information loss during dropout.

4.4.3 Comparison with Deep Learning Models

To contextualise our results within the broader landscape of pain assessment methods, we compared our STAL-SRNN approach with state-of-the-art deep learning models on the EmoPain dataset (see Table 2). This comparison serves multiple purposes:

  1. 1.

    It establishes a performance baseline for spiking neural networks in chronic pain assessment, an area predominantly explored using traditional deep learning methods.

  2. 2.

    It highlights the trade-offs between deep learning models’ high performance and the potential efficiency advantages of spiking neural networks.

  3. 3.

    It shows that despite the inherent differences in input processing (continuous vs. spike-based), our STAL-SRNN can achieve competitive results in certain aspects.

Table 2: Comparison of the proposed spiking architecture with state-of-the-art deep learning models on the EmoPain dataset.
Methods AUC F1 MCC
Deep Learning
MiMT [26] 0.648 0.722 0.327
LSTM+GCN [27] 0.690 0.731 0.365
GRU-RNN [7] 0.855 0.765 0.411
L-SFAN [9] 0.849 0.772 0.510
Proposed Spiking Method
STAL-SRNN 0.679 0.526 0.437

Our STAL-SRNN demonstrates competitive performance on the EmoPain dataset, achieving an AUC of 0.679 for CLBP classification. While this AUC is lower than the deep learning approaches (GRU-RNN: 0.855, L-SFAN: 0.849), it is important to highlight that our method outperforms the GRU-RNN model in MCC (0.437 vs 0.411). The competitive MCC suggests that STAL-SRNN has the potential for balanced classification in this domain.

5 Conclusion and Future Work

In this study, we presented the first application of spiking neural networks for chronic lower back pain classification using the EmoPain dataset. Our work introduces two significant contributions to the field of neuromorphic computing and biosignal analysis. We introduced Spike Threshold Adaptive Learning (STAL), a novel, trainable encoder that effectively converts continuous biosignals into spike trains. Experimental results demonstrate that STAL could preserve crucial temporal dynamics and adapt to signal characteristics. We proposed an ensemble of Spiking Recurrent Neural Network (SRNN) classifiers designed to process surface Electromyography (sEMG) and Inertial Measurement Unit (IMU) data within a multi-stream architecture. This ensemble leverages a random forest classifier to efficiently harness the temporal processing capabilities inherent in SRNNs.

Our STAL-SRNN framework demonstrates outstanding performance, achieving an accuracy of 80.43%, AUC of 67.90%, F1 score of 52.60%, and Matthews Correlation Coefficient (MCC) of 0.437. Notably, it outperforms traditional rate-based and latency-based encoding methods and surpasses the best-performing deep learning approach in terms of MCC, indicating better-balanced class prediction. This achievement was particularly significant given the challenges posed by the small sample size and class imbalance in the EmoPain dataset.

The success of our approach could represent a significant step forward in applying neuromorphic computing techniques to chronic pain management challenges. By leveraging the inherent efficiency of SNNs, our method bridged the gap between efficient spike-based computation and complex biosignal analysis. The STAL-SRNN provides new possibilities for personalised, continuous health monitoring and intervention in chronic pain management.

Future work could explore extending the framework to simultaneously classify pain levels and identify specific pain-related behaviours, providing more comprehensive insights for pain management. Furthermore, exploring the deployment of the STAL-SRNN framework on neuromorphic hardware could fully leverage its energy efficiency for long-term, wearable applications.

References

  • [1] A. Supratak, C. Wu, H. Dong, K. Sun, and Y. Guo, Survey on Feature Extraction and Applications of Biosignals.   Springer International Publishing, 2016, pp. 161–182.
  • [2] N. Li, R. Zhou, B. Krishna, A. Pradhan, H. Lee, J. He, and N. Jiang, “Non-invasive Techniques for Muscle Fatigue Monitoring: A Comprehensive Survey,” ACM Computing Survey, vol. 56, no. 9, 2024.
  • [3] M. S. H. Aung, S. Kaltwang, B. Romera-Paredes, B. Martinez, A. Singh, M. Cella, M. Valstar, H. Meng, A. Kemp, M. Shafizadeh, A. C. Elkins, N. Kanakam, A. De Rothschild, N. Tyler, P. J. Watson, A. C. D. C. Williams, M. Pantic, and N. Bianchi-Berthouze, “The Automatic Detection of Chronic Pain-Related Expression: Requirements, Challenges and the Multimodal EmoPain Dataset,” IEEE Transactions on Affective Computing, vol. 7, no. 4, pp. 435–451, 2016.
  • [4] R. J. Yong, P. M. Mullins, and N. Bhattacharyya, “Prevalence of chronic pain among adults in the united states,” Pain, vol. 163, no. 2, pp. e328–e332, 2022.
  • [5] R. Fernandez Rojas, N. Brown, G. Waddington, and R. Goecke, “A systematic review of neurophysiological sensing for the assessment of acute pain,” npj Digital Medicine, vol. 6, no. 1, p. 76, 2023.
  • [6] N. Makaram, P. A. Karthick, and R. Swaminathan, “Analysis of Dynamics of EMG Signal Variations in Fatiguing Contractions of Muscles Using Transition Network Approach,” IEEE Transactions on Instrumentation and Measurement, vol. 70, pp. 1–8, 2021.
  • [7] M. M. Dehshibi, T. Olugbade, F. Diaz-de Maria, N. Bianchi-Berthouze, and A. Tajadura-Jiménez, “Pain Level and Pain-Related Behaviour Classification Using GRU-Based Sparsely-Connected RNNs,” IEEE Journal of Selected Topics in Signal Processing, vol. 17, no. 3, pp. 677–688, 2023.
  • [8] E. Farago, D. MacIsaac, M. Suk, and A. D. C. Chan, “A Review of Techniques for Surface Electromyography Signal Quality Analysis,” IEEE Reviews in Biomedical Engineering, vol. 16, pp. 472–486, 2023.
  • [9] J. Ortigoso-Narro, F. D. de Maria, M. M. Dehshibi, and A. Tajadura-Jiméne, “L-SFAN: Lightweight Spatially-focused Attention Network for Pain Behavior Detection,” pp. 1–9, 2024.
  • [10] F. Wang, W. M. Severa, and F. Rothganger, “Acquisition and Representation of Spatio-Temporal Signals in Polychronizing Spiking Neural Networks,” in Proceedings of the 7th Annual Neuro-Inspired Computational Elements Workshop.   Association for Computing Machinery, 2019.
  • [11] A. Sun, X. Chen, M. Xu, X. Zhang, and X. Chen, “Feasibility study on the application of a spiking neural network in myoelectric control systems,” Frontiers in Neuroscience, vol. 17, p. 1174760, 2023.
  • [12] D. Auge, J. Hille, E. Mueller, and A. Knoll, “A survey of encoding techniques for signal processing in Spiking Neural Networks,” Neural Processing Letters, vol. 53, no. 6, pp. 4693–4710, 2021.
  • [13] N. Perez-Nieves and D. F. M. Goodman, “Sparse Spiking Gradient Descent,” in Advances in Neural Information Processing Systems, 2021, pp. 11 795–11 808.
  • [14] M. Shahsavari, D. Thomas, A. Brown, and W. Luk, “Neuromorphic Design Using Reward-based STDP Learning on Event-Based Reconfigurable Cluster Architecture,” in International Conference on Neuromorphic Systems.   Association for Computing Machinery, 2021.
  • [15] J. M. Fabian, D. C. O’Carrol, and S. D. Wiederman, “Sparse spike trains and the limitation of rate codes underlying rapid behaviours,” Biology Letters, vol. 19, no. 5, p. 20230099, 2023.
  • [16] M. Shahsavari, D. Thomas, M. van Gerven, A. Brown, and W. Luk, “Advancements in spiking neural network communication and synchronization techniques for event-driven neuromorphic systems,” Array, vol. 20, p. 100323, 2023.
  • [17] J. K. Eshraghian, M. Ward, E. O. Neftci, X. Wang, G. Lenz, G. Dwivedi, M. Bennamoun, D. S. Jeong, and W. D. Lu, “Training Spiking Neural Networks Using Lessons From Deep Learning,” Proceedings of the IEEE, vol. 111, no. 9, pp. 1016–1054, 2023.
  • [18] E. Donati, M. Payvand, N. Risi, R. Krause, and G. Indiveri, “Discrimination of EMG Signals Using a Neuromorphic Implementation of a Spiking Neural Network,” IEEE Transactions on Biomedical Circuits and Systems, vol. 13, no. 5, pp. 795–803, 2019.
  • [19] P. Gong, P. Wang, Y. Zhou, and D. Zhang, “A Spiking Neural Network With Adaptive Graph Convolution and LSTM for EEG-Based Brain-Computer Interfaces,” IEEE Transactions on Neural Systems and Rehabilitation Engineering, vol. 31, pp. 1440–1450, 2023.
  • [20] R. D. Gurchiek, N. Donahue, N. M. Fiorentino, and R. S. McGinnis, “Wearables-Only Analysis of Muscle and Joint Mechanics: An EMG-Driven Approach,” IEEE Transactions on Biomedical Engineering, vol. 69, no. 2, pp. 580–589, 2021.
  • [21] A. El Ferdaoussi, J. Rouat, and E. Plourde, “Efficiency metrics for auditory neuromorphic spike encoding techniques using information theory,” Neuromorphic Computing and Engineering, vol. 3, no. 2, p. 024007, 2023.
  • [22] T. K. Ho, “Random decision forests,” in Proceedings of 3rd International Conference on Document Analysis and Recognition, vol. 1.   IEEE, 1995, pp. 278–282.
  • [23] A. Paszke, S. Gross, S. Chintala, G. Chanan, E. Yang, Z. DeVito, Z. Lin, A. Desmaison, L. Antiga, and A. Lerer, “Automatic differentiation in PyTorch,” in NIPS Autodiff Workshop, 2017.
  • [24] I. Loshchilov and F. Hutter, “Decoupled weight decay regularization,” in International Conference on Learning Representations, 2019, pp. 1–18.
  • [25] J. O. Egede, S. Song, T. A. Olugbade, C. Wang, C. D. C. Amanda, H. Meng, M. Aung, N. D. Lane, M. Valstar, and N. Bianchi-Berthouze, “Emopain challenge 2020: Multimodal pain evaluation from facial and bodily expressions,” in 2020 15th IEEE International Conference on Automatic Face and Gesture Recognition (FG 2020).   IEEE, 2020, pp. 849–856.
  • [26] T. Olugbade, N. Gold, A. C. d. C. Williams, and N. Bianchi-Berthouze, “A Movement in Multiple Time Neural Network for Automatic Detection of Pain Behaviour,” in International Conference on Multimodal Interaction.   Association for Computing Machinery, 2021, pp. 442––445.
  • [27] C. Wang, Y. Gao, A. Mathur, A. C. De C. Williams, N. D. Lane, and N. Bianchi-Berthouze, “Leveraging Activity Recognition to Enable Protective Behavior Detection in Continuous Data,” Proc. ACM Interact. Mob. Wearable Ubiquitous Technol., vol. 5, no. 2, 2021.