A Comprehensive Analysis of Real-World Accelerometer Data Quality in a Global Smartphone-based Seismic Network

Yawen Zhang1, Qingkai Kong2, Tao Ruan1, Qin Lv1 and Richard Allen 3 1Department of Computer Science, University of Colorado Boulder, Boulder, Colorado, USA
2Lawrence Livermore National Laboratory, Livermore, California, USA
3Berkeley Seismological Laboratory, Berkeley, California, USA
Email: 1{tao.ruan, yawen.zhang, qin.lv}@colorado.edu, 2kongqk@berkeley.edu, 3rallen@berkeley.edu
Abstract

The proliferation of low-cost sensors in smartphones has facilitated numerous applications; however, large-scale deployments often encounter performance issues. Sensing heterogeneity, which refers to varying data quality due to factors such as device differences and user behaviors, presents a significant challenge. In this research, we perform an extensive analysis of 3-axis accelerometer data from the MyShake system, a global seismic network utilizing smartphones. We systematically evaluate the quality of approximately 22 million 3-axis acceleration waveforms from over 81 thousand smartphone devices worldwide, using metrics that represent sampling rate and noise level. We explore a broad range of factors influencing accelerometer data quality, including smartphone and accelerometer manufacturers, phone specifications (release year, RAM, battery), geolocation, and time. Our findings indicate that multiple factors affect data quality, with accelerometer model and smartphone specifications being the most critical. Additionally, we examine the influence of data quality on earthquake parameter estimation and show that removing low-quality accelerometer data enhances the accuracy of earthquake magnitude estimation.

Index Terms:
Mobile Sensing, Smartphone Seismic Network, Sensing Quality

I Introduction

Smartphones, equipped with a collection of sensors, have increasingly been employed in various applications. [1, 2, 3, 4]. In recent years, researchers have been exploring using smartphones in disaster-related applications, for example, earthquake detection [5, 6, 7, 8, 9, 10]. Traditionally, the monitoring of earthquakes relies on high-quality seismometers. Seismic networks built with a good density of seismometers are a prerequisite for developing an Earthquake Early Warning (EEW) system [5]. It is expensive to deploy and maintain such networks in many underdeveloped regions. While smartphone-based accelerometer data are primarily used for Human Activity Recognition (HAR) [11, 12, 13, 14], the wide availability of low-cost accelerometers on smartphones offers novel approaches for earthquake detection [15]. To build a global seismic network, the MyShake system 111https://myshake.berkeley.edu/ has been developed to leverage accelerometers on smartphones to detect earthquake-like motions [16, 5]. The MyShake system was launched in 2016 by the University of California, Berkeley. To date, it has recorded over 1000 earthquakes worldwide [17]. The MyShake application (available in the iTunes App and Google Play stores) has been downloaded by more than 2.5 million users, and more than 500,000 phones interact with the system each day. MyShake also delivers USGS ShakeAlert earthquake early warning messages to California, Oregon, and Washington [18]. Building upon MyShake, Google has recently announced the Android Earthquake Alerts System 222https://blog.google/products/android/earthquake-detection-and-alerts/ [19] that makes use of billions of Android phones globally as mini seismometers for earthquake detection. This effort would greatly expand the capability of mobile sensing for earthquake detection. MyShake has also started to deliver USGS ShakeAlert earthquake early warning messages to the state of California [20].

Like many other mobile sensing applications, when deployed at a large scale, MyShake faces the performance challenge arising from sensing heterogeneity [21]. The accelerometer data collected exhibit varying data quality due to different devices, user behaviors, etc. Previous studies have shown that such sensing heterogeneity can significantly impair the performance of accelerometer-based applications [22, 23]. Currently, the MyShake system relies on seismologists to manually review all the waveforms and remove those with significant quality issues, e.g., missing data, spikes [24], which is impractical in large-scale deployments. It is crucial to automate the process of quality assessment, and gain a comprehensive understanding of the primary quality concerns associated with accelerometer data. Furthermore, accelerometer data quality may be influenced by a variety of factors, including the device, accelerometer sensor, user’s location, and more. Previous studies has investigated influencing factors like the device and accelerometer sensor, but with a small number of devices in the analysis (e.g., 13 smartphones from 4 manufacturers in [22]). Real-world applications such as MyShake are dealing with a much larger number of devices (81 thousand smartphones) as well as accelerometer sensors. Our study represents the first comprehensive, large-scale analysis of accelerometer data quality. The findings would be beneficial for not only MyShake but also other accelerometer-based applications.

To summarize, our study makes the following contributions:

  • We investigate real-world, large-scale 3-axis accelerometer data collected by the MyShake system, assessing their quality based on parameters such as sampling rate and noise level.

  • We explore an extensive range of factors, including smartphone manufacturer, accelerometer sensor, smartphone specifications (RAM, battery, etc.), geolocation, and trigger time, as to understand their relationships with accelerometer data quality.

  • We assess the importance of various factors by employing them to predict accelerometer data quality. The findings suggest that the quality of accelerometer data is influenced by multiple factors, with the accelerometer model and smartphone specifications being the most important ones.

  • We examine the effect of accelerometer data quality on earthquake parameter estimation by applying quality control in real-world earthquake events. The results demonstrate that by filtering out poor-quality accelerometer data, the accuracy of earthquake magnitude estimations can be improved.

II Related Work

II-A New Approaches for Earthquake Detection

Traditionally, earthquake detection relies on high-quality seismometers. In recent years, there have been many studies that explore new ways to detect earthquakes. Sakaki et al. [25, 26] and Earle et al. [7] leverage Twitter data to estimate the locations of earthquake events, and build an earthquake reporting system upon that. Avvenuti et al. [27] make use of real-time Twitter messages to detect earthquake events, and mine the message content to discover knowledge about the consequences of those events. More recent studies explore the potential of utilizing low-cost accelerometers, which are found in devices such as smartphones and connected vehicles, for the purpose of earthquake detection [28, 9, 5, 8, 29]. MyShake is such an application that leverages these low-cost accelerometers in smartphones as seismometers to detect earthquake events.

II-B Accelerometer Data Quality

When deployed in the real world, mobile sensing usually face the challenge arises from varying data quality. In the HAR application, Stisen et al. [22] analyze several types of sensing heterogeneities like sensor bias, sampling rate, and sampling rate instability. Martinez et al. [30] examine wearable data quality by looking into metrics related to data gaps, replacements, and wearable energy levels. A variety of sources can contribute to data quality variability, including the device, sensor, operating system (OS), user behavior, etc. Although previous studies have explored these sources, the scale of examination has been limited. For instance, Stisen et al. [22] investigate heterogeneity sources such as accelerometer sensor and CPU load across 13 phone models from four manufacturers. Min et al. [31] analyze device and temporal variability in a multi-device setting with an accelerometer dataset from 15 participants performing seven activities. Distinguished from previous studies, our study presents the first comprehensive, large-scale analysis of accelerometer data quality. Additionally, we thoroughly examine an extensive range of factors that could potentially contribute to data quality variability.

II-C Quality-Aware Mobile Crowdsensing

Various strategies have been proposed to address the issue of data quality variability in mobile sensing applications. One common approach involves data preprocessing techniques, such as outlier removal [32] or data interpolation [33], which aim to alleviate the effects of poor-quality data. In addition to preprocessing, some studies explicitly consider varying data quality and incorporate it into the modeling process. For instance, Stisen et al. train classifiers for datasets exhibiting diverse quality levels [22]. Khan et al. employ domain adaptation techniques to adapt models to different contexts (e.g., user, device type, device instance) [14]. Chuprov et al. [34] design a Genetic Algorithm (GA)-based approach for sensor selection and fusion. Our research represents an initial effort towards creating a quality-aware mobile sensing system for earthquake detection, with a focus on analyzing accelerometer data quality on a large scale and assessing the effects of data quality variability. We acknowledge the emergence of novel methods in this area and intend to explore their potential application in future work.

III MyShake System and Dataset Overview

III-A MyShake System

MyShake is a global, smartphone-based seismic network that collects data from accelerometers to detect earthquakes and potentially provide earthquake early warning (EEW) using crowdsourced information [5, 20]. The MyShake app employs a trained Artificial Neural Networks (ANNs) model to identify earthquake-like movements on individual phones, only when the phone is stationary. If such movements are detected on a phone, a trigger (including location, time, and amplitude) is sent to a cloud server, where triggers from multiple devices are compiled and a network detection algorithm is used to confirm an earthquake [17]. In addition to each trigger, 3-axis accelerometer data is collected (as shown in Fig. 3). Specifically, 5-minute segments (1 minute before and 4 minutes after the trigger) of 3-component acceleration data are recorded and uploaded to the cloud server when the phone has access to WiFi and power. The system is designed to sample accelerometer data at 25 Hz, resulting in approximately a 40-msec time interval between two samples. From all the uploaded waveforms, an earthquake waveform database is established [24]. Figure 1 illustrates the system architecture of MyShake.

Refer to caption
Figure 1: MyShake system architecture

III-B Dataset Description

The MyShake earthquake waveform database provides the accelerometer dataset, comprising approximately 22 million anonymized waveforms collected from 81 thousand MyShake devices globally (all android devices), as depicted in Fig. 2. These waveforms were collected between 2016 and 2019. Information about the phone brands and accelerometer sensor vendors for these devices was also collected. MyShake obtains user consent to gather their phones’ GPS locations and adds 1 km of random noise to the locations to safeguard user privacy. All 81 thousand devices include GPS location information.

Refer to caption
Figure 2: Spatial distribution of MyShake devices (with 1 km random spatial noise)
Refer to caption
(a) Good-quality waveforms
Refer to caption
(b) Waveforms with missing data
Refer to caption
(c) Waveforms with high noise levels
Refer to caption
(d) Waveforms with missing data and high noise levels
Figure 3: Example waveforms with varying quality (Note: spikes near 0s represent earthquake-related triggers).

IV Accelerometer Data Quality

In this section, we firstly examine sample waveforms of varying quality. Secondly, we introduce the quality metrics employed to evaluate waveform data quality and present an analysis of the overall quality distribution.

IV-A Example Waveforms

Fig. 3 displays four sample waveforms with varying quality. The first example in Fig. 3(a) exhibits good quality as it consistently collects samples within the 5-minute duration, and the 3-axis acceleration variations are very small before the earthquake-related trigger (near 0s). The second example in Fig. 3(b) presents noticeable missing data (i.e., gaps between time intervals) and a significantly smaller total number of samples compared to the example in Fig. 3(a) (2773 vs. 8826). Fig. 3(c) showcases waveforms with high noise levels, as the acceleration values exhibit much larger variations than those in Fig. 3(a). In the y-axis, the noise-induced accelerations almost overshadow the earthquake-related accelerations. In Fig. 3(c), the standard deviations of the noise level are substantially larger than those in Fig.3(a) (0.0022 g vs. 0.0006 g). The example in Fig. 3(d) combines the issues observed in Fig. 3(b) and Fig. 3(c), featuring both missing data and low resolution which we start to see the resolution levels.

IV-B Quality Metrics

Based on the analysis of example waveforms, we characterize waveform data quality using various metrics focused on sampling rate and noise level, which are highly relevant to earthquake detection performance. Sampling rate-based quality metrics have been extensively employed in other accelerometer-based applications as well [22, 35]. Specifically, we generate the following quality metrics:

  • Sampling rate: This includes the total number of samples (n_sample), the total number of pre-trigger samples (n_noise), and the standard deviation of time intervals between samples (std_dt).

  • Noise level: This includes the standard deviation of noise data for the x, y, and z components (std_x, std_y, std_z). Note that for the vertical component, gravity is removed.

We calculate six quality metrics for each 5-minute segment of 3-axis acceleration data. The sampling rate affects the number of valid data points collected for earthquake detection. Based on the system’s design, the expected values for n_sample and n_noise should be 7500 and 1500, respectively. std_dt serves as a measure of the sampling rate stability, with larger std_dt values indicating less stable time intervals throughout the recording period. The noise level can impact the accuracy of the recorded 3-axis acceleration data, which is crucial for estimating earthquake parameters. std_x/y/z provide a comparative measure of each device’s noise level.

IV-C Waveform Quality Distribution

Fig. 4 presents the cumulative distributions of six waveform quality metrics. Although the MyShake system is designed to sample data at 25 Hz, numerous waveforms display variations in terms of sampling rate. Regarding n_sample, approximately half of the waveforms contain around 7500 samples. For the remaining waveforms, there is a higher percentage of undersampling instances (31.1%) compared to oversampling instances (24.7%). The patterns of n_noise are roughly similar to those of n_sample. std_dt ranges from 0 to 10,000 msec, suggesting the presence of varying gap sizes between samples. As for noise level, std_x, std_y, and std_z exhibit comparable distributions. Most variations fall between 0.0005 and 0.01 g, but a very small portion of waveforms still demonstrate extremely large noise variations.

Refer to caption
Figure 4: Cumulative distribution function (CDF) of six waveform data quality metrics, with dashed orange lines representing the expected values of n_sample and n_noise. This figure highlights significant variations observed in waveform quality distributions.

V Factors related to Accelerometer Data Quality

In this section, we explore potential factors influencing accelerometer-based sensing quality, such as phone and accelerometer manufacturer, accelerometer model, phone specifications, geolocation, and trigger time. It is important to note that, in addition to these observable factors, numerous unobserved factors like user behavior and local environment can also affect accelerometer data quality. Our study, however, primarily focuses on the observed factors.

V-A Smartphone Hardware

For each MyShake device, both smartphone and accelerometer sensor information was collected. Smartphone information consists of its manufacturer and specific model, for example, “samsung, galaxy a7”. Accelerometer sensor information consists of its manufacturer and specific model, for example, “st, lsm330” (“st” is short for STMicroelectronics, which is a well-known semiconductor manufacturer).

The raw information collected can be noisy, inconsistent, or even incorrect. Therefore, we perform preprocessing on the smartphone and accelerometer sensor information. After preprocessing, there are 513 phone manufacturers, 4036 phone models, 17 accelerometer manufacturers, and 419 accelerometer models. Among the 513 phone manufacturers, only 26 manufacturers have at least 100 MyShake devices, and they account for 96.3% of all devices (81 thousand in total). This suggests that most MyShake devices are associated with major players in the smartphone market. Consequently, we rank phone manufacturers and accelerometer manufacturers by the number of devices linked to them and examine the most representative ones.

Fig. 5 displays the cumulative distributions of quality metrics for various phone manufacturers, revealing differences in both sampling rate and noise levels. Waveforms from certain phone manufacturers exhibit significant deviations compared to others. For example, the waveforms of htc phones demonstrate a deviation in std_dt. The median std_dt for htc phones is 39 msec, considerably larger than that of other manufacturers, such as samsung at 4 msec. A larger std_dt implies greater variations in time intervals (i.e., an unstable sampling rate). Another notable deviation is found in huawei, which displays different distributions in std_x, std_y, and std_z, suggesting that waveforms from huawei phones are more likely to exhibit high noise levels. These significant deviations point to potential quality issues in the waveforms collected by those phone manufacturers.

Fig. 6 presents the cumulative distributions of quality metrics for various accelerometer manufacturers, revealing greater variations in both sampling rate and noise levels compared to phone manufacturers. Regarding sampling rate, variations among accelerometer manufacturers are primarily observed in the undersampling aspect. Manufacturers such as memsic, mcube, and nxp exhibit a higher proportion of undersampling cases compared to others. Additionally, waveforms from memsic show significant deviations in the distributions of std_x, std_y, and std_z. Fig. 7 illustrates that even waveforms from the same accelerometer manufacturer but different models display variations in sampling interval and noise level. In this example, all five models are from STMicroelectronics and are ordered by release time. Newer generation models like lsm6dsl, lsm6dsm, and lsm6dso exhibit smaller variations in time intervals and noise levels compared to older models such as lis2dh and lis3dh. Interestingly, newer models within the same series tend to have slightly larger variations in noise levels (e.g., lis3dh vs. lis2dh).

Refer to caption
Figure 5: Cumulative distribution function (CDF) of waveform quality metrics for the top 10 smartphone manufacturers. This figure highlights that there are differences in sampling rates and noise levels across various phone manufacturers, with significant deviations observed in htc and huawei phones.
Refer to caption
Figure 6: Cumulative distribution function (CDF) of waveform quality metrics for the top 10 accelerometer manufacturers. This figure highlights that accelerometer manufacturers show greater variations in sampling rate and noise levels than phone manufacturers.
Refer to caption
Figure 7: Boxplots of waveform quality metrics across various STMicroelectronics accelerometer models (ordered by release time). This figure highlights that even among different models from the same accelerometer manufacturer, such as STMicroelectronics, variations in sampling intervals and noise levels exist, with newer models generally exhibiting smaller variations in time intervals and noise levels.

V-B Smartphone Specifications

Due to the extensive number of phone models, examining the main differences between all of them can be challenging. To address this issue, we gather specifications for each phone model using their brand and model information from a public website called GSMArena 333www.gsmarena.com, which provides detailed smartphone specifications. The queried specifications include release date, random access memory (RAM) size, and battery size. Out of the 4036 phone models, we successfully collected phone specifications for 1663 models.

Fig. 8 displays the cumulative distributions of quality metrics for phones released in different years, revealing variations in n_sample and n_noise. A noticeable pattern emerges, showing that newer phones are more likely to oversample. Fig. 9 presents the cumulative distributions of quality metrics for phones with different RAM sizes, demonstrating similar patterns to release year: phones with larger RAM are more likely to oversample. Notably, the 8 GB RAM size group deviates significantly in std_x, std_y, and std_z, suggesting that waveforms in this group tend to have larger variations in noise levels. Additionally, phones with different battery sizes exhibit smaller variations in sampling rate compared to release year and RAM size, indicating a lesser impact of battery size on sampling rate. Regarding noise level, battery sizes below 5000 mAh display similar distributions, while those above 5000 mAh are less likely to exhibit high noise levels.

Refer to caption
Figure 8: Cumulative distribution function (CDF) of waveform quality metrics categorized by the release year of phones. This figure highlights that newer phones are more likely to oversample.
Refer to caption
Figure 9: Cumulative distribution function (CDF) of waveform quality metrics categorized by phone RAM size (unit: GB). This figure highlights that phones with larger RAM are more likely to oversample.

V-C Geolocation

The GPS location of MyShake devices in this study represents a single snapshot in time (with 1 km random noise added). We associate each GPS location (latitude, longitude) with geographic layers, such as country boundaries and roads 444https://www.naturalearthdata.com/downloads/10m-cultural-vectors/roads/, to obtain the geographic context (e.g., country, distance to highway) of the MyShake devices.

Using the country information, we calculate the total number of MyShake devices per country and select the top 10 countries. We then examine their hardware characteristics in terms of phone and accelerometer manufacturer composition. As illustrated in Fig. 10, a significant percentage of phones from the top 10 countries are samsung devices, with varying compositions of phone manufacturers across countries. The United States and India have the largest number of phone manufacturers (105 and 89). Fig. 11 displays the accelerometer manufacturer compositions, with most devices in the top 10 countries featuring sensors from three manufacturers: invensense, st, and bosch, albeit with different ratios. There are fewer variations in accelerometer manufacturer compositions among the top 10 countries compared to phone manufacturer compositions. As different countries have varying compositions of phone and accelerometer manufacturers, they also display differences in waveform quality. Fig. 12 shows that waveforms from India, Nepal, and Taiwan are more likely to be undersampled and have larger time intervals compared to other countries. Meanwhile, most countries have similar noise level distributions, with Chile standing out as having larger variations in noise levels.

Refer to caption
Figure 10: Phone manufacturer distribution in the top 10 countries with MyShake devices. This figure highlights the varying compositions of phone manufacturers across countries.
Refer to caption
Figure 11: Accelerometer manufacturer distribution in the top 10 countries with MyShake devices. This figure highlights that the top 10 countries with MyShake devices have most of their accelerometer sensors from three manufacturers (invensense, st, and bosch), with varying ratios.
Refer to caption
Figure 12: Boxplots of waveform quality metrics for the top 10 countries with MyShake devices. This figure highlights that waveforms from India, Nepal, and Taiwan show a higher likelihood of undersampling, while most countries exhibit similar noise level distributions, except for Chile, which has larger variations in noise levels.

To assess the influence of local environments on ambient noise levels, we explore an additional geographic factor, specifically the proximity to highways. We focus on devices within the United States and calculate the distance of each device to the nearest primary and secondary highways. We then compare devices situated very close to highways (within 1km) to those located at a considerable distance (beyond 50km) from highways. As illustrated in Fig. 13, there is a notable disparity in noise levels between the two groups, with devices farther away from highways exhibiting relatively less variation in noise levels. This is particularly evident for the z-axis, where devices situated farther from highways demonstrate a lower median std_z compared to those in close proximity to highways.

Refer to caption
Figure 13: Boxplots of waveform quality metrics of devices very close to (within 1km) and farther away from (beyond 50km) highways. This figure highlights the difference in noise levels between devices near highways and those farther away, with the latter group exhibiting less noise variations.

V-D Trigger Time

The quality of waveform data may also be influenced by the time of collection, which is closely associated with users’ daily routines and phone usage patterns. We determine the local hour for each waveform using its corresponding trigger timestamp. The variations in quality metrics across different hours are relatively minor, particularly when compared to hardware-related factors. Nonetheless, certain distinctions related to users’ behavior can still be observed. In Fig. 14, we evaluate waveform data quality across different time periods. Due to increased phone usage, waveforms gathered during the afternoon (12 - 6 pm) and evening (7 - 11 pm) exhibit larger time intervals and slightly higher noise levels compared to those collected during the morning (6 - 11 am).

Refer to caption
Figure 14: Cumulative distribution function (CDF) of waveform quality metrics categorized by trigger time period. This figure highlights that waveforms collected during afternoon and evening hours exhibiting larger time intervals and slightly higher noise levels.

VI Factor Importance

In this section, we predict waveform data quality by taking into account the potential impact factors identified earlier, with the aim of evaluating factor importance. Considering the availability of impact factors, we compile a dataset comprising waveforms that possess a complete set of features. This dataset includes approximately 10 million waveforms from 31 thousand MyShake devices, encompassing 845 phone models across 22 manufacturers, and 78 accelerometer models from 11 different manufacturers. Focusing on waveforms with quality issues, we develop classification models to identify these poor-quality waveforms.

As an initial step, we establish rules for quality control, which serve to to distinguish between good- and poor-quality waveforms. Based on the distributions of quality metrics displayed in Fig. 4, we calculate percentiles for each metric and combine them at various levels. Specifically, for a given quality level k%, we compute the lower k% values of n_sample and n_noise, and the higher k% values of std_dt, std_x, std_y, and std_z. These values, which serve as thresholds, are then combined into a condition designed to identify waveforms exhibiting significant undersampling or high noise level issues. We test k values of 10, 15, 20, and 25, where higher k values will classify more waveforms as poor-quality.

For quality prediction, we divide the entire dataset into training (70%) and test (30%) sets. We employ a Random Forest binary classifier and utilize weight balancing to address the class imbalance issue (i.e., more good-quality than poor-quality waveforms). For categorical features, such as phone and accelerometer manufacturers, we convert them into numerical values using one-hot encoding. To assess classification performance, we rely on metrics such as precision, recall, and F1-score. Specifically, a true positive (TP) occurs when the predicted poor-quality waveform is from the poor-quality group; otherwise, it is a false positive (FP). When a waveform from the poor-quality group is not predicted as a poor-quality waveform by the model, it is considered a false negative (FN). Using these definitions, we compute the evaluation metrics accordingly:

precision=|TP||TP|+|FP|𝑝𝑟𝑒𝑐𝑖𝑠𝑖𝑜𝑛𝑇𝑃𝑇𝑃𝐹𝑃\small precision=\frac{|TP|}{|TP|+|FP|}italic_p italic_r italic_e italic_c italic_i italic_s italic_i italic_o italic_n = divide start_ARG | italic_T italic_P | end_ARG start_ARG | italic_T italic_P | + | italic_F italic_P | end_ARG (1)
recall=|TP||TP|+|FN|𝑟𝑒𝑐𝑎𝑙𝑙𝑇𝑃𝑇𝑃𝐹𝑁\small recall=\frac{|TP|}{|TP|+|FN|}italic_r italic_e italic_c italic_a italic_l italic_l = divide start_ARG | italic_T italic_P | end_ARG start_ARG | italic_T italic_P | + | italic_F italic_N | end_ARG (2)
F1=2precision1+recall1𝐹12𝑝𝑟𝑒𝑐𝑖𝑠𝑖𝑜superscript𝑛1𝑟𝑒𝑐𝑎𝑙superscript𝑙1\small F1=\frac{2}{precision^{-1}+recall^{-1}}italic_F 1 = divide start_ARG 2 end_ARG start_ARG italic_p italic_r italic_e italic_c italic_i italic_s italic_i italic_o italic_n start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT + italic_r italic_e italic_c italic_a italic_l italic_l start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT end_ARG (3)

Table I presents the classification performance at varying quality control levels (k%). As the quality control level increases, a larger number of waveforms are categorized into the poor-quality group, making it easier for the model to detect them. The 25% quality control level achieves the best prediction performance with an F1-score of 0.76. The 10% level focuses on more extreme cases of undersampling and/or high noise levels, and the impact factors can still provide reasonable predictions for such cases. It is important to note that we do not expect a very high F1-score, as our prediction only includes observable features, while numerous unobservable features could also affect waveform quality.

TABLE I: Comparison of classification performances with different quality control (QC) levels.
QC Level Precision Recall F1-score
10% 0.58 0.61 0.60
15% 0.65 0.61 0.63
20% 0.77 0.64 0.70
25% 0.84 0.70 0.76

We further compare prediction performance using various feature sets, applying the 10% quality control level to target extreme cases. Based on the prediction performances shown in Table II, the most important feature sets are accelerometer model and phone specifications, which include key information about the sensor and phone resources. The location feature set ranks as the third most important. Accelerometer and phone manufacturer features are less effective in predicting poor-quality waveforms, with the time feature being the least important.

TABLE II: Comparison of prediction performances with different feature sets (“Acc” is short for accelerometer).
Feature Sets (#) Precision Recall F1-score
All (284) 0.58 0.61 0.60
Acc model (78) 0.44 0.63 0.52
Phone specifications (3) 0.48 0.56 0.51
Location (169) 0.31 0.80 0.45
Acc manufacturer (11) 0.36 0.50 0.42
Phone manufacturer (22) 0.45 0.39 0.42
Time (1) 0.31 0.50 0.38

VII Impact of Accelerometer Data Quality

In this section, we implement quality control measures for real-world earthquake events to evaluate the influence of accelerometer data quality on earthquake parameter estimation.

To evaluate the impact of accelerometer data quality on earthquake parameter estimation, we examine four earthquake events (outlined in Table III) as case studies. These events vary in magnitude and number of waveforms. We employ the same method used in [24] to estimate earthquake magnitude, which relies on peak-to-peak amplitude and time span between peaks in seismic waveforms. Fig. 15 displays the absolute magnitude errors between estimations using different quality control levels and those without. In general, applying quality control to filter out poor-quality waveforms helps reduce the absolute error in magnitude estimation. A 25% quality control level yields the smallest errors for all four earthquake events. However, the effects of quality control differ among these events. The Borrego and Oklahoma earthquakes exhibit a similar pattern, with higher quality control levels leading to further error reduction. The Morocco earthquake, which has a limited number of waveforms (only 6), shows exceptions with 10% and 15% quality control levels. The Berkeley earthquake, with its relatively small magnitude, demonstrates less variation in errors across different quality control levels.

TABLE III: Characteristics of the four selected earthquake events.
Event Name Time Magnitude # Waveforms
Borrego 2016-06-10 M5.2 103
Berkeley 2018-01-04 M4.4 63
Oklahoma 2016-09-03 M5.8 16
Morocco 2016-03-15 M5.6 6
Refer to caption
Figure 15: Comparison of absolute magnitude estimation errors with different quality control levels versus no quality control.

VIII Conclusions and Future Work

In this study, we conduct a comprehensive analysis of accelerometer data quality from a global smartphone-based seismic network known as MyShake. We investigate quality issues in the collected waveform data, employing various metrics to assess their sampling rates and noise levels. Additionally, we explore diverse factors, such as phone and accelerometer manufacturer, phone specifications, geolocation, and time, examining their correlation with data quality. Utilizing these factors, we develop quality classification models to identify poor-quality waveforms and assess the importance of various impact factors. Finally, by applying various quality control levels to the collected waveforms, we reveal the influence of data quality on earthquake parameter estimation and present strategies for mitigating these effects.

Limitations and Future Work. In this analysis, we examine approximately four years of accelerometer data gathered by the MyShake system, with waveforms from global devices continuing to accumulate. We focus on key factors influencing accelerometer data quality, recognizing that additional factors and conditions warrant further exploration. Understanding the various impact factors on sensing quality can inform the development of improved strategies to address sensing heterogeneity in real-world applications. Our current quality control analysis investigates only four earthquake events. As future work, we aim to include a broader range of earthquake events and assess the effects of data quality on their key parameter estimation. By considering diverse event characteristics, we can better understand data quality impacts and design more efficient methods to address real-world quality issues. We aspire to enhance the MyShake system by incorporating quality-awareness, ultimately monitoring data quality in real-time and selectively integrating it into the application pipeline.

Acknowledgment

The California Governor’s Office of Emergency Services (Cal OES) funded MyShake through grant 6142-2018 to Berkeley Seismology Lab. Part of this work was performed under the auspices of the U.S. Department of Energy by Lawrence Livermore National Laboratory under Contract Number DE-AC52-07NA27344. Any opinions, findings, conclusions, or recommendations expressed in this publication are those of the authors and do not necessarily reflect those of the supporting agencies. This is LLNL Contribution Number LLNL-JRNL-847732. We thank the global MyShake users and MyShake team at Berkeley for this wonderful project.

References

  • [1] S. Tan, X. Wang, G. Maier, and W. Li, “Riding quality evaluation through mobile crowd sensing,” in 2016 IEEE International Conference on Pervasive Computing and Communications (PerCom).   IEEE, 2016, pp. 1–6.
  • [2] A. Allouch, A. Koubâa, T. Abbes, and A. Ammar, “Roadsense: Smartphone application to estimate road conditions using accelerometer and gyroscope,” IEEE Sensors Journal, vol. 17, no. 13, pp. 4231–4238, 2017.
  • [3] M. Elhamshary, M. Youssef, A. Uchiyama, H. Yamaguchi, and T. Higashino, “Crowdmeter: Congestion level estimation in railway stations using smartphones,” in 2018 IEEE International Conference on Pervasive Computing and Communications (PerCom).   IEEE, 2018, pp. 1–12.
  • [4] R. Wang, W. Wang, M. Obuchi, E. Scherer, R. Brian, D. Ben-Zeev, T. Choudhury, J. Kane, M. Hauser, M. Walsh et al., “On predicting relapse in schizophrenia using mobile sensing in a randomized control trial,” in 2020 IEEE International Conference on Pervasive Computing and Communications (PerCom).   IEEE, 2020, pp. 1–8.
  • [5] Q. Kong, R. M. Allen, L. Schreier, and Y.-W. Kwon, “Myshake: A smartphone seismic network for earthquake early warning and beyond,” Science advances, vol. 2, no. 2, p. e1501055, 2016.
  • [6] R. M. Allen and D. Melgar, “Earthquake early warning: Advances, scientific challenges, and societal needs,” Annual Review of Earth and Planetary Sciences, vol. 47, pp. 361–388, 2019.
  • [7] P. Earle, M. Guy, R. Buckmaster, C. Ostrum, S. Horvath, and A. Vaughan, “OMG Earthquake! Can Twitter Improve Earthquake Response?” Seismological Research Letters, vol. 81, no. 2, pp. 246–251, 2010. [Online]. Available: http://srl.geoscienceworld.org/cgi/doi/10.1785/gssrl.81.2.246
  • [8] S. E. Minson, B. A. Brooks, C. L. Glennie, J. R. Murray, J. O. Langbein, S. E. Owen, T. H. Heaton, R. A. Iannucci, and D. L. Hauser, “Crowdsourced earthquake early warning,” Science Advances, vol. 1, no. 3, pp. e1 500 036–e1 500 036, apr 2015.
  • [9] F. Finazzi, “The earthquake network project: Toward a crowdsourced smartphone-based earthquake early warning system,” Bulletin of the Seismological Society of America, vol. 106, no. 3, pp. 1088–1099, 2016.
  • [10] R. Bossu, F. Roussel, L. Fallou, M. Landès, R. Steed, G. Mazet-Roux, A. Dupont, L. Frobert, and L. Petersen, “LastQuake: From Rapid Information to Global Seismic Risk Reduction,” International Journal of Disaster Risk Reduction, vol. 28, no. November 2017, pp. 32–42, 2018. [Online]. Available: http://linkinghub.elsevier.com/retrieve/pii/S2212420918302097
  • [11] J. R. Kwapisz, G. M. Weiss, and S. A. Moore, “Activity recognition using cell phone accelerometers,” ACM SigKDD Explorations Newsletter, vol. 12, no. 2, pp. 74–82, 2011.
  • [12] R. Xu, S. Zhou, and W. J. Li, “Mems accelerometer based nonspecific-user hand gesture recognition,” IEEE sensors journal, vol. 12, no. 5, pp. 1166–1173, 2011.
  • [13] A. Bayat, M. Pomplun, and D. A. Tran, “A study on human activity recognition using accelerometer data from smartphones,” Procedia Computer Science, vol. 34, pp. 450–457, 2014.
  • [14] M. A. A. H. Khan, N. Roy, and A. Misra, “Scaling human activity recognition via deep learning-based domain adaptation,” in 2018 IEEE international conference on pervasive computing and communications (PerCom).   IEEE, 2018, pp. 1–9.
  • [15] R. M. Allen, “Transforming earthquake detection?” Science, vol. 335, no. 6066, pp. 297–298, 2012.
  • [16] Q. Kong, R. M. Allen, and L. Schreier, “Myshake: Initial observations from a global smartphone seismic network,” Geophysical Research Letters, vol. 43, no. 18, pp. 9588–9594, 2016.
  • [17] Q. Kong, R. Martin-Short, and R. M. Allen, “Toward global earthquake early warning with the myshake smartphone seismic network, part 1: simulation platform and detection algorithm,” Seismological Research Letters, vol. 91, no. 4, pp. 2206–2217, 2020.
  • [18] J. A. Strauss, Q. Kong, S. Pothan, S. Thompson, R. F. Mejia, S. Allen, S. Patel, and R. M. Allen, “Myshake citizen seismologists help launch dual-use seismic network in california,” Frontiers in Communication, p. 32, 2020.
  • [19] R. M. Allen and M. Stogaitis, “Global growth of earthquake early warning,” Science, vol. 375, no. 6582, pp. 717–718, 2022.
  • [20] R. M. Allen, Q. Kong, and R. Martin-Short, “The myshake platform: A global vision for earthquake early warning,” Pure and Applied Geophysics, vol. 177, no. 4, pp. 1699–1712, 2020.
  • [21] Q. Kong, Q. Lv, and R. M. Allen, “Earthquake early warning and beyond: Systems challenges in smartphone-based seismic network,” in Proceedings of the 20th International Workshop on Mobile Computing Systems and Applications, 2019, pp. 57–62.
  • [22] A. Stisen, H. Blunck, S. Bhattacharya, T. S. Prentow, M. B. Kjærgaard, A. Dey, T. Sonne, and M. M. Jensen, “Smart devices are different: Assessing and mitigatingmobile sensing heterogeneities for activity recognition,” in Proceedings of the 13th ACM conference on embedded networked sensor systems, 2015, pp. 127–140.
  • [23] H. Blunck, S. Bhattacharya, A. Stisen, T. S. Prentow, M. B. Kjærgaard, A. Dey, M. M. Jensen, and T. Sonne, “Activity recognition on smart devices: Dealing with diversity in the wild,” GetMobile: Mobile Computing and Communications, vol. 20, no. 1, pp. 34–38, 2016.
  • [24] Q. Kong, S. Patel, A. Inbal, and R. M. Allen, “Assessing the sensitivity and accuracy of the myshake smartphone seismic network to detect and characterize earthquakes,” Seismological Research Letters, vol. 90, no. 5, pp. 1937–1949, 2019.
  • [25] T. Sakaki, M. Okazaki, and Y. Matsuo, “Earthquake shakes twitter users: real-time event detection by social sensors,” in Proceedings of the 19th international conference on World wide web, 2010, pp. 851–860.
  • [26] ——, “Tweet analysis for real-time event detection and earthquake reporting system development,” IEEE Transactions on Knowledge and Data Engineering, vol. 25, no. 4, pp. 919–931, 2012.
  • [27] M. Avvenuti, S. Cresci, A. Marchetti, C. Meletti, and M. Tesconi, “Ears (earthquake alert and report system) a real time decision support system for earthquake crisis management,” in Proceedings of the 20th ACM SIGKDD international conference on knowledge discovery and data mining, 2014, pp. 1749–1758.
  • [28] S. Dashti, J. D. Bray, J. Reilly, S. Glaser, A. Bayen, and E. Mari, “Evaluating the reliability of phones as seismic monitoring instruments,” Earthquake Spectra, vol. 30, no. 2, pp. 721–742, 2014.
  • [29] “Connected cars detect mexico earthquake,” 2017. [Online]. Available: https://www.geotab.com/blog/connected-cars-detect-earthquake/
  • [30] G. J. Martinez, S. M. Mattingly, S. Mirjafari, S. K. Nepal, A. T. Campbell, A. K. Dey, and A. D. Striegel, “On the quality of real-world wearable data in a longitudinal study of information workers,” in 2020 IEEE International Conference on Pervasive Computing and Communications Workshops (PerCom Workshops).   IEEE, 2020, pp. 1–6.
  • [31] C. Min, A. Montanari, A. Mathur, and F. Kawsar, “A closer look at quality-aware runtime assessment of sensing models in multi-device environments,” in Proceedings of the 17th Conference on Embedded Networked Sensor Systems, 2019, pp. 271–284.
  • [32] M. Zhang, T. Wo, and T. Xie, “A platform solution of data-quality improvement for internet-of-vehicle services,” in 2018 IEEE International Conference on Pervasive Computing and Communications (PerCom).   IEEE, 2018, pp. 1–7.
  • [33] X. Yin, G. Shen, X. Wang, and W. Shen, “Mitigating sensor differences for phone-based human activity recognition,” in 2016 IEEE International Conference on Systems, Man, and Cybernetics (SMC).   IEEE, 2016, pp. 003 550–003 555.
  • [34] S. Chuprov, L. Reznik, I. Khokhlov, and K. Manghi, “Multi-modal sensor selection with genetic algorithms,” in 2022 IEEE Sensors.   IEEE, 2022, pp. 1–4.
  • [35] M. Janidarmian, A. Roshan Fekr, K. Radecka, and Z. Zilic, “A comprehensive analysis on wearable acceleration sensors in human activity recognition,” Sensors, vol. 17, no. 3, p. 529, 2017.