Detects and flags potential data quality issues in activity stream data, including HR/power spikes, GPS drift, and identifies steady-state segments suitable for physiological metrics calculation.
Arguments
- streams
A data frame containing activity stream data with time-series measurements. Expected columns:
time
(seconds),heartrate
(bpm),watts
(W),velocity_smooth
orspeed
(m/s),distance
(m).- sport
Type of activity (e.g., "Run", "Ride"). Default "Run".
- hr_range
Valid heart rate range as c(min, max). Default c(30, 220).
- pw_range
Valid power range as c(min, max). Default c(0, 1500).
- max_run_speed
Maximum plausible running speed in m/s. Default 7.0 (≈2:23/km).
- max_ride_speed
Maximum plausible riding speed in m/s. Default 25.0 (≈90 km/h).
- max_accel
Maximum plausible acceleration in m/s². Default 3.0.
- max_hr_jump
Maximum plausible HR change per second (bpm/s). Default 10.
- max_pw_jump
Maximum plausible power change per second (W/s). Default 300.
- min_steady_minutes
Minimum duration (minutes) for steady-state segment. Default 20.
- steady_cv_threshold
Coefficient of variation threshold for steady-state (\%). Default 8.
Value
A data frame identical to streams
with additional flag columns:
- flag_hr_spike
Logical. TRUE if HR is out of range or has excessive jump.
- flag_pw_spike
Logical. TRUE if power is out of range or has excessive jump.
- flag_gps_drift
Logical. TRUE if speed or acceleration is implausible.
- flag_any
Logical. TRUE if any quality flag is raised.
- is_steady_state
Logical. TRUE if segment meets steady-state criteria.
- quality_score
Numeric 0-1. Proportion of clean data (1 = perfect).
Details
This function performs several quality checks:
HR/Power Spikes: Flags values outside physiological ranges or with sudden jumps (Delta HR > 10 bpm/s, Delta P > 300 W/s).
GPS Drift: Flags implausible speeds or accelerations based on sport type.
Steady-State Detection: Identifies segments with low variability (CV < 8%) lasting >= 20 minutes, suitable for EF/decoupling calculations.
The function is sport-aware and adjusts thresholds accordingly. All thresholds are configurable to accommodate different athlete profiles and data quality.
Examples
if (FALSE) { # \dontrun{
# Create sample activity stream data
stream_data <- data.frame(
time = seq(0, 3600, by = 1),
heartrate = rnorm(3601, mean = 150, sd = 10),
watts = rnorm(3601, mean = 200, sd = 20),
velocity_smooth = rnorm(3601, mean = 3.5, sd = 0.3)
)
# Flag quality issues
flagged_data <- flag_quality(stream_data, sport = "Run")
# Check summary
summary(flagged_data$quality_score)
table(flagged_data$flag_any)
} # }