SlideShare a Scribd company logo
Real-time personal trainer on
the SMACK stack



@honzam399 Jan Machacek 

@anirvan_c Anirvan Chakraborty 

© 2016 Cake Solutions Limited CC BY-NC-SA 4.0
Automated personal trainer - muvr
• Suggests the sequence of exercise sessions
• Suggests exercises in a session, including exercise
parameters (e.g. weight, repetitions, …)
• Provides tips on proper exercise form
• With additional hardware (smartwatch, smart clothes),
muvr provides
• Completely unobtrusive exercise experience
• More accurate tips on proper exercise form
• With over–fitting, it is usable for physiotherapy
© 2016 Cake Solutions Limited CC BY-NC-SA 4.0
Architecture
© 2016 Cake Solutions Limited CC BY-NC-SA 4.0
Privacy
© 2016 Cake Solutions Limited CC BY-NC-SA 4.0
The technologies—iOS
• Learns the users’ behaviour
• Exercise sessions
• Exercises within exercise session
• Short–term prediction of [scalar] labels for the exercises
• Performs the real–time analysis of the incoming sensor
data
• Advised by the expected behaviour
• Signal processing to compute repetitions / strokes
• Forward–propagation to label the exercise
• Submits all recorded sensor data and confirmed (!) labels
per session
• Handles offline / travel modes
• Synchronises the data across the user’s devices using iCloud
© 2016 Cake Solutions Limited CC BY-NC-SA 4.0
The technologies—Akka
• Reactive services for user profiles, model parameters,
and sensor data
• CQRS/ES implementation, which helps to
• Handle peaks in load
• Handle failures of individual nodes
• Reason about the scope of the mutable state we keep
• Uses Cassandra for its journal and snapshot stores
• The written values are binary “blobs”
• Writes the sensor data to Cassandra
• Writes the sensor data in “readable” form; it can be read outside the Akka / Scala
world
• Reads the model and exercise parameters from
Cassandra
• It selects the best / newest model parameters to serve to the mobile app
© 2016 Cake Solutions Limited CC BY-NC-SA 4.0
The technologies—Spark
• Distributed computation framework
• “Big data” tasks
• Integrates extremely well with Cassandra
• Reads and processes the profiles and sensor data
• Identifies clusters of users on their profile information
• Slices the sensor inputs by sensor types
• Writes the results to another store
• Runs in batches
• Executes by schedule (typically once a day)
© 2016 Cake Solutions Limited CC BY-NC-SA 4.0
The technologies—neon
• A machine learning framework, including
• “The usual” suspects in tensor algebra
• Signal processing
• Different ML approaches
• Training and evaluation programs
• Both programs terminate either upon discovering the perfect model or when their
budget is up
• Reads clustered training and testing data from the Spark job
• Writes the model parameters and evaluation result to Cassandra
© 2016 Cake Solutions Limited CC BY-NC-SA 4.0
The technologies—Cassandra
• Underpins the entire platform
• Journal and snapshot store for Akka
• Sensor data store
• Model parameter store
• “Summary” store
• High availability
• No single point of failure
• High read and write
• Replication factor
• Tuneable consistency level
© 2016 Cake Solutions Limited CC BY-NC-SA 4.0
© 2016 Cake Solutions Limited CC BY-NC-SA 4.0
Spark & Cassandra
• Group the sensor data into n clusters by user profile with
biometric ID
• Expand the sensor data
• Slices of the sensor data by combinations of accelerometer, gyroscope, heart rate,
targeted muscle group strain gauges, …
• 1 user = 1 MiB from one sensor per hour; but 4 sensors expand into 4! MiB
• Trivial tasks
• The most popular user–contributed exercises
• The most popular exercise sessions and exercises within the sessions
• The most effective (by overall fitness improvement, weight loss, muscle mass gain, …)
exercise sessions
© 2016 Cake Solutions Limited CC BY-NC-SA 4.0
Production ML
Take the data from Cassandra (written there by the Spark
jobs) and:
• Split into training and test datasets
• Fit models for various sensor types
• Save model parameters
• Evaluate the newly fitted models, and re-evaluate old
data
© 2016 Cake Solutions Limited CC BY-NC-SA 4.0
Production ML
• We are using convolutional network
• 2 seconds of sensor data input (e.g. a @ 50 Hz for accelerometer; a, g @ 50 Hz for
accelerometer + gyroscope; u, l @ 10 Hz for smart clothes)
• The exercise classes as the outputs
• The training program
• CNN in neon
• Loads the mini–batches from Cassandra
• Fits the model; evaluates the fitted model
• Saves the model parameters into Cassandra
• The re–evaluation program
• Re–evaluates past n models against the latest training dataset; computing accuracy,
precision, recall, f1
© 2016 Cake Solutions Limited CC BY-NC-SA 4.0
Having code is jolly good
© 2016 Cake Solutions Limited CC BY-NC-SA 4.0
Running it
• Simplicity
• Ease of orchestration
• Ease of development
• Support for polyglot frameworks and components
• Cost effective resource utilisation
© 2016 Cake Solutions Limited CC BY-NC-SA 4.0
Docker
• Deploy reliably & consistently
• Execution is fast and light weight
• Simplicity
• Developer friendly workflow
• Fantastic community
© 2016 Cake Solutions Limited CC BY-NC-SA 4.0
Dockerize Cassandra Dev Environment
• Super low memory settings in cassandra-env.sh
• MAX_HEAP_SIZE=“128M”
• HEAP_NEWSIZE=“24M”
• Remove caches in dev mode in cassandra.yml
• key_cache_size_in_mb: 0
• reduce_cache_sizes_at: 0
• reduce_cache_capacity_to: 0
© 2016 Cake Solutions Limited CC BY-NC-SA 4.0
Dockerize Cassandra Production
• Use host networking (—net=host) for better network
performance
• Put data, commitlog and saved_caches in volume
mount folders to the underlying host
• Run cassandra on the foreground using (-f)
• Tune JVM heap for optimal size
• Tune JVM garbage collector for your workload
© 2016 Cake Solutions Limited CC BY-NC-SA 4.0
Mesos
• Distributed systems kernel
• Scales to 10,000s of nodes
• Depends on Zookeeper for fault tolerance and high
availability
• Creates a highly available, scalable single resource pool
• Automatic failover
• Ease of management
• Simple to operate
• Support for Docker container
© 2016 Cake Solutions Limited CC BY-NC-SA 4.0
Mesos architecture
image source: https://assets.digitalocean.com/articles/mesosphere/mesos_architecture.png
© 2016 Cake Solutions Limited CC BY-NC-SA 4.0
Cassandra on Mesos
• Running Cassandra as Docker containers
• Custom Dockerfile and entry-point script to control Cassandra configuration
• Marathon to initialize and control
© 2016 Cake Solutions Limited CC BY-NC-SA 4.0
Cost effective resource in AWS
• Embrace AWS spot instances
• About 50-60% cheaper than on demand instances
• Can be reclaimed without notice if outbidded
• Run dev and staging on spot instances
• Run Spark jobs on spot instances
© 2016 Cake Solutions Limited CC BY-NC-SA 4.0
© 2016 Cake Solutions Limited CC BY-NC-SA 4.0
Thanks!
Twitter: @cakesolutions

Tel: 0845 617 1200
Email: enquiries@cakesolutions.net
Jobs: http://www.cakesolutions.net/
careers

More Related Content

Real-time personal trainer on the SMACK stack

  • 1. Real-time personal trainer on the SMACK stack
 
 @honzam399 Jan Machacek 
 @anirvan_c Anirvan Chakraborty 

  • 2. © 2016 Cake Solutions Limited CC BY-NC-SA 4.0 Automated personal trainer - muvr • Suggests the sequence of exercise sessions • Suggests exercises in a session, including exercise parameters (e.g. weight, repetitions, …) • Provides tips on proper exercise form • With additional hardware (smartwatch, smart clothes), muvr provides • Completely unobtrusive exercise experience • More accurate tips on proper exercise form • With over–fitting, it is usable for physiotherapy
  • 3. © 2016 Cake Solutions Limited CC BY-NC-SA 4.0 Architecture
  • 4. © 2016 Cake Solutions Limited CC BY-NC-SA 4.0 Privacy
  • 5. © 2016 Cake Solutions Limited CC BY-NC-SA 4.0 The technologies—iOS • Learns the users’ behaviour • Exercise sessions • Exercises within exercise session • Short–term prediction of [scalar] labels for the exercises • Performs the real–time analysis of the incoming sensor data • Advised by the expected behaviour • Signal processing to compute repetitions / strokes • Forward–propagation to label the exercise • Submits all recorded sensor data and confirmed (!) labels per session • Handles offline / travel modes • Synchronises the data across the user’s devices using iCloud
  • 6. © 2016 Cake Solutions Limited CC BY-NC-SA 4.0 The technologies—Akka • Reactive services for user profiles, model parameters, and sensor data • CQRS/ES implementation, which helps to • Handle peaks in load • Handle failures of individual nodes • Reason about the scope of the mutable state we keep • Uses Cassandra for its journal and snapshot stores • The written values are binary “blobs” • Writes the sensor data to Cassandra • Writes the sensor data in “readable” form; it can be read outside the Akka / Scala world • Reads the model and exercise parameters from Cassandra • It selects the best / newest model parameters to serve to the mobile app
  • 7. © 2016 Cake Solutions Limited CC BY-NC-SA 4.0 The technologies—Spark • Distributed computation framework • “Big data” tasks • Integrates extremely well with Cassandra • Reads and processes the profiles and sensor data • Identifies clusters of users on their profile information • Slices the sensor inputs by sensor types • Writes the results to another store • Runs in batches • Executes by schedule (typically once a day)
  • 8. © 2016 Cake Solutions Limited CC BY-NC-SA 4.0 The technologies—neon • A machine learning framework, including • “The usual” suspects in tensor algebra • Signal processing • Different ML approaches • Training and evaluation programs • Both programs terminate either upon discovering the perfect model or when their budget is up • Reads clustered training and testing data from the Spark job • Writes the model parameters and evaluation result to Cassandra
  • 9. © 2016 Cake Solutions Limited CC BY-NC-SA 4.0 The technologies—Cassandra • Underpins the entire platform • Journal and snapshot store for Akka • Sensor data store • Model parameter store • “Summary” store • High availability • No single point of failure • High read and write • Replication factor • Tuneable consistency level
  • 10. © 2016 Cake Solutions Limited CC BY-NC-SA 4.0
  • 11. © 2016 Cake Solutions Limited CC BY-NC-SA 4.0 Spark & Cassandra • Group the sensor data into n clusters by user profile with biometric ID • Expand the sensor data • Slices of the sensor data by combinations of accelerometer, gyroscope, heart rate, targeted muscle group strain gauges, … • 1 user = 1 MiB from one sensor per hour; but 4 sensors expand into 4! MiB • Trivial tasks • The most popular user–contributed exercises • The most popular exercise sessions and exercises within the sessions • The most effective (by overall fitness improvement, weight loss, muscle mass gain, …) exercise sessions
  • 12. © 2016 Cake Solutions Limited CC BY-NC-SA 4.0 Production ML Take the data from Cassandra (written there by the Spark jobs) and: • Split into training and test datasets • Fit models for various sensor types • Save model parameters • Evaluate the newly fitted models, and re-evaluate old data
  • 13. © 2016 Cake Solutions Limited CC BY-NC-SA 4.0 Production ML • We are using convolutional network • 2 seconds of sensor data input (e.g. a @ 50 Hz for accelerometer; a, g @ 50 Hz for accelerometer + gyroscope; u, l @ 10 Hz for smart clothes) • The exercise classes as the outputs • The training program • CNN in neon • Loads the mini–batches from Cassandra • Fits the model; evaluates the fitted model • Saves the model parameters into Cassandra • The re–evaluation program • Re–evaluates past n models against the latest training dataset; computing accuracy, precision, recall, f1
  • 14. © 2016 Cake Solutions Limited CC BY-NC-SA 4.0 Having code is jolly good
  • 15. © 2016 Cake Solutions Limited CC BY-NC-SA 4.0 Running it • Simplicity • Ease of orchestration • Ease of development • Support for polyglot frameworks and components • Cost effective resource utilisation
  • 16. © 2016 Cake Solutions Limited CC BY-NC-SA 4.0 Docker • Deploy reliably & consistently • Execution is fast and light weight • Simplicity • Developer friendly workflow • Fantastic community
  • 17. © 2016 Cake Solutions Limited CC BY-NC-SA 4.0 Dockerize Cassandra Dev Environment • Super low memory settings in cassandra-env.sh • MAX_HEAP_SIZE=“128M” • HEAP_NEWSIZE=“24M” • Remove caches in dev mode in cassandra.yml • key_cache_size_in_mb: 0 • reduce_cache_sizes_at: 0 • reduce_cache_capacity_to: 0
  • 18. © 2016 Cake Solutions Limited CC BY-NC-SA 4.0 Dockerize Cassandra Production • Use host networking (—net=host) for better network performance • Put data, commitlog and saved_caches in volume mount folders to the underlying host • Run cassandra on the foreground using (-f) • Tune JVM heap for optimal size • Tune JVM garbage collector for your workload
  • 19. © 2016 Cake Solutions Limited CC BY-NC-SA 4.0 Mesos • Distributed systems kernel • Scales to 10,000s of nodes • Depends on Zookeeper for fault tolerance and high availability • Creates a highly available, scalable single resource pool • Automatic failover • Ease of management • Simple to operate • Support for Docker container
  • 20. © 2016 Cake Solutions Limited CC BY-NC-SA 4.0 Mesos architecture image source: https://assets.digitalocean.com/articles/mesosphere/mesos_architecture.png
  • 21. © 2016 Cake Solutions Limited CC BY-NC-SA 4.0 Cassandra on Mesos • Running Cassandra as Docker containers • Custom Dockerfile and entry-point script to control Cassandra configuration • Marathon to initialize and control
  • 22. © 2016 Cake Solutions Limited CC BY-NC-SA 4.0 Cost effective resource in AWS • Embrace AWS spot instances • About 50-60% cheaper than on demand instances • Can be reclaimed without notice if outbidded • Run dev and staging on spot instances • Run Spark jobs on spot instances
  • 23. © 2016 Cake Solutions Limited CC BY-NC-SA 4.0
  • 24. © 2016 Cake Solutions Limited CC BY-NC-SA 4.0 Thanks! Twitter: @cakesolutions
 Tel: 0845 617 1200 Email: enquiries@cakesolutions.net Jobs: http://www.cakesolutions.net/ careers