Oscon keynote: Working hard to keep it simple
- 2. The Challenge The world of mainstream software is changing: Moore’s law now achieved by increasing # of cores not clock cycles Huge volume workloads that require horizontal scaling “ PPP” Grand Challenge Data from Kunle Olukotun, Lance Hammond, Herb Sutter, Burton Smith, Chris Batten, and Krste Asanovic
- 3. Concurrency and Parallelism Parallel programming Execute programs faster on parallel hardware. Concurrent programming Manage concurrent execution threads explicitly. Both are too hard!
- 4. The Root of The Problem Non-determinism caused by concurrent threads accessing shared mutable state. It helps to encapsulate state in actors or transactions, but the fundamental problem stays the same. So, non-determinism = parallel processing + mutable state To get deterministic processing, avoid the mutable state! Avoiding mutable state means programming functionally . var x = 0 async { x = x + 1 } async { x = x * 2 } // can give 0, 1, 2
- 5. Space vs Time Time (imperative/concurrent) Space (functional/parallel)
- 6. Scala is a Unifier Agile, with lightweight syntax Object-Oriented Scala Functional Safe and performant, with strong static tpying
- 7. Scala is a Unifier Agile, with lightweight syntax Parallel Object-Oriented Scala Functional Sequential Safe and performant, with strong static tpying
- 9. Some adoption vectors: Web platforms Trading platforms Financial modeling Simulation Fast to first product, scalable afterwards
- 11. Different Tools for Different Purposes Parallelism : Parallel Collections Collections Distributed Collections Parallel DSLs Concurrency : Actors Software transactional memory Akka Futures
- 13. A class ... public class Person { public final String name ; public final int age ; Person(String name, int age) { this . name = name; this . age = age; } } class Person( val name: String, val age: Int ) ... in Java: ... in Scala:
- 14. ... and its usage import java.util.ArrayList; ... Person[] people ; Person[] minors ; Person[] adults ; { ArrayList<Person> minorsList = new ArrayList<Person>(); ArrayList<Person> adultsList = new ArrayList<Person>(); for ( int i = 0; i < people . length ; i++) ( people [i]. age < 18 ? minorsList : adultsList) .add( people [i]); minors = minorsList.toArray( people ); adults = adultsList.toArray( people ); } ... in Java: ... in Scala: val people: Array [Person] val (minors, adults) = people partition (_.age < 18) A simple pattern match An infix method call A function value
- 15. Going Parallel ? ... in Java: ... in Scala: val people: Array [Person] val (minors, adults) = people .par partition (_.age < 18)
- 16. Actors for Concurrent Programming Simple message-oriented programming model for multi-threading Serializes access to shared resources using queues and function passing. Easier for programmers to create reliable concurrent processing Many sources of contention, races, locking and dead-locks removed
- 17. Going further: Parallel DSLs But how do we keep a bunch of Fermi’s happy? How to find and deal with 10000+ threads in an application? Parallel collections and actors are necessary but not sufficient for this. Our bet for the mid term future: parallel embedded DSLs. Find parallelism in domains: physics simulation, machine learning, statistics, ... Joint work with Kunle Olukuton, Pat Hanrahan @ Stanford. EPFL side funded by ERC.
- 18. EPFL / Stanford Research Applications Domain Specific Languages Heterogeneous Hardware DSL Infrastructure OOO Cores SIMD Cores Threaded Cores Specialized Cores Programmable Hierarchies Scalable Coherence Isolation & Atomicity On-chip Networks Pervasive Monitoring Domain Embedding Language ( Scala ) Virtual Worlds Personal Robotics Data informatics Scientific Engineering Physics ( Liszt ) Scripting Probabilistic (RandomT) Machine Learning ( OptiML ) Rendering Parallel Runtime ( Delite, Sequoia, GRAMPS ) Dynamic Domain Spec. Opt. Locality Aware Scheduling Staging Polymorphic Embedding Task & Data Parallelism Hardware Architecture Static Domain Specific Opt.
- 19. Example: Liszt - A DSL for Physics Simulation Mesh-based Numeric Simulation Huge domains millions of cells Example: Unstructured Reynolds-averaged Navier Stokes (RANS) solver Fuel injection Transition Thermal Turbulence Turbulence Combustion
- 20. Liszt as Virtualized Scala val // calculating scalar convection (Liszt) val Flux = new Field[Cell,Float] val Phi = new Field[Cell,Float] val cell_volume = new Field[Cell,Float] val deltat = .001 ... untilconverged { for(f <- interior_faces) { val flux = calc_flux(f) Flux(inside(f)) -= flux Flux(outside(f)) += flux } for(f <- inlet_faces) { Flux(outside(f)) += calc_boundary_flux(f) } for(c <- cells(mesh)) { Phi(c) += deltat * Flux(c) /cell_volume(c) } for(f <- faces(mesh)) Flux(f) = 0.f } AST Hardware DSL Library Optimisers Generators … … Schedulers GPU, Multi-Core, etc
- 21. Follow us on twitter: @typesafe scala-lang.org typesafe.com
Editor's Notes
- This leads to our vision, applications driven by a set of interoperable DSLs. We are developing DSLs to provide evidence as to their effectiveness in extracting parallel performance. But we are also very interested in empowering other to easily build such DSLs, so we are investing heavily in developing frameworks and runtimes to make parallel DSL development easier. And the goal is to run single source programs on a variety of very different hardware targets.
- Liszt is another language we have implemented. It is designed to support the creation of solvers for mesh-based partial differential equations. Problems in this domain typically simulate complex physical systems such as fluid flow or mechanics by breaking up space into discrete cells. A typical mesh may contain hundreds of millions of these cells (here we are visualizing a scram-jet designed to work at hypersonic speeds). Liszt is an ideal candidate for a DSL because while the problems are large and highly parallel, the mesh introduces many data-dependencies that are difficult to reason about, making writing solvers tedious.