The document discusses the history and development of Heroku Postgres, a database as a service provider. It began as a simple Sinatra app running on Heroku that provisioned Postgres databases for users. Over time it grew more sophisticated with monitoring processes, a state machine workflow, and distributed queue for continuous monitoring. The system was inspired by gaming with states like available, unavailable. It focuses on durability with WAL shipping to multiple datacenters and high availability using replication and forks.
The slides from the December 2012 Cloud Camp Chicago. The slides include slides from our speakers: Dave Falck, Model Metrics: node.js on AWS; Paul Mantz, CohesiveFT: Working with APIs; Bob Chojnacki, Jellyvision Labs: Hadoop on AWS; Karl Zimmerman, Steadfast: Keep control with the Private Cloud
Theme: "Do you speak cloud? How old roles fit in with the new cloud."
CloudCamp is an unconference where early adopters of Cloud Computing technologies exchange ideas. Come share your cloud experiences, challenges and solutions. At CloudCamp, attendees are encouraged to share thoughts in open discussions and short talks. End users, IT professionals and vendors are all encouraged to come!
Dave Falck, Model Metrics: node.js on AWS
Paul Mantz, CohesiveFT: Working with APIs
Bob Chojnacki, Jellyvision Labs: Hadoop on AWS
Karl Zimmerman, Steadfast: Keep control with the Private Cloud
Getting Started with Splunk Breakout SessionSplunk
Splunk is a software platform that allows users to search, monitor, and analyze machine-generated big data for security, IT and business intelligence. It collects data from sources like servers, networks, sensors and applications. Splunk can scale from analyzing data from a single computer to very large enterprises handling terabytes of data per day. It provides real-time operational intelligence through universal data ingestion, schema-on-the-fly indexing, and an intuitive search process.
Zenko @Cloud Native Foundation London Meetup March 6th 2018Laure Vergeron
Zenko is an open source multi-cloud data controller that provides a single API and dashboard to manage data across multiple cloud storage providers. It includes features like native storage formats, policy-based data management, and metadata search. The enterprise edition adds multi-tenancy, scale-out capabilities, and file services for legacy applications. Zenko was created by Scality as an "inner startup" project to help reinvigorate innovation at the company and has since grown a developer community through meetups and hackathons.
Treasure Data is a cloud-based big data analytics company based in Silicon Valley with about 20 employees. The document discusses Treasure Data's services and architecture, which includes collecting data from various sources using Fluentd, storing the data in a columnar format on AWS S3, and performing analytics using Hadoop and SQL queries. Treasure Data aims to simplify big data adoption through its fully-managed platform and quick setup process. Example customers discussed were able to see results within 2 weeks of signing up.
Backbone using Extensible Database APIs over HTTPMax Neunhöffer
These days, more and more software applications are designed using a micro services architecture, that is, as suites of independently deployable services, talking to each other with well-defined interfaces. This approach is helped by the fact that many NoSQL databases expose their API through HTTP, which makes it particularly easy to define the interfaces.
The multi-model NoSQL database ArangoDB embeds Google's V8 JavaScript engine and features the Foxx framework, which allows the developer to extend ArangoDB's API by user defined JavaScript code that runs on the database server.
In this talk I will explain the benefits of this approach to the software architecture and development process. I will keep the presentation practice oriented by showing concrete examples in ArangoDB and JavaScript, using Backbone.js
The document discusses big data and Hadoop. It defines big data as highly scalable integration, storage, and analysis of poly-structured data. It describes how Hadoop can be used for tasks like ads/recommendations, travel processing, mobile data processing, energy savings, infrastructure management, image processing, fraud detection, IT security, and healthcare. It also discusses NoSQL databases and Hive Query Language. Finally, it notes that big data requires new data specialists like Hadoop specialists and data scientists.
Detailed presentation on big data hadoop +Hadoop Project Near Duplicate Detec...Ashok Royal
Bigdata Hadoop, Its components and a Hadoop project is described in Details.
Visit http://hadoop-beginners.blogspot.com to see Hadoop Tutorials.
Thanks for the visit. :)
Apache Arrow is a new standard for in-memory columnar data processing. It is a complement to Apache Parquet and Apache ORC. In this deck we review key design goals and how Arrow works in detail.
Joshfire Factory: Building Apps for Billion of DevicesFrancois Daoust
1) The future will include billions of connected devices and using web standards can help build apps for all of them.
2) Using a tool like Joshfire Factory can help developers avoid programming each app for every platform from scratch and save both time and money.
3) Linked data and schemas like schema.org can help normalize data from different sources and platforms to simplify app development across devices.
This document provides an overview of continuous integration and deployment best practices on AWS. It discusses what continuous integration is and how it helps with rapid development by making changes and deployments iterative rather than monolithic. This allows bugs to be detected quickly. The document then discusses tools that can be used to implement continuous integration, such as AWS services, configuration management systems like Puppet, deployment frameworks like AWS Elastic Beanstalk, and infrastructure management tools like AWS CloudFormation. It also provides tips for scaling tools like Puppet masters and optimizing continuous integration and deployment workflows.
Continuous Integration and Deployment Best Practices on AWS (ARC307) | AWS re...Amazon Web Services
With AWS, companies now have the ability to develop and run their applications with speed and flexibility like never before. Working with an infrastructure that can be 100 percent API driven enables businesses to use lean methodologies and realize these benefits. This in turn leads to greater success for those who make use of these practices. In this session, we talk about some key concepts and design patterns for continuous deployment and continuous integration, two elements of lean development of applications and infrastructures.
Deploying deep learning models with Docker and KubernetesPetteriTeikariPhD
Short introduction for platform agnostic production deployment with some medical examples.
Alternative download: https://www.dropbox.com/s/qlml5k5h113trat/deep_cloudArchitecture.pdf?dl=0
Zenko & MetalK8s @ Dublin Docker Meetup, June 2018Laure Vergeron
A presentation of the Zenko project, that gives you a single namespace and a single API (the S3 API) across all of your clouds (AWS S3, MS Azure, GCP, Wasabi, Digital Ocean, and private clouds), and that lets you perform metadata seach across all these backends.
We also cover MetalK8s, an opinionated release of Kubernetes committed to bare metal environments, and growing on Kuberspray.
Puppet and Chef are popular configuration management tools that use code to define and automate infrastructure. Puppet uses a declarative domain-specific language (DSL) and model-based approach, while Chef uses Ruby scripts and a top-down execution model. Both tools allow defining reusable infrastructure components as modules/cookbooks and converge systems to their desired state. Puppet is open source while Chef offers commercial support plans starting at $120/month.
The document introduces Couchbase Server 2.0, a NoSQL document database. It discusses how relational databases do not scale well for growing applications, while NoSQL databases like Couchbase can scale out horizontally across commodity servers to meet increasing demand. Couchbase Server provides features like flexible JSON document data model, easy scalability, high performance, and always-on availability. It also shows the Couchbase Server architecture and how data is distributed and replicated across nodes in a cluster.
Standard Issue: Preparing for the Future of Data ManagementInside Analysis
The Briefing Room with Robin Bloor and Jaspersoft
Slides from the Live Webcast on Sept. 18, 2012
As change continues to sweep across the data management industry, many organizations are looking for ways to prepare their systems and personnel for an unpredictable future. Forces such as Big Data and Cloud Computing are creating new opportunities and significant challenges for a world filled with legacy systems. Information architectures are fundamentally changing, and that's good news for companies that can take advantage of recent innovations.
Check out this episode of The Briefing Room to learn from veteran Analyst Robin Bloor, who will explain why the Information Oriented Architecture provides a stable roadmap for companies looking to harness a new era of corporate computing. He'll be briefed by Mike Boyarski of Jaspersoft, who will tout his company's history of integrating with highly diverse information systems. He'll also discuss Jaspersoft's standards-based, Cloud-ready architecture, and how it enables organizations to embed powerful Business Intelligence capabilities into their existing systems.
http://www.insideanalysis.com
Cloud Foundry is an open platform as a service that allows developers to deploy and scale applications in seconds without locking themselves into a single cloud. It provides choice of development frameworks, deployment clouds, and application services. Cloud Foundry uses BOSH for release engineering, deployment, and lifecycle management of large, distributed cloud services across multiple infrastructure providers.
Cloud Foundry, the Open Platform as a Service - Oscon - July 2012Patrick Chanezon
Cloud Foundry is an open platform as a service that allows developers to deploy and scale applications in seconds without locking themselves into a single cloud. It provides developer agility by allowing them to focus on writing applications without worrying about middleware and infrastructure. Cloud Foundry also offers portability without requiring code changes to deploy applications across private and public clouds. The open source project is community-driven and offers choices in frameworks, services, and deployment targets.
Heroku is a platform as a service that allows developers to build, run, and operate applications entirely in the cloud. It supports Ruby, Java, Node.js, and other languages and frameworks. Heroku builds applications on Amazon Web Services infrastructure and provides automatic scaling of dynos (the lightweight Linux containers that run applications). Developers deploy code to Heroku using Git and the platform automatically distributes and runs the application across dynos.
The document discusses various machine learning clustering algorithms like K-means clustering, DBSCAN, and EM clustering. It also discusses neural network architectures like LSTM, bi-LSTM, and convolutional neural networks. Finally, it presents results from evaluating different chatbot models on various metrics like validation score.
The document discusses challenges with using reinforcement learning for robotics. While simulations allow fast training of agents, there is often a "reality gap" when transferring learning to real robots. Other approaches like imitation learning and self-supervised learning can be safer alternatives that don't require trial-and-error. To better apply reinforcement learning, robots may need model-based approaches that learn forward models of the world, as well as techniques like active localization that allow robots to gather targeted information through interactive perception. Closing the reality gap will require finding ways to better match simulations to reality or allow robots to learn from real-world experiences.
[243] Deep Learning to help student’s Deep LearningNAVER D2
This document describes research on using deep learning to predict student performance in massive open online courses (MOOCs). It introduces GritNet, a model that takes raw student activity data as input and predicts outcomes like course graduation without feature engineering. GritNet outperforms baselines by more than 5% in predicting graduation. The document also describes how GritNet can be adapted in an unsupervised way to new courses using pseudo-labels, improving predictions in the first few weeks. Overall, GritNet is presented as the state-of-the-art for student prediction and can be transferred across courses without labels.
[234]Fast & Accurate Data Annotation Pipeline for AI applicationsNAVER D2
This document provides a summary of new datasets and papers related to computer vision tasks including object detection, image matting, person pose estimation, pedestrian detection, and person instance segmentation. A total of 8 papers and their associated datasets are listed with brief descriptions of the core contributions or techniques developed in each.
[226]NAVER 광고 deep click prediction: 모델링부터 서빙까지NAVER D2
This document presents a formula for calculating the loss function J(θ) in machine learning models. The formula averages the negative log likelihood of the predicted probabilities being correct over all samples S, and includes a regularization term λ that penalizes predicted embeddings being dissimilar from actual embeddings. It also defines the cosine similarity term used in the regularization.
[214] Ai Serving Platform: 하루 수 억 건의 인퍼런스를 처리하기 위한 고군분투기NAVER D2
The document discusses running a TensorFlow Serving (TFS) container using Docker. It shows commands to:
1. Pull the TFS Docker image from a repository
2. Define a script to configure and run the TFS container, specifying the model path, name, and port mapping
3. Run the script to start the TFS container exposing port 13377
The document discusses linear algebra concepts including:
- Representing a system of linear equations as a matrix equation Ax = b where A is a coefficient matrix, x is a vector of unknowns, and b is a vector of constants.
- Solving for the vector x that satisfies the matrix equation using linear algebra techniques such as row reduction.
- Examples of matrix equations and their component vectors are shown.
This document describes the steps to convert a TensorFlow model to a TensorRT engine for inference. It includes steps to parse the model, optimize it, generate a runtime engine, serialize and deserialize the engine, as well as perform inference using the engine. It also provides code snippets for a PReLU plugin implementation in C++.
The document discusses machine reading comprehension (MRC) techniques for question answering (QA) systems, comparing search-based and natural language processing (NLP)-based approaches. It covers key milestones in the development of extractive QA models using NLP, from early sentence-level models to current state-of-the-art techniques like cross-attention, self-attention, and transfer learning. It notes the speed and scalability benefits of combining search and reading methods for QA.
An invited talk given by Mark Billinghurst on Research Directions for Cross Reality Interfaces. This was given on July 2nd 2024 as part of the 2024 Summer School on Cross Reality in Hagenberg, Austria (July 1st - 7th)
Choose our Linux Web Hosting for a seamless and successful online presencerajancomputerfbd
Our Linux Web Hosting plans offer unbeatable performance, security, and scalability, ensuring your website runs smoothly and efficiently.
Visit- https://onliveserver.com/linux-web-hosting/
Best Programming Language for Civil EngineersAwais Yaseen
The integration of programming into civil engineering is transforming the industry. We can design complex infrastructure projects and analyse large datasets. Imagine revolutionizing the way we build our cities and infrastructure, all by the power of coding. Programming skills are no longer just a bonus—they’re a game changer in this era.
Technology is revolutionizing civil engineering by integrating advanced tools and techniques. Programming allows for the automation of repetitive tasks, enhancing the accuracy of designs, simulations, and analyses. With the advent of artificial intelligence and machine learning, engineers can now predict structural behaviors under various conditions, optimize material usage, and improve project planning.
Are you interested in dipping your toes in the cloud native observability waters, but as an engineer you are not sure where to get started with tracing problems through your microservices and application landscapes on Kubernetes? Then this is the session for you, where we take you on your first steps in an active open-source project that offers a buffet of languages, challenges, and opportunities for getting started with telemetry data.
The project is called openTelemetry, but before diving into the specifics, we’ll start with de-mystifying key concepts and terms such as observability, telemetry, instrumentation, cardinality, percentile to lay a foundation. After understanding the nuts and bolts of observability and distributed traces, we’ll explore the openTelemetry community; its Special Interest Groups (SIGs), repositories, and how to become not only an end-user, but possibly a contributor.We will wrap up with an overview of the components in this project, such as the Collector, the OpenTelemetry protocol (OTLP), its APIs, and its SDKs.
Attendees will leave with an understanding of key observability concepts, become grounded in distributed tracing terminology, be aware of the components of openTelemetry, and know how to take their first steps to an open-source contribution!
Key Takeaways: Open source, vendor neutral instrumentation is an exciting new reality as the industry standardizes on openTelemetry for observability. OpenTelemetry is on a mission to enable effective observability by making high-quality, portable telemetry ubiquitous. The world of observability and monitoring today has a steep learning curve and in order to achieve ubiquity, the project would benefit from growing our contributor community.
Kief Morris rethinks the infrastructure code delivery lifecycle, advocating for a shift towards composable infrastructure systems. We should shift to designing around deployable components rather than code modules, use more useful levels of abstraction, and drive design and deployment from applications rather than bottom-up, monolithic architecture and delivery.
Understanding Insider Security Threats: Types, Examples, Effects, and Mitigat...Bert Blevins
Today’s digitally connected world presents a wide range of security challenges for enterprises. Insider security threats are particularly noteworthy because they have the potential to cause significant harm. Unlike external threats, insider risks originate from within the company, making them more subtle and challenging to identify. This blog aims to provide a comprehensive understanding of insider security threats, including their types, examples, effects, and mitigation techniques.
Measuring the Impact of Network Latency at TwitterScyllaDB
Widya Salim and Victor Ma will outline the causal impact analysis, framework, and key learnings used to quantify the impact of reducing Twitter's network latency.
Details of description part II: Describing images in practice - Tech Forum 2024BookNet Canada
This presentation explores the practical application of image description techniques. Familiar guidelines will be demonstrated in practice, and descriptions will be developed “live”! If you have learned a lot about the theory of image description techniques but want to feel more confident putting them into practice, this is the presentation for you. There will be useful, actionable information for everyone, whether you are working with authors, colleagues, alone, or leveraging AI as a collaborator.
Link to presentation recording and transcript: https://bnctechforum.ca/sessions/details-of-description-part-ii-describing-images-in-practice/
Presented by BookNet Canada on June 25, 2024, with support from the Department of Canadian Heritage.
Coordinate Systems in FME 101 - Webinar SlidesSafe Software
If you’ve ever had to analyze a map or GPS data, chances are you’ve encountered and even worked with coordinate systems. As historical data continually updates through GPS, understanding coordinate systems is increasingly crucial. However, not everyone knows why they exist or how to effectively use them for data-driven insights.
During this webinar, you’ll learn exactly what coordinate systems are and how you can use FME to maintain and transform your data’s coordinate systems in an easy-to-digest way, accurately representing the geographical space that it exists within. During this webinar, you will have the chance to:
- Enhance Your Understanding: Gain a clear overview of what coordinate systems are and their value
- Learn Practical Applications: Why we need datams and projections, plus units between coordinate systems
- Maximize with FME: Understand how FME handles coordinate systems, including a brief summary of the 3 main reprojectors
- Custom Coordinate Systems: Learn how to work with FME and coordinate systems beyond what is natively supported
- Look Ahead: Gain insights into where FME is headed with coordinate systems in the future
Don’t miss the opportunity to improve the value you receive from your coordinate system data, ultimately allowing you to streamline your data analysis and maximize your time. See you there!
How RPA Help in the Transportation and Logistics Industry.pptxSynapseIndia
Revolutionize your transportation processes with our cutting-edge RPA software. Automate repetitive tasks, reduce costs, and enhance efficiency in the logistics sector with our advanced solutions.
Paradigm Shifts in User Modeling: A Journey from Historical Foundations to Em...Erasmo Purificato
Slide of the tutorial entitled "Paradigm Shifts in User Modeling: A Journey from Historical Foundations to Emerging Trends" held at UMAP'24: 32nd ACM Conference on User Modeling, Adaptation and Personalization (July 1, 2024 | Cagliari, Italy)
Support en anglais diffusé lors de l'événement 100% IA organisé dans les locaux parisiens d'Iguane Solutions, le mardi 2 juillet 2024 :
- Présentation de notre plateforme IA plug and play : ses fonctionnalités avancées, telles que son interface utilisateur intuitive, son copilot puissant et des outils de monitoring performants.
- REX client : Cyril Janssens, CTO d’ easybourse, partage son expérience d’utilisation de notre plateforme IA plug & play.
TrustArc Webinar - 2024 Data Privacy Trends: A Mid-Year Check-InTrustArc
Six months into 2024, and it is clear the privacy ecosystem takes no days off!! Regulators continue to implement and enforce new regulations, businesses strive to meet requirements, and technology advances like AI have privacy professionals scratching their heads about managing risk.
What can we learn about the first six months of data privacy trends and events in 2024? How should this inform your privacy program management for the rest of the year?
Join TrustArc, Goodwin, and Snyk privacy experts as they discuss the changes we’ve seen in the first half of 2024 and gain insight into the concrete, actionable steps you can take to up-level your privacy program in the second half of the year.
This webinar will review:
- Key changes to privacy regulations in 2024
- Key themes in privacy and data governance in 2024
- How to maximize your privacy program in the second half of 2024
TrustArc Webinar - 2024 Data Privacy Trends: A Mid-Year Check-In
[B6]heroku postgres-hgmnz
1. Heroku Postgres
The Tale of Conceiving and Building
a Leading Cloud Database Service
Harold Giménez
@hgmnz
1
Saturday, September 8, 12
2. Heroku Postgres
•
Database-as-a-service
•
Cloud
•
Fully managed
•
Over 2 years in production
•
From tiny blogs to superbowl commercials
2
Saturday, September 8, 12
Heroku Postgres is a Database as a Service provider
We provision and run databases in cloud infrastructure
It is fully managed, always on and available
Has been in production for over 2 years, and has powered everything from a personal blog to
sites backing superbowl commercial sites.
3. Heroku origins
3
Saturday, September 8, 12
Heroku is born with a vision of increasing developer productivity and agility.
Anyone remember heroku garden? While that product no longer exists, that vision remains
part of our core culture.
We want to enable developers to bring their creations to market as fast and pleasantly as
possible.
4. focus on rails
4
Saturday, September 8, 12
heroku got in the business of running web applications. As with any startup, it focused on
doing one thing well, and for heroku that was running rails applications.
The approach empowered developers like never before. As a heroku customer, as I was back
then, I was excited to make hobby apps available on the internet on a regular basis. It was so
easy.
5. rails apps need a database
5
Saturday, September 8, 12
Clearly, rails apps need a database. Rails got really good at doing CRUD, after all.
6. web apps need a database
6
Saturday, September 8, 12
but this is true of any web application
7. thankfully postgres was chosen
7
Saturday, September 8, 12
The story was something like
“Hey, we need a database. What should we use?”
Heroku was a very small team. The security expert happened to speak up and recommends
Postgres,
for it’s correctness track record and fine grained user role management
8. otherwise I wouldn’t be here
8
Saturday, September 8, 12
I’ve been a Postgres user for years and know it is vastly superior to other open source RDBMS
projects. If Postgres had not been chosen, I wouldn’t be here.
9. “let’s make a production grade
postgres service”
9
Saturday, September 8, 12
Heroku would give you a free database whenever you create an app.
One database server would hold a bunch of users.
But this is not sufficient for serious production applications that require exclusive access to
more resources, and higher availability ops.
10. 10
Saturday, September 8, 12
This is our team’s mascot. It is a slide from IBM used in marketing materials in the 70s.
It’s funny how this vision was not true back then, but we are making it a reality over 30 years
later.
12. Heroku Postgres v.0.pre.alpha
•
A sinatra app implementing the heroku addons API
•
create servers
•
install postgres service
•
create databases for users - a “Resource”
•
Sequel talks to postgres
•
stem talks to AWS
12
Saturday, September 8, 12
Let’s talk about the tools used to build the very first version of Heroku Postgres.
It was built in Ruby.
Sinatra is used to expose and APIs.
Sequel is used to talk to postgres databases as well as as an ORM
stem was built for this project - a very minimalistic and pleasant interface to the AWS APIs.
stem was made available as open source software.
13. Two main entities
13
Saturday, September 8, 12
There are two main entities in this application
14. Resource
{
database: ‘d4f9wdf02’,
port: 5432,
username: ‘uf0wjasdf’,
password: ‘pf14fhjas’,
created_at: ‘2012-05-02’,
state: ‘available’
}
14
Saturday, September 8, 12
A resource encapsulates a database, the actual tangible resource that customers buy. A
customer only cares about the database URL, used to connect to it.
15. Server
{
elastic_ip: ‘192.168.0.1’,
instance_id: ‘i-2efjoiads’,
ami: ‘pg-prod’,
availability_zone: ‘us-east-1’,
created_at: ‘2012-05-02’,
state: ‘booting’
}
15
Saturday, September 8, 12
A server is the physical box where the resource is installed.
Customers don’t have direct access to it. It’s for our own bookkeeping and maintenance.
It includes an IP address, availability zone, and other AWS related attributes.
16. ...and a thin admin web interface
erb templates in sinatra
endpoint
16
Saturday, September 8, 12
The early application also had an admin interface, right in the very same codebase as erb
templates within some sinatra HTTP endpoints.
17. We are just an add-on
17
Saturday, September 8, 12
The Heroku Postgres offering is just an heroku addon.
18. 18
Saturday, September 8, 12
There are numerous components to Heroku, one of which is the addons system.
Heroku Postgres is an addon just like any other third party is an addon (such as Papertrail or
Sendgrid). We don’t utilize any backdoors of any kind, and instead interface with the rest of
Heroku in the same way other addon providers do.
This is a great position to be in, because as consumers of the heroku addons echosystem, we
help drive its evolution.
19. we run on
19
Saturday, September 8, 12
Furthermore, the entire Heroku Postgres infrastructure runs on Heroku itself.
20. the simplest thing
that could possibly work,
but no less
20
Saturday, September 8, 12
Simplicity is key to building any sort of system, and in this case, the initial version of the
Heroku Postgres management app was as simple as it could be.
This allows us to modify behavior and evolve as quickly as possible, on a smaller more
pleasant code base.
21. We’ve come a long way since then
21
Saturday, September 8, 12
Fast forward a few years, and we are now managing a very large number of databases,
keeping them alive, and creating new ones at a higher rate than ever.
This requires more sophisticated processes and managers.
Let’s dive into how it works today
22. Monitoring and Workflow
22
Saturday, September 8, 12
Monitoring and Workflow are key to this type of system.
23. draw inspiration from gaming
23
Saturday, September 8, 12
In programming we often draw inspiration from a number of things.
A good example is OOP itself, which is inspired by the way messages are sent between
organisms in a biological ecosystem
The project lead (@pvh) has a background in gaming.
Imagine the bad guy in a Diablo game. He’s just wandering around doing nothing, because
there’s nothing to attack around him. At some point, he sees your character and charges
toward you. You battle the Diablo. He fights back, and finally you kill him. He dies a slow and
painful death.
There are many ways to model these kinds of systems. One can be an events based system,
where observers listen on events that are occurring and react to them appropriately. You
could also load all objects that need monitoring and process that queue. This either gets too
complex easily, or doesn’t scale at all because of memory constraints and size of the
workload.
A state machine is another good way to model this. A state machine is, at heart, an entity that
is fed some inputs, and in return it takes some action, and then may or may not transition to
a different state.
The bad guy is in a `wondering around` state when nothing is around it. But as soon as it
saw your character, it entered a `battle` state, and so on.
We model what happens in real life, which is that we observe our environment, register it, and
react to it.
24. class Resource
def feel monitoring
observations.create(
Feeler.new(self).current_environment
)
end
end
class Feeler
def current_environment
{
service_available?: service_available?,
open_connections: open_connections,
row_count: row_count,
table_count: table_count,
seq_scans: seq_scans,
index_scans: index_scans
}
end
end
24
Saturday, September 8, 12
This is what the actual source code looks like.
A Resource has a #feel method, which stores an observation based on what the Feeler sees.
A Feeler is an object that observes the current environment around it. It checks things like is
the service available, how many connections are open, and many more health checks.
25. class Resource
include Stateful workflow
state :available do
unless service_available?
transition :unavailable
end
end
end
resource = Resource.new
resource.transition :available
resource.feel
resource.tick
puts resource.state
# ‘unavailable’
25
Saturday, September 8, 12
26. module Stateful
def self.included(base) workflow
base.extend ClassMethods
end
module ClassMethods
def state(name, &block)
states[name] = block
end
def states; @states ||= {}; end
end
def tick
self.instance_eval(
&self.class.states[self.state.to_sym]
)
end
def transition(state)
# log and assign new state
end
end
26
Saturday, September 8, 12
In terms of workflow, we built an extremely simple state machine system.
It allows you to define states via the `state` method which takes an arbitrary block of code to
execute when invoked via the `#tick` method.
27. resource.feel
resource.tick
Need to do this all the time
27
Saturday, September 8, 12
We first call #feel on an object, and then call #tick on it.
Feel stores new observed information, while #tick uses this information to make system
decisions, such as transitioning to other states, sending alerts, and much more.
We must run these two methods continuously
28. db1 db2 db3 db4 db5 db6 db7 db8 db9 ... dbn
db1.feel
db1.tick
28
Saturday, September 8, 12
One way to run it continously is via a work queue.
29. db2 db3 db4 db5 db6 db7 db8 db9 ... dbn db1
db2.feel enqueue(db1)
db2.tick
29
Saturday, September 8, 12
We create a queue and place all active resources on it. A set or workers pull jobs from the
queue, invoke feel and tick, and then enqueue themselves again.
This is in escense a poorly implemented distributed ring buffer, and it’s served us well.
30. QueueClassic
http://github.com/ryandotsmith/queue_classic
30
Saturday, September 8, 12
Our queue is implemented on top of the QueueClassic gem, which is a queue system built in
Ruby on top of Postgres with some interesting characteristics.
31. 31
Saturday, September 8, 12
Let’s look at some of the states on our resource class. A resource can go through these
states.
One very important aspect of this system is idempotency. The system must be designed in
such a way that each state can be run any number of times and without affecting the end
result.
Examples where this is not immediately obvious are the creating and deprovisioning state.
32. Durability and Availability
32
Saturday, September 8, 12
Let’s talk about how we handle durability and availability of databases.
33. 33
Saturday, September 8, 12
In Postgres, as in other similar systems, when you issue a write transaction, it firsts writes the
transaction to what’s called the Write-Ahead Log (WAL), and only then does it write to the
data files.
This ensures that all data committed to the system exists first in the WAL stream.
34. 34
Saturday, September 8, 12
Of course, if the WAL stream is on the same physical disks as the data files, there’s a high
risk of data loss.
Many opt to place the wal segments on a separate disk than the data files. This is a great first
step (and one we also take).
But really, we don’t consider data to be durable until the WAL segments are replicated across
many data centers.
We ship WAL segments to multi-datacenter storage every 60 seconds. We use Wal-e, a
python WAL archiver written at Heroku and now available as open source.
35. 35
Saturday, September 8, 12
Now that the WAL segments are out of the box, we can do many other tricks.
For example, creating a “follower” is as easy as fetching the WAL segments from the
distributed storage, and replaying these logs on a brand new server - once it has caught up,
we set up direct streaming replication between primary and follower.
36. 36
Saturday, September 8, 12
Similary, a fork of a database sets pulls down the WAL segments from distributed storage and
replays them on a new server.
Once it’s caught up, instead of setting up streaming replication as in the follow case, instead
this new server starts producing WAL segments of it’s own (when write transactions occur on
it). So now the fork is set up to ship WAL segments to distributed storage, just like its leader.
37. Continuous Protection
•
Write-Ahead Log segments shipped to durable storage every
60 seconds
•
We can replay these logs on a new server to recover your
data
•
https://github.com/heroku/WAL-E
37
Saturday, September 8, 12
This is what we call Continuous Protection.
Having WAL segments always available is a primary concern of ours, as it allows us to easily
rebuild a server’s data state, and can be updated continuously as opposed to capturing full
backups of the system.
38. Need a more flexible object model
38
Saturday, September 8, 12
Now, the introduction of all of these functions required us to rethink our object model.
39. timeline
39
Saturday, September 8, 12
We have the concept of a timeline
A timeline at time = zero contains no data, no commits.
40. participant
40
Saturday, September 8, 12
Participants are attached to a timeline. Participants can write data to the timeline.
43. resource
43
Saturday, September 8, 12
A resource is what our users get. It maps to a URL. A resource is attached to one participant.
44. follower44
Saturday, September 8, 12
This allows us to model followers easily.
A follower is just a participant on the same timeline as its reader.
The difference is that followers can’t write to the same timeline. Only one participant can
write to the timeline, the follower’s leader (or primary).
45. fork
45
Saturday, September 8, 12
When we fork a database, it creates its own timeline. The new timeline now has drifted away
from it’s parent, and can be writable. So it will create it’s own path.
46. disaster
46
Saturday, September 8, 12
Finally, this system can be used during the event of catastrophic hardware failure.
When a database’s hardware fails completely, instead of trying to recover the server itself, it’s
best to create a new node and “STONITH” (http://en.wikipedia.org/wiki/STONITH)
48. recovery48
Saturday, September 8, 12
And once it is caught up and ready to go, we tie the resource to it.
So, the user only sees a blip in availability, but behind the scenes they are actually sitting on
entirely new hardware, like magic.
49. big project
49
Saturday, September 8, 12
Needless to say, this has become a big project over time.
52. modularize and build APIs
52
Saturday, September 8, 12
So it’s time to spread out responsabilities by modularizing the system and building APIs that
are used for them to talk to each other.
53. 53
Saturday, September 8, 12
What we’ve built is a constellation of heroku apps. We may split this even further in the
future.
54. gain in agility
54
Saturday, September 8, 12
This gains un in agility.
The test suites of each individual project is much smaller now, which improves our ability to
develop quicker.
It also means that each component can be deployed individually. For example, a deploy to the
admin front end UI has no effect on the main system’s APIs.
55. composable services
55
Saturday, September 8, 12
It also allows us to build better abstractions at the systematic level, which gains us in the
ability to compose services better.
For example, a system that provisions and manages servers from our infrastructure provider
can be used by many other consumers, not only heroku postgres.
56. independently scalable
56
Saturday, September 8, 12
They can furthermore be scaled individually. Some parts of the system require different loads
and response times than others, so now we are able to easily and clearly tweak our system
operations based on clearly decoupled subsystems.
57. Logging and Metrics
57
Saturday, September 8, 12
Finally, I’d like to talk about visibility into our app.
58. log generation
58
Saturday, September 8, 12
First, let’s talk about logging.
59. 59
Saturday, September 8, 12
In Heroku, there’s a service called Logplex (it’s open source).
Your application is able to send logs to the logplex service to a specific channel (it uses
Capability Based Security).
Then, one or more consumers can “drain” the logs for that channel.
61. how should you log?
61
Saturday, September 8, 12
Having this logging infrastructure available, let’s talk about how to make best use of it.
62. post “/work” do
puts “starting to do work”
worker = Worker.new(params)
begin
worker.lift_things_up
worker.put_them_down
rescue WorkerError => e
puts “Fail :( #{e.message}”
status 500
end
puts “done doing work”
status 200
end
62
Saturday, September 8, 12
This is an example of terrible logging.
63. $ heroku logs --tail
2012-07-28T02:43:35 [web.4] starting to do
work
2012-07-28T02:43:35 [web.4] Fail :(
invalid worker, nothing to do
2012-07-28T02:43:35 heroku[router] POST
myapp.com/work dyno=web.4 queue=0 wait=0ms
service=14ms status=500 bytes=643
63
Saturday, September 8, 12
There’s no structure to these logs, so it can’t be easily read and interpreted by a computer.
64. bad logging
•
What exactly happened?
•
When did it happen?
•
How long did it take?
•
How many times has it happened?
64
Saturday, September 8, 12
65. good logging
•
parseable
•
consistent
•
plentiful
65
Saturday, September 8, 12
66. post “/work” do
log(create_work: true, request_id: uuid) do
worker = Worker.new(params.merge(uuid: uuid))
begin
worker.lift_things_up
worker.put_them_down
rescue WorkerError => e
log_exception(e, create_work: true)
end
end
end
helpers do
def uuid
@uuid ||= SecureRandom.uuid
end
end
66
Saturday, September 8, 12
Instead, let’s do some more structured logging.
Also note how every request gets a UUID. This is critical to tying up all the logs for a given
request.
67. require ‘scrolls’
module App
module Logs
extend self
def log(data, &block)
Scrolls.log(with_env(data), &block)
end
def log_exception(exception, data, &block)
Scrolls.log_exception(with_env(data), &block)
end
def with_env(hash)
{ environment: ENV[‘RACK_ENV’] }.merge(data)
end
end
end
67
Saturday, September 8, 12
On the prior slide, we saw the `log` and `log_exception` methods.
This is a small module that provides those methods. It is a wrapper for the `scrolls` (open
source) gem.
Scrolls provides a framework for structured logging.
This module merely adds our environment name to the logs, which is useful for parsing later.
68. $ heroku logs --tail
2012-07-28T02:43:35 [web.4] create_work
request_id=afe2-f0d at=start
2012-07-28T02:43:35 [web.4] create_work
request_id=afe2-f0d at=exception
message=invalid worker, nothing to do
2012-07-28T02:43:35 [web.4] create_work
request_id=afe2-f0d at=finish elapsed=53
2012-07-28T02:43:35 heroku[router] POST
myapp.com/work dyno=web.4 queue=0 wait=0ms
service=14ms status=500 bytes=643
68
Saturday, September 8, 12
Now our logs look like this.
Easy to parse, and still easy to read by a human.
69. log consumption
69
Saturday, September 8, 12
Let’s talk about consuming those logs, which should make it clear why structured logging is
so important.
70. (this is the fun part)
70
Saturday, September 8, 12
71. 71
Saturday, September 8, 12
As mentioned before, it’s possible to set up multiple log drains.
The heroku toolbelt has a utility to print out logs to your terminal (accessible via heroku logs
--tail).
But why stop there? You can have as many drains as you want!
We can set up a drain that stores data locally for further analysis and metrics generation.
Here, a postgres database is set up and logs stored to it on the key-value data type called
hstore.
73. 73
Saturday, September 8, 12
Now that we have stored data on a postgres database, we can use SQL to query it and
generate some metrics.
We have a process that continuously queries this database and sends aggregated results to a
metrics collection service (third party).
74. good logging
metrics
alerts
74
Saturday, September 8, 12
Visibility into your system starts with good logging
Great logs enable easy metrics collection
Metrics lead to system alerts.
75. current tooling
•
still using sequel and sinatra
•
fog displaced stem
•
backbone.js for web UIs
•
fernet for auth tokens, valcro for validations
•
python, go and bash in some subsystems
75
Saturday, September 8, 12
So to wrap up, our current tooling includes these pieces of technology
76. lessons
•
managing databases is hard
•
start simple
•
extract (and share) reusable code
•
separate concerns into services
•
learn to love your event stream
76
Saturday, September 8, 12
77. thanks!
@hgmnz
@herokupostgres
77
Saturday, September 8, 12