SlideShare a Scribd company logo
Sync Framework
Synchronize Your Data On-
Premises and to the Cloud
Sameh Samir
Senior Software Engineer
Architecture and Infrastructure Team
MedStreaming LLC
What Will Talk About
• Brief on Microsoft Sync Framework
• Why I’d Need Synchronization
• Synchronization Ecosystem: The Concert
• Framework Components
• Responsibilities
• Participants
• Application Scenarios: Offline
• Application Scenarios: Collaboration
• How It Works
• Change Tracking
• Conflict Resolution
• Concepts
• Sync Scenarios: On-Premises Two Tier Architecture
• Demo: Synchronizing Data - 2-Tier Architecture
• Sync Scenarios: In the cloud – N Tier Architecture
• Demo: Synchronizing Data : N-Tier Architecture
• Choosing Primary Keys
• Tracing
• Demo: Sync with SQL Azure
Brief on Microsoft Sync Framework
• Microsoft data synchronization platform
• Allow for Collaboration and OCA (Offline) Applications scenarios
• Announced in MIX 2008
• August 2008 – V1.0
• April 2009 – V2.0
• August 2010 – V2.1
• Q1 2011 – V3.0 (Expected)
Why I’d Need Synchronization?
• Offline Availability
• Lake of offline availability maybe frustrating for some users. But can be a disaster for
others (Retail Store POS, Medical system)
• Access to Full Client Capabilities
• H/W intensive applications (Imaging, Medical, 3D, Media Processing, POS Station,
etc…)
• User Experience
• Asynchronous processing improves usability, but you still have to wait
• Cache management will be a headache if you would cache everything
• Mobility
• Request for mobile accessibility increases
• Mobile accessibility is a must for some businesses
• Mobile internet still not cheap
Qualities of MSF
• Ease of use
• High Level of Customization
• Data and Transport Agnostic Sync Functionality
• Build - in Providers
• Extensibility
• Custom Providers Framework
Synchronization Ecosystem: The Concert
Sync
Provider
Sync Application
Sync
Provider
Sync Orchestrator
Data
Store
Data
Store
changes
changes
changes
changes
Metadata
Interpretation
Tools
Provider
Services
MD Store
Sync Runtime
Framework Components
Sync Runtime (Orchestration)
KnowledgeVersion
Change
Enumeration
Basic Building Blocks
Built-In Providers
Conflict
Detection
Metadata
Storage Service
Anchor based
Providers
Simple
Providers
SQL Sync
Provider
SQL CE Sync
Provider
File Sync
Provider
Feed Sync
Provider
End to End Solutions
IDE
Integration
Other MS & 3rd Party
Providers / Solutions
.…
ADO Sync
Services
Db Sync
Provider
Sync for
OData
Full Enumeration
Providers
Responsibilities
Developer:
• The application
• The data store
• The data transfer protocol
Sync Framework:
• Synchronization session, or manager
• The synchronization runtime
Sync Framework, or the Developer:
• The sync provider
• The metadata store
Participants
• Full Participants: Devices that allow
developers to create applications and new
data stores directly on the device. E.g.
Windows Phone, laptop
• Partial Participants: Devices that have the
ability to store data either in the existing data
store or another data store on the device but
do not have the ability to launch executables.
E.g. thumb drives or SD Cards.
• Simple Participants: Devices that are only
capable of providing information when
requested. These devices cannot store or
manipulate new data. E.g. RSS Feeds and
web services.
Application Scenarios: Offline
• All clients sync through a single hub (Server)
• Suitable for Occasionally Connected
Applications (OCA)
• Single point of failure
• The most common, and easier to implement
Application Scenarios: Collaboration
• Suitable for application where users needs to
share data (i.e notes, documents, calendars
, project info)
• Each client can sync with other clients or with a
central server
• Avoid single point of failure
• Offload the sync processing from
server to clients, and thus provide
more scalability
• Less common and more complex
to implement.
Is metadata up-to
date
How It Works
Enumeration
Sync Orchestrator
Provider
Framework
with Runtime
Sync
Provider
Data
Store Meta-data
Store
GetChangeBatch
Enumerate all objects
Here’s one:
Id=‘foo’, LMT=5pm
What was it last time?
New
Updated
Same
Update metadata
Bring
metadata
up-to-date
Enumerate
changes
Metadata is
up-to-date!
All done!
What’s missing?
Record deletes
…
How It Works (Cont.)
Applying Changes
Sync Orchestrator
Provider
Framework
with Runtime
Sync
Provider
Data
StoreMeta-data
Store
Enumerate all objects
Here’s one:
Id=‘foo’, LMT=5pm
What was it last time?
New
Updated
Same
Update metadata
Bring
metadata
up-to-date
Metadata is
up-to-date!
All done!
What’s missing?
Record deletes
…
ProcessChangeBatch
Get versions
Update item
id=‘foo’
LMT was 1pm
New data is ‘bar’New LMT=8pm
Check LMT
and write
Update
metadata
Change Tracking
• Change tracking provides a list of changes made from one point in time to
another.
• Commonly implemented using rowversions and triggers, plus a “deleted”
table
• The major disadvantages are:
• Changes are required to the schema to add columns and tables
• Triggers are fired for each change made, which has performance implications.
• SQL Server 2008 has built-in change tracking, implemented without
rowversions and triggers
• The Sync Framework database synch providers take advantage of SQL
Server 2008 change tracking and provide the following advantages :
• No schema changes are required
• Triggers are not required for tracking changes
• All of the logic for tracking changes is internal to the SQL Server engine
Conflict Resolution
• Conflicts occur when two or more databases make a change to
the same piece of data
• A variety of ways to resolve these conflicts.
• Last change to come in wins
• Highest priority user wins
• Manual selection
• Sync Framework provides conflict detection and resolution
capabilities out of the box
• SQL Server 2008 makes it easier to identify conflicts.
Concepts
• Sync Scope:
• Set of tables that will be available for synchronization
• Sync Group:
• Group of that must be synchronized as a single unit (transaction)
• Ensure data consistency
• Provisioning a Server
• Get the server ready for change tracking
• Add change tracking columns and triggers for SQL Server 2005
• Enable change tracking feature for a set of tables of a SQL Server 2008 database
• Can be done programmatically or through “Configure Data Synchronization” wizard
Sync Scenarios
On-Premises (Two-Tier Architecture)
Sync
Provider
Sync Application
Sync
Orchestrator
Data
Store
Sync
Provider
Data
Store
Data ServerClient
Synchronizing Data - 2-Tier Architecture
Sync Scenarios
In The Cloud (N-Tier Architecture)
Sync
Provider
Sync Application
Sync
Orchestrator
Data
Store
Sync
Provider
Data
Store
Data ServerClient
Proxy
Synchronizing Data - N-Tier Architecture
Table Key Selection
Take it seriously, or else
Table Key Selection : The Problem
Client 1 Client 2
1 Customer 1 …
100 Customer 100 …
1 Customer 1 …
100
Customer 100 …
101
Customer 101 …
1 Customer 1 …
100
Customer 100 …
101
Customer 101 …
101 Customer 101 …
Duplicate Key Conflict
Table Key Selection : Solutions
1. Use GUID instead of auto incremented IDs
• Solve primary key collisions possible with auto-increment columns
• Increased index size leads to increased query time
• Causes fragmented clustered index, which also affects query
processing time.
• Can be solved in SQL Server by using NEWSEQUENTIALID function
to generate ordered GUIDs
2. ID Ranges
• Split available IDs into segments
• Assign each client a unique segments
• Client can ask for more ID ranges
Table Key Selection : Solutions (Cont.)
3. Compound Keys
• Use compound key that includes a client identifier
4. Use Business Key as ID
• Use unique business keys (i.e National Number / SSN /
Barcode)
• May affect the query performance if key type is not numerical.
5. Online Insert
• Insert directly to the server
Enable Tracing
Sync To SQL Azure
MSF: Sync your Data On-Premises And To The Cloud - dotNetwork Gathering, Oct 2010.
Call To Action
Azure Table Sync Library (azuretablesynclib.codeplex.com)
Open source project aims to create custom data sync providers
to allow for the following sync sceanrios
1. Azure Table Storage <-> SQL Server / Express
2. Azure Table Storage <-> SQL CE
3. Azure Table Storage <-> SQL Azure
4. Azure Table Storage <-> Azure Table Storage
Keep in Touch
Email: sameh.sami@gmail.com
Blog: www.Cloudy-Ideas.net / www.sameh-samir.net
Twitter: twitter.com/sameh_samir
LinkedIn: linkedin/in/samehsamir

More Related Content

MSF: Sync your Data On-Premises And To The Cloud - dotNetwork Gathering, Oct 2010.

  • 1. Sync Framework Synchronize Your Data On- Premises and to the Cloud Sameh Samir Senior Software Engineer Architecture and Infrastructure Team MedStreaming LLC
  • 2. What Will Talk About • Brief on Microsoft Sync Framework • Why I’d Need Synchronization • Synchronization Ecosystem: The Concert • Framework Components • Responsibilities • Participants • Application Scenarios: Offline • Application Scenarios: Collaboration • How It Works • Change Tracking • Conflict Resolution • Concepts • Sync Scenarios: On-Premises Two Tier Architecture • Demo: Synchronizing Data - 2-Tier Architecture • Sync Scenarios: In the cloud – N Tier Architecture • Demo: Synchronizing Data : N-Tier Architecture • Choosing Primary Keys • Tracing • Demo: Sync with SQL Azure
  • 3. Brief on Microsoft Sync Framework • Microsoft data synchronization platform • Allow for Collaboration and OCA (Offline) Applications scenarios • Announced in MIX 2008 • August 2008 – V1.0 • April 2009 – V2.0 • August 2010 – V2.1 • Q1 2011 – V3.0 (Expected)
  • 4. Why I’d Need Synchronization? • Offline Availability • Lake of offline availability maybe frustrating for some users. But can be a disaster for others (Retail Store POS, Medical system) • Access to Full Client Capabilities • H/W intensive applications (Imaging, Medical, 3D, Media Processing, POS Station, etc…) • User Experience • Asynchronous processing improves usability, but you still have to wait • Cache management will be a headache if you would cache everything • Mobility • Request for mobile accessibility increases • Mobile accessibility is a must for some businesses • Mobile internet still not cheap
  • 5. Qualities of MSF • Ease of use • High Level of Customization • Data and Transport Agnostic Sync Functionality • Build - in Providers • Extensibility • Custom Providers Framework
  • 6. Synchronization Ecosystem: The Concert Sync Provider Sync Application Sync Provider Sync Orchestrator Data Store Data Store changes changes changes changes Metadata Interpretation Tools Provider Services MD Store Sync Runtime
  • 7. Framework Components Sync Runtime (Orchestration) KnowledgeVersion Change Enumeration Basic Building Blocks Built-In Providers Conflict Detection Metadata Storage Service Anchor based Providers Simple Providers SQL Sync Provider SQL CE Sync Provider File Sync Provider Feed Sync Provider End to End Solutions IDE Integration Other MS & 3rd Party Providers / Solutions .… ADO Sync Services Db Sync Provider Sync for OData Full Enumeration Providers
  • 8. Responsibilities Developer: • The application • The data store • The data transfer protocol Sync Framework: • Synchronization session, or manager • The synchronization runtime Sync Framework, or the Developer: • The sync provider • The metadata store
  • 9. Participants • Full Participants: Devices that allow developers to create applications and new data stores directly on the device. E.g. Windows Phone, laptop • Partial Participants: Devices that have the ability to store data either in the existing data store or another data store on the device but do not have the ability to launch executables. E.g. thumb drives or SD Cards. • Simple Participants: Devices that are only capable of providing information when requested. These devices cannot store or manipulate new data. E.g. RSS Feeds and web services.
  • 10. Application Scenarios: Offline • All clients sync through a single hub (Server) • Suitable for Occasionally Connected Applications (OCA) • Single point of failure • The most common, and easier to implement
  • 11. Application Scenarios: Collaboration • Suitable for application where users needs to share data (i.e notes, documents, calendars , project info) • Each client can sync with other clients or with a central server • Avoid single point of failure • Offload the sync processing from server to clients, and thus provide more scalability • Less common and more complex to implement.
  • 12. Is metadata up-to date How It Works Enumeration Sync Orchestrator Provider Framework with Runtime Sync Provider Data Store Meta-data Store GetChangeBatch Enumerate all objects Here’s one: Id=‘foo’, LMT=5pm What was it last time? New Updated Same Update metadata Bring metadata up-to-date Enumerate changes Metadata is up-to-date! All done! What’s missing? Record deletes …
  • 13. How It Works (Cont.) Applying Changes Sync Orchestrator Provider Framework with Runtime Sync Provider Data StoreMeta-data Store Enumerate all objects Here’s one: Id=‘foo’, LMT=5pm What was it last time? New Updated Same Update metadata Bring metadata up-to-date Metadata is up-to-date! All done! What’s missing? Record deletes … ProcessChangeBatch Get versions Update item id=‘foo’ LMT was 1pm New data is ‘bar’New LMT=8pm Check LMT and write Update metadata
  • 14. Change Tracking • Change tracking provides a list of changes made from one point in time to another. • Commonly implemented using rowversions and triggers, plus a “deleted” table • The major disadvantages are: • Changes are required to the schema to add columns and tables • Triggers are fired for each change made, which has performance implications. • SQL Server 2008 has built-in change tracking, implemented without rowversions and triggers • The Sync Framework database synch providers take advantage of SQL Server 2008 change tracking and provide the following advantages : • No schema changes are required • Triggers are not required for tracking changes • All of the logic for tracking changes is internal to the SQL Server engine
  • 15. Conflict Resolution • Conflicts occur when two or more databases make a change to the same piece of data • A variety of ways to resolve these conflicts. • Last change to come in wins • Highest priority user wins • Manual selection • Sync Framework provides conflict detection and resolution capabilities out of the box • SQL Server 2008 makes it easier to identify conflicts.
  • 16. Concepts • Sync Scope: • Set of tables that will be available for synchronization • Sync Group: • Group of that must be synchronized as a single unit (transaction) • Ensure data consistency • Provisioning a Server • Get the server ready for change tracking • Add change tracking columns and triggers for SQL Server 2005 • Enable change tracking feature for a set of tables of a SQL Server 2008 database • Can be done programmatically or through “Configure Data Synchronization” wizard
  • 17. Sync Scenarios On-Premises (Two-Tier Architecture) Sync Provider Sync Application Sync Orchestrator Data Store Sync Provider Data Store Data ServerClient
  • 18. Synchronizing Data - 2-Tier Architecture
  • 19. Sync Scenarios In The Cloud (N-Tier Architecture) Sync Provider Sync Application Sync Orchestrator Data Store Sync Provider Data Store Data ServerClient Proxy
  • 20. Synchronizing Data - N-Tier Architecture
  • 21. Table Key Selection Take it seriously, or else
  • 22. Table Key Selection : The Problem Client 1 Client 2 1 Customer 1 … 100 Customer 100 … 1 Customer 1 … 100 Customer 100 … 101 Customer 101 … 1 Customer 1 … 100 Customer 100 … 101 Customer 101 … 101 Customer 101 … Duplicate Key Conflict
  • 23. Table Key Selection : Solutions 1. Use GUID instead of auto incremented IDs • Solve primary key collisions possible with auto-increment columns • Increased index size leads to increased query time • Causes fragmented clustered index, which also affects query processing time. • Can be solved in SQL Server by using NEWSEQUENTIALID function to generate ordered GUIDs 2. ID Ranges • Split available IDs into segments • Assign each client a unique segments • Client can ask for more ID ranges
  • 24. Table Key Selection : Solutions (Cont.) 3. Compound Keys • Use compound key that includes a client identifier 4. Use Business Key as ID • Use unique business keys (i.e National Number / SSN / Barcode) • May affect the query performance if key type is not numerical. 5. Online Insert • Insert directly to the server
  • 26. Sync To SQL Azure
  • 28. Call To Action Azure Table Sync Library (azuretablesynclib.codeplex.com) Open source project aims to create custom data sync providers to allow for the following sync sceanrios 1. Azure Table Storage <-> SQL Server / Express 2. Azure Table Storage <-> SQL CE 3. Azure Table Storage <-> SQL Azure 4. Azure Table Storage <-> Azure Table Storage
  • 29. Keep in Touch Email: sameh.sami@gmail.com Blog: www.Cloudy-Ideas.net / www.sameh-samir.net Twitter: twitter.com/sameh_samir LinkedIn: linkedin/in/samehsamir

Editor's Notes

  1. 3 – 5 Minutes
  2. 3 – 5 Minutes
  3. 7 min max Title Tech rush over the last 4 yrs (MSDN site) I can’t learn all Some tech are important, some for show, and others are mandatory Offline availability Centralized Data + Web Access do the trick Checking mail from the web VS checking from an offline client Read / write and interact offline, even you may not notice conn drop This is acceptable for normal users, but what for POS station, or Medical system H/W intensive applications (Imaging, Medical, 3D, Media Processing, POS Station, etc…) User Experience SL is a client technology that run inside the browser Async techs (Ajax) improves usability, but still you’ve to wait (Gmail as an example) Rich UI Mobility Day after day we can access more through mobile (email, messaging, social networks)
  4. 3 – 5 Minutes
  5. 7 min max Title Tech rush over the last 4 yrs (MSDN site) I can’t learn all Some tech are important, some for show, and others are mandatory Offline availability Centralized Data + Web Access do the trick Checking mail from the web VS checking from an offline client Read / write and interact offline, even you may not notice conn drop This is acceptable for normal users, but what for POS station, or Medical system H/W intensive applications (Imaging, Medical, 3D, Media Processing, POS Station, etc…) User Experience SL is a client technology that run inside the browser Async techs (Ajax) improves usability, but still you’ve to wait (Gmail as an example) Rich UI Mobility Day after day we can access more through mobile (email, messaging, social networks)
  6. 3 – 5 Minutes Version is what tells us the last modification time Sync Knowledge: typical solution would be sending of all sync versions from the destination to the source. very inefficient . single compact data structure which we call knowledge Metadata Store: Interact w/ md stores which can be inside the data store or in a separate data store
  7. Diff between anchor based enumeration and full enumeration
  8. 7 min max Title Tech rush over the last 4 yrs (MSDN site) I can’t learn all Some tech are important, some for show, and others are mandatory Offline availability Centralized Data + Web Access do the trick Checking mail from the web VS checking from an offline client Read / write and interact offline, even you may not notice conn drop This is acceptable for normal users, but what for POS station, or Medical system H/W intensive applications (Imaging, Medical, 3D, Media Processing, POS Station, etc…) User Experience SL is a client technology that run inside the browser Async techs (Ajax) improves usability, but still you’ve to wait (Gmail as an example) Rich UI Mobility Day after day we can access more through mobile (email, messaging, social networks)
  9. Scenarios VPN headeche Port Forwarding disadv Out of enterprise hosting and public clients
  10. Code executes on the server allow for dynamic filtering
  11. 3 – 5 Minutes
  12. 3 – 5 Minutes