The document discusses Codd's rules for relational database management systems (RDBMS). It explains the 13 rules, which include that data should only be represented as values in tables, null values must be supported, and the database description must be queryable using the same relational language as the data. It also defines what constitutes an RDBMS, describes database concepts like normalization, and provides examples of relationships and integrity rules.
MySQL is an open-source relational database management system. The document discusses the introduction to MySQL, its development history, installation, features, data types, basic commands like CREATE, SELECT, UPDATE, DELETE. It also covers MySQL constraints. MySQL is widely used for web applications due to its speed, ease of use and open source nature. It can store and manage large volumes of data across different tables using relationships.
The document discusses different approaches to data management and persistence in applications, including:
1) Storing objects directly in files or using a database management system (DBMS) to store data in tables while hiding physical storage details.
2) Design questions around persistence such as whether to use files, a relational or object DBMS, and how to structure the logical and physical layers.
3) Common techniques for mapping objects to relational databases like normalization, handling inheritance and associations.
4) Alternatives for designing data management classes like adding persistence methods to classes or using broker classes.
This document provides an overview of ADO.NET. It defines ADO.NET as providing functionality to connect frontend and backend systems and update, retrieve, and add data to databases using classes and functions. It supports both connected and disconnected architectures. The disconnected architecture allows retrieving entire database tables locally. Key ADO.NET classes include Connection, Command, and DataReader. Namespaces like System.Data organize the object model. Data providers like SqlClient are responsible for database connections. ADO.NET enables developing data-centric applications with benefits like a disconnected data architecture, scalability, and interpretability.
This document lists 576 Oracle DBA interview questions covering topics such as checking privileges and permissions, resizing datafiles, finding analyzed tables, active users, what a user is doing, table and index counts, tablespace and user space usage, OS and database versions, datafile reads and writes, segments close to limits, archived log and backup information, online redo log groups, datafiles, schema sizes, quotas, tablespace usage, table sizes, row distribution, database recovery, alerts, thresholds, notifications, archive log format, connection troubleshooting, and other administrative tasks and problems.
Oracle Database is a relational database management system produced by Oracle Corporation. It stores data logically in tables, tablespaces, and schemas, and physically in datafiles. The database, SGA (containing the buffer cache, redo log buffer, and shared pool), and background processes like SMON, PMON, and DBWR work together for high performance and reliability. Backup methods and administrative tasks help maintain the database.
The document discusses various Oracle database concepts and architecture. It covers physical and logical database structures, components like datafiles, redo logs, control files, tablespaces and schemas. It also discusses logical objects like tables, indexes, views, sequences and synonyms. Other topics include parallel servers, database instances, memory structures like SGA and PGA, tablespaces, rollback segments, free extents and space allocation.
The document discusses using PHP to connect to and manipulate MySQL databases. It covers using MySQLi and PDO to connect to MySQL from PHP, and provides examples of inserting, selecting, updating, and deleting data from MySQL databases using PDO commands. Key points include that PDO can work with multiple database types while MySQLi only works with MySQL, and that both support prepared statements to protect against SQL injection.
This document contains interview questions about Oracle database concepts and architecture. It covers physical and logical database structures, tablespaces, schemas, schema objects like tables, views, indexes, and sequences. It also discusses database administration topics like instances, parallel servers, memory structures like the system global area and program global area, and space allocation and management using rollback segments and tablespaces.
This document provides an overview of Oracle database history, architecture, components, and terminology. It discusses:
- Oracle's release history from 1978 to present.
- The physical and logical structures that make up an Oracle database, including data files, control files, redo logs, tablespaces, segments, and blocks.
- The Oracle instance and its memory components like the SGA and PGA. It describes the various background processes.
- How clients connect to Oracle using the listener, tnsnames.ora file, and naming resolution.
- Common Oracle tools for accessing and managing databases like SQLPlus, SQL Developer, and views for monitoring databases.
ADO.NET Architecture
Data processing has traditionally relied primarily on a connection-based, two-tier model. As data
processing increasingly uses multi-tier architectures, programmers are switching to a
disconnected approach to provide better scalability for their applications.
The document discusses several differences between ADO.NET concepts including:
1) DataReader allows reading one record at a time in a forward-only manner while DataAdapter allows navigating records and updating data in a disconnected manner.
2) DataSet allows caching and manipulating disconnected data across multiple tables while DataReader requires an open connection and only retrieves data from a single query.
3) DataSet.Copy() copies both structure and data of a DataSet while DataSet.Clone() only copies the structure without any data.
4) ADO.NET uses XML, disconnected architecture, and the DataSet object while classic ADO uses binary format, requires active connections, and the Recordset object.
This document discusses managing and processing XML data in SQL Server 2005. It covers storing XML data natively in SQL Server using the XML data type, accessing XML data from SQL Server using the SqlXml class, updating XML data stored in SQL Server by filling a dataset with XML and updating the database, and demonstrates working with XML data through an example.
The document provides an overview of Oracle for beginners, including the different editions of Oracle database, data types in Oracle such as character, numeric, date, and LOB data types. It also discusses how to create and alter Oracle tables, including adding, modifying and dropping columns, as well as renaming tables and columns. Primary keys in Oracle tables are also covered at a high level.
The document discusses Codd's rules for relational database management systems (RDBMS). It explains the 13 rules, which include that data should only be represented as values in tables, null values must be supported, and the database description must be queryable using the same relational language as the data. It also defines what constitutes an RDBMS, describes database concepts like normalization, and provides examples of relationships and integrity rules.
MySQL is an open-source relational database management system. The document discusses the introduction to MySQL, its development history, installation, features, data types, basic commands like CREATE, SELECT, UPDATE, DELETE. It also covers MySQL constraints. MySQL is widely used for web applications due to its speed, ease of use and open source nature. It can store and manage large volumes of data across different tables using relationships.
The document discusses different approaches to data management and persistence in applications, including:
1) Storing objects directly in files or using a database management system (DBMS) to store data in tables while hiding physical storage details.
2) Design questions around persistence such as whether to use files, a relational or object DBMS, and how to structure the logical and physical layers.
3) Common techniques for mapping objects to relational databases like normalization, handling inheritance and associations.
4) Alternatives for designing data management classes like adding persistence methods to classes or using broker classes.
This document provides an overview of ADO.NET. It defines ADO.NET as providing functionality to connect frontend and backend systems and update, retrieve, and add data to databases using classes and functions. It supports both connected and disconnected architectures. The disconnected architecture allows retrieving entire database tables locally. Key ADO.NET classes include Connection, Command, and DataReader. Namespaces like System.Data organize the object model. Data providers like SqlClient are responsible for database connections. ADO.NET enables developing data-centric applications with benefits like a disconnected data architecture, scalability, and interpretability.
This document lists 576 Oracle DBA interview questions covering topics such as checking privileges and permissions, resizing datafiles, finding analyzed tables, active users, what a user is doing, table and index counts, tablespace and user space usage, OS and database versions, datafile reads and writes, segments close to limits, archived log and backup information, online redo log groups, datafiles, schema sizes, quotas, tablespace usage, table sizes, row distribution, database recovery, alerts, thresholds, notifications, archive log format, connection troubleshooting, and other administrative tasks and problems.
Oracle Database is a relational database management system produced by Oracle Corporation. It stores data logically in tables, tablespaces, and schemas, and physically in datafiles. The database, SGA (containing the buffer cache, redo log buffer, and shared pool), and background processes like SMON, PMON, and DBWR work together for high performance and reliability. Backup methods and administrative tasks help maintain the database.
The document discusses various Oracle database concepts and architecture. It covers physical and logical database structures, components like datafiles, redo logs, control files, tablespaces and schemas. It also discusses logical objects like tables, indexes, views, sequences and synonyms. Other topics include parallel servers, database instances, memory structures like SGA and PGA, tablespaces, rollback segments, free extents and space allocation.
The document discusses using PHP to connect to and manipulate MySQL databases. It covers using MySQLi and PDO to connect to MySQL from PHP, and provides examples of inserting, selecting, updating, and deleting data from MySQL databases using PDO commands. Key points include that PDO can work with multiple database types while MySQLi only works with MySQL, and that both support prepared statements to protect against SQL injection.
This document contains interview questions about Oracle database concepts and architecture. It covers physical and logical database structures, tablespaces, schemas, schema objects like tables, views, indexes, and sequences. It also discusses database administration topics like instances, parallel servers, memory structures like the system global area and program global area, and space allocation and management using rollback segments and tablespaces.
This document provides an overview of Oracle database history, architecture, components, and terminology. It discusses:
- Oracle's release history from 1978 to present.
- The physical and logical structures that make up an Oracle database, including data files, control files, redo logs, tablespaces, segments, and blocks.
- The Oracle instance and its memory components like the SGA and PGA. It describes the various background processes.
- How clients connect to Oracle using the listener, tnsnames.ora file, and naming resolution.
- Common Oracle tools for accessing and managing databases like SQLPlus, SQL Developer, and views for monitoring databases.
ADO.NET Architecture
Data processing has traditionally relied primarily on a connection-based, two-tier model. As data
processing increasingly uses multi-tier architectures, programmers are switching to a
disconnected approach to provide better scalability for their applications.
Oracle Database is a collection of data treated as a unit. The purpose of a database is to store and retrieve related information. Oracle Database was started in 1977 as Software Development Laboratories by Larry Ellison and others. Over time, Oracle released several major versions that added new functionality, such as Oracle 12c which was designed for cloud computing. A database server is the key to solving problems of information management by allowing storage, retrieval, and manipulation of data.
SQLLite and Java
SQLite is an embedded SQL database that is not a client/server system but is instead accessed via function calls from an application. It uses a single cross-platform database file. The android.database.sqlite package provides classes for managing SQLite databases in Android applications, including methods for creating, opening, inserting, updating, deleting, and querying the database. Queries return results as a Cursor object that can be used to access data.
Object relational and extended relational databasesSuhad Jihad
This document discusses object-relational and extended relational databases. It begins with an introduction and agenda. It then covers database design for ORDBMS, including complex data types, structured types, type inheritance, and array/multiset types. It discusses creating and querying collection-valued attributes. Finally, it covers nesting and unnesting relations to transform between normalized and denormalized forms. The key topics covered in 3 sentences or less are: database design for ORDBMS supports objects, classes, and inheritance; structured types allow user-defined complex attributes; type inheritance and subtables allow modeling specialization hierarchies; and arrays and multisets allow modeling ordered and unordered collections as attributes.
JDBC 4.0 introduced 20 new features and enhancements including easier driver management through automatic driver loading, more flexible result set handling through the new SQLXML data type, and enhanced support for large objects, data types, and exception handling. Key goals were better object management, more data type support, and increased flexibility and ease of use. The presentation provided code examples to illustrate the new features and highlighted improvements in areas like connection management, exception handling, and the DatabaseMetaData API.
This document provides an overview of an Oracle DBA walkthrough presentation. It includes a table of contents covering topics like the duties of database administrators, memory and process architecture, instance startup and shutdown, and tools for DBAs. It also introduces the presenter, Akash Pramanik, who is an Oracle DBA by profession and freelance trainer.
Building a Scalable XML-based Dynamic Delivery Architecture: Standards and Be...Jerry SILVER
The document discusses challenges with traditional and dynamic content delivery and solutions using XML standards and a native XML database. It provides examples of using XQuery, XSLT, XForms, XProc and other XML standards to dynamically assemble and deliver personalized content at scale from an XML repository. It also presents two case studies of companies that implemented such standards-based dynamic XML content delivery solutions.
This presentation was hold at APEXConnect in Berlin 28th of April 2016.
The presentition describes how to user a source control / versioning system in combination with database oriented projects. You can see how to manage the folder structure and what types of files are versioned, including an Oracle Application Express Application.
EclipseCon Eu 2012 - Build your own System Engineering workbenchmelbats
The document discusses building a system engineering workbench to manage complex systems. It recommends using standards like SysML and UML for system definition, but also developing custom domain-specific languages when needed. The workbench should provide links between the system model, requirements, code, performance, safety and quality domains. Data and tools from different domains can be integrated using Eclipse plugins, and standards like EMF and CDO enable sharing models between teams. Process integration and handling large amounts of data are identified as areas for future work.
This tutorial describes using recursive XSLT calls and JavaScript to display an expanding and collapsing tree view of an XML purchase order document. Key aspects covered include using recursion in the XSLT stylesheet to process the XML data hierarchically, invoking JavaScript from XSLT to make the display interactive, and generating HTML output with calls to the JavaScript functions. The code sample demonstrates techniques for parsing XML data recursively and creating an interactive user interface using XSLT and JavaScript.
This tutorial describes using recursive XSLT calls and JavaScript to display an expanding and collapsing tree view of an XML purchase order document. Key aspects covered include using recursion in the XSLT stylesheet to process the XML data hierarchically, invoking JavaScript from XSLT to make the display interactive, and generating HTML output with calls to the JavaScript functions. The code sample demonstrates techniques for parsing XML data recursively and creating an interactive user interface using XSLT and JavaScript.
Environment Canada's Data Management ServiceSafe Software
A brief history in TimeSeries data at Environment Canada. An Enterprise view of how FME can be integrated into departmental data management activities.
DDS Advanced Tutorial - OMG June 2013 Berlin MeetingJaime Martin Losa
An extended, in-depth tutorial explaining how to fully exploit the standard's unique communication capabilities.Presented at the OMG June 2013 Berlin Meeting.
Users upgrading to DDS from a homegrown solution or a legacy-messaging infrastructure often limit themselves to using its most basic publish-subscribe features. This allows applications to take advantage of reliable multicast and other performance and scalability features of the DDS wire protocol, as well as the enhanced robustness of the DDS peer-to-peer architecture. However, applications that do not use DDS's data-centricity do not take advantage of many of its QoS-related, scalability and availability features, such as the KeepLast History Cache, Instance Ownership and Deadline Monitoring. As a consequence some developers duplicate these features in custom application code, resulting in increased costs, lower performance, and compromised portability and interoperability.
This tutorial will formally define the data-centric publish-subscribe model as specified in the OMG DDS specification and define a set of best-practice guidelines and patterns for the design and implementation of systems based on DDS.
Develop an App with the Odoo Framework or How to Implement a Plant Nursery in a Few Minutes.
Yannick Tivisse, Software Engineer, RD4HR Team Leader, Odoo
Terraform modules provide reusable, composable infrastructure components. The document discusses restructuring infrastructure code into modules to make it more reusable, testable, and maintainable. Key points include:
- Modules should be structured in a three-tier hierarchy from primitive resources to generic services to specific environments.
- Testing modules individually increases confidence in changes.
- Storing module code and versions in Git provides versioning and collaboration.
- Remote state allows infrastructure to be shared between modules and deployments.
Silverlight Development & The Model-View-ViewModel PatternDerek Novavi
Presentation covering some of the features of Silverlight 3, some background on developing Silverlight applications, the Model-View-ViewModel Pattern, the Silverlight Unit Test Framework, and some of the new features in Silverlight 4 Beta.
This document summarizes a webinar about Clarify and Dovetail schema concepts, viewing schemas, editing schemas, and tips. The webinar covered database structures like tables, fields, relations, and views. It discussed metadata and the ADP layer. It demonstrated how to view schemas using tools like the Clarify Data Dictionary, ERD diagrams, schema files, and Dovetail BOLT. The webinar compared different schema editing tools and showed how to customize schemas using Dovetail SchemaEditor and schema scripts. It provided tips on customizing schemas, working with schema files, and using SchemaEditor reports.
Linq To XML provides advantages over previous .NET XML options like being faster, using LINQ query syntax for easier processing, and having simpler classes. It allows loading XML from files or strings, creating XML trees, querying XML with namespaces, and validating against schemas. The overview demonstrates its usage and provides links for additional resources.
This document provides an overview of the Earth Science Markup Language (ESML). ESML is an XML-based interchange format that allows applications and services to access heterogeneous earth science data regardless of the underlying data format. It provides syntactic, semantic, and content metadata that describe data structures, meanings, and contents in a machine-readable way. The document outlines the need for such an interchange format, describes the components of ESML including the schema, libraries, and tools, and provides examples of writing ESML descriptions for different types of data files.
The document provides an introduction and overview of topics to be covered in a tutorial on web-based collaborative tools. The tutorial will cover tools for web conferencing, learning management systems, shared displays, management tools, and learning object standards. It will also discuss authoring tools, voice over IP, access grids, instant messengers, calendars, palmtop interfaces, and portals for education and computing.
This document summarizes the experience of Jacob Keecheril as a Senior .NET Developer with over 24 years of experience. He has extensive experience designing, developing, and testing client-server, web, and N-tier applications. His technical skills include VB.NET, C#, ASP.NET, SQL Server, and Oracle. He has worked on projects involving compliance monitoring systems, environmental reporting systems, and human resources systems for Pennsylvania state agencies.
This document provides an introduction to object-oriented databases (OODBMS). It discusses key concepts like objects having an identity, structure and type constructor. An OODBMS allows for complex object structures, encapsulation of operations, inheritance and relationships between objects using object identifiers. It provides advantages over traditional databases for applications requiring complex data types and application-specific operations.
This document contains questions and answers related to the IT6701-Information Management course. It covers topics like data modeling, database concepts, JDBC, big data, Hadoop ecosystem components, security concepts, and organizational systems. Some key points include:
- It defines data modeling, schemas, normalization, and JDBC drivers.
- It lists the types of data models, sources of business rules, and steps to access a database using JDBC.
- It covers Hadoop Distributed File System (HDFS), MapReduce, Hive, and applications of Hive.
- It defines security terms like firewalls, intrusion detection systems, and data protection.
- It discusses organizational schemes,
Similar to Euclid Data Model 101 - Episode 01: Overview (20)
Comparison Table of DiskWarrior Alternatives.pdfAndrey Yasko
To help you choose the best DiskWarrior alternative, we've compiled a comparison table summarizing the features, pros, cons, and pricing of six alternatives.
YOUR RELIABLE WEB DESIGN & DEVELOPMENT TEAM — FOR LASTING SUCCESS
WPRiders is a web development company specialized in WordPress and WooCommerce websites and plugins for customers around the world. The company is headquartered in Bucharest, Romania, but our team members are located all over the world. Our customers are primarily from the US and Western Europe, but we have clients from Australia, Canada and other areas as well.
Some facts about WPRiders and why we are one of the best firms around:
More than 700 five-star reviews! You can check them here.
1500 WordPress projects delivered.
We respond 80% faster than other firms! Data provided by Freshdesk.
We’ve been in business since 2015.
We are located in 7 countries and have 22 team members.
With so many projects delivered, our team knows what works and what doesn’t when it comes to WordPress and WooCommerce.
Our team members are:
- highly experienced developers (employees & contractors with 5 -10+ years of experience),
- great designers with an eye for UX/UI with 10+ years of experience
- project managers with development background who speak both tech and non-tech
- QA specialists
- Conversion Rate Optimisation - CRO experts
They are all working together to provide you with the best possible service. We are passionate about WordPress, and we love creating custom solutions that help our clients achieve their goals.
At WPRiders, we are committed to building long-term relationships with our clients. We believe in accountability, in doing the right thing, as well as in transparency and open communication. You can read more about WPRiders on the About us page.
Transcript: Details of description part II: Describing images in practice - T...BookNet Canada
This presentation explores the practical application of image description techniques. Familiar guidelines will be demonstrated in practice, and descriptions will be developed “live”! If you have learned a lot about the theory of image description techniques but want to feel more confident putting them into practice, this is the presentation for you. There will be useful, actionable information for everyone, whether you are working with authors, colleagues, alone, or leveraging AI as a collaborator.
Link to presentation recording and slides: https://bnctechforum.ca/sessions/details-of-description-part-ii-describing-images-in-practice/
Presented by BookNet Canada on June 25, 2024, with support from the Department of Canadian Heritage.
How RPA Help in the Transportation and Logistics Industry.pptxSynapseIndia
Revolutionize your transportation processes with our cutting-edge RPA software. Automate repetitive tasks, reduce costs, and enhance efficiency in the logistics sector with our advanced solutions.
Quantum Communications Q&A with Gemini LLM. These are based on Shannon's Noisy channel Theorem and offers how the classical theory applies to the quantum world.
Mitigating the Impact of State Management in Cloud Stream Processing SystemsScyllaDB
Stream processing is a crucial component of modern data infrastructure, but constructing an efficient and scalable stream processing system can be challenging. Decoupling compute and storage architecture has emerged as an effective solution to these challenges, but it can introduce high latency issues, especially when dealing with complex continuous queries that necessitate managing extra-large internal states.
In this talk, we focus on addressing the high latency issues associated with S3 storage in stream processing systems that employ a decoupled compute and storage architecture. We delve into the root causes of latency in this context and explore various techniques to minimize the impact of S3 latency on stream processing performance. Our proposed approach is to implement a tiered storage mechanism that leverages a blend of high-performance and low-cost storage tiers to reduce data movement between the compute and storage layers while maintaining efficient processing.
Throughout the talk, we will present experimental results that demonstrate the effectiveness of our approach in mitigating the impact of S3 latency on stream processing. By the end of the talk, attendees will have gained insights into how to optimize their stream processing systems for reduced latency and improved cost-efficiency.
BT & Neo4j: Knowledge Graphs for Critical Enterprise Systems.pptx.pdfNeo4j
Presented at Gartner Data & Analytics, London Maty 2024. BT Group has used the Neo4j Graph Database to enable impressive digital transformation programs over the last 6 years. By re-imagining their operational support systems to adopt self-serve and data lead principles they have substantially reduced the number of applications and complexity of their operations. The result has been a substantial reduction in risk and costs while improving time to value, innovation, and process automation. Join this session to hear their story, the lessons they learned along the way and how their future innovation plans include the exploration of uses of EKG + Generative AI.
7 Most Powerful Solar Storms in the History of Earth.pdfEnterprise Wired
Solar Storms (Geo Magnetic Storms) are the motion of accelerated charged particles in the solar environment with high velocities due to the coronal mass ejection (CME).
Details of description part II: Describing images in practice - Tech Forum 2024BookNet Canada
This presentation explores the practical application of image description techniques. Familiar guidelines will be demonstrated in practice, and descriptions will be developed “live”! If you have learned a lot about the theory of image description techniques but want to feel more confident putting them into practice, this is the presentation for you. There will be useful, actionable information for everyone, whether you are working with authors, colleagues, alone, or leveraging AI as a collaborator.
Link to presentation recording and transcript: https://bnctechforum.ca/sessions/details-of-description-part-ii-describing-images-in-practice/
Presented by BookNet Canada on June 25, 2024, with support from the Department of Canadian Heritage.
The Rise of Supernetwork Data Intensive ComputingLarry Smarr
Invited Remote Lecture to SC21
The International Conference for High Performance Computing, Networking, Storage, and Analysis
St. Louis, Missouri
November 18, 2021
Fluttercon 2024: Showing that you care about security - OpenSSF Scorecards fo...Chris Swan
Have you noticed the OpenSSF Scorecard badges on the official Dart and Flutter repos? It's Google's way of showing that they care about security. Practices such as pinning dependencies, branch protection, required reviews, continuous integration tests etc. are measured to provide a score and accompanying badge.
You can do the same for your projects, and this presentation will show you how, with an emphasis on the unique challenges that come up when working with Dart and Flutter.
The session will provide a walkthrough of the steps involved in securing a first repository, and then what it takes to repeat that process across an organization with multiple repos. It will also look at the ongoing maintenance involved once scorecards have been implemented, and how aspects of that maintenance can be better automated to minimize toil.
Blockchain technology is transforming industries and reshaping the way we conduct business, manage data, and secure transactions. Whether you're new to blockchain or looking to deepen your knowledge, our guidebook, "Blockchain for Dummies", is your ultimate resource.
2. INTRODUCTION
Objectives of Euclid Datamodel 101
Slidecast dedicated to Euclid data modelers & developers
Help you understand what is expected and how to do it
Released as multiple episodes over time
1st episode: high-level overview of tools and process
2nd episode: the TIPS example
Following episodes: zoom on technical points
3. INTRODUCTION
Objectives and contents of this presentation
Get an overview of the data modeling process
Understand the data model workflow
Know where to find information
Know what tools are available
No complex details and technical information here…
… but high-level information and pointers to the right direction.
4. SUMMARY
The data modeling process
1 - Understand Euclid DataModel
Why using a Euclid DataModel? Why choosing XML? What is XML
Schema? What are the Euclid-specific XML rules my schema shall
comply with? How is DataModel SVN repository structured? How are
xml namespaces structured?
2 - Create your own DataModel
What should my DataModel contain? What software can I use to write
xml? How can I check if my datamodel is correct?
3 - Use the DataModel in your own code
How can I use the data model in my code? How can I use XML data
bindings? Can I get pre-configured tools all at once?
5. Why using a Euclid DataModel?
Euclid mission relies a lot on data transfer and manipulation
Data consistency between OUs, workflows, pipelines, storage
is a key point
use
EAS
Your DataModel will be:
- used to structure EAS db
- manipulated by your pipelines code
Compliant
Data
products
in/out
DataModel
IAL
SDC
use
Pipeline
code
DESIGN TIME
RUN TIME
Your data products will be:
- stored on EAS
- queryable from EAS
- transmitted to/from pipelines by IAL
6. Why choosing XML?
XML language brings many benefits:
Easy to read and understand by humans and machines
<coord>
<x>12.05</x>
<y>3.1</y>
</coord>
Many tools available to create, control and check xml
Strong type/namespace control and definition
Widely used and supported across the world
Self contained: express data and data structure
XML chosen above many other alternatives
Find information
- W3Schools tutorials:
http://www.w3schools.com/schema/
7. What is XML Schema?
Two file format you should be familiar with:
XSD (XML schema)
XML
Describes the data structure
Contains the actual data
<coord>
<x>12.05</x>
<y>3.1</y>
</coord>
complies
with
<xs:element name=«coord»>
<xs:complexType>
<xs:sequence>
<xs:element name=«x» type=«xs:float» />
<xs:element name=«y» type=«xs:float» />
</xs:sequence>
</xs:complexType>
</xs:element>
Find information
- W3Schools tutorials: http://www.w3schools.com/schema/
- Highlights on XML/XSD (DM Workshop):
http://euclid.roe.ac.uk/attachments/download/2744/Workshop_Nov2013_XSD_XML%20-%202.02.ppt
8. What are the Euclid-specific XML rules my
schema shall comply with?
Need for a fully consistent DataModel
everybody should follow the same rules
Among existing rules:
-
XML Schema file name
XML file name
Single root element
Element identifier name
Numeric type restriction
-
Recursive definitions
Target namespaces
Encoding
Unqualified namespaces
…
Rules are still in development, feedback is welcome and changes might be required
Find information
- Official Euclid XML rules: http://euclid.roe.ac.uk/dmsf/eucrma?folder_id=47
- DM Workshop presentation: http://euclid.roe.ac.uk/attachments/download/2762/DM-Rules.pdf
9. How is DataModel SVN repo structured?
Classic SVN structure
- trunk: latest stable work
- branches: specific feature parallel development
- tags: official releases
Dictionary and Interfaces for your products
- Dictionnary: definition of the complexTypes and
elements of your product entire DataModel
- Interfaces: definition of the data exchanged between
components. One root element only per type, that you can
see as a variable to access a product.
EC/SGS/ST/4-2-05-DM/schema
Find information
- DMWorkshop svn presentation: http://euclid.roe.ac.uk/projects/eucrma/wiki/20131411DMWSconf
- Dictionary of types:
https://apceucliddev.in2p3.fr/jenkins/job/Dictionary/ws/eXist/dictionary.html
- Configuration management & best practices: http://euclid.roe.ac.uk/projects/eucrma/wiki#Configuration-management
10. How are xml namespaces structured?
Under Dictionary and Interfaces, 4 top-level namespaces
- bas: common definitions shared by everyone
- ins: instrument specific definitions
- pro: OU-specific definitions
- sys: system specific definitions (storage, processing…)
/pro sub-levels
- one directory per OU
- one responsible custodian per directory
EC/SGS/ST/4-2-05-DM/schema
Find information
- DMWorkshop svn presentation: http://euclid.roe.ac.uk/projects/eucrma/wiki/20131411DMWSconf
- Dictionary of types:
https://apceucliddev.in2p3.fr/jenkins/job/Dictionary/ws/eXist/dictionary.html
- Configuration management & best practices: http://euclid.roe.ac.uk/projects/eucrma/wiki#Configuration-management
11. What should my DataModel contain?
Your DataModel should contain:
Must have
- definitions of pipeline inputs
- definitions of output products
- definitions of intermediate elements
used in your code
<sgs:dataContainer>
• ID
• Filename
• StorageNode
• Path
Your DataModel can use:
- new elements you define
- already existing elements
- dataContainers for files with no specific definition
Find information
- Fits DataModel (see dictionary and interfaces): schema/trunk/Dictionary/pro/sim/euc-test-ousim-tips.xsd
- DM Workshop DataContainer presentation: http://euclid.roe.ac.uk/attachments/download/2765
- DM wiki homepage: http://euclid.roe.ac.uk/projects/eucrma/wiki
12. What software can I use to write XML?
Of course, any text editor allows you to simply read and write XML
One of these two powerful XML development environment software is recommended
- Altova XMLSpy (license from 400€)
- Oxygen XML Editor (license from 99$ - 30 days free trial)
Project oriented browsing, handles dependencies
between files
Content completion for elements, attributes & values
XML validation and detection of errors
Schema modeling with graph representation
Find information
- Altova XMLSpy: http://www.altova.com/xmlspy.html
- Oxygen XML Editor: http://www.oxygenxml.com/
13. How can I check if my DataModel is correct?
Use Oxygen or XMLSpy to validate your XML and XML Schema files
Well formed XML: correct language syntax
Document validation: xml conforms to xml schema definition
Use Euclid Data Model Checker tools
Check compliance with Euclid DataModel rules
Python module & scripts available in Euclid SVN
(ECSGSST4-2-05-DMtoolstrunkDataModelChecker)
Find information
-
Altova XMLSpy:
http://www.altova.com/xmlspy.html
Oxygen XML Editor: http://www.oxygenxml.com/
Official Euclid XML rules: http://euclid.roe.ac.uk/dmsf/eucrma?folder_id=47
DM Workshop presentation: http://euclid.roe.ac.uk/attachments/download/2762/DM-Rules.pdf
- DataModelChecker readme (SVN):
ECSGSST4-2-05-DMtoolstrunkDataModelCheckerdoc
14. How can I use the DataModel in my code?
In your pipelines code, you might want to
in
Read and modify existing XML files
Produce new XML files
Manipulate data as specified in the DataModel (no XML file)
Multiple ways to do that
Must be
avoided
Use bindings generation
Bindings:
Pipeline
code
Data Model
Manually parse XML files
Prefered way
use
Use XPATH and xml libraries (Python lxml)
XML Schema elements become class definitions
XML product becomes an object instance
Find information
- XML data bindings resources: http://www.rpbourret.com/xml/XMLDataBinding.htm
out
15. How do I use XML data bindings?
Two XML binding libraries available for Euclid
For Python, based on PyXB
For C++, based on CodeSynthesis XSD
First step: generate classes from the DataModel
C++ classes:
.hxx & .cxx
generateStubs.py
DataModel
XML Schema
(.xsd files)
C++
Python
generate_allbindings.sh
Python
classes: .py
Second step: use generated classes in your own code
Create and access elements as you would do with usual classes/objects
Find information
- Python Bindings library: (SVN)/EC/SGS/ST/4-2-05-DM/tools/trunk/PythonBinding
- C++ Bindings library:
(SVN)/EC/SGS/ST/4-2-05-DM/tools/trunk/CppBinding
- DMWorkshop Python bindings presentation: http://euclid.roe.ac.uk/attachments/download/2734
- DMWorkshop C++ bindings presentation:
http://euclid.roe.ac.uk/attachments/download/2745
& http://euclid.roe.ac.uk/attachments/download/2773
16. Can I get pre-configured tools at once?
We are building a virtual machine you can use on your own computer
Based on Scientific Linux 6 (OS supported for Euclid)
Linked to Euclid CODEEN yum repository for package installation
Linked to Euclid SVN for source code checkin/checkout
Containing
-
Required software libraries
Pre-configured development environment
C++ & Python bindings generation libraries
Data Model Checker tools
… and more
Still in development, hopefully available soon
Find information
- CODEEN yum packages list: https://apceuclidrepo.in2p3.fr/nexus/content/groups/el6.euclid/
- Virtualbox virtualization tool: https://www.virtualbox.org/
- VMWare virtualization tool: http://www.vmware.com/fr/products/player/
17. In the next episode…
Tips DataModel from its creation
to the pipeline code
Stay tuned !