This White Paper provides an introduction to the EMC Isilon scale-out data lake as the key enabler to store, manage, and protect unstructured data for traditional and emerging workloads.
1. Argosy Gaming Co. built a data warehouse to integrate customer data from its six riverboat casinos to enhance management and customer relationships. However, each casino defined operational activities and customer characteristics differently, creating inconsistent data.
2. During ETL testing, the data warehouse team discovered the conflicting definitions, which they wished they had identified earlier. ETL involves extracting, transforming, and loading data, but transformations are difficult when source data definitions vary.
3. Defining terms like "player" and "visit" consistently across casinos was challenging. IT worked with business units to standardize definitions and ensure dependable, consistent reporting from the data warehouse.
This white paper introduces the EMC Isilon scale-out data lake as the key enabler to store, manage, and protect unstructured data for traditional and emerging workloads.
Securing your IT infrastructure with SOC-NOC collaboration TWPSridhar Karnam
The document discusses integrating log management with IT operations to improve security and incident management. Log management provides universal collection, analysis and long-term retention of log data from all sources. Integrating this with IT operations tools allows security incidents to be detected and addressed through the IT operations workflow. This provides better visibility into the root causes of issues and their business impacts. A case study of HP-IT is presented where they integrated log management with their IT operations solution to manage security incidents and the complex IT infrastructure supporting 350,000 employees.
Museum Collections Management: Possibilities for Access and Use with Linked D...cbogen
Carly Bogen completed a practicum at a museum where she worked on their collections management system, wrote grants, and assisted with strategic planning. She discusses how modern collection management systems can link objects and their associated data. However, much of this data is not publicly available due to issues like copyright and data sensitivity. Linked open data could help make more museum collection data accessible by standardizing its structure and linking it across institutions. However, barriers include a lack of resources and fears about data accuracy and control. Integrating linked open data with collection management software could help lower these barriers. A few museums have begun publishing linked open data to make their collections more discoverable and connectable with others.
This is Part III of a workshop presented by ICPSR at IASSIST 2011. This section focuses on data management including data management plans, secure computing environments, and restricted data contract management.
This document summarizes the challenges of integrating data from different modeling and simulation (M&S) architectures used in a live-virtual-constructive simulation network. It discusses how differing data formats, representations, and structures between architectures like Distributed Interactive Simulation (DIS) and High Level Architecture (HLA) can introduce complexity. Standards, tools like gateways and the Federated Engineering Agreements Template (FEAT), and processes like the Distributed Simulation Engineering and Execution Process (DSEEP) can help address these challenges and reduce complexity when combining M&S architectures. The author recommends questioning if combining architectures is truly needed, using recognized standards, and maintaining good documentation records.
Distributed systems consist of components located across a network that communicate and coordinate their actions by passing messages. Key challenges of distributed systems include independent component failures, insecure communication, and lack of a global clock. Distributed systems aim to provide distribution transparency to present a single system view to users. They can satisfy requirements like resource sharing, openness, scalability, fault tolerance, and heterogeneity. However, developing distributed systems risks false assumptions about reliability, security, and performance of the network.
SECURITY IN LARGE, STRATEGIC AND COMPLEX SYSTEMSMarco Lisi
Lesson on "Security in large, Strategic and Complex Systems" at the "Master di II Livello" in "Homeland Security" -
Università degli Studi Campus Bio-Medico di Roma, A. A. 2012-2013
C11-1 CASE STUDY 11 CLOUD COMPUTING (IN)SECURITY .docxRAHUL126667
C11-1
CASE STUDY 11
CLOUD COMPUTING (IN)SECURITY
Cloud computing is reshaping enterprise network architectures and
infrastructures. It refers to applications delivered as services over the
Internet as well as the hardware and systems software in data centers that
provide those services. The services themselves have long been referred to
as Software as a Service (SaaS) which had its roots in Software-Oriented
Architecture (SOA) concepts that began shaping enterprise network
roadmaps in the early 2000s. IaaS (Infrastructure as a Service) and PaaS
(Platform as a Service) are other types of cloud computing services that are
available to business customers.
Cloud computing fosters the notion of computing as a utility that can be
consumed by businesses on demand in a manner that is similar to other
services (e.g. electricity, municipal water) from traditional utilities. It has the
potential to reshape much of the IT industry by giving businesses the option
of running business software applications fully on-premises, fully in “the
cloud” or some combination of these two extremes. These are choices that
businesses have not had until recently and many companies are still coming
to grips with this new computing landscape.
Security is important to any computing infrastructure. Companies go to
great lengths to secure on-premises computing systems, so it is not
surprising that security looms as a major consideration when augmenting or
replacing on-premises systems with cloud services. Allaying security
C11-2
concerns is frequently a prerequisite for further discussions about migrating
part or all of an organization’s computing architecture to the cloud.
Availability is another major concern: “How will we operate if we can’t access
the Internet? What if our customers can’t access the cloud to place orders?”
are common questions [AMBR10].
Generally speaking, such questions only arise when businesses
contemplating moving core transaction processing, such as ERP systems,
and other mission critical applications to the cloud. Companies have
traditionally demonstrated less concern about migrating high maintenance
applications such as e-mail and payroll to cloud service providers even
though such applications hold sensitive information.
Security Issues and Concerns
Auditability is a concern for many organizations, especially those who must
comply with Sarbanes-Oxley and/or Health and Human Services Health
Insurance Portability and Accountability Act (HIPAA) regulations [IBM11].
The auditability of their data must be ensured whether it is stored on-
premises or moved to the cloud.
Before moving critical infrastructure to the cloud, businesses should do
diligence on security threats both from outside and inside the cloud
[BADG11]. Many of the security issues associated with protecting clouds
from outside threats are similar to those that have traditionally faced
...
PaaSword: A Holistic Data Privacy and Security by Design Framework for Cloud ...PaaSword EU Project
This is a paper presentation held by Dr. Yiannis Verginadis at the 5th International Conference on Cloud Computing and Services Science (CLOSER 2015) in Lisbon, Portugal. The authors outline significant security challenges presented when migrating to a cloud environment and described a novel holistic framework that aspires to alleviate these challenges, corresponding to the high level description of the vision of the PaaSword project.
Total Cost of Ownership Evaluation for Network Centric Systems of SystemsMarco Lisi
The document discusses the challenges of evaluating the total cost of ownership for network-centric systems of systems. It notes that systems of systems involve multiple independent systems working together and have greater complexity due to emergent behaviors. Estimating costs for systems of systems is difficult due to factors like the number of components, connections between systems, and overall complexity, which scales exponentially with additional components. The document examines how factors like complexity, security certification levels, and human-system interactions can impact overall costs.
This document provides an outline for a lecture on Section 5 of CS 5950 – Computer Security and Information Assurance. Section 5 covers security in networks, specifically Part 2 which includes an overview of network security controls and tools. The outline lists topics like encryption, authentication, firewalls, intrusion detection systems, and secure email that will be discussed in the lecture. It also acknowledges information from other sources that may be included in the lecture slides.
This document is an outline for Section 5 of a lecture on computer security and information assurance. Section 5 covers network security, specifically Part 2 on security controls and tools. The summary outlines network security controls like encryption, content integrity controls, authentication, access controls, intrusion detection systems, and firewalls. It also summarizes network security tools that will be discussed like firewalls, intrusion detection systems, and secure email systems.
This document is an outline for Section 5 of a lecture on computer security and information assurance. Section 5 covers network security, specifically Part 2 on security controls and tools. The summary outlines network security controls like encryption, content integrity controls, authentication, access controls, intrusion detection systems, and firewalls. It also summarizes network security tools that will be covered, including firewalls, intrusion detection systems, and secure email systems.
Privacy and Integrity Preserving in Cloud Storage DevicesIOSR Journals
This document proposes a method for providing privacy, integrity, and storage space management for files stored in the cloud. It aims to address confidentiality issues by encrypting files before uploading. To manage storage space, it arranges files in a complete binary tree based on size. File integrity is checked by a third party auditor comparing hash values of files stored in the cloud with those generated by the client. The method aims to provide confidentiality, integrity checking, and efficient use of storage space for cloud computing.
This document summarizes key aspects of distributed system architecture from Chapter Two. It discusses software and system architectures, architectural styles including layered, object-based, data-centered, and event-based. It also covers centralized architectures like client-server and multi-tiered architectures. Decentralized architectures through vertical and horizontal distribution are described as well. Specific examples like internet search engines and stock broker systems are provided to illustrate architectural concepts.
This presentation discusses securing databases in the cloud. It begins with an overview of cloud infrastructure and open-source products like MySQL, PostgreSQL, MongoDB, and Hadoop. It then covers security risks to data in the cloud from threats like insecure cryptographic storage and attacks on file systems. Methods for protecting databases are discussed, including access controls, encryption, and key management solutions. The presentation concludes by describing how a product called ezEncrypt provides transparent encryption to securely store data.
PaaSword: A Holistic Data Privacy and Security by Design Framework for Cloud ...Yiannis Verginadis
This is a paper presentation held at the 5th International Conference on Cloud Computing and Services Science (CLOSER 2015) in Lisbon, Portugal. The authors outline significant security challenges presented when migrating to a cloud environment and described a novel holistic framework that aspires to alleviate these challenges, corresponding to the high level description of the vision of the PaaSword project.
Similar to Meletis Belsis - Workflow based Incident Management Model (20)
This document discusses the potential for using multimedia in enterprise security user training. It argues that traditional training methods like posters and emails are ineffective. Multimedia could provide more effective training through interactive presentations using audio, video, images and text. Examples show multimedia has been successfully used in other training domains. The document concludes that a multimedia training tool could improve security awareness if designed carefully to avoid helping adversaries understand security systems and policies.
This document proposes a system to improve how Computer Security Incident Response Teams (CSIRTs) store and share security incident data. Currently, CSIRTs use various data structures and methods to record incident details, limiting collaboration. The authors propose a system using CORBA that allows incident data to be stored in a central database and accessed securely via a web interface or standalone application. This would facilitate information sharing between CSIRTs and give users different views of the data based on their roles. A natural language interface is also suggested to allow complex queries without technical expertise. The system aims to address current problems around incident data management and access.
Security is a major concern for organizations and individuals as information has become more valuable. The need for security has existed since information first became important. While firewalls and antivirus software provide some protection, they do not make an organization fully secure. Security involves processes for prevention, detection, reaction, and forensics. It is difficult to implement security perfectly due to costs, user resistance, evolving threats, and time/budget constraints for security teams. Hackers use various techniques like information gathering, password cracking, viruses, denial of service attacks, sniffing, and system exploits to compromise targets. Organizations implement defenses like firewalls, intrusion detection, honeypots, anti-sniffing measures, antivirus software, security awareness
VoIP Security: An Overview discusses the security challenges of Voice over IP (VoIP) technology. It notes that VoIP inherits vulnerabilities from TCP/IP networks and uses the corporate network, making it complex to secure. Common VoIP threats include denial of service attacks, interception attacks, covert channels, and vulnerabilities in VoIP platforms. The document outlines example attacks and tools used by hackers. It recommends countermeasures like network separation, encryption of SIP and RTP, firewalls, intrusion detection systems, and hardening VoIP infrastructure and devices. VoIP honeypots can also be used to detect attackers.
This document provides an overview of key topics in information security:
- It discusses the challenges of implementing information security programs and outlines the importance of processes over products.
- An Information Security Management System (ISMS) is presented as the foundation for establishing security policies, procedures, and responsibilities.
- Authentication and provisioning systems are described as ways to centrally manage user identities and access across applications.
- The importance of vulnerability assessment, policy compliance, and log monitoring tools is highlighted to help detect threats, ensure compliance, and aid auditing.
- Endpoint security, access control, and data leakage prevention are outlined as methods to enforce security policies across networked devices and sensitive data.
This document discusses IMS security. It provides an overview of IMS architecture, noting its complexity due to supporting different access media and TCP/IP vulnerabilities. Threats to IMS are then outlined, including denial of service attacks, interception attacks, fraud attacks, and vulnerabilities in VoIP platforms. Hacking tools for attacking IMS are also listed. The document concludes with recommendations for IMS countermeasures such as encryption, firewalls, security gateways, antivirus software, network hardening techniques, and IDS/IPS systems.
Kief Morris rethinks the infrastructure code delivery lifecycle, advocating for a shift towards composable infrastructure systems. We should shift to designing around deployable components rather than code modules, use more useful levels of abstraction, and drive design and deployment from applications rather than bottom-up, monolithic architecture and delivery.
Understanding Insider Security Threats: Types, Examples, Effects, and Mitigat...Bert Blevins
Today’s digitally connected world presents a wide range of security challenges for enterprises. Insider security threats are particularly noteworthy because they have the potential to cause significant harm. Unlike external threats, insider risks originate from within the company, making them more subtle and challenging to identify. This blog aims to provide a comprehensive understanding of insider security threats, including their types, examples, effects, and mitigation techniques.
Best Programming Language for Civil EngineersAwais Yaseen
The integration of programming into civil engineering is transforming the industry. We can design complex infrastructure projects and analyse large datasets. Imagine revolutionizing the way we build our cities and infrastructure, all by the power of coding. Programming skills are no longer just a bonus—they’re a game changer in this era.
Technology is revolutionizing civil engineering by integrating advanced tools and techniques. Programming allows for the automation of repetitive tasks, enhancing the accuracy of designs, simulations, and analyses. With the advent of artificial intelligence and machine learning, engineers can now predict structural behaviors under various conditions, optimize material usage, and improve project planning.
Transcript: Details of description part II: Describing images in practice - T...BookNet Canada
This presentation explores the practical application of image description techniques. Familiar guidelines will be demonstrated in practice, and descriptions will be developed “live”! If you have learned a lot about the theory of image description techniques but want to feel more confident putting them into practice, this is the presentation for you. There will be useful, actionable information for everyone, whether you are working with authors, colleagues, alone, or leveraging AI as a collaborator.
Link to presentation recording and slides: https://bnctechforum.ca/sessions/details-of-description-part-ii-describing-images-in-practice/
Presented by BookNet Canada on June 25, 2024, with support from the Department of Canadian Heritage.
Fluttercon 2024: Showing that you care about security - OpenSSF Scorecards fo...Chris Swan
Have you noticed the OpenSSF Scorecard badges on the official Dart and Flutter repos? It's Google's way of showing that they care about security. Practices such as pinning dependencies, branch protection, required reviews, continuous integration tests etc. are measured to provide a score and accompanying badge.
You can do the same for your projects, and this presentation will show you how, with an emphasis on the unique challenges that come up when working with Dart and Flutter.
The session will provide a walkthrough of the steps involved in securing a first repository, and then what it takes to repeat that process across an organization with multiple repos. It will also look at the ongoing maintenance involved once scorecards have been implemented, and how aspects of that maintenance can be better automated to minimize toil.
Details of description part II: Describing images in practice - Tech Forum 2024BookNet Canada
This presentation explores the practical application of image description techniques. Familiar guidelines will be demonstrated in practice, and descriptions will be developed “live”! If you have learned a lot about the theory of image description techniques but want to feel more confident putting them into practice, this is the presentation for you. There will be useful, actionable information for everyone, whether you are working with authors, colleagues, alone, or leveraging AI as a collaborator.
Link to presentation recording and transcript: https://bnctechforum.ca/sessions/details-of-description-part-ii-describing-images-in-practice/
Presented by BookNet Canada on June 25, 2024, with support from the Department of Canadian Heritage.
Best Practices for Effectively Running dbt in Airflow.pdfTatiana Al-Chueyr
As a popular open-source library for analytics engineering, dbt is often used in combination with Airflow. Orchestrating and executing dbt models as DAGs ensures an additional layer of control over tasks, observability, and provides a reliable, scalable environment to run dbt models.
This webinar will cover a step-by-step guide to Cosmos, an open source package from Astronomer that helps you easily run your dbt Core projects as Airflow DAGs and Task Groups, all with just a few lines of code. We’ll walk through:
- Standard ways of running dbt (and when to utilize other methods)
- How Cosmos can be used to run and visualize your dbt projects in Airflow
- Common challenges and how to address them, including performance, dependency conflicts, and more
- How running dbt projects in Airflow helps with cost optimization
Webinar given on 9 July 2024
Paradigm Shifts in User Modeling: A Journey from Historical Foundations to Em...Erasmo Purificato
Slide of the tutorial entitled "Paradigm Shifts in User Modeling: A Journey from Historical Foundations to Emerging Trends" held at UMAP'24: 32nd ACM Conference on User Modeling, Adaptation and Personalization (July 1, 2024 | Cagliari, Italy)
Blockchain technology is transforming industries and reshaping the way we conduct business, manage data, and secure transactions. Whether you're new to blockchain or looking to deepen your knowledge, our guidebook, "Blockchain for Dummies", is your ultimate resource.
Support en anglais diffusé lors de l'événement 100% IA organisé dans les locaux parisiens d'Iguane Solutions, le mardi 2 juillet 2024 :
- Présentation de notre plateforme IA plug and play : ses fonctionnalités avancées, telles que son interface utilisateur intuitive, son copilot puissant et des outils de monitoring performants.
- REX client : Cyril Janssens, CTO d’ easybourse, partage son expérience d’utilisation de notre plateforme IA plug & play.
Sustainability requires ingenuity and stewardship. Did you know Pigging Solutions pigging systems help you achieve your sustainable manufacturing goals AND provide rapid return on investment.
How? Our systems recover over 99% of product in transfer piping. Recovering trapped product from transfer lines that would otherwise become flush-waste, means you can increase batch yields and eliminate flush waste. From raw materials to finished product, if you can pump it, we can pig it.
Coordinate Systems in FME 101 - Webinar SlidesSafe Software
If you’ve ever had to analyze a map or GPS data, chances are you’ve encountered and even worked with coordinate systems. As historical data continually updates through GPS, understanding coordinate systems is increasingly crucial. However, not everyone knows why they exist or how to effectively use them for data-driven insights.
During this webinar, you’ll learn exactly what coordinate systems are and how you can use FME to maintain and transform your data’s coordinate systems in an easy-to-digest way, accurately representing the geographical space that it exists within. During this webinar, you will have the chance to:
- Enhance Your Understanding: Gain a clear overview of what coordinate systems are and their value
- Learn Practical Applications: Why we need datams and projections, plus units between coordinate systems
- Maximize with FME: Understand how FME handles coordinate systems, including a brief summary of the 3 main reprojectors
- Custom Coordinate Systems: Learn how to work with FME and coordinate systems beyond what is natively supported
- Look Ahead: Gain insights into where FME is headed with coordinate systems in the future
Don’t miss the opportunity to improve the value you receive from your coordinate system data, ultimately allowing you to streamline your data analysis and maximize your time. See you there!
Comparison Table of DiskWarrior Alternatives.pdfAndrey Yasko
To help you choose the best DiskWarrior alternative, we've compiled a comparison table summarizing the features, pros, cons, and pricing of six alternatives.
Are you interested in dipping your toes in the cloud native observability waters, but as an engineer you are not sure where to get started with tracing problems through your microservices and application landscapes on Kubernetes? Then this is the session for you, where we take you on your first steps in an active open-source project that offers a buffet of languages, challenges, and opportunities for getting started with telemetry data.
The project is called openTelemetry, but before diving into the specifics, we’ll start with de-mystifying key concepts and terms such as observability, telemetry, instrumentation, cardinality, percentile to lay a foundation. After understanding the nuts and bolts of observability and distributed traces, we’ll explore the openTelemetry community; its Special Interest Groups (SIGs), repositories, and how to become not only an end-user, but possibly a contributor.We will wrap up with an overview of the components in this project, such as the Collector, the OpenTelemetry protocol (OTLP), its APIs, and its SDKs.
Attendees will leave with an understanding of key observability concepts, become grounded in distributed tracing terminology, be aware of the components of openTelemetry, and know how to take their first steps to an open-source contribution!
Key Takeaways: Open source, vendor neutral instrumentation is an exciting new reality as the industry standardizes on openTelemetry for observability. OpenTelemetry is on a mission to enable effective observability by making high-quality, portable telemetry ubiquitous. The world of observability and monitoring today has a steep learning curve and in order to achieve ubiquity, the project would benefit from growing our contributor community.
論文紹介:A Systematic Survey of Prompt Engineering on Vision-Language Foundation ...Toru Tamaki
Jindong Gu, Zhen Han, Shuo Chen, Ahmad Beirami, Bailan He, Gengyuan Zhang, Ruotong Liao, Yao Qin, Volker Tresp, Philip Torr "A Systematic Survey of Prompt Engineering on Vision-Language Foundation Models" arXiv2023
https://arxiv.org/abs/2307.12980
20240702 QFM021 Machine Intelligence Reading List June 2024
Meletis Belsis - Workflow based Incident Management Model
1. Workflow Based Security
Incident Management
Meletis A. Belsis1
, Alkis Simitsis2
, Stefanos Gritzalis1
(1) University of the Aegean
Dept. of Information and Communication Systems Eng.
meletis_belsis@yahoo.com, sgritz@aegean.gr
(2) National Technical University of Athens
Dept. of Electrical and Computer Engineering
asimi@dbnet.ece.ntua.gr
2. M. Belsis, A. Simitsis, S. Gritzalis @ PCI'05, Volos, 13/11/2005 2
Outline
Introduction
Incident Collection
ETL Workflows
System Architecture for the Incident Management
Conclusions
3. M. Belsis, A. Simitsis, S. Gritzalis @ PCI'05, Volos, 13/11/2005 3
Introduction
A Security incident is some set of events that involve an attack
or series of attacks at one or more sites (John D. Howard)
Security incidents are not an one step process
a security incident is some set of events
involves an attack or a series of attacks
at one or more sites
may involve one or more criminals
may take place in different tide
may take place from different geographical locations
Storing such incident information is an invaluable tool to users,
administrators and managers.
4. M. Belsis, A. Simitsis, S. Gritzalis @ PCI'05, Volos, 13/11/2005 4
Background
Today many incident databases exist
Most of them follow the Balkanised Model
Examples of such include the
IBM’s VuLDA
NIST ICAT
Ohio University IDB
Many efforts have been made to form a central approach to
incident information storage
CERT/CC
Europe S3000
Open Vulnerability and Assessment Language (OVAL)
Cerias Incident Response Database (CIRDB)
Incident Object Description and Exchange Format (IODEF)
5. M. Belsis, A. Simitsis, S. Gritzalis @ PCI'05, Volos, 13/11/2005 5
Background
IODEFIODEF Incident Data ModelIncident Data Model
6. M. Belsis, A. Simitsis, S. Gritzalis @ PCI'05, Volos, 13/11/2005 6
Motivation
Current incident databases use different schemas and format.
Today experts and law enforcement units require the complete
picture of an incident before taking decisions.
Unfortunately forcing experts around the world to a use common
structure is difficult if possible at all.
What is needed is an infrastructure that can collect and integrate
information from different incident databases
Delivering such a structure incorporates providing solutions to
a number of problems
gathering
export snapshots/differentials
transportation
transformations
cleaning issues
efficient loading
7. M. Belsis, A. Simitsis, S. Gritzalis @ PCI'05, Volos, 13/11/2005 7
Contributions
We employ advance database techniques to tackle the
problem of designing a centralized incident DBMS
We identify the main problems that are underlying the
population of a central incident database
We propose a method based on ETL workflows for the
incremental maintenance of such a centralized
database
We present a framework for incident correlation in
order to keep track of a full attack that its component
incidents are stored in different databases
8. M. Belsis, A. Simitsis, S. Gritzalis @ PCI'05, Volos, 13/11/2005 8
Outline
Introduction
Incident Collection
ETL Workflows
System Architecture for the Incident Management
Conclusions
9. M. Belsis, A. Simitsis, S. Gritzalis @ PCI'05, Volos, 13/11/2005 9
Incident Collection
10. M. Belsis, A. Simitsis, S. Gritzalis @ PCI'05, Volos, 13/11/2005 10
Incident Collection
In terms of the transformation tasks, there are two
main classes of problems
conflicts and problems at the schema level
data level transformations (i.e., at the instance level)
More specifically
Naming conflicts
homonyms
synonyms
Structural conflicts
Data formatting
String Problems
‘Hewlett Packard’ vs. ‘HP’ vs. ‘Hioulet Pakard’
11. M. Belsis, A. Simitsis, S. Gritzalis @ PCI'05, Volos, 13/11/2005 11
Incident Collection
A problem
the time window for the population of the centralized
database is rather too small to repeat the same job
more than once
... a solution
instead of extracting, transforming, and loading all the
data, we are interested only to those incident records
that have been changed during the last execution of the
process
this means that we are interested only to the incident
data that are
newly inserted
updated
deleted
12. M. Belsis, A. Simitsis, S. Gritzalis @ PCI'05, Volos, 13/11/2005 12
Outline
Introduction
Incident Collection
ETL Workflows
System Architecture for the Incident Management
Conclusions
13. M. Belsis, A. Simitsis, S. Gritzalis @ PCI'05, Volos, 13/11/2005 13
ETL Workflows
μαζί με το σχήμα λες τα εξής:
In this figure we abstractly describe the general framework for
ETL workflows.
In the left side, we can observe the original data stores (Sources)
that are involved in the overall process. Typically, data sources
are relational databases and files.
The data from these sources are extracted by specialized
routines or tools, which provide either complete snapshots or
differentials of the data sources.
Then, these data are propagated to the data staging area (DSA)
where they are transformed and cleaned before being loaded into
the data warehouse. Intermediate results, again in the form of
(mostly) files or relational tables are part of the data staging area.
The central database DW is depicted in the right part of figure
and comprises the target data stores. The loading of the central
warehouse is performed from the loading activities depicted in
the right side before the DW data store.
14. M. Belsis, A. Simitsis, S. Gritzalis @ PCI'05, Volos, 13/11/2005 14
ETL Workflows
More informations can be found at:
http://www.dblab.ntua.gr/~asimi/
15. M. Belsis, A. Simitsis, S. Gritzalis @ PCI'05, Volos, 13/11/2005 15
ETL Workflows
Extraction-Transformation-Loading (ETL) tools
can be used to facilitate the population of a centralized
incident database from several different incident DBs
are pieces of software responsible for the extraction of data
from several sources, their cleansing, their customization,
their transformation in order to fit business needs, and finally,
their loading into a central DB
their most prominent tasks include
the identification of relevant information at the source side
the extraction of this information
the transportation of this information to the Data Staging Area
(DSA), where all the transformations take place
the transformation, (i.e., customization and integration) of the
information coming from multiple sources into a common format
the cleaning of the resulting data set, on the basis of database
and business rules
the propagation and loading of the data to a central DB
16. M. Belsis, A. Simitsis, S. Gritzalis @ PCI'05, Volos, 13/11/2005 16
Outline
Introduction
Incident Collection
ETL Workflows
System Architecture for the Incident Management
Conclusions
17. M. Belsis, A. Simitsis, S. Gritzalis @ PCI'05, Volos, 13/11/2005 17
System Architecture
The system proposed, is based on the
OMG’s CORBAOMG’s CORBA architecture.
CORBA allows for the addition of new
services on demand.
CORBA is transperent from client
applications, OS, and platform.
Registered law enforcement units will
be able to access incident information
through the WEB
Data are going to be collected from
CSIRT databases on a daily basis www.dcs.fmph.uniba.skwww.dcs.fmph.uniba.sk
18. M. Belsis, A. Simitsis, S. Gritzalis @ PCI'05, Volos, 13/11/2005 18
System Architecture
Incident data are protected
during transit using the
CORBA’s Security Service
Protocol (SECP) using the SSL
protocol
The final Corba’s security API
will provide Security at level 3level 3
with a Common SecureCommon Secure
Interoperability at level 0Interoperability at level 0 in order
to disallow privilege delegation.
19. M. Belsis, A. Simitsis, S. Gritzalis @ PCI'05, Volos, 13/11/2005 19
System Architecture
20. M. Belsis, A. Simitsis, S. Gritzalis @ PCI'05, Volos, 13/11/2005 20
Outline
Introduction
Incident Collection
ETL Workflows
System Architecture for the Incident Management
Conclusions
21. M. Belsis, A. Simitsis, S. Gritzalis @ PCI'05, Volos, 13/11/2005 21
Conclusions
This research delivers a framework for automated
incident information collection.
The collection and correlation of incident related data
is vital
Incident data collected from different sources need to
be cleaned and homogenized before a centrally
stored.
We try to minimize the time window between the
appearance of an incident and its worldwide
publication.
Automated correlation of incident information will allow
law enforcement units to pursuit the criminals
22. M. Belsis, A. Simitsis, S. Gritzalis @ PCI'05, Volos, 13/11/2005 22
Future Work
Select an incident structure able to store information
received from diverse databases
Currently we review two potential candidates :IODEF
and IDM.
Optimization of the ETL process to enable incident
information correlation during the collection process
Correlation of information stored on the central
database using data mining techniques
Allow the public community to securely access
incident information using database personalized
views
23. M. Belsis, A. Simitsis, S. Gritzalis @ PCI'05, Volos, 13/11/2005 23
Thank You!