SlideShare a Scribd company logo
www.eudat.eu
EUDAT receives funding from the European Union's Horizon 2020 programme - DG CONNECT e-Infrastructures. Contract No. 654065
Using the EGI Fed-Cloud for
Data Analysis
Giuseppe La Rocca
giuseppe.larocca@egi.eu
Technical Outreach Expert
EUDAT Summer School, 3-7 July 2017, Crete
Agenda
• Background information about EGI
– Mission and infrastructure
– Members & partners
– EGI services
• Introduction of the EGI Federated Cloud Infrastructure
– Architecture
• Linking EUDAT service to the EGI Fed-Cloud
EUDAT Summer School, 3-7 July 2017, Crete
EGI: A sustainable e-Infrastructure
for Open Science
• Major national e-Infrastructures: 22 NGIs + 1 EIRO (CERN)
EGI is a federation of over 300 computing and data centres spread across 56 countries in Europe
and worldwide
www.egi.eu/about/egi-foundation/
EGI Foundation
EUDAT Summer School, 3-7 July 2017, Crete
Africa and Arabia
Council for Scientific and
Industrial Research, South Africa
India Centre for
Development of
Advanced Comp.
China Inst. Of HEP
Chinese Academy
of Sciences
Latin America
Universida de Federal do
Rio de Janeiro
Ukraine
Ukrainian National
Grid
USA
Canada
Asia Pacific Region
Academia Sinica
at Taiwan
International Partnerships
EUDAT Summer School, 3-7 July 2017, Crete
23 Cloud
providers,
300+ HTC
providers
15 types of
services
1.7 Million
jobs/day
2.6 Billion
CPU
hours/year
48,000+
users
EGI today
EUDAT Summer School, 3-7 July 2017, Crete
ESFRIs,
FET flagships
Size of
individual
groups
Multinational communities,
(e.g. H2020 projects)
‘Long tail of science’
WLCG
ELI
CTA
ELIXIR
EPOS
EISCAT_3D
BBMRI
CLARIN
LOFAR
EMSO
LifeWatch
ICOS
CORBEL
ENVRIplus
…
VRE projects
OpenDreamKit
WeNMR
DRIHM
VERCE
MuG
AgINFRA
CMMST
LSGC
SuperSites Exploitation
Environmental sci.
neuGRID
…
PeachNote
CEBA Galaxy eLab
Semiconductor design
Main-belt comets
Quantum pysics studies
Virtual imaging (LS)
Bovine tuberculosis spread
Convergent evol. in genomes
Geography evolution
Seafloor seismic waves
3D liver maps with MRI
Metabolic rate modelling
Genome alignment
Tapeworms infection on fish
…
Industry,
SMEs
Agroknow
CloudEO
CloudSME
Ecohydros
gnubila
Sinergise
SixSq
TEISS
Terradue
Ubercloud
…
EGI serves researchers
and innovations
EUDAT Summer School, 3-7 July 2017, Crete
VO 1
(site a, b, c)
VO 2
(site x, y, z, b)
1. Generic VOs – such as fedcloud.egi.eu  Test VOs, “incubator” for new users
2. Community/discipline-specific VOs – e.g. Chipster, Highthroughtputseq, EISCAT, etc.
3. Training VO = training.egi.eu  Running hands on training in the cloud (about any software!)
Browse and search VOs at http://operations-portal.egi.eu/vo/search
Access to EGI resources:
Virtual Organisations
VO memberships and resources
access with X.509 certificates
EUDAT Summer School, 3-7 July 2017, Crete
Project/Community
representing the VO
Negotiator Grid
provider
Cloud
provider
Operation
Level
Agreement
Service Level
Agreement
Satisfaction review
(every 6 months)
Storage
provider
Service
requirements Conditions
Applic.
provider
Performance reports
SupportTraining
Type, number, size,
cost, availability, etc.
Resources allocation to Virtual
Organisation (VOs)
Send list of
publications
EUDAT Summer School, 3-7 July 2017, Crete
Customer Start End Service Type Providers
BioISI Aug. 2016 Jan. 2018 Cloud Compute
NBIS/BILS Dec. 2015 Dec. 2017 Cloud Compute
DARIAH Apr. 2016 Sept. 2017 Cloud Compute,
Online Storage
DRIHM Jan. 2016 Jan. 2018 High-Througput
D4Science Sept. 2016 Dec. 2017 Cloud Compute
EMSODEV Sept. 2016 Dec. 2017 Cloud Compute
EXTraS May 2016 Jan. 2018 Cloud Compute
LSGC May 2016 Jan. 2018 High-Throughput,
Cloud Compute
MoBrain Jan. 2016 Jan. 2018 High-Throughput,
Cloud Compute
Peachnote Apr 2016 Sept. 2017 Cloud Compute,
Online Storage
Terradue Jan. 2016 Jan. 2018 Cloud Compute,
Online Storage
#Resources committed
https://ww.egi.eu/federation/committed-resources/
EUDAT Summer School, 3-7 July 2017, Crete
The EGI services – A wide offer of
services for Research and
Innovation
EUDAT Summer School, 3-7 July 2017, Crete
The EGI Service Catalogue
www.egi.eu/services
EUDAT Summer School, 3-7 July 2017, Crete
Execute thousands of computational tasks to analyse
large datasets
• Access to high-quality computing resources
• Integrated monitoring and accounting tools to provide information
about the availability and resource consumption
• Workload and data management tools to manage all computational
tasks
• Large amounts of processing capacity over long periods of time
See High-Throughput Compute for service information and request
Main features of High-Throughput Compute:
High-Throughput Compute
EUDAT Summer School, 3-7 July 2017, Crete
Powered by High-Throughput Compute
HADDOCK
A web portal offering tools for structural biologists
Used to model the structure of proteins and other molecules
So far, HADDOCK processed + 130,000 submissions from over 7,500 scientists.
Read more... World-wide: > 120’000 CPU cores from 41 sites (EGI & OSG)
HADDOCK
Portal
EGI Clusters
(CPU and GPU)
Workload
manager
(DIRAC)
EUDAT Summer School, 3-7 July 2017, Crete
Run virtual machines on-demand with complete control over
the computing resources
• Execute compute- and data-intensive workloads
• Host long-running services (e.g. web servers or databases)
• Create disposable testing and development environments
• Select virtual machine configurations to fit your requirements
• Manage your Cloud Compute resources in a flexible way with
integrated monitoring and accounting capabilities
Cloud Compute
See Cloud Compute for service information and request
EUDAT Summer School, 3-7 July 2017, Crete
When a human cell meets Salmonella
K. Förstner, Univ. Würzburg, used Cloud
Compute to run a pipeline for the analysis
of sequencing data.
Nature (doi:10.1038/nature16547)
The EXTraS project
Implement four software
pipelines to harvest data
collected on-board ESA’s space
observatory XMM-Newton.
Powered by Cloud Compute
EUDAT Summer School, 3-7 July 2017, Crete
Run Docker containers in a lightweight virtualised environment
• On-demand provisioning
• Lightweight environment for maximised performance
• Standard interface to deploy on multiple service providers
• Interoperable and transparent
• Removes friction between development and operations
environments.
See Cloud Container Compute for service information and request
Main features of Cloud Container Compute:
Cloud Container Compute
EUDAT Summer School, 3-7 July 2017, Crete
Summary and comparison of the
“Compute services”
High-Throughput Cloud Compute Container Cloud
• For batch compute
“jobs”
• Jobs must be grid-
enabled
• To run parallel-
based applications
on large scale
resource providers
• For compute- or data-
intensive tasks and host
online services
• For batch and interactive
compute
• Full flexibility with SW
• Lower IT costs, reduce
infrastructure complexity,
enhance flexibility and
delivery high-level services
• Easily to scale up according
to customer’s need
• For compute- or data-
intensive tasks and host
online services
• Most light-weight
• Fast VM/application start-up
• Container isolate
applications from the
underlying infrastructure
EUDAT Summer School, 3-7 July 2017, Crete
Store, share and access your files and their metadata on
a global scale
• Assign global identifiers to files
• Access highly-scalable storage from anywhere
• Control the data you share
• Organise your data using a flexible hierarchical structure
Online Storage
See Online Storage for service information and request
Main features of Online Storage:
EUDAT Summer School, 3-7 July 2017, Crete
Back-up your data for the long term and future use in
a secure environment
Archive Storage
Main features of Archive Storage:
• Store large amount of data
• Free up your online storage
• Store data for long-term retention
See Archive Storage for service information and request
EUDAT Summer School, 3-7 July 2017, Crete
The EGI Federated Cloud
Infrastructure
EUDAT Summer School, 3-7 July 2017, Crete
The EGI Federated Cloud
Infrastructure
• Grid of clouds!
• Unified user interfaces
• Harmonised operational
behaviour
• Clouds and their
interconnections are based
on open standards, open
technologies
EUDAT Summer School, 3-7 July 2017, Crete
Benefits, technologies
Harmonised
operation
Cloud registry
Information system
Virt. Machine marketpl.
Usage accounting
Access control
Uniform
user interfaces
- On every site
OpenStack Nova - On OS sites
CDMI - on any site
• OpenStack SWIFT – on OS sites
VM and block storage management: Object storage management (optional):Standard-based
federation
OpenStack
federation
EUDAT Summer School, 3-7 July 2017, Crete
Federated Cloud Model
EGI Federation services:
Accounting, Monitoring, Configuration Database, Information Discovery, VM Marketplace
EGI AAI
Cloud Management
Framework
IaaS API
Cloud Management
Framework
IaaS API
Cloud Management
Framework
IaaS API
IaaS Federated Access Tools
Community PlatformsAppDB VMOps
EUDAT Summer School, 3-7 July 2017, Crete
A view on the current infrastructure
Today:
• 23 providers from 14 NGIs
• 15 OpenStack
• 7 OpenNebula
• 1 Synnefo
• VOs: 34
• Catch-all VOs: 7
• Domain-spec: NGS, …
EUDAT Summer School, 3-7 July 2017, Crete
Different modus operandi
• Compute and data intensive workloads
• Batch and interactive (e.g. Jupiter Notebooks) with scalable and
customized environments
• Examples: The Genetics of Salmonella Infections, The Chipster platform
• Service Hosting
• Long-running services (e.g. web server, database, application server)
• Examples: NBIS Web Services, Peachnote analysis platform, The VERCE
platform
• Datasets repository
• Store and manage large datasets (in a storage volume)
• Disposable and testing environments
• Host training environments, test applications
• Examples: Events conducted on the cloud-based EGI Training Infrastructure
EUDAT Summer School, 3-7 July 2017, Crete
How to access the EGI FedCloud ?
Access to the resources:
Obtain a personal X.509 access certificate
from a recognised Certification Authority.
Terena Certificate Service: (online)
https://www.digicert.com/sso
Join the fedcloud.egi.eu VO serves as a test
ground for users to try the EGI cloud and to
prototype and validate applications.
VIRTUAL
ORGANISATION
CA
VO manager
Obtain certificate: Once
Renew certificate: Annually
User database
Cloud sites
Membership
service
Join VO: Once
DB replication
(once a day)
You
Register
Use
resources
Remarks:
After the 6-month long membership in
the fedcloud.egi.eu VO, you will need to move to a
production VO, or establish a new VO.
EUDAT Summer School, 3-7 July 2017, Crete
• Open Standards Realm
• Uses OCCI 1.2 interface
• Ruby and Java SDKs available
• Simple CLI tool for managing resources
• OpenStack Realm
• Native OpenStack API with VOMS AuthN/AuthZ
• Plugin for python SDK and OpenStack CLI
How to interact with the EGI FedCloud ?
EUDAT Summer School, 3-7 July 2017, Crete
A typical workflow
VO Manager:
Endorses available images
Includes images in the VO
EUDAT Summer School, 3-7 July 2017, Crete
The EGI Applications Database
• The EGI Application DataBase (AppDB) is a central service that stores
and provides information about:
• Software solutions in the form of native software products, virtual
appliances and/or software appliances,
• Programmers and the scientists who are involved, and
• Publications derived from
the registered solutions.
Virtual Appliances
EUDAT Summer School, 3-7 July 2017, Crete
Two different storage solutions
EGI FedCloud Storage
Block Storage
Object Storage
The EGI Federated Cloud Infrastructures offers two different storage
solutions
EUDAT Summer School, 3-7 July 2017, Crete
Block Storage
Persistent block level storage to use with VMs
• Use as any other block device
from VMs
• Snapshotable
Simple usage
• Consistent and low-latency
performance
• SSDs (in some sites)
High
Performance
• From GB to TB
• Create and attach to VMs on
demand
Scale to your
needs
VM
EUDAT Summer School, 3-7 July 2017, Crete
Object Storage
Data storage infrastructure for storing and retrieving data from
anywhere at any time
• Simple REST APIs for
managing and accessing data
API Access
• Store as much data as needed.
• Get accounted only for the
space used.
Scalable
• Define ACLs on each object,
share publicly your data
Sharing
EUDAT Summer School, 3-7 July 2017, Crete
Block Storage vs Object Storage
Block Storage Object Storage
Access
only from within a VM
only at the same site the VM is
located
from any device
connected to the
internet.
Sharing not possible
possible (data can be
kept private or public)
Accounting
for the entire volume,
regardless how much of it is
actually used in the VM
only for the data
stored
Integration
POSIX access, easy with any
application capable to
write/read file from a local disk
requires a client to be
integrated within the
application
EUDAT Summer School, 3-7 July 2017, Crete
• OCCI (Open Cloud Computing Interface) is a OGF
standard API to facilitate interoperable access to
cloud resources
• Block storage in FedCloud is managed via OCCI:
• Create/Delete volumes
• Attach/Detach (link/unlink in OCCI terms) to VMs
• Once attached, use as other disk in VM
Block Storage: OCCI
EUDAT Summer School, 3-7 July 2017, Crete
Object Storage: CDMI
• FedCloud object storage is managed via CDMI
(Cloud Data Management Interface)
• RESTful API for operations on storage objects
• Developed by SNIA, now ISO/IEC 17826
• Very flexible API, based on capabilities:
• Object basic capabilities (create/get/delete/list)
• Object ACLs
• Import from external sources, export as Filesystems
EUDAT Summer School, 3-7 July 2017, Crete
State of the art: Block Storage
• Block storage is supported on all FedCloud CMFs and sites
OpenStack OpenNebula Synnefo
OCCI Basic
Operations
Yes Yes Yes
OCCI advanced
(resize,
snapshot)
No No No
Native API
advanced
Yes Partial Yes
EUDAT Summer School, 3-7 July 2017, Crete
State of the art: Object Storage
• CDMI support
• CDMI server framework by Synnefo
• On going effort to support OpenStack
• Basic client available
• Native APIs allow basic and advanced capabilities
OpenStack Synnefo OpenNebula
CDMI Basic
Operations
In Progress Yes N/A
Native API Yes Yes N/A
EUDAT Summer School, 3-7 July 2017, Crete
How to manage datasets in the
EGI Federated Cloud ?
Data providers
Local
dataset
Local
dataset
Local
dataset
VO Manager:
Endorses available images
Includes images in the VO
EUDAT Summer School, 3-7 July 2017, Crete
The EGI DataHub
A Data as a Service (DaaS) to implement the EGI Open Data
Platform (ODP)
• EGI Open Data Platform (ODP)
– Support EC Open Data Cloud vision
– Integrate different data repositories available in a distributed
environment
– Offer the functionalities to make data open and link them to
Open Data Catalogues
• OneData
– Software stack for distributed data management platform
EUDAT Summer School, 3-7 July 2017, Crete
Open Data Platform – The big picture
EGI User 1 (VO x) Anonymous
User 1
EGI User 2
(Onedata space)
Anonymous
User 2
Space
Manager
DOI Registrar
(e.g. DataCite)
Community
Portal
Open Data Platform
Web GUI POSIX HTTP OAI-PMH CDMI REST
REST
Generatore AIP
package for abc
EGI Site 1 EGI Site 2 EGI Site 3 Cloud storage EUDAT
Space Manager Open Data Manager Metadata Registry OAI-PMH Data
Provider
Authentication and
Authorization
Long Term
Retention
EUDAT Summer School, 3-7 July 2017, Crete
Open Data Platform - Interfaces
GUI
Web based
Easy data
management
and sharing,
access control
Publication of
data items
and
collections
REST
Advanced
data and
collection
management
API for
integration
with
community
tools and
portals
CDMI
Standard data
management
operations
Advanced
metadata
queries
Integration
with future
data
management
applications
POSIX
Enable direct
mounting of
spaces in the
local
filesystem
without full
data transfer
OAI-
PMH
OAI Data
Provider
interface
Dublin Core
metadata by
default
More complex
metadata can
be registered
in ODP
manually
HTTP
Direct
download of
open data
from URL’s
EUDAT Summer School, 3-7 July 2017, Crete
Linking EUDAT services to
the EGI Federated Cloud
VM
EUDAT Summer School, 3-7 July 2017, Crete
How to link EUDAT services to EGI FedCloud
Create your VM topology with the EGI VMOps dashboard
– Access the EGI VMOps dashboard and create your VM to interact with EUDAT
– Select the proper VO
– Select the VM image
– Select on of the
available providers
The first time you access
the EGI VMOps
dashboard you need to
set up your profile
EUDAT Summer School, 3-7 July 2017, Crete
Select the VM flavour
Start the VM and wait until it is in “running” status
When the VM is in Running status click on View Details
How to link EUDAT services to EGI FedCloud
EUDAT Summer School, 3-7 July 2017, Crete
Check VM details
Download the SSH key, change its permission and access the VM
How to link EUDAT services to EGI FedCloud
EUDAT Summer School, 3-7 July 2017, Crete
Install the EUGridPMA PGP key for apt:
]$ sudo su -
]$ wget -q -O -  https://dist.eugridpma.info/distribution/igtf/current/GPG-KEY-EUGridPMA-
RPM-3  | apt-key add -
Add the following line to your /etc/apt/sources.list file for apt
#### EGI Trust Anchor Distribution ####
deb http://repository.egi.eu/sw/production/cas/1/current egi-igtf core
Install ca-policy-egi-core package
]$ sudo apt-get update
]$ sudo apt-get install -y ca-policy-egi-core
How to link EUDAT services to EGI FedCloud
EUDAT Summer School, 3-7 July 2017, Crete
Install a clients in the VM to perform manipulations on file
]$ sudo apt-get install software-properties-common
]$ sudo add-apt-repository ppa:maarten-kooyman-6/ppa
]$ sudo apt-get update
]$ sudo apt-get install uberftp
]$ sudo apt-get install globus-gass-copy
Copy certificate under /tmp/ to access EUDAT server
• You need to have granted access to the B2STAGE/B2SAFE instances
– Send the DN of your digital certificate to B2STAGE and B2SAFE support teams
Manipulating files on B2STAGE/B2SAFE with UberFTP
]$ uberftp eudat-b2stage.pdc.kth.se
For more details, please refer to: https://linux.die.net/man/1/uberftp
How to link EUDAT services to EGI FedCloud
EUDAT Summer School, 3-7 July 2017, Crete
Manipulating files on B2STAGE/B2SAFE with globus-url-copy
• Create a simple text file in your $HOME and save it as text.txt
Upload files from the VM to the B2STAGE instance
]$ globus-url-copy -vb -cred <X509_USER_PROXY> 
file:///home/cloudadm/text.txt 
gsiftp://eudat-b2stage.pdc.kth.se/eudat.se/projects/eudat-summerschool/text.txt
Download files from B2STAGE to the VM
]$ globus-url-copy -vb -cred <X509_USER_PROXY> 
gsiftp://eudat-b2stage.pdc.kth.se/eudat.se/projects/eudat-summerschool/text.txt 
file:///home/cloudadm/text2.txt
How to link EUDAT services to EGI FedCloud
EUDAT Summer School, 3-7 July 2017, Crete
Delete your VM when you have done!
How to link EUDAT services to EGI FedCloud
EUDAT Summer School, 3-7 July 2017, Crete
Documentations and wiki
https://wiki.egi.eu/wiki/Federated_Cloud_user_support
Do you need any support ? Please, contact us at: support@egi.eu
www.eudat.eu
Thank you for your attention.

More Related Content

Using the EGI Fed-Cloud for Data Analysis - EUDAT Summer School (Giuseppe La Rocca, EGI)

  • 1. www.eudat.eu EUDAT receives funding from the European Union's Horizon 2020 programme - DG CONNECT e-Infrastructures. Contract No. 654065 Using the EGI Fed-Cloud for Data Analysis Giuseppe La Rocca giuseppe.larocca@egi.eu Technical Outreach Expert
  • 2. EUDAT Summer School, 3-7 July 2017, Crete Agenda • Background information about EGI – Mission and infrastructure – Members & partners – EGI services • Introduction of the EGI Federated Cloud Infrastructure – Architecture • Linking EUDAT service to the EGI Fed-Cloud
  • 3. EUDAT Summer School, 3-7 July 2017, Crete EGI: A sustainable e-Infrastructure for Open Science • Major national e-Infrastructures: 22 NGIs + 1 EIRO (CERN) EGI is a federation of over 300 computing and data centres spread across 56 countries in Europe and worldwide www.egi.eu/about/egi-foundation/ EGI Foundation
  • 4. EUDAT Summer School, 3-7 July 2017, Crete Africa and Arabia Council for Scientific and Industrial Research, South Africa India Centre for Development of Advanced Comp. China Inst. Of HEP Chinese Academy of Sciences Latin America Universida de Federal do Rio de Janeiro Ukraine Ukrainian National Grid USA Canada Asia Pacific Region Academia Sinica at Taiwan International Partnerships
  • 5. EUDAT Summer School, 3-7 July 2017, Crete 23 Cloud providers, 300+ HTC providers 15 types of services 1.7 Million jobs/day 2.6 Billion CPU hours/year 48,000+ users EGI today
  • 6. EUDAT Summer School, 3-7 July 2017, Crete ESFRIs, FET flagships Size of individual groups Multinational communities, (e.g. H2020 projects) ‘Long tail of science’ WLCG ELI CTA ELIXIR EPOS EISCAT_3D BBMRI CLARIN LOFAR EMSO LifeWatch ICOS CORBEL ENVRIplus … VRE projects OpenDreamKit WeNMR DRIHM VERCE MuG AgINFRA CMMST LSGC SuperSites Exploitation Environmental sci. neuGRID … PeachNote CEBA Galaxy eLab Semiconductor design Main-belt comets Quantum pysics studies Virtual imaging (LS) Bovine tuberculosis spread Convergent evol. in genomes Geography evolution Seafloor seismic waves 3D liver maps with MRI Metabolic rate modelling Genome alignment Tapeworms infection on fish … Industry, SMEs Agroknow CloudEO CloudSME Ecohydros gnubila Sinergise SixSq TEISS Terradue Ubercloud … EGI serves researchers and innovations
  • 7. EUDAT Summer School, 3-7 July 2017, Crete VO 1 (site a, b, c) VO 2 (site x, y, z, b) 1. Generic VOs – such as fedcloud.egi.eu  Test VOs, “incubator” for new users 2. Community/discipline-specific VOs – e.g. Chipster, Highthroughtputseq, EISCAT, etc. 3. Training VO = training.egi.eu  Running hands on training in the cloud (about any software!) Browse and search VOs at http://operations-portal.egi.eu/vo/search Access to EGI resources: Virtual Organisations VO memberships and resources access with X.509 certificates
  • 8. EUDAT Summer School, 3-7 July 2017, Crete Project/Community representing the VO Negotiator Grid provider Cloud provider Operation Level Agreement Service Level Agreement Satisfaction review (every 6 months) Storage provider Service requirements Conditions Applic. provider Performance reports SupportTraining Type, number, size, cost, availability, etc. Resources allocation to Virtual Organisation (VOs) Send list of publications
  • 9. EUDAT Summer School, 3-7 July 2017, Crete Customer Start End Service Type Providers BioISI Aug. 2016 Jan. 2018 Cloud Compute NBIS/BILS Dec. 2015 Dec. 2017 Cloud Compute DARIAH Apr. 2016 Sept. 2017 Cloud Compute, Online Storage DRIHM Jan. 2016 Jan. 2018 High-Througput D4Science Sept. 2016 Dec. 2017 Cloud Compute EMSODEV Sept. 2016 Dec. 2017 Cloud Compute EXTraS May 2016 Jan. 2018 Cloud Compute LSGC May 2016 Jan. 2018 High-Throughput, Cloud Compute MoBrain Jan. 2016 Jan. 2018 High-Throughput, Cloud Compute Peachnote Apr 2016 Sept. 2017 Cloud Compute, Online Storage Terradue Jan. 2016 Jan. 2018 Cloud Compute, Online Storage #Resources committed https://ww.egi.eu/federation/committed-resources/
  • 10. EUDAT Summer School, 3-7 July 2017, Crete The EGI services – A wide offer of services for Research and Innovation
  • 11. EUDAT Summer School, 3-7 July 2017, Crete The EGI Service Catalogue www.egi.eu/services
  • 12. EUDAT Summer School, 3-7 July 2017, Crete Execute thousands of computational tasks to analyse large datasets • Access to high-quality computing resources • Integrated monitoring and accounting tools to provide information about the availability and resource consumption • Workload and data management tools to manage all computational tasks • Large amounts of processing capacity over long periods of time See High-Throughput Compute for service information and request Main features of High-Throughput Compute: High-Throughput Compute
  • 13. EUDAT Summer School, 3-7 July 2017, Crete Powered by High-Throughput Compute HADDOCK A web portal offering tools for structural biologists Used to model the structure of proteins and other molecules So far, HADDOCK processed + 130,000 submissions from over 7,500 scientists. Read more... World-wide: > 120’000 CPU cores from 41 sites (EGI & OSG) HADDOCK Portal EGI Clusters (CPU and GPU) Workload manager (DIRAC)
  • 14. EUDAT Summer School, 3-7 July 2017, Crete Run virtual machines on-demand with complete control over the computing resources • Execute compute- and data-intensive workloads • Host long-running services (e.g. web servers or databases) • Create disposable testing and development environments • Select virtual machine configurations to fit your requirements • Manage your Cloud Compute resources in a flexible way with integrated monitoring and accounting capabilities Cloud Compute See Cloud Compute for service information and request
  • 15. EUDAT Summer School, 3-7 July 2017, Crete When a human cell meets Salmonella K. Förstner, Univ. Würzburg, used Cloud Compute to run a pipeline for the analysis of sequencing data. Nature (doi:10.1038/nature16547) The EXTraS project Implement four software pipelines to harvest data collected on-board ESA’s space observatory XMM-Newton. Powered by Cloud Compute
  • 16. EUDAT Summer School, 3-7 July 2017, Crete Run Docker containers in a lightweight virtualised environment • On-demand provisioning • Lightweight environment for maximised performance • Standard interface to deploy on multiple service providers • Interoperable and transparent • Removes friction between development and operations environments. See Cloud Container Compute for service information and request Main features of Cloud Container Compute: Cloud Container Compute
  • 17. EUDAT Summer School, 3-7 July 2017, Crete Summary and comparison of the “Compute services” High-Throughput Cloud Compute Container Cloud • For batch compute “jobs” • Jobs must be grid- enabled • To run parallel- based applications on large scale resource providers • For compute- or data- intensive tasks and host online services • For batch and interactive compute • Full flexibility with SW • Lower IT costs, reduce infrastructure complexity, enhance flexibility and delivery high-level services • Easily to scale up according to customer’s need • For compute- or data- intensive tasks and host online services • Most light-weight • Fast VM/application start-up • Container isolate applications from the underlying infrastructure
  • 18. EUDAT Summer School, 3-7 July 2017, Crete Store, share and access your files and their metadata on a global scale • Assign global identifiers to files • Access highly-scalable storage from anywhere • Control the data you share • Organise your data using a flexible hierarchical structure Online Storage See Online Storage for service information and request Main features of Online Storage:
  • 19. EUDAT Summer School, 3-7 July 2017, Crete Back-up your data for the long term and future use in a secure environment Archive Storage Main features of Archive Storage: • Store large amount of data • Free up your online storage • Store data for long-term retention See Archive Storage for service information and request
  • 20. EUDAT Summer School, 3-7 July 2017, Crete The EGI Federated Cloud Infrastructure
  • 21. EUDAT Summer School, 3-7 July 2017, Crete The EGI Federated Cloud Infrastructure • Grid of clouds! • Unified user interfaces • Harmonised operational behaviour • Clouds and their interconnections are based on open standards, open technologies
  • 22. EUDAT Summer School, 3-7 July 2017, Crete Benefits, technologies Harmonised operation Cloud registry Information system Virt. Machine marketpl. Usage accounting Access control Uniform user interfaces - On every site OpenStack Nova - On OS sites CDMI - on any site • OpenStack SWIFT – on OS sites VM and block storage management: Object storage management (optional):Standard-based federation OpenStack federation
  • 23. EUDAT Summer School, 3-7 July 2017, Crete Federated Cloud Model EGI Federation services: Accounting, Monitoring, Configuration Database, Information Discovery, VM Marketplace EGI AAI Cloud Management Framework IaaS API Cloud Management Framework IaaS API Cloud Management Framework IaaS API IaaS Federated Access Tools Community PlatformsAppDB VMOps
  • 24. EUDAT Summer School, 3-7 July 2017, Crete A view on the current infrastructure Today: • 23 providers from 14 NGIs • 15 OpenStack • 7 OpenNebula • 1 Synnefo • VOs: 34 • Catch-all VOs: 7 • Domain-spec: NGS, …
  • 25. EUDAT Summer School, 3-7 July 2017, Crete Different modus operandi • Compute and data intensive workloads • Batch and interactive (e.g. Jupiter Notebooks) with scalable and customized environments • Examples: The Genetics of Salmonella Infections, The Chipster platform • Service Hosting • Long-running services (e.g. web server, database, application server) • Examples: NBIS Web Services, Peachnote analysis platform, The VERCE platform • Datasets repository • Store and manage large datasets (in a storage volume) • Disposable and testing environments • Host training environments, test applications • Examples: Events conducted on the cloud-based EGI Training Infrastructure
  • 26. EUDAT Summer School, 3-7 July 2017, Crete How to access the EGI FedCloud ? Access to the resources: Obtain a personal X.509 access certificate from a recognised Certification Authority. Terena Certificate Service: (online) https://www.digicert.com/sso Join the fedcloud.egi.eu VO serves as a test ground for users to try the EGI cloud and to prototype and validate applications. VIRTUAL ORGANISATION CA VO manager Obtain certificate: Once Renew certificate: Annually User database Cloud sites Membership service Join VO: Once DB replication (once a day) You Register Use resources Remarks: After the 6-month long membership in the fedcloud.egi.eu VO, you will need to move to a production VO, or establish a new VO.
  • 27. EUDAT Summer School, 3-7 July 2017, Crete • Open Standards Realm • Uses OCCI 1.2 interface • Ruby and Java SDKs available • Simple CLI tool for managing resources • OpenStack Realm • Native OpenStack API with VOMS AuthN/AuthZ • Plugin for python SDK and OpenStack CLI How to interact with the EGI FedCloud ?
  • 28. EUDAT Summer School, 3-7 July 2017, Crete A typical workflow VO Manager: Endorses available images Includes images in the VO
  • 29. EUDAT Summer School, 3-7 July 2017, Crete The EGI Applications Database • The EGI Application DataBase (AppDB) is a central service that stores and provides information about: • Software solutions in the form of native software products, virtual appliances and/or software appliances, • Programmers and the scientists who are involved, and • Publications derived from the registered solutions. Virtual Appliances
  • 30. EUDAT Summer School, 3-7 July 2017, Crete Two different storage solutions EGI FedCloud Storage Block Storage Object Storage The EGI Federated Cloud Infrastructures offers two different storage solutions
  • 31. EUDAT Summer School, 3-7 July 2017, Crete Block Storage Persistent block level storage to use with VMs • Use as any other block device from VMs • Snapshotable Simple usage • Consistent and low-latency performance • SSDs (in some sites) High Performance • From GB to TB • Create and attach to VMs on demand Scale to your needs VM
  • 32. EUDAT Summer School, 3-7 July 2017, Crete Object Storage Data storage infrastructure for storing and retrieving data from anywhere at any time • Simple REST APIs for managing and accessing data API Access • Store as much data as needed. • Get accounted only for the space used. Scalable • Define ACLs on each object, share publicly your data Sharing
  • 33. EUDAT Summer School, 3-7 July 2017, Crete Block Storage vs Object Storage Block Storage Object Storage Access only from within a VM only at the same site the VM is located from any device connected to the internet. Sharing not possible possible (data can be kept private or public) Accounting for the entire volume, regardless how much of it is actually used in the VM only for the data stored Integration POSIX access, easy with any application capable to write/read file from a local disk requires a client to be integrated within the application
  • 34. EUDAT Summer School, 3-7 July 2017, Crete • OCCI (Open Cloud Computing Interface) is a OGF standard API to facilitate interoperable access to cloud resources • Block storage in FedCloud is managed via OCCI: • Create/Delete volumes • Attach/Detach (link/unlink in OCCI terms) to VMs • Once attached, use as other disk in VM Block Storage: OCCI
  • 35. EUDAT Summer School, 3-7 July 2017, Crete Object Storage: CDMI • FedCloud object storage is managed via CDMI (Cloud Data Management Interface) • RESTful API for operations on storage objects • Developed by SNIA, now ISO/IEC 17826 • Very flexible API, based on capabilities: • Object basic capabilities (create/get/delete/list) • Object ACLs • Import from external sources, export as Filesystems
  • 36. EUDAT Summer School, 3-7 July 2017, Crete State of the art: Block Storage • Block storage is supported on all FedCloud CMFs and sites OpenStack OpenNebula Synnefo OCCI Basic Operations Yes Yes Yes OCCI advanced (resize, snapshot) No No No Native API advanced Yes Partial Yes
  • 37. EUDAT Summer School, 3-7 July 2017, Crete State of the art: Object Storage • CDMI support • CDMI server framework by Synnefo • On going effort to support OpenStack • Basic client available • Native APIs allow basic and advanced capabilities OpenStack Synnefo OpenNebula CDMI Basic Operations In Progress Yes N/A Native API Yes Yes N/A
  • 38. EUDAT Summer School, 3-7 July 2017, Crete How to manage datasets in the EGI Federated Cloud ? Data providers Local dataset Local dataset Local dataset VO Manager: Endorses available images Includes images in the VO
  • 39. EUDAT Summer School, 3-7 July 2017, Crete The EGI DataHub A Data as a Service (DaaS) to implement the EGI Open Data Platform (ODP) • EGI Open Data Platform (ODP) – Support EC Open Data Cloud vision – Integrate different data repositories available in a distributed environment – Offer the functionalities to make data open and link them to Open Data Catalogues • OneData – Software stack for distributed data management platform
  • 40. EUDAT Summer School, 3-7 July 2017, Crete Open Data Platform – The big picture EGI User 1 (VO x) Anonymous User 1 EGI User 2 (Onedata space) Anonymous User 2 Space Manager DOI Registrar (e.g. DataCite) Community Portal Open Data Platform Web GUI POSIX HTTP OAI-PMH CDMI REST REST Generatore AIP package for abc EGI Site 1 EGI Site 2 EGI Site 3 Cloud storage EUDAT Space Manager Open Data Manager Metadata Registry OAI-PMH Data Provider Authentication and Authorization Long Term Retention
  • 41. EUDAT Summer School, 3-7 July 2017, Crete Open Data Platform - Interfaces GUI Web based Easy data management and sharing, access control Publication of data items and collections REST Advanced data and collection management API for integration with community tools and portals CDMI Standard data management operations Advanced metadata queries Integration with future data management applications POSIX Enable direct mounting of spaces in the local filesystem without full data transfer OAI- PMH OAI Data Provider interface Dublin Core metadata by default More complex metadata can be registered in ODP manually HTTP Direct download of open data from URL’s
  • 42. EUDAT Summer School, 3-7 July 2017, Crete Linking EUDAT services to the EGI Federated Cloud VM
  • 43. EUDAT Summer School, 3-7 July 2017, Crete How to link EUDAT services to EGI FedCloud Create your VM topology with the EGI VMOps dashboard – Access the EGI VMOps dashboard and create your VM to interact with EUDAT – Select the proper VO – Select the VM image – Select on of the available providers The first time you access the EGI VMOps dashboard you need to set up your profile
  • 44. EUDAT Summer School, 3-7 July 2017, Crete Select the VM flavour Start the VM and wait until it is in “running” status When the VM is in Running status click on View Details How to link EUDAT services to EGI FedCloud
  • 45. EUDAT Summer School, 3-7 July 2017, Crete Check VM details Download the SSH key, change its permission and access the VM How to link EUDAT services to EGI FedCloud
  • 46. EUDAT Summer School, 3-7 July 2017, Crete Install the EUGridPMA PGP key for apt: ]$ sudo su - ]$ wget -q -O - https://dist.eugridpma.info/distribution/igtf/current/GPG-KEY-EUGridPMA- RPM-3 | apt-key add - Add the following line to your /etc/apt/sources.list file for apt #### EGI Trust Anchor Distribution #### deb http://repository.egi.eu/sw/production/cas/1/current egi-igtf core Install ca-policy-egi-core package ]$ sudo apt-get update ]$ sudo apt-get install -y ca-policy-egi-core How to link EUDAT services to EGI FedCloud
  • 47. EUDAT Summer School, 3-7 July 2017, Crete Install a clients in the VM to perform manipulations on file ]$ sudo apt-get install software-properties-common ]$ sudo add-apt-repository ppa:maarten-kooyman-6/ppa ]$ sudo apt-get update ]$ sudo apt-get install uberftp ]$ sudo apt-get install globus-gass-copy Copy certificate under /tmp/ to access EUDAT server • You need to have granted access to the B2STAGE/B2SAFE instances – Send the DN of your digital certificate to B2STAGE and B2SAFE support teams Manipulating files on B2STAGE/B2SAFE with UberFTP ]$ uberftp eudat-b2stage.pdc.kth.se For more details, please refer to: https://linux.die.net/man/1/uberftp How to link EUDAT services to EGI FedCloud
  • 48. EUDAT Summer School, 3-7 July 2017, Crete Manipulating files on B2STAGE/B2SAFE with globus-url-copy • Create a simple text file in your $HOME and save it as text.txt Upload files from the VM to the B2STAGE instance ]$ globus-url-copy -vb -cred <X509_USER_PROXY> file:///home/cloudadm/text.txt gsiftp://eudat-b2stage.pdc.kth.se/eudat.se/projects/eudat-summerschool/text.txt Download files from B2STAGE to the VM ]$ globus-url-copy -vb -cred <X509_USER_PROXY> gsiftp://eudat-b2stage.pdc.kth.se/eudat.se/projects/eudat-summerschool/text.txt file:///home/cloudadm/text2.txt How to link EUDAT services to EGI FedCloud
  • 49. EUDAT Summer School, 3-7 July 2017, Crete Delete your VM when you have done! How to link EUDAT services to EGI FedCloud
  • 50. EUDAT Summer School, 3-7 July 2017, Crete Documentations and wiki https://wiki.egi.eu/wiki/Federated_Cloud_user_support Do you need any support ? Please, contact us at: support@egi.eu
  • 51. www.eudat.eu Thank you for your attention.