WP2
Data and Compute Cloud
Platform
Marian Bubak, Piotr Nowakowski, Tomasz Bartyński, Jan
Meizner,
Adam Belloum, Spiros Koulouzis, Eric Sarries, Stefan
Zasada, David Chang
WP2 in the VPH-Share
WP2 in the VPH-Share
framework
framework
Mission: To develop a cloud platform which will enable
easy access to compute and data resources by providing
methods and services for:
1. Resource allocation management dedicated for VPH
application scenarios;
2.
Execution environment for flexible deployment of
scientific software on the virtualized infrastructure;
3.
Virtualized access to high performance (HPC)
execution environments;
4.
Cloud data access for very large objects and data
transfer between services;
5.
Data reliability and integrity to ensure sound use
of biomedical data;
WP2: Objectives
WP2: Objectives
and tasks
and tasks
A cloud platform enabling easy access to compute and data resources
•Scientific coordination of the development of VPH-Share cloud computing solutions (Task 2.0);
•Providing a means by which the Cloud resources available to the Project can be managed and allocated on demand (Task 2.1);
•Developing and deploying a platform which will manage such
resources and deploy computational tasks in support of VPHShare applications (Tasks 2.2 and 2.3);
•Ensuring reliable, managed access to large binary objects stored in various Cloud frameworks (Tasks 2.4 and 2.5);
•Ensuring secure operation of the platform developed in tasks 2.1-2.5 (Task 2.6);
•Gathering user requirements, liaise with application teams and
advising on migration to Cloud computational and storage resources; testing WP2 solutions in the scope of real-world application workflows (Task 2.7);
•Collaborating with p-medicine forthe purposes of sharing experience with Cloud computing technologies (Task 2.8).
Cloud computing
Cloud computing
What is Cloud computing?
◦
„Unlimited” access to computing power and data
storage
◦
Virtualization technology (enables running many
isolated operating systems on one physical machine);
◦
Lifecycle management (deploy/start/stop/restart);
◦
Scalability;
◦
Pay-per-use accounting model;
However, Cloud computing isn’t:
◦
…a magic platform to automatically scale your
application up from your PC;
◦
…a secure place where sensitive data can be stored
(this is why we require security and data
anonymization…).
WP2 offer for workflows
WP2 offer for workflows
Scale your applications in the Cloud („unlimited”
computer power/reliable storage);
Utilize resources in a cost-effective way;
Install/configure each Atomic Service once – then
use them multiple times in different workflows;
Many instances of Atomic Services can be
instantiated automatically;
Large-scale computation can be delegated from
the PC to the cloud/HPC;
Smart deployment: computation can be
executed close to data (or the other way round);
Multitudes of operating systems to choose from;
Install whatever you want (root access to Cloud
Partner P M
Description
CYFRONET 119 Coordination of work package. Architecture of the cloud platform. Cloud execution environment and access to computational resources.
ATOS 57 Integrated authentication and authorization framework. Integration with VPH semantics (WP4)
UCL 30 Providing high level access to virtualized HPC application as services
USFD 8 Integration of application workflows with the cloud platform. Coordinator of all flagship workflows.
UvA 50 Multiple protocol data transfer
between services. Integration of ViroLab workflow with the cloud platform. IOR 2 Integration of VPHOP workflow with the
cloud platform.
KCL 4 Integration of euHeart workflow with the cloud platform
UPF 4 integration with relational data access (WP3), and user access systems (WP6)
Integration of @neurIST workflow with the cloud platform.
WP
3 (new) words
3 (new) words
…
…
Appliance: A running instance
of an atomic service, hosted in the Cloud and capable of being directly interfaced, e.g. by the workflow management tools or VPH-Share GUIs.
!
Virtual Machine: A
self-contained operating system image, registered in the Cloud framework and capable of being managed by VPH-Share mechanisms.
!
Atomic service: A VPH-Share
application (or a component thereof) installed on a Virtual Machine and registered with the WP2 cloud management tools for deployment.
!
Raw OS OS VPH-Share app. (or component) External APIs OS VPH-Share app. (or component) External APIs Cloud hostWP2 vision at a glance
WP2 vision at a glance
(1/3)
(1/3)
Installing a VPH-Share application in the Cloud (developer action):
•Upon the application developers’ request, the Atmosphere component
(developed in T2.1 and T2.2) spawns a fresh Virtual Machine, which resides in the Cloud and contains all the features typically expected of an „out of the box”
operating system (virtualized storage, standard libraries, root account, initial configuration etc.) If needed, many such VMs can be spawned, each
encapsulating a single VPH-Share atomic service.
•It is the application developers’ task to install components of their applications on these templates so that they can be wrapped as atomic services.
•WP2 tools can then further manage the atomic services and deploy their instances (also called appliances) on Cloud resources as requested by the Workflow Composer. VM Cloud platform OS template registry T2. 1 Atmosphere UI
1. Browse avail. OS templates 2a. Create VM with selected OS
2b. Spawn VM 2c. Return VM IP
4. Install required software
Preparing a VPH-Share application for execution (user action):
•The user requests the Workflow Composer (via the WP6 Master UI) to execute an application.
•The Workflow Composer informs Atmosphere which atomic services to deploy in the Cloud so that the workflow may be executed.
•Atmosphere takes care of deploying the required assets and returns a list of service endpoints (typically IP addresses) whereupon workflow execution may commence.
•Atmosphere can be designed to „talk to” many different computing stacks, and thus interface both commercial and private Clouds – we are currently eyeing Eucalyptus, OpenStack, OpenNebula and Nimbus.
•Depending on the underlying Cloud computing stack we expect to be able to define deployment heuristics, enabling optimization of resource usage.1. Log in to MI VPH Master Int.
Workflow Composer
tool
2. Execute application
3a. Get atomic services
Appl. #1 Cloud platform Appl. #2 Atomic servic e reg. Atmosphere API Cloud resourc e reg. 3b. Spawn appliances 3c. Return list of appliances 4. Run workflow T2. 2
WP2 vision at a glance
WP2 vision at a glance
(2/3)
(2/3)
Managing binary data in Cloud storage (developer action):
•Atmosphere will contain a registry of binary data for use by VPH-Share
applications (T2.5) and assume responsibility for maintaining such data in Cloud storage.
•Appliances may produce data and register it with Atmosphere.
•Atmosphere will provide a query API where authorized applications may locate and retrieve Cloud-based data.
•If required, Atmosphere may also shift binary data between Cloud storage systems (not depicted)
•As an optional tool, we can develop a data registry browsing UI for integration with the VPH-Share Master Interface (not depicted).
•For access to the underlying Cloud storage resources, we intend to apply tools developed in Task 2.4.
VPH-Share application
2. Inform Atmosphere
(passing handle to stored data)
Binary data registr y Atmosphere API 4. Get data 3a. Query 3b. Retrieve handle T2. 5 Cloud storag e
1a. Generate data
1b. Save data in Cloud storage VPH-Share applicati on
WP2 vision at a glance
WP2 vision at a glance
(3/3)
(3/3)
Problem description
•If the appliance is to be accessed by external users, its corresponding atomic service needs a remote interface through which its functionality can be invoked. According to the Technical Annex, we expect that all such interfaces assume the form of Web Services (cf. p. 41) exposed by the hosts on which the VPH-Share appliances will reside.
•While Atmosphere can manage atomic services, it falls to application developers to encapsulate the functionality of their applications (or parts
thereof) in the form of Web Services. We believe that this is a crucial task to be addressed by Task 2.7 as soon as feasible.
OS VPH-Share app. Component #2 Cloud host OS Visualization component Cloud host OS VPH-Share app. Component #1 Cloud host OS VPH-Share Master Interface User host 1. Execute application 2. Run calculations 3. Prepare
visualization 4. Display output
Issue: application
Issue: application
interfaces
interfaces
Problem description:
•Early studies suggest that we should adopt OpenNebula for private VPH-Share
cloud installations due to its simple yet effective design – however, we are eager to discuss the matter with WP2 partners.
•For larger private cloud deployments OpenStack (or, at least, its Object Store) seems a good choice.Eucalyptus OpenNebula
OpenStack Nimbus
Advantages Drawbacks Advantages Drawbacks
Advantages Drawbacks Advantages Drawbacks
• Excellent compatibility with EC2
• Advanced network features
• Comes with its own Cloud storage engine (Walrus)
• Overly complex
architecture given the features it offers
• Advanced networking modes require a
dedicated network switch with VLANs
• Poor Walrus functionality and performance
• Heavyweight software and communication protocols
• Simple yet effective design
• Standard
communication protocols (SSH)
• Lightweight technology (Ruby + bash scripts) • Standard Shared Storage (NFS, SAN + Cluster FS) • Contextualization support • Poor network autoconfiguration features • No dedicated storage engine
• Minor but irksome bugs in configuration scripts • Advanced features, including complex computing and (object) storage engine • Relatively simple design given the rich feature set
• Support for various hypervisors (including KVM, Xen, MS Hyper-V) • IPv6 support • Advanced networking modes require a dedicated network switch with VLANs (much like Eucalyptus) • Higly complex architecture and convoluted deployment • Questions regarding maturity • Advanced storage engines – Cumulus- and LANTorrent-based • Support for legacy
technology (PBS) • Manageable architecture • EC2/S3 compatible • Requires installation of heavyweight components on HEAD node • Largely a conglomerate of various technologies – may cause maintainace issues
Issue: selection of software stacks for
Issue: selection of software stacks for
private Cloud installations
T2.
T2.
0 Scientific
0 Scientific
Management
Management
Key issues:
Software development should be based on top quality research
Collaboration and exchange of research results within WP,
VPH-Share, and with related projects
Encouraging publications (FGCS, IEEE Computer, Internet
Computing, …)
Participation in conferences (EuroPar, e-Science, CCGrid, ICCS, …)
Organization of workshops
PhD and MSc research related to the project topics
Promotion of best software engineering practices and research
methodologies
Hosting an e-mail list, wiki and telcos (1 per month); managing
P2P contacts and WP meetings (semiannually)
Partners involved and contact persons:
CYFRONET,
Marian Bubak, Maciej Malawski {bubak,malawski}@agh.edu.pl and task leadersMain goals:
Overseeing the scientific progress and synchronisation of tasks
Interim (6 monthly) and annual reports
Main goal: Multicriterial optimization of computing
resource usage (private and public clouds as well as
the HPC infrastructure provided by T2.3);
Key issues:
• Applications (atomic services) and workflow
characteristics;
• Component Registry (T6.5);
• Workflow execution engine interfaces (T6.5);
• Atomic Services Cloud Facade interface (T6.3);
• Security (T2.6);
Partners involved and contact persons:
•CYFRONET (Tomasz Bartyński;
t.bartynski@cyfronet.pl);
•UCL (David Chang; d.chang@uci.ac.uk);
•AOSAE (Enric Sarries;
enric.sarries@atosresearch.eu).
T2.1 Cloud resource Allocation
T2.1 Cloud resource Allocation
Management
Management
T2.1 Deployment planning
T2.1 Deployment planning
Atmosphere will take into account application characteristics and
infostructure status to find an optimal deployment and allocation plan
which will specify:
•where to deploy atomic services (partner’s private cloud site, public
cloud infrastructure or hybrid installation),
•should the data be transferred to the site where the atomic service is
deployed or the other way around,
•how many atomic service instances should be started,
•is it possible to reuse predeployed AS (instances shared among
workflows)?
The deployment plan will be based on the analysis of:
•workflow and atomic service resource demands,
•volume and location of input and output data,
•load of available resources,
•cost of acquiring resources on private and public cloud sites,
•cost of transferring data between private and public clouds (also
between „availability zones” such as US and Europe ),
•cost of using cheaper instances (whenever possible and sufficient; e.g.
EC2 Spot Instances or S3 Reduced Redundancy Storage for some
noncritical (temporary) data),
•public cloud provider billing model (Amazon charges for a full hour –
thus, five 10-minute tasks would cost 5 more times to run than an
individual instance).
1. Inform about req. AS and data Atmosphere API T2. 1 Cloud stora ge Workflow Execution
3. Collect computing, storage and networking statistics 4. Analyze and prepare optimal deployment Cloud Computing resources
T2.1 in scope of VPH-Share project
T2.1 in scope of VPH-Share project
Compone nt Registry T6. 5 T6.5 2. Get metadata about required appliances and data Atmosphere will:
•receive requests from the Workflow Execution stating that a set of atomic services is required to process/produce certain data;
•query Component Registry to determine the relevant AS and data characteristics;
•collect infostructure metrics,
T2.2 Cloud Execution
T2.2 Cloud Execution
Environment
Environment
Main goal: Deployment of atomic services in the Cloud
according to T2.1 specifications
Key issues:
• Cloud usage (public providers, private setups contributed by
partners; choice of Cloud computing platform);
• Interfacing infostructure:
• Public Cloud providers as well as private
(partner-operated) Cloud platforms built using heterogeneous
resources?
• Data Access services T2.4;
• Moving atomic services across the infostructure;
• Atomic Services Invoker interface (T6.3);
• Security (T2.6);
Partners involved and contact persons:
•CYFRONET (Tomasz Bartyński; t.bartynski@cyfronet.pl);
•UCL (David Chang; d.chang@uci.ac.uk);
T2.2 Deployment accord
T2.2 Deployment accord
i
i
n
n
g
g
to plan
to plan
from
from
T2.1
T2.1
AS #2 Atmosphere API T2.2 Cloud storag e Deploy appliance / movedata
Cloud
Computing resources AS #2
T2.2 will receive a deployment plan from T2.1
It will implement the deployment plan by instantiating atomic
services on private and/or public Clouds, and moving data using
T2.4 tools.
•It may be required to interface public and/or private clouds built
upon different platforms so the choice of Cloud API and client-side
library is important;
•Cyfronet is currently investigating Amazon EC2, Open Cloud
Computing Interface, OpenStack Compute (Nova); S3 and Swift
storage APIs.
Atmosphere
API
T2.2
Monitor and scale
AS
T2.2 Monitoring and scaling
T2.2 Monitoring and scaling
infostructure
infostructure
Atmosphere will monitor the usage of atomic services
Atomic Services will be scaled:
• new instances will be started for overloaded
services;
T2.3 High Performance Execution
T2.3 High Performance Execution
Environment
Environment
Key issues:
Key issues:
Cloud computing provides an infrastructure on
Cloud computing provides an infrastructure on
which to run so called capacity workloads
which to run so called capacity workloads
Some workflows require access to high
Some workflows require access to high
performance computing resoruces
performance computing resoruces
Cloud computing paradigm has not found wide
Cloud computing paradigm has not found wide
uptake in HPC
uptake in HPC Introduces performance Introduces performance overhead
overhead
Need to preinstall and optimise applications on
Need to preinstall and optimise applications on
HPC resources
HPC resources
Should we seek to integrate HPC access tightly
Should we seek to integrate HPC access tightly
with cloud computing or treat it separately?
with cloud computing or treat it separately?
Partners involved and contact persons:
Partners involved and contact persons:
UCL: David Chang (d.chang@ucl.ac.uk)
UCL: David Chang (d.chang@ucl.ac.uk); ; Stefan ZasadaStefan Zasada (stefan.zasada@ucl.ac.uk)
(stefan.zasada@ucl.ac.uk)
Cyfronet: Tomasz Bartynski (t.bartynski@cyf-kr.edu.pl) Cyfronet: Tomasz Bartynski (t.bartynski@cyf-kr.edu.pl)
Main goals:
Provide virtualised access to high performance execution
environments seamlessly provide access to high performance
computing to workflows that require more computational power than clouds can provide
Deploy and extend Application Hosting Environment –
provides a set of web services to start and control applications on HPC resources
T2.3 High Performance Execution
T2.3 High Performance Execution
Environment
Environment
Virtualizing access to scientific applications
Virtualizing access to scientific applications
Tasks:1.Refactor AHE client API to provide
similar/same calls as Eucaliptus/cloud APIs Access Grid/HPC in similar way to cloud
2.Integrate AHE (via API) with Resource
Allocation Management system
developed in T 2.1 AHE will publish load information from HPC resources
3.HPC typically uses pre-stages
applications. UCL will build, optimise and host simulation codes in AHE
4.Extend AHE to stage/access data
from cloud data facilities developed in T 2.4
5.Integrate AHE/ACD with security
framework developed in T2.7
Application Hosting Environment:
• Based on the idea of
applications as stateful WSRF
web services
• Lightweight hosting environment
for running unmodified
applications on grid and local
resources
• Community model: expert user
installs and configures an
application and uses the AHE to
share it with others
• Launch applications on Unicore
and Globus 4 grids by acting as
an OGSA-BES and GT4 client
• Use advanced reservation to
schedule HPC into workflow
T2.4 Data Access for Large Binary
T2.4 Data Access for Large Binary
Objects
Objects
Key issues:
Integrate functionality of the security framework (T2.6) to provide user- and application-level access control as well as ensuring storage privacy (a, b)
Maintaining and synchronizing multiple
replicas on different loosely coupled resources (a)
Dealing with errors in storage systems as well as providing a uniform metadata model (a) Providing an abstraction on a higher-level
transport protocol for fast transfers, checkpoints and parallel streaming (b)
Partners involved and contact persons:
UvA; Spiros Koulouzis; S.Koulouzis@uva.nl, Adam Belloum; A.S.Z.Belloum@uva.nl
Main goals:
a) Federated cloud storage: uniform access to data storage
resources to transparently integrate multiple autonomous storage resources; as a result a file system will be provided as a service, optimizing data placement, storage utilization, speed, etc.
b) Transport protocols: efficient data transfers for replication,
migration, and sharing; to avoid centralisation bottlenecks, connection services will be deployed near the data
Cloud Storage2 Abstraction Layer Cloud Storage2 Cloud Client1 Service Client Operations layer M a n a g e m e n t O p tim iz a tion LOB federated storage access Connection Service1 Connection ServiceN SOAP/REST Cloud Client1
T2.4 Data Access for Large Binary
T2.4 Data Access for Large Binary
Objects
Objects
Federated Cloud Storage –
Federated Cloud Storage –
Goals
Goals
Transparently integrate multiple
autonomous storage resources
Provide virtually limitless capacity
Uniform access to data storage
resources or a file system like view
Optimize:
◦
Data placement based on access
frequency, latency, etc.
◦
Storage utilization/cost; this will
have to make sure that storage
space is used in the most efficient
way
Provide file system as a service; this
will provide an intuitive and easy way
to access the federated storage space
Cloud Storage1 Abstraction Layer Data Data Data Operations layer M an ag em en t O pti m iz at io n LOB federated storage access Cloud Client1 Cloud Client2
T2.4 Data Access for Large Binary
T2.4 Data Access for Large Binary
Objects
Objects
Federated Cloud Storage –
Federated Cloud Storage –
Issues
Issues
Clearing data handling requests with the security
framework (T2.6)
◦
Ensuring privacy on the storage locations
◦
Obtaining authorization, authentication, access control, etc.
Synchronizing multiple replicas on different loosely
coupled resources
Determining the distance between replicas
Dealing with errors in storage systems beyond our control
Providing a uniform meta-data model
Defining utility functions for optimising multiple targets
such as space usage, access latency, cost, etc.
T2.4 Data Access for Large Binary
T2.4 Data Access for Large Binary
Objects
Objects
Transfer
Transfer
protocols
protocols
– Goals
– Goals
Interconnect resources for efficient
transport
Investigate the state of the art of
protocols designed for large-scale data transfers, such as UDT & GridFTP
Provide a higher-level protocol that will be
capable of checkpointing and exploiting parallel streams to boost performance
Deploy connection services next to or
near data resources
Use direct streaming
Take advantage of existing transfer
protocols, such as HTTP(s)
Take advantage of 3rd party transfer
abilities offered by underlying protocols e.g. GridFTP Local FS Connection Service Connection Service Abstraction Layer Operations layer M an ag em en t O pt im iz ati on LOB federated storage access C on trol (e .g . c he ck po in ts ) T ra ns po rt Cloud Client1 Service Client UDT HTTP(S) GridFTP
T2.4 Data Access for Large Binary
T2.4 Data Access for Large Binary
Objects
Objects
Transfer
Transfer
prot
prot
o
o
cols
cols
– Issues
– Issues
In case of migrations, transfers, replications, etc.
In case of migrations, transfers, replications, etc.
between different storage systems we have to consider:
between different storage systems we have to consider:
◦
Appliances in the case of IaaS
◦
Web services in the case of PaaS
◦
They will act as connection services, enabling third party
transfers, usage of state-of-the-art transport protocols such
as UDR, GridFTP, as well as torrentlike transport models
These connection services will have to:
These connection services will have to:
◦
Encrypt data while in transit from one provider to another
◦
Resume failed transfers
◦
Enable checkpoints in transfers to increase fault tolerance
Use the framework provided by Atmosphere to maintain
Use the framework provided by Atmosphere to maintain
and deploy such services
T2.4 Data Access for Large Binary
T2.4 Data Access for Large Binary
Objects
Objects
Interactions with other WP
Interactions with other WP
s
s
&
&
Tasks
Tasks
Security Framework Task 2.6 Data Reliability And integrity Task 2.5 Cloud Execution Environment Task 2.2 High Performance Execution Environment Task 2.3 Data Access and mediation Task 3.3 Fine Grained ACL/ Authorization request Perform Integrity/checksum on data Workflow Execution Task 6.5 Move Data/Tasks around resources Deployment/Transport of System Images Move data From/to resources Provide access to raw data Abstraction Layer Operations layer M an ag em en t O pt im iz ati on LOB federated storage access Service SOAP/RESTT2.5 Data Reliability and
T2.5 Data Reliability and
Integrity
Integrity
Main goals: Provide a mechanism which will keep track of binary data stored
in the Cloud infrastructure, monitor its availability; advise Atmosphere when instantiating atomic services and – when required – shift/replicate data
between clouds (in collaboration with T2.4);
Key issues: Establishing an API which can be used by applications and end
users; deciding upon supported set of cloud stacks;
Partners involved and contact persons:
•CYFRONET (Piotr Nowakowski; p.nowakowski@cyfronet.pl)
Binary data registr y Atmosphere T2. 5 Amazon S3 Walrus (Eucalyptus) ObjectStorage (OpenStack) Cumulus (Nimbus) T2. 4
LOB federated storage access
(sample protocol stacks)
Register files Get metadata
Migrate LOBs Get usage stats
(etc.)
Distributed Cloud storage
Store and marshal data End-user features
(browsing, querying, direct access to data)
VPH Master Int.
Data Storage
T2.5 Data Reliability and
T2.5 Data Reliability and
Integrity
Integrity
Enforcing data integrity by means of
•access control (requires integration with T2.6); •access log (requires integration with T2.4).
Each operation performed by T2.4 tools can be logged in the Atmosphere registry for the purposes of establishing data provenance. Moreover,
Atmosphere can enforce fine-grained data access security by means of policies defined in Task 2.6.
Access log
Atmosphere
T2. 5
LOB federated storage access
Log access to registered data
Distributed Cloud storage
Store and marshal data
VPH Master
InterfaceOperations on registered
data
(requested by users
or workflow management tools)
T2.6 Security Framework
T2.6 Security Framework
Main goals: Policy-driven access system for
the security framework. Providing a solution for an open-source based access control
system based on fine-grained authorization policies. Components: Policy Enforcement, Policy Decision, Policy Management, Registry of conditions and effects definitions and
values.
Key issues:
•Security by design
•Privacy & Confidentiality of eHealthcare data •Expressing eHealth requirements &
constraints in security policies (compliance) •Deploying security software in clouds
(potential threats and inconsistencies).
Partners involved and contact persons:
•AOSAE (Enric Sàrries;
enric.sarries@atosresearch.eu). •CYFRONET (Jan Meizner;
jan.meizner@cyfronet.pl);
•UCL (Ali Haidar; ali.haidar@ucl.ac.uk);
VPH Clients VPH Security Framework VPH Security Framework VPH Services Internet Internet
T2.6 Security Framework
T2.6 Security Framework
?
The Security Components will be located in the frontends
of the VPH deployment. There are 3 perimeters in VPH-Share:
◦ User Interfaces
◦ Applications (Workflow execution)
◦ Infostructure
◦ However, the design of the VPH-Share architecture is still not clear. To prevent man-in-the-middle (impersonation) attacks, we must know which are the boundaries, the perimeter to secure.
Security by design implies the following:
◦ Components must not bypass security (services/data not exposed to threats), each message must go to / come from the security framework
◦ Components must be trusted (well specified, standard, “known”) with respect to the security policies
◦ Administrative access to the Security Framework to modify the access rights to the component (system admins can configure security for their software)
The Security Components will provide the following features:
◦ Secure messaging between the frontends of each deployment/platform
◦ Authentication of VPH Users
◦ Resource access authorization (based on access control policies)
◦ Specifically, for policy-based access control, we will need the following components:
◦ Policy Enforcement Point: It “executes” the policies. It composes the access control request. Once it gets the authorization response from the Policy
Decision Point, it enforces it by allowing/denying the requested access.
◦ Policy Decision Point: It analyses the access control request, the policies
and the environment, and issues an access control decision: either “Permit” or “Deny”.
◦ Policy Administration Point: Deploys or provisions policies to the PDP. Enables reconfiguration of access rights to resources.
◦ Policy Information Point: Provides attribute-value pairs needed for the analysis of the access control requests and policies.
T2.6 Security Framework
T2.6 Security Framework
VPH User Policy Decision Point Security Token Service Policy Administration Point Policy Information Point Policy Enforcement Point RDB Service Compute Service Storage Service VPH ServicesOS template registry OS VPH-Share app. (or component)
Secured External APIs
Cloud host Secure Proxy Developer Atmospher e
T2.6 Security Framework
T2.6 Security Framework
Deploying secure appliances in the cloud presents a challenge:◦ Many different OS templates
◦ Many different VPH-Share appliances
◦ The best way to address this is, in our opinion:
◦ Security Components deployed in the OS templates of Atmosphere;
◦ When deploying a VPH service, the Security component is configured to “proxy” it.“Create a virtual
machine and install a VPH-Share
T2.7
T2.7
Requirements
Requirements
A
A
nalysis and
nalysis and
I
I
ntegration
ntegration
of VPH
of VPH
W
W
ork
ork
f
f
low
low
S
S
ervices with the
ervices with the
C
C
loud
loud
I
I
nfrastructure
nfrastructure
Main goals: Ensure integration with VPH-Share workflows and
deployment of VPH-Share atomic services on Cloud and HPC resources
provided by the partners
Key issues: Establishing workflow specification details and atomic
service requirements
Partners involved and contact persons:
•USFD (workflow coordination – cp. Richard Lycett;
Rich.Lycett@gmail.com);
•KCL (Cardiovascular Modeling; T5.5 – cp. TBD);
•UPF (Neurovascular Modeling; T5.6 – cp. TBD);
•IOR (Orthopaedic Modeling; T5.4 – cp. TBD);
•UvA (HIV Epidemiology; T5.5 – cp. TBD);
•CYFRONET (Marek Kasztelnik; m.kasztelnik@cyfronet.pl)
A preliminary questionnaire has been distributed at the kickoff meeting in
Sheffield; results are due by mid-May 2011.
T2.8 Joint Strategy for Cloud
T2.8 Joint Strategy for Cloud
Computing between
Computing between
p
p
-
-
m
m
edicine and
edicine and
VPH-Share
VPH-Share
Main goals:•Exchange of information on Cloud computing and storage environments with focus on how they may support distributed medical information systems;
•Exchange of technical knowledge pertaining to the exploitation of specific Cloud technologies;
•Joint assessment of the applicability of Cloud platforms to storing, processing and exposing data in the context of medical applications;
•Exchange of prototype software and detailed technical documentation thereof, with the possibility of cross-exploitation of synergies between both projects;
•Semiannual collaboration workshops of representatives of both Projects to support the above.
•1st VHP-Share/P-Medicine meeting: about June 15, at UvA, to discuss (among others) D2.1 and D6.1 as well as plans for platform design .
Partners involved and contact persons:
•CYFRONET (cp. Marian Bubak; m.bubak@agh.edu.pl); •USFD (dp. TBD)
•UvA (cp. Adam Belloum; A.S.Z.Belloum@uva.nl); •UCL (cp. TBD)
WP2: Services
WP2: Services
Service Name Description
Atmosphere
Computing Service Broker
A tool to instantiate new Virtual Machines in selected Cloud
environments (enabling application developers to prepare atomic services), then store and manage the resulting service images; interface extensions for the VPH-Share Master Interface
Atmosphere Atomic Service Deployment Tool
Deployment of available atomic services in the Cloud as
requested by the workflow execution tools; interface extensions for the Workflow Execution tool (API)
Atmosphere Data
Management Tool A registry of binary data belonging to VPH-Share applications, enabling storage, querying and direct retrieval (interface extensions for the VPH-Share Master Interface and an API for VPH-Share applications)
Computing access extensions for Cloud stacks and HPC
infrastructures
Pluggable support for interaction with various Cloud computing stacks and HPC infrastructures for the purposes of scheduling computations/deploying atomic service instances and retrieving results
Data access
extensions for Cloud stacks
Pluggable support for interaction with various Cloud computing stacks for the purposes of storing, processing and retrieving binary data objects
Integrated security Cross-domain component; ensures secure operation of all of the above
Key
Key
WP2
WP2
interactions
interactions
Computing Service Broker (2.1) Cloud Exe. Environment (2.2) HPC Exe. Environment (2.3) Workflow Execution (6.5) HPC Infrastructure (e.g. DEISA) Public Cloud s Privat e Cloud s VPH Master Interface (6.4) Data Integrity Services (2.5) Binary Data Access (2.4) Metadata Mgmt Services (4.2) WP2 Security (2.6) Atomic Service Cloud Facade (6.3) 1 2 3 4 6 7 5 8 9 1 0 1 1 12 12
1. Workflow execution requests 2. Atomic Service preparation
requests
3. AS creation and management UI
4. Invocations of Atmosphere back-end services
5. Data management UI
(possibly integrated with Task 6.3)
6. Sharing Atomic Service metadata
(common/distributed registry)
7. Sharing LOB metadata (common/distributed registry)
8. Preparation of HPC resources 9. Binary data processing
requests
10.Instantiation and
management of Appliances based on AS templates (AS template repository not depicted)
11.Execution of computational jobs
12.Low-level access to Cloud resources
13.Enactment of workflows using preinstantiated Cloud and HPC resources
(Appliances)
9 1
Used by Description
Task 6.5, Workflow Execution
Requesting deployment of atomic service instances (appliances) whenever a workflow is to be executed; interacts with Atmosphere (by means of a dedicated API) and with data access extensions (separate APIs provided by Task 2.4)
Task 6.4, Master Interface
Visual management of atomic services and instantiation of additional virtual machines as requested by VPH-Share application developers (via plugin); direct access to binary data (via plugin); interacts with Atmosphere by means of
dedicated APIs Tasks 5.4-5.7, VPH-Share
applications
Used directly by application developers to deploy their applications in the VPH-Share infrastructure; also used indirectly by application users whenever a workflow is to be instantiated and executed or whenever binary data needs to be read from the underlying Cloud storage
Uses Description
Tasks 2.2 and 2.3, Cloud and HPC computing extensions
Internal dependency; Atmosphere requires a means of communication with the underlying resources – hence the dependency of Task 2.1 on Tasks 2.2 and 2.3. Task 2.4, Data access for
large binary objects Internal dependency; Atmosphere requires a means of manipulating data in Cloud storage – hence the dependency of Task 2.5 on Task 2.4 Tasks 5.4-5.7, VPH-Share
applications
Input is required from application teams to establish requirements (functional and nonfunctional) with regard to the underlying computational and storage resources (public/private Cloud infrastructures, preferred deployment
environments, operating systems, processors, memory, network bandwidth, storage capacity etc.)
WP2: Interactions
WP2: Workflow Interactions
WP2: Workflow Interactions
(tbd)
(tbd)
Type Description WF #Diagrams Does workflow have use case, data flow, component diagrams? If not, when can workflow deliver them? All
Development status
What is current status of the workflow atomic services (concept, design, development of first prototype, first prototype, further development and version, release, deployment)? When is the first version of the workflow planned to be released?
All
Data What kind of data will workflow store (format/size)? Does workflow need streaming? All
Computation
What computational (desktop, cluster, grid, cloud) are
required? How long does workflow take to serve one request on a target (suitable) computing platform? Are the workflow elements stateless?
All
Operating system, licensing system
What kind of operating system do workflow atomic services require? What kind of licence does workflow (and all required
atomic services, libraries) have? All Communication
protocol
Is the application remotely accessible? What protocols are
OBJECTIVE EVALUATED/DUE
1. Analysis of the state-of-the art, work package definition D2.1 [PM03] 2. Architecture and design of the cloud platform D2.2 [PM06]M1 [PM12]
3. 1st prototype of the cloud platform – Alpha Release D2.3 and M3 [PM12]
4. 2nd prototype of the cloud platform – Beta Release D2.4 and M7 [PM24] 5. Final deployment of the of the cloud platform – Candidate
Release D2.5 and M11 [PM36] 6. Full integration of four workflows with VPH infostructure M12 [PM42]
7. Comprehensive collection of data sources accessible through
Candidate Release M13 [PM42] 8. Final evaluation of the of the cloud platform, Maintenance
Release D2.6, M15, and M16 [PM48]
WP
WP
2
2
: Measurable
: Measurable
Objectives
Objectives
WP2: Mapping to Global
WP2: Mapping to Global
Objectives
Objectives
WP2 Goals
VPH-Share Objectives
Analysis of the state-of-the art, work package definition
To develop and to deploy the VPH
infostructure through which the VPH
community will be able to store, share, reuse and integrate data, information, knowledge and wisdom on the
physiopathology of the human body. Execution environment for flexible deployment of
scientific software on the virtualized infrastructure Virtualized access to high performance (HPC)
execution environments
Cloud data access for very large objects and data transfer between services
Security framework
Data reliability and integrity to ensure sound use of the biomedical data
To develop a process by which models
are formulated, analysed and annotated for integration into a
workflow, able to exploit the VPH infostructure.
Exploiting synergies and exchanging experience with other similar projects, in particular with P-Medicine
To reach out, firstly to the VPH
community, but then to the wider clinical and medical records communities to
ensure access to the widest possible
range of data and tools, made
possible by the effective and easy-to-use annotation tools and compute service access that this project will provide and/or promote.
D2.1: Analysis of the State of the Art and WP Definition (M3 – end of May 2011)
Requires contributions from technical developers w.r.t. solutions which will be considered for use in their tools.
Proposed TOC and responsibles:
1. Introduction (incl. objectives and approach) CYFRONET
2. High-level overview of WP2 (incl. generalized view of WP2 architecture) CYFRONET
3. Key challenges in developing a Cloud platform for VPH-Share CYFRONET
4. Targeted SOTA for:
• Cloud Resource Allocation Management CYFRONET
• Cloud Application Execution CYFRONET
• Access to High-Performance Execution Environments UCL
• Access to large binary data on the Cloud UvA
• Data reliability and integrity CYFRONET
• Cloud Security frameworks AOSAE
5. Conclusions (incl. summary and references) CYFRONET
As part of Section 4, we ask each contributing partner to conform to the following schema:
•Problem statement (Why is this aspect important for VPH-Share?)
•SOTA description (along with an in-depth discussion of the advantages and drawbacks of available technologies)
•Recommendations for VPH-Share (Which technologies to adopt? Is it
necessary to extend them? If so – why and how?)
Deadline for contributions is May 6 (submit to p.nowakowski@cyfronet.pl).