SlideShare a Scribd company logo
Science Gateway Advanced Support Activities in PTIMarlon PierceIndiana University
Indiana University's Advanced Science Gateway Support
OGCE Gateway Tool Adaption & ReuseLEADLEADExperiment Builder, XRegistry InterfaceGFac, XBaya, XRegistry, FTREventing SystemGridChemXBayaGridChemUltrascanOGCERe-engineer, Generalize, Build, Test and ReleaseGFac, Eventing SystemResource Discovery ServiceBioVLabOVP/RST/ MIG TeraGridUser PortalXBaya, GFacODIGPIR, File BrowserWorkflow Suite, Gadget ContainerOGCE TeamBio Drug ScreenSwarm->GFacGadget Container, GTLab, Javascript Cog, XRegistry Interface, Experiment Builder, Axis2Gfac, Axis2 Eventing System, Resource Prediction Service, SwarmEST PipelineSwarm->GFacFuture GridGFac, Xbaya, …33
Gateway Hosting ServiceAllocatableTeraGrid Resource providing Virtual Machine hosting of Science Gateways.
 This has been a valuable resource for our group.
 We should look for ways to expand its usage, such as supporting data collectionsCourtesy: Mike Lowe, Dave Hancock.
BioDrugScreen PortalSupport: Josh Rosen and ArchitKulshresthaCollaboration: SamyMeroueh, IUPUI
BioDrugScreenA computational drug discovery resourceContains millions of pre-docked and pre-scored complexes between thousands of targets from the human proteome and thousands of drug-like small moleculesAllows drug researchers to develop their own scoring functions of calculating how well a drug will interact with a protein.
Small drug-like molecules from NCI diversity set are docked into 205 proteasome protein target.  Orange area is an identified target area of the protein. Visualization uses JMol. Docking done with Amber on the TeraGrid.  Proteins are obtained from the PDB.  Samy maintains his own database of protein structures and small molecules.
Support ProvidedProteins that have not had their docking and scores calculated need to have these run on the Teragrid. An Web interface needed to be created so users can submit their own jobs.We developed the interface between the site’s submissions and the Teragrid using the Swarm Service.Also prototyping GFac supportUsing Flash and javascript, we developed an improved data presentation for the ranking page
UltraScan GatewayStaff: Raminder SinghCollaborators: BorriesDemeler and Emre Brookes, UTHSCSA
UltraScan Science GatewayA biophysics gateway for investigating properties and structure-function relationships of biological macromolecules, nanoparticles, polymers and colloids that are implicated in many diseases, including cancer.High-resolution analysis and modeling of hydrodynamic data from an analytical ultracentrifuge.TeraGrid as a backup spill over resources but still is one of the heaviest user consuming 1.75 million SU’s in 6 months.
UltraScan Advanced SupportPorting to new architectures and parallel performance enhancements.New workflow implementations, new grid computing and grid middleware support:Reliability problems with WSGramMissing job statusOnly supports Gram4, needs porting to other middlewareIssues with data movement.Need Fault tolerance at all levels.Users decide resources manually, need automated scheduling.Current Architecture
UltraScan OGCE IntegrationEnhance the perl job submission daemon and monitoring with  OGCE GFacservice.Implement and iteratively enhance fault tolerance.Port to community account usage withGridshibauditing support.Support UNICORE to run jobs on other European and Australian resources.
GridChemSupport: Suresh Marru, Raminder SinghCollaborators: SudhakarPamidighantam, NCSA
GridChem Science GatewayA chemistry/material Science Gateway for running computational chemistry codes, workflows, and parameter sweeps.Integrates molecular science applications and tools for community use. 400+ users heavily using TeraGrid. One of the consistentTop 5 TeraGrid Gateway users.Supports all popular Chemistry applications including Gaussian, GAMESS, NWChem, QMCPack, Amber and MolPro, CHARMM
GridChem Advanced SupportGridChem supports single application executionsAdvanced support request for supporting workflowsImproved Fault Tolerance
GridChem OGCE IntegrationOGCE workflow tools wrapped Gaussian & CHARMM chemistry applicationsCoupled Butane workflow using Gaussian & CHARMM Integration100 member Gaussian parametric sweepsIntegration with Pegasus workflow toolsYe Fan, Master’s student
GridChem Using OGCE ToolsInitial StructureOptimized StructureGridChem using OGCE Workflow Tools to construct and execute CHARMM and Gaussian Molecular chemistry Models
Future Grid User PortalSupport: SiddMaini, ArchitKulshrestha
Future Grid User PortalOur strategy is to build all components as Google Gadgets that interact with REST ServicesCan live in iGoogle as well as containers like Drupal.Take advantage of OpenID and OAuthInitial target gadgets: Knowledge Base, Amazon EC2 Clients, Inca ClientsFuture Work: services, gadgets, and workflows for managing machine images with Xcat.
Future Grid KnowledgeBase (FGKB)Task: Develop FGKB Web AppSearch KB DocumentsTechnology used: Adobe Flex / PHP / KB REST API
Current Status: Basic Search and RetrievalLive URL: http://tinyurl.com/ykaa9gr
EC2 Client User InterfaceLink: http://tinyurl.com/ylkohj7See list of images available
Launch them
Terminate themNext Step: Add more EC2 features, integrate with FutureGrid user database
Portal Embedded Gadgets
FutureGrid Machine Image ServicesFutureGrid will use Xcat to dynamically create and manage clusters from preconfigured images
On both real hardware and Virtual machines.
We are working to capture common XCat tasks as scripts
These scripts can be wrapped as secure services using OGCE’sGFac.
Several tasks can be linked together as workflows visually composed with OGCE’sXBaya.
You can still use Pegasus/Condor as the workflow engine OGCE Software for Science GatewaysSoftware and Architectural Approach
Science Gateways Layer CakeUser InterfacesGateway Abstraction InterfacesWeb Enabled Desktop ApplicationsWeb/Gadget Container Web/Gadget InterfacesInformation ServicesApplication AbstractionsApplicationMonitoringUser ManagementFault Tolerance Gateway ServicesProvenance & Metadata ManagementRegistryWorkflow SystemSecurityAuditing & ReportingResource MiddlewareCloud InterfacesSSH & Resource ManagersGrid MiddlewareCompute ResourcesLocal ResourcesComputational GridsComputational CloudsColor CodingOGCE Gateway ComponentsComplimentary Gateway ComponentsDependent resource provider components
Google Gadget-Based Science GatewaysPolarGridMyOSGLEAD
GFac Current & Future FeaturesApache  Axis2 GlobusRegistry InterfaceScheduling InterfaceInput HandlersMonitoring InterfaceCampus ResourcesData Management AbstractionFault Tolerance Output HandlersAmazon EucalyptusJob ManagementAbstractionAuditingCheckpoint SupportUnicoreCondorColor CodingExisting FeaturesPlanned/Requested Features
OGCE  Layered Workflow Architecture:Derived from LEAD Workflow SystemWorkflow Interfaces (Design & Definition) XBaya GUI (Composition, Deploying, Steering & Monitoring) Flex/Web CompositionGadget Interface for Input BindingPythonBPEL 2.0Workflow SpecificationScuflBPEL 1.0Java Code Pegasus DAGWorkflow Execution & Control  EnginesApache ODECondor DAGManDynamic EnactorJython InterpreterGBPELTaverna
Putting It All Together
Software StrategyFocus on gadget container and tools for running science applications on grids and clouds.Provide a tool set that can be used in whole or in part.If you just want GFac, then you can use it without buying an entire framework.Outsource security, information services, data and metadata, advanced job handling, etc to other providers.MyProxy, TG IIS, Globus, Condor, XMC Cat, iRods, etc.
Packaging, Building, and TestingAll builds are designed to be self contained Use Apache Maven 2.xDownload includes everything you need.Portal, Axis Services, and Xregistry all build nightly on the NMI Build and Test facility at UW.Several Linux platforms, Mac PPC, and Mac X86Java 1.5Apache JMetertest suite for the portal.  Run against your installationAutomated tests nightly
Next StepsApache Incubator Project for XBaya, GFac and supporting workflow toolsWIYN ODI instrument pipeline and gatewayRobert Henschel and Scott Michael are leading overall effort.Suresh and Raminder are working 50% time through early April on technical evaluation of integrating NHPPS software with OGCENew Collaboration: Craig Mattocks, UNC will build a storm surge forecasting gatewayBroadly similar to LEAD and SCOOPArchit will be the point of contactLocal gateway tutorial in early AprilSun Kim’s group, UNC Group, CGBOthers welcomeGadget Container additional applications OGCE grid gadgets packaged release, SimpleGrid
More Information, AcknowledgementsWebsite: www.collab-ogce.orgBlog/RSS Feed: collab-ogce.blogspot.comEmail: mpierce@cs.indiana.edu, smarru@cs.indiana.eduGeoffrey Fox, Craig Stewart, and Dennis Gannon have spent years laying the foundation for this work
Backup Slides
EST Assembly PipelineSupport: ArchitKulshrestha, Chin Hua KongCollaborator: Qunfeng Dong, UNT
Our goal is to provide a Web service-based science portal that can handle the largest mRNA clustering problems.
Computation is outsourced to Grids (TeraGrid) and Clouds (Amazon)
Not provided by in-house clusters.
This is an open service, open architecture approach.
These codes have very different scheduling requirementshttp://swarm.cgb.indiana.edu

More Related Content

Indiana University's Advanced Science Gateway Support

  • 1. Science Gateway Advanced Support Activities in PTIMarlon PierceIndiana University
  • 3. OGCE Gateway Tool Adaption & ReuseLEADLEADExperiment Builder, XRegistry InterfaceGFac, XBaya, XRegistry, FTREventing SystemGridChemXBayaGridChemUltrascanOGCERe-engineer, Generalize, Build, Test and ReleaseGFac, Eventing SystemResource Discovery ServiceBioVLabOVP/RST/ MIG TeraGridUser PortalXBaya, GFacODIGPIR, File BrowserWorkflow Suite, Gadget ContainerOGCE TeamBio Drug ScreenSwarm->GFacGadget Container, GTLab, Javascript Cog, XRegistry Interface, Experiment Builder, Axis2Gfac, Axis2 Eventing System, Resource Prediction Service, SwarmEST PipelineSwarm->GFacFuture GridGFac, Xbaya, …33
  • 4. Gateway Hosting ServiceAllocatableTeraGrid Resource providing Virtual Machine hosting of Science Gateways.
  • 5. This has been a valuable resource for our group.
  • 6. We should look for ways to expand its usage, such as supporting data collectionsCourtesy: Mike Lowe, Dave Hancock.
  • 7. BioDrugScreen PortalSupport: Josh Rosen and ArchitKulshresthaCollaboration: SamyMeroueh, IUPUI
  • 8. BioDrugScreenA computational drug discovery resourceContains millions of pre-docked and pre-scored complexes between thousands of targets from the human proteome and thousands of drug-like small moleculesAllows drug researchers to develop their own scoring functions of calculating how well a drug will interact with a protein.
  • 9. Small drug-like molecules from NCI diversity set are docked into 205 proteasome protein target. Orange area is an identified target area of the protein. Visualization uses JMol. Docking done with Amber on the TeraGrid. Proteins are obtained from the PDB. Samy maintains his own database of protein structures and small molecules.
  • 10. Support ProvidedProteins that have not had their docking and scores calculated need to have these run on the Teragrid. An Web interface needed to be created so users can submit their own jobs.We developed the interface between the site’s submissions and the Teragrid using the Swarm Service.Also prototyping GFac supportUsing Flash and javascript, we developed an improved data presentation for the ranking page
  • 11. UltraScan GatewayStaff: Raminder SinghCollaborators: BorriesDemeler and Emre Brookes, UTHSCSA
  • 12. UltraScan Science GatewayA biophysics gateway for investigating properties and structure-function relationships of biological macromolecules, nanoparticles, polymers and colloids that are implicated in many diseases, including cancer.High-resolution analysis and modeling of hydrodynamic data from an analytical ultracentrifuge.TeraGrid as a backup spill over resources but still is one of the heaviest user consuming 1.75 million SU’s in 6 months.
  • 13. UltraScan Advanced SupportPorting to new architectures and parallel performance enhancements.New workflow implementations, new grid computing and grid middleware support:Reliability problems with WSGramMissing job statusOnly supports Gram4, needs porting to other middlewareIssues with data movement.Need Fault tolerance at all levels.Users decide resources manually, need automated scheduling.Current Architecture
  • 14. UltraScan OGCE IntegrationEnhance the perl job submission daemon and monitoring with OGCE GFacservice.Implement and iteratively enhance fault tolerance.Port to community account usage withGridshibauditing support.Support UNICORE to run jobs on other European and Australian resources.
  • 15. GridChemSupport: Suresh Marru, Raminder SinghCollaborators: SudhakarPamidighantam, NCSA
  • 16. GridChem Science GatewayA chemistry/material Science Gateway for running computational chemistry codes, workflows, and parameter sweeps.Integrates molecular science applications and tools for community use. 400+ users heavily using TeraGrid. One of the consistentTop 5 TeraGrid Gateway users.Supports all popular Chemistry applications including Gaussian, GAMESS, NWChem, QMCPack, Amber and MolPro, CHARMM
  • 17. GridChem Advanced SupportGridChem supports single application executionsAdvanced support request for supporting workflowsImproved Fault Tolerance
  • 18. GridChem OGCE IntegrationOGCE workflow tools wrapped Gaussian & CHARMM chemistry applicationsCoupled Butane workflow using Gaussian & CHARMM Integration100 member Gaussian parametric sweepsIntegration with Pegasus workflow toolsYe Fan, Master’s student
  • 19. GridChem Using OGCE ToolsInitial StructureOptimized StructureGridChem using OGCE Workflow Tools to construct and execute CHARMM and Gaussian Molecular chemistry Models
  • 20. Future Grid User PortalSupport: SiddMaini, ArchitKulshrestha
  • 21. Future Grid User PortalOur strategy is to build all components as Google Gadgets that interact with REST ServicesCan live in iGoogle as well as containers like Drupal.Take advantage of OpenID and OAuthInitial target gadgets: Knowledge Base, Amazon EC2 Clients, Inca ClientsFuture Work: services, gadgets, and workflows for managing machine images with Xcat.
  • 22. Future Grid KnowledgeBase (FGKB)Task: Develop FGKB Web AppSearch KB DocumentsTechnology used: Adobe Flex / PHP / KB REST API
  • 23. Current Status: Basic Search and RetrievalLive URL: http://tinyurl.com/ykaa9gr
  • 24. EC2 Client User InterfaceLink: http://tinyurl.com/ylkohj7See list of images available
  • 26. Terminate themNext Step: Add more EC2 features, integrate with FutureGrid user database
  • 28. FutureGrid Machine Image ServicesFutureGrid will use Xcat to dynamically create and manage clusters from preconfigured images
  • 29. On both real hardware and Virtual machines.
  • 30. We are working to capture common XCat tasks as scripts
  • 31. These scripts can be wrapped as secure services using OGCE’sGFac.
  • 32. Several tasks can be linked together as workflows visually composed with OGCE’sXBaya.
  • 33. You can still use Pegasus/Condor as the workflow engine OGCE Software for Science GatewaysSoftware and Architectural Approach
  • 34. Science Gateways Layer CakeUser InterfacesGateway Abstraction InterfacesWeb Enabled Desktop ApplicationsWeb/Gadget Container Web/Gadget InterfacesInformation ServicesApplication AbstractionsApplicationMonitoringUser ManagementFault Tolerance Gateway ServicesProvenance & Metadata ManagementRegistryWorkflow SystemSecurityAuditing & ReportingResource MiddlewareCloud InterfacesSSH & Resource ManagersGrid MiddlewareCompute ResourcesLocal ResourcesComputational GridsComputational CloudsColor CodingOGCE Gateway ComponentsComplimentary Gateway ComponentsDependent resource provider components
  • 35. Google Gadget-Based Science GatewaysPolarGridMyOSGLEAD
  • 36. GFac Current & Future FeaturesApache Axis2 GlobusRegistry InterfaceScheduling InterfaceInput HandlersMonitoring InterfaceCampus ResourcesData Management AbstractionFault Tolerance Output HandlersAmazon EucalyptusJob ManagementAbstractionAuditingCheckpoint SupportUnicoreCondorColor CodingExisting FeaturesPlanned/Requested Features
  • 37. OGCE Layered Workflow Architecture:Derived from LEAD Workflow SystemWorkflow Interfaces (Design & Definition) XBaya GUI (Composition, Deploying, Steering & Monitoring) Flex/Web CompositionGadget Interface for Input BindingPythonBPEL 2.0Workflow SpecificationScuflBPEL 1.0Java Code Pegasus DAGWorkflow Execution & Control EnginesApache ODECondor DAGManDynamic EnactorJython InterpreterGBPELTaverna
  • 38. Putting It All Together
  • 39. Software StrategyFocus on gadget container and tools for running science applications on grids and clouds.Provide a tool set that can be used in whole or in part.If you just want GFac, then you can use it without buying an entire framework.Outsource security, information services, data and metadata, advanced job handling, etc to other providers.MyProxy, TG IIS, Globus, Condor, XMC Cat, iRods, etc.
  • 40. Packaging, Building, and TestingAll builds are designed to be self contained Use Apache Maven 2.xDownload includes everything you need.Portal, Axis Services, and Xregistry all build nightly on the NMI Build and Test facility at UW.Several Linux platforms, Mac PPC, and Mac X86Java 1.5Apache JMetertest suite for the portal. Run against your installationAutomated tests nightly
  • 41. Next StepsApache Incubator Project for XBaya, GFac and supporting workflow toolsWIYN ODI instrument pipeline and gatewayRobert Henschel and Scott Michael are leading overall effort.Suresh and Raminder are working 50% time through early April on technical evaluation of integrating NHPPS software with OGCENew Collaboration: Craig Mattocks, UNC will build a storm surge forecasting gatewayBroadly similar to LEAD and SCOOPArchit will be the point of contactLocal gateway tutorial in early AprilSun Kim’s group, UNC Group, CGBOthers welcomeGadget Container additional applications OGCE grid gadgets packaged release, SimpleGrid
  • 42. More Information, AcknowledgementsWebsite: www.collab-ogce.orgBlog/RSS Feed: collab-ogce.blogspot.comEmail: mpierce@cs.indiana.edu, smarru@cs.indiana.eduGeoffrey Fox, Craig Stewart, and Dennis Gannon have spent years laying the foundation for this work
  • 44. EST Assembly PipelineSupport: ArchitKulshrestha, Chin Hua KongCollaborator: Qunfeng Dong, UNT
  • 45. Our goal is to provide a Web service-based science portal that can handle the largest mRNA clustering problems.
  • 46. Computation is outsourced to Grids (TeraGrid) and Clouds (Amazon)
  • 47. Not provided by in-house clusters.
  • 48. This is an open service, open architecture approach.
  • 49. These codes have very different scheduling requirementshttp://swarm.cgb.indiana.edu
  • 50. EST Assembly PipelineOGCE SWARM is used to intelligently submit thousands of jobs to compute resources of various sizes such as workstations and Grid enabled supercomputers.
  • 51. TeraGrid’sBigRed, Ranger, and Cobalt: PACE, RepeatMasker
  • 53. Support for Job submission to the Cloud is being developed and will address the need for resources larger (in terms of available memory) than clusters currently available. UltraScan GatewayStaff: Raminder SinghCollaborators: BorriesDemeler and Emre Brookes, UTHSCSA
  • 54. UltraScan Science GatewayA biophysics gateway for investigating properties and structure-function relationships of biological macromolecules, nanoparticles, polymers and colloids that are implicated in many diseases, including cancer.High-resolution analysis and modeling of hydrodynamic data from an analytical ultracentrifuge.TeraGrid as a backup spill over resources but still is one of the heaviest user consuming 1.75 million SU’s in 6 months.
  • 55. UltraScan Advanced SupportPorting to new architectures and parallel performance enhancements.New workflow implementations, new grid computing and grid middleware support:Reliability problems with WSGramMissing job statusOnly supports Gram4, needs porting to other middlewareIssues with data movement.Need Fault tolerance at all levels.Users decide resources manually, need automated scheduling.Current Architecture
  • 56. UltraScan OGCE IntegrationEnhance the perl job submission daemon with OGCE Gfac service.Enhance socket and email based job monitoring with OGCE Eventing SystemImplement and iteratively enhance fault tolerance.Port to Community account usage with gridshib auditing support.Support Unicore to run jobs on other European and Australian resources.
  • 57. OGCE based UltraScan development Architecture Manual ProcessQuarry Gateway Hosting MachineUltraScan MiddlewareGFac, Eventing System, Fault Tolerance Europe & Australian Grids
  • 58. Future Grid KBImage ManagerExperiment BrowserINCA MonitorDownloadable Gadgets
  • 59. Deployment (in future)iGoogle GadgetsImage ManagementFG INCA MONITORExperiment ManagementiPhone Application
  • 61. Why Gadgets?We have redesigned many OGCE components to work as gadgets.Fugang Wang’s Cyberaide JavaScript gives us an APIFramework and language independentClient-side HTML, CSS, and JavaScript, not server-side JavaIntegration and content under user’s control, not portal administrator’sCan be integrated into iGoogle and similar containers.140,000 published gadgetsJoomla, Drupal, Liferay, etcWe can potentially provide HUBzero gadgets
  • 62. OGCE based UltraScan development Architecture Manual ProcessQuarry Gateway Hosting MachineUltraScan MiddlewareGFac and supporting servicesEurope & Australian Grids
  • 63. BioDrugScreen Next StepsWe want to expand the user generated function process, including the ability for a user to save a function and have multiple functionsInteraction between users will be enhanced, allowing them to share their functions and findings.