EMI's Foundations EMI's Foundations

ARC: Heavy Expertise - Light-Weight Solutions

  • ARC is one of the three main middleware solutions developed and used by European researchers.
  • ARC provides an innovative, easy to deploy and use, computing service optimised for data-intensive applications and allowing most efficient computing resource utilisation.
  • ARC offers standard-compliant, portable, platform independent, open source middleware.

By bringing this expertise, knowledge and know-how to EMI, ARC will help advancing
 the European middleware by enhancing its functionality and improving usability.

ARC is designed and implemented as a reliable, efficient, highly portable and easy-to handle
middleware. It provides optimized CPU utilization for serial data-intensive computational tasks by considering output data manipulation as an integral part of a computing service.

A key concept is the absence of a single point of failure. Combined with a stateful implementation of services, this ensures high stability of the system. ARC separates clearly between the local batch systems and the Grid allowing for integration of any batch system via plugins.

The middleware relies on the distributed, multi-rooted dynamic information system that contributes to the overall redundancy and scalability of the infrastructure. The resource discovery and brokering is delegated to the client, avoiding thus a typical single point of failure.

ARC provides a light-weight, easy to install stand-alone client package for most popular platforms.

Another key aspect of ARC is its non-intrusiveness with respect to the underlying resources: it deploys as a thin layer, with software installation only on the front-end. It can thus co-exist with any other middleware or setup. In order to be easy to maintain and operate ARC collects all configurations in only one file.

Many different scientific communities currently use ARC. One of the most challenging but also successful projects is the unique distributed Nordic Tier-1 for LHC experiments ATLAS and ALICE.

Contributed by: Katarina Pajchel

 

dCache: The Success Story Continues Within EMI

dCache is a storage solution used to store huge amounts of data. Within WLCG over 80 deployments throughout the world use dCache, some providing many petabytes of storage capacity for their local users. These dCache deployments contribute some 60 PB of storage capacity to CERN's Large Hadron Collider experiments. This is over half of the total storage capacity available to LHC experiments.

dCache gives sites the facility to combine a network of heterogeneous commodity disk servers into a coherent, managed storage service. This service is robust against failures and can integrate with tertiary storage, for example a site's tape storage facility or with cloud storage backends.

dCache supports familiar protocols, such as HTTP and WebDAV, allowing people to use dCache with their favourite software. However, these protocols can introduce a performance barrier that are unacceptable for some users. dCache provides custom protocols to break through these performance barriers and satisfy these demanding requirements.

As an agile project, dCache is at the cutting edge of storage technology. NFS v4.1 / pNFS marks the storage industry's move towards high-throughput storage. dCache is the first grid storage software to support pNFS, thanks to the project's experience in this area and its participation in the development of the NFSv4.1 standard.

Given dCache's track record of delivering high-performance storage and cutting-edge features it's natural that the project is one of the founding partners of EMI. We anticipate that this close collaboration with the other middleware providers will yield numerous benefits to the dCache project as well as for EMI users.

Contributed by: Paul Millar and Patrick Fuhrmann

 

gLite in EMI

gLite represents the next generation middleware for distributed grid computing.

Born from the collaborative efforts of more than 80 people in 12 different academic and industrial research centers as part of the EGEE series of Projects, gLite today provides a robust framework  of services on which to build applications for a Distributed Compute Infrastructure (DCI), tapping into powerful distributed computing and storage resources across the Internet.

During  the 6 years of the EGEE projects the gLite components have been progressively improved and made increasingly robust to satisfy the  requirements of a large variety of Research Communities. The largest of these is the High Energy Physics community, with its need for a very demanding computing infrastructure for the simulation, processing and analysis of the data of the Large Hadron Collider (LHC) experiments at CERN.

Today the gLite components routinely sustain the research activities of a large variety of communities in the world's largest production infrastructure and enable the transparent access and sharing of more than 300 computing and storage facilities in Europe and worldwide.

The gLite middleware hides from the user much of the complexity in the operation of such a complex and large  environment by presenting standard common interfaces.

Distributed under a business friendly open source license, gLite integrates components from the best of the current major grid stacks, such as Condor and the Globus Toolkit, as well as components developed specifically for the LHC Computing Grid (LCG). The product is a best-of-breed, low level middleware solution, compatible with very popular schedulers such as PBS, Condor and LSF, and with the most popular data storage and archiving systems,  built with interoperability in mind and providing a large variety of services that enable the creation, operation and usage of a general DCI for all research fields.

The gLite stack combines low level core services with a range of higher level services. The gLite Grid services follow a Service Oriented Architecture and  the most popular standards, such as those developed by the Open Grid Forum. The gLite stack is designed as a modular system, allowing sites and  users to deploy and use different services according to their needs.  In this way each community can tailor the system to their individual requirements.
 gLite continues to add new features in all areas of the software stack, including in particular constant improvements in security, better interfaces for data management and job submission, a re-factored information system, and many other improvements that make gLite increasingly efficitent and easy to use.

The initial effort within EMI will be to provide packaged distributions for a wider selection of platforms and operating systems, more uniform interfaces to the resources, a rationalization of the APIs, a plan for a coordinated infrastructure activity with the other middlewares and many other technical improvements to enhance interoperability and the user experience.

Contributed by: Andrea Caltroni, Diana Cresti, Mirco Mazzucato

 

UNICORE: Seamless Access to Computational and Data Resources

The European Grid technology UNICORE comes with a history of more than 10 years. Originally initiated in the Supercomputing domain, today UNICORE is a general-purpose Grid technology. In its recent version, UNICORE 6 follows the latest standards from the Grid and Web Services world and offers a rich set of features to its users.

UNICORE is used in Grid infrastructures of different nature and without limitations on the type of computing resources ranging from single PCs coupled together for a Campus Grid or cluster-systems interconnected similar to the EGI Grid infrastructure to leadership HPC systems like those forming the PRACE Research Infrastructure. UNICORE contributes to EMI its knowledge about HPC and the expertise on non-intrusive integration into existing infrastructures.

UNICORE is supported by the UNICORE Forum, all software is available as Open Source under BSD license from the UNICORE Website, while the software repository is hosted on SourceForge.

UNICORE's main features are:

  • Portability: Being Java-based, all UNICORE 6 needs to run is Java SE 1.6 or later.
  • Interoperability: UNICORE 6 uses HTTPS-based Web services as well as several common open Grid standards.
  • Security: Access to UNICORE 6 is governed by authentication through a HTTPS gateway and authorisation via a rule-based access control engine.
  • Service-orientation: All services accessible to users are implemented as Web services and cover, among others, use cases such as target system access and job management.
  • Extensibility: The modular architecture and open-source character of UNICORE 6 provide for ease of extensibility.
  • Scalability: Multiple UNICORE 6 installations can be combined to form a distributed, multi-organisational Grid allowing for thousands of jobs.

Contributed by: Daniel Mallmann