March 2006   OSG HOME | SUBMIT NEWS | SUBSCRIBE | ARCHIVE | ABOUT OSG NEWS  
Meetings and Events
CDIGS Roadmap Workshop
March 30, 2006

Condor Week 2006
April 24–27, 2006

GGF17
May 9–12, 2006

View Full Calendar

The PPDG Common Project
Eileen Berman and Doug Olson
Eileen Berman and Doug Olson
If you look up the word common in the dictionary, you will find the following definition: of or relating to the community as a whole; i.e. for the common good. This describes well the PPDG Common Project, a group of people from eight institutions, working on many different projects, who are involved in providing services and software to support and enable grid infrastructure.

All of the PPDG Common Project activities include collaborators from outside PPDG and have had to compromise between the needs of individual VOs and the greater grid community. The projects contribute directly to the OSG program of work, and many are included in the OSG 0.4.0 release. PRIMA and GUMS are key elements of the security and authorization infrastructure, the OSG Discovery Service is used as the main catalog registry system for VOMS servers, JobMon allows secure and authenticated access to jobs running anywhere on OSG, and the Grid Exerciser is used on the Integration Testbed as an OSG distribution validation tool. The SRM-tester harness tests OSG Storage Element services for compliance with SRM Protocols v1.1 and v2.1.1.

An Accounting Service, which records a grid-wide view of a VO member's resource utilization, and a Resource Selection Service to support automatic selection of OSG resources based on job requirements are both scheduled to be part of OSG 0.6.0. Also under development are an Edge Services Framework to support deploying VO-specific services on many sites and the gPlazma authorization framework for storage, which is currently being tested within the dCache deployment.

While the PPDG project will end in June, members of the PPDG Common Project will continue to contribute to the OSG and grid computing in general. Planning is underway to transition the integration and support of Common Project activities from the PPDG era into the future.

Eileen Berman, Fermilab and Doug Olson, Berkeley Lab

From the Executive Director
OSG Executive Board
Members of the Executive Board met March 23 at Fermilab. (Click on image for larger version.)
Dear OSG Consortium and Friends,

On March 6 we submitted the OSG proposal to the DOE SciDAC-2 solicitation. Members of the Executive Board were actively discussing proposals submitted by other PIs to understand how they may benefit OSG stakeholders and the OSG facility. I wrote 11 letters of support following the recommendation of members of the Council—thank you to those who volunteered to read the proposals.

The Executive Team has been meeting weekly by phone and discussing the status of and actions needed for VO use of available resources, and operational and software issues to make the OSG facility more effective and robust. CDF is now able to use the Massachussetts Institute of Technology and University of Nebraska-Lincoln sites, D0 is maintaining its use of the University of Oklahoma's site, and ATLAS and CMS often run several hundred jobs each. LIGO is making progress on using OSG 0.4.0 sites.

In the next month we will focus on: enabling GADU to have a several-week-long run using up to 500 CPUs; enabling CDF to use additional sites; completing the release of OSG 0.4.1; supporting the smooth running of CMS and ATLAS jobs on OSG sites submitted through the WLCG interfaces; and making progress on interoperability with the TeraGrid. We also want to better understand how many batch slots are out there that can actually be filled.

We are still effort-limited in many areas, including storage use and management, progress in policy and authorization usage, help for VOs to improve their throughput, and in training and documentation. The Executive Board will review our progress in these areas at its next few meetings.

Sincerely,
Ruth Pordes, OSG Executive Director

Applications - Genome Analysis
Dinanath Sulakhe
Dinanath Sulakhe
GADU, the Genome Analysis and Database Update system, uses OSG resources to run computationally intensive tools such as BLAST, Blocks and Chisel for all publicly available genome sequence data. These computations are performed periodically as the volume of the sequence data increases in the public domain. The results computed using OSG are fed into a variety of applications such as GNARE, PUMA2, Pathos, TarGet, and Chisel, used by over 2,400 researchers worldwide to study topics of scientific interest such as bioremediation, the use of microorganisms to clean up pollution.

In the last run in January, GADU processed 3.1 million protein sequences using various tools. The run processed 1,000 sequences per batch job on 10 CPUs (100 sequences per CPU) on a selected site. Each batch job took about three hours and was CPU and tool bound, with 300 kB of input generating about 10 MB of output data. The throughput was about 500 batch jobs per day. The complete task of running three tools for 3.1 million sequences was accomplished by running about 93,000 jobs (9,300 batch jobs or DAGs) and generating about 400 GB of data.

GADU Performance
GADU's performance using grid resources to run BLAST. (Click on image for larger version.)
GADU uses the Virtual Data System (VDS) to generate and submit the jobs as Condor DAGs. VDS helps with data provenance and tracing errors to specific sites. GADU submits batch jobs automatically to different sites using its internal site selection mechanism, which makes sure that no more than one job is waiting in the queue at any site, thus minimizing the load on the gatekeeper.

During the last run in January, GADU VO jobs had access to only about 8-10 OSG sites and were not authenticated by a large number of sites. With the help of the GOC, we are working on getting more sites to authenticate GADU jobs. In the next update, scheduled for April, we intend to add new tools and continue running large updates every two months. GADU will also be running these tools regularly for small sets of sequences submitted by GNARE users.

Dinanath Sulakhe, Argonne National Laboratory

Supported By