OSG Newsletter – May 2013

*The DOE Grids Certificate Authority (CA) has stopped issuing certificates. Please use idmanager.opensciencegrid.org

Predicting agricultural impacts of large-scale drought

Joshua Elliott, a fellow at the Computation Institute at the University of Chicago, and several colleagues are studying the effects of the drought of 2012 on corn yields in the US. In a recent paper, the team makes the case that the severe nature of the drought has demonstrated a significant need for better analytical tools.

The researchers are undertaking a model-based assessment of the 2012 US growing season using the parallel System for Integrating Impact Models and Sectors (pSIMS). The system is a high performance computing framework that fuses independent climate and agriculture models at large scales, producing 5-arcminute spatial resolution (about 10 km) simulations. The pSIMS framework is written in Swift, an open source parallel scripting language developed at the Computation Institute and prototyped through the University of Chicago Computing Cooperative (UC3) campus grid, the Open Science Grid (OSG), and the Extreme Science and Engineering Discovery Environment (XSEDE).

“The goal is to create a probabilistic framework to estimate and forecast climate change impacts on a variety of temporal and spatial scales—ranging from local field scale all the way up to continental and global scales—and in time ranging from seasonal scales to multi-decadal climate change scales,” explains Elliott. “The framework runs several different models at once using the same set of assumptions and scenarios.”

The median deviation of simulated 2012 county-level yields from linear trend as a percentage of county-specific trend yields from 1979 to 2011.The median deviation of simulated 2012 county-level yields from linear trend as a percentage of county-specific trend yields from 1979 to 2011.

The median deviation of simulated 2012 county-level yields from linear trend as a percentage of county-specific trend yields from 1979 to 2011.

The team is validating their model against county-level data released by the US Department of Agriculture. By fine-tuning the underlying mechanics, they hope to determine how soon accurate predictions can be made about an upcoming harvest. Extending that lead time will ultimately give farmers and decision makers more opportunities to adjust for adverse weather – and researchers more opportunities to simulate approaches to improving crop resilience.

 

“The Open Science Grid provides us with easy access to compute resources an order of magnitude larger than we otherwise could get from campus clusters,” Elliott says. Ultimately, the OSG gives Elliott compute cycles without the overhead that sometimes comes with large scale resources—such as difficult custom architectures, required proposals (and subsequent wait times), and complicated access procedures. “Grad students without degrees in computer science—or in our case students from geoscience—are able to get accounts and get started on the OSG in relatively short order,” says Elliott.

Elliott also made a point of praising the OSG support system for researchers. “For example, in some cases, code may need to be optimized for distributed computing. The community is full of great people who can help solve all sorts of problems. Researchers should definitely take advantage of the OSG.”

Elliott notes that they have only scratched the surface of possible applications for high performance computing and big data in simulating crop yields and climate change impacts. He thinks this type of work is essential for helping stakeholders at all scales—from local farmers to agri-businesses to governments and even international aid agencies—respond to the challenges of drought, food insecurity, and climate change over the coming years and decades.

“Agricultural risk management and adaptation planning at seasonal to multi-decadal time scales has become more important than ever in the context of global socio-economic and environmental change,” Elliott says. “Countries around the globe will be facing difficult choices over the coming decades, and we’re hoping these tools and analyses can help. Resources like the Open Science Grid are essential if we want to solve big data challenges like this.”

“A world four degrees warmer than it is now is not a world that we’ve ever seen before,” Elliott says. “Studying years like 2012 in detail can potentially be very useful for helping us understand whether our models can hope to accurately capture the future.”

~ Rob Mitchum (communications manager, Computation Institute) and Greg Moore

elliott

*For more on Elliott’s research and its insights into the future of agriculture, read Rob Mitchum’s April 17 iSGTW newsletter article: http://www.isgtw.org/feature/historic-drought-yields-harvest-data


The Release of OASIS

The OSG Application and Installation Service (OASIS) was released for use by VOs on April 9, 2013.  This  new service, provided by the OSG Grid Operations Center (GOC),  is intended to create a means for VOs to distribute their software stack. Based on the CERN Virtual Machine File System (CVMFS), OASIS provides a mechanism for a central software repository to be distributed reliably and securely to OSG compute resources. Distribution of software needed to run jobs has been a challenge due to the varying mechanisms adopted by participating institutions for installation of software on their resources.

OASIS is implemented on three machines operated by the GOC. A node for interactive login is provided for designated representatives of participating VOs to deposit their software. Once the VO OASIS Manager is satisfied with the content on the login node, a request to “publish” this content is generated and a few minutes later content is available on all compute resources using the OASIS repository.

Mechanisms to perform this publication are transparently implemented at this level to insure that only one VO is publishing at any given time and to prevent a partially-filled repository from being published.

Delivery of content is achieved by a multi-tiered design. The highest level of this hierarchy is the “Stratum-0”. Access to this level of the service is via the publish mechanism of the login node and is limited to VO OASIS Managers. This is the only layer of the service that allows write access to the repository.  Extensive technical and administrative security means are in place to insure the integrity of the content at this tier.

The next tier of the service, “Stratum-1”, replicates the content on the Stratum-0. Read-only access to the content is allowed to the outside world. Since this is the service layer that makes content available to compute resources, reliability is of high concern. Several replicas at geographically distributed locations are implemented to increase availability of the service as a whole. The Stratum-1 is operated at critical priority, the highest operational level at the GOC.

Finally, a worker-node client makes the content from the Stratum-1 appear as a directory tree on the node. Fail-over mechanisms to use different Stratum-1 replicas are easily configured at this level.

As this service moves from the implementation phase to the adoption phase simple means to make use of the service will be developed, announced and distributed. If you want to use OASIS to distribute or use content, please contact the GOC at [email protected].

OASIS is the work of many people and the GOC gratefully acknowledges the contributions of the OSG Technology Group, staff of Fermi National Laboratory, Brookhaven National Laboratory and university members of the OSG.

~ Scott Teige


2013 OSG Annual Report Submitted

OSG submitted our annual progress report to the National Science Foundation (NSF) on April 6, 2013. This is the first report associated with our current Open Science Grid project spanning the next 5 years; this project officially started on June 1, 2012.  This same report was also sent to our program officer in the Department of Energy, Office of Science on April 9, 2013.

In this report, we highlight the work during the first year of the new project in addressing the needs of our stakeholders and creating broader access to distributed high throughput computing (DHTC) for researchers in the US.  We report on the project accomplishments, the contributions from our many partners in the OSG consortium, and on how we are addressing the future challenges.

The actual work involved with preparing this report is a distributed process (like OSG!) with contributions from the science experiments and campus research communities, the functional area coordinators, and from the Executive Team; the Council Chair and Project Manager served as editors to compile this report.

The 2013 OSG Annual Report is available as a public document in the OSG document database; please see: http://osg-docdb.opensciencegrid.org/cgi-bin/ShowDocument?docid=1146 .  This report is extensive and provides details on the many facets of our work – and the Executive Summary may provide a good synopsis for most of our community.

For more information, please contact Chander Sehgal


HTCondor Week 

There are many attendees and published talks at HTCondor week  from and/or of great relevance and interest to OSG participants. While the hallway conversations add to the impact, the information on the Web gives a good flavor of the value and broadness of the high throughput computing activities in the community. A summary of this might be captured from two images in Miron Livny’s talk.

Condor Week 1Condor Week 2

~ On behalf of Miron Livny
OSG Technical Director and PI


Freeform From the OSG Blogs – Notes from the Campus Infrastructure Community Pegasus webinar

Mats Rynge gave the webinar on “Managing HTC Workflows with Pegasus” from the OSG Campus Infrastructures website: http://www.campusgrids.org/. This webinar is available from the WEBCASTS tab or from the following direct link to materials on the Pegasus website.  There were 18 participants.

Following are notes:

~ Rob Gardner


 From OSG Communications

Much of the work of the OSG is led and innovated by the Area Coordinators. Mine Altunay for Security,  Brian Bockelman for Technology, Tim Cartwright for Software, Chander Sehgal for User Support, Dan Fraser for Production and Campus Grids, Rob Quick for Operations and Shawn McKee for Networking. There are weekly meetings where Areas Coordinators present the work being done and discuss current issues and future needs. The information presented and the outcomes are a good summary of not only “OSG day-to-day” but “OSG tomorrow.”

Agendas for the meetings are available  (ignore certificate error messages.)

Summaries of the meetings are published, and we include a snapshot of recent information here:

From the April 17th Technology meeting:

OASIS was released on April 9 and it is now being used by NOvA. Need to work on improving documentation as more sites deploy the service.

  1. HTCondor-CE – interacting with production and software areas to devise a release / deployment schedule.
  2. glideinWMS Usability – proceeding on anonymous accounts and traceability according to schedule.

From the April 10th User Support meeting:

  1. OSG-XSEDE usage increased at 1M hours / week – continuing to improve the integration with XSEDE infrastructure, recently to automate user allocation policies – Extending available resources to include Fermilab CMS Tier 1 (for users with certificates.)
  2. Public Storage – slow continuous increase of usage by small communities. Recently using iRODS for the deployment of large input data (20GB) for the Snowmass group. – Involving operations and technology areas to prepare recommendations to Executive Team in one month on how to move forward with this service.

From April 3rd Security:

  1. Key Accomplishments
  2. Pakiti demo at the All Hands Meeting. FermiGrid is interested to monitor their services using Pakiti. Seeking additional sites.
  3. OASIS / CVMFS security assessment completed satisfactorily. Official CVMFS deployment date at the GOC is Apr 9.
  4. Completed risk assessment for the proxy certificates using MD-5 and SHA-1: no major concerns although still recommending a move to SHA-2.

From March 27th Campus Grid and Networking meeting:

  1. Campus Infrastructure Community meeting – latest at OSG AHM – well attended.
  2. BOSCO supports access to XSEDE resources for CMS through SSH rather than GRAM. Stable at the scale of 5k jobs. Targeting 1.5M SU.
  3.  Improvements and new features on PerfSONAR and the Dashboard.

~Ruth Pordes

OSG Communications