~ Greg Moore and Sarah Engel
In 2009, São Paulo State University in Brazil (Universidade Estadual Paulista, or UNESP) began operating GridUNESP, one of the largest multi-campus grid infrastructures in Latin America fully dedicated to scientific research. With 23 campuses distributed throughout the state of São Paulo, UNESP is the second largest university in Brazil. A research group led by Sergio Novaes formed the Center for Scientific Computing (CSC), which now operates two main clusters: São Paulo Research and Analysis Center (SPRACE)—dedicated to CERN; and GridUNESP—dedicated to university researchers and students. GridUNESP was a spinoff of SPRACE, which provides computation and storage for the data produced by the Compact Muon Solenoid experiment in CERN.
With 23 campuses distributed throughout the state of São Paulo, UNESP is the second largest university in Brazil.
Image courtesy of UNESP Press Office.
GridUNESP is designed to take advantage of the distributed scientific research at UNESP. The grid consists of a central cluster located in São Paulo city, with seven secondary clusters at different UNESP campuses spread over the state. The eight sites are connected through the KyaTera research and education network for São Paulo, provided by FAPESP (Fundação de Amparo à Pesquisa do Estado de São Paulo) in partnership with Telefonica Brazil. GridUNESP is also connected to the US Internet2 Network through the Americas Lightpaths (AmLight) links.
The number of researchers using GridUNESP and its resources has been growing. In its four years of operation, GridUNESP has processed 6 million jobs and logged 46.5 million CPU hours. Outside researchers can also use the grid if they are affiliated with or connected to UNESP researchers. A major upgrade of the GridUNESP physical infrastructure is planned for 2014.
“This year is our first major upgrade of GridUNESP,” said Rogério Iope, a CSC systems engineer who began working with SPRACE in 2005 and has worked on the CSC team since 2009. “We have a budget of around $1 million to buy new hardware, install a new no-break system, and upgrade our network link, allowing us to reach speeds up to 100 gigabits per second. This will also allow us to reach the US networks such as Internet2—and thus US Open Science Grid (OSG) resources—at 100 gigabits per second, helping our researchers to use OSG resources outside of Brazil.”
A new web portal is facilitating job submissions to GridUNESP. With development led by UNESP CSC team members Gabriel Winckler and Beraldo Costa, the GridUNESP portal offers an easy-to-use platform for scientists to manage the execution of computer simulations. Providing a modern web interface, it lets researchers explore all the computing power of the grid with just a few clicks, from a computer, tablet, or even a phone. The platform also incorporates pre-installed applications related to a variety of scientific areas, so a researcher without computing expertise can access all the resources. Conversely, it provides command-line tools and workflow execution for more experienced users. The portal is extensible through a RESTful API (a type of web service), and all code is available as open-source (GPL3 license).
Another important milestone was establishing their own certificate authority. Together with ANSP (Academic Network of São Paulo), the UNESP CSC team created a certificate authority for the state of São Paulo. According to Sergio Lietti, the team member responsible for the deployment, ANSPGridCA (ANSP Grid Certification Authority) began issuing certificates in March 2013 for GridUNESP. Following nine months of testing, certificate production is now open to the entire academic community of the state of São Paulo so that researchers no longer need to rely on a US certificate authority.
“There is also a new thematic project being prepared to be submitted to FAPESP,” Iope said. “Our main partner—Federal University of São Carlos—is leading the project and is in the same state. Our aim is to build infrastructure for eScience in the state. We are planning to deploy distributed infrastructure at different universities, starting with UNESP and Federal University of São Carlos and in the long term involving other institutions. The idea is to work on cloud infrastructure that has integration between clouds and with grids, so we should be able to run OSG middleware on top of it. That would open opportunities statewide.”
From the very beginning, GridUNESP established a formal partnership with the OSG. The GridUNESP VO (virtual organization) was established in December 2009 as the first OSG VO outside the US. As a multipurpose VO, it includes projects from astronomy, biology and biophysics, biomedical engineering, chemistry, computer science, the geosciences, materials science, meteorology, and physics. This partnership enables GridUNESP to use the OSG middleware stack to integrate its computational resources and share them with other institutions.
In his 2014 GridUNESP VO annual report to the OSG, Iope writes that GridUNESP VO resource usage—measured against the mean occupancy of computing resources—has grown from 19% in 2010 to 53% in 2011, 74% in 2012, and a peak of 89% in 2013.
“The OSG has been a good opportunity for our researchers to have contact with the concept of grid computing,” added Iope. “At one time, each researcher had their own small cluster. The idea of deploying the grid showed researchers that it is possible to run jobs from a central system to distributed resources that may be spread out. Since UNESP is the most distributed university in Brazil, it was ideal to show that distributed infrastructure can be tied together.”
After establishing the formal partnership with OSG in 2009, GridUNESP researchers started using the OSG middleware stack and sharing resources with other institutions in the US. GridUNESP can run jobs from several different VOs in the US. “We have had a very strong partnership with OSG,” said Iope. “In 2010, we organized the São Paulo OSG School with the full support of the OSG Education, Outreach, and Training team and brought five OSG experts to Brazil. Our attendees learned how to maximize their research using grid resources.”
By 2011, local users started submitting more jobs. GridUNESP now sees more local usage than jobs coming from outside. When resources are free, however, OSG users from anywhere in the US can use them. “The upgrade that is happening this year,” said Iope, “will bring about increased usage and even more partnership with the OSG.”
Iope is one of the main collaborators on the GridUNESP upgrade and is deeply involved in the eScience project. He will be responsible for building the education and training infrastructure for new eScience users and has been busy formulating ideas. He plans to use a platform called HUBzero. First developed at Purdue University in the US, HUBzero is an open-source platform that enables scientists and researchers to create dynamic websites that support their activities.
“In the eScience project, we will need to provide dynamic content for researchers and for teachers,” Iope said. “This is my main involvement at the moment. We are also working closely with Intel in Brazil to deploy a Manycore Testing Lab (MTL). We will be deploying the first MTL outside the US. It will be tied to building an educational platform to use for teaching grad students about parallel computing.”
The team has until the end of April to submit the eScience proposal. During that time, Iope will be fully dedicated to completing his piece—after learning how to deploy and use HUBzero (not to mention how to teach it to others), and integrating OSG middleware with the MTL. “This all depends on close interaction with the OSG development team,” noted Iope. “They are also interested in this integration of cloud and grid, and this is a good opportunity to start working together.”