The American Museum of Natural History Ramps Up Education on Research Computing

By: Sarah Matysiak

December 15, 2023

With a multi-day workshop, the museum strives to expand the scale of its educational and training services by bringing additional computing capacity resources to New York-area researchers and tapping into the power of high throughput computing (HTC).

After “falling in love with the system” during the 2023 OSG School, American Museum of Natural History Museum (AMNH) bioinformatics specialist Dean Bobo wondered if he could jump on an offer to bring New York institutions’ and researchers’ attention to the OSPool, a pool of computing capacity freely available to U.S.-affiliated institution researchers. Research Facilitation Lead Christina Koch mentioned the capacity of the National Science Foundation (NSF)-funded Partnership to Advance Throughput Computing (PATh) project to help institutions put on local trainings. So he reached out to Koch — and indeed the offer did stand!

The PATh project is committed to advancing the state of the art and adoption of high throughput computing (HTC). As part of this commitment, the project annually offers the OSG School at UW–Madison, which is open to participants who want to transform their research and scale out utilizing HTC. AMNH wanted to host a shortened version of the OSG School for their researchers with the help of the PATh team.

A Successful Workshop

Through Koch, Bobo connected with Research Computing Facilitator Rachel Lombardi who helped him plan the OSPool workshop on the second day of the museum’s multi-day workshop. “It was for our own museum community, but for other outside institutions as well,” Bobo says. So, Bobo arranged a computational skills training on November 3 and 6 at the AMNH in New York, New York. This was the first time the museum arranged a multi-day workshop with one day centered around OSPool resources.

The first day of the two-day training included a workshop teaching basic computational skills to an audience of students from the museum’s graduate program and graduate students, as well as researchers from various institutions around New York City. About 20 people chose to attend the second day, which involved training on OSPool resources. That day, Lombardi led a workshop likened to an OSG School crash course, with lectures covering the topics of software and container basics, principles of job submission, troubleshooting, learning about the jobs a user is running, and information for the next steps researchers could take.

Rachel Lombardi during her presentation.
Rachel Lombardi during her presentation.

The workshop garnered great success, which Bobo measured through the number of eyes it opened, including “folks who are completely new to HTC but also people who are more experienced with high performance computing on our local HPCs. They realized the utility and the capabilities of the OSPool and the resources therein. Some folks after the workshop said that they would give it a shot, which is great for me to hear. I feel like all this work was worth it because there are going to be attempts to get their software and pipelines lifted over to the OSPool.”

Empowering the HTC Community

The AMNH is looking to start hosting more OSPool events, bringing an event inspired by the OSG School locally to New York, and this workshop was the first step toward future OSPool workshops. From leading a section of the workshop, Lombardi learned “what resources [the AMNH] would need from PATh facilitators to run its own OSPool trainings.” The goal is to “empower them to do these things [conduct training] without necessarily waiting for the annual OSG School,” notes Lombardi. Bobo also picked up a few valuable lessons too. He gained insights about community outreach and a better understanding of instructing on HTC and utilizing OSPool capacity.

In this sense, the workshops the AMNH hosted — with support from PATh — reflected the ideal of “training the trainers” to scale out the facilitation effort and share computing capacity. “It won’t be sustainable to come in person and support a training for everyone who asks, so we’re thinking about how to develop and publish easy-to-use training materials that people could use on their own, a formal process of (remote) coaching and support, and even a ‘train the trainers’ program where we could build community among people who want to run an OSPool training,” Koch explains.

A Continuing Partnership

Even before arranging the two-day workshop, the AMNH already had a strong partnership with the PATh and the OSG Consortium, which provides distributed HTC services to the research community, Bobo says. The museum contributes its spare CPU power to the OSPool, and museum staff as well as PATh system administrators and facilitators communicate regularly. So far the museum has contributed over 15.5 million core hours to the OSPool.

One way the museum wants to utilize the OSPool capacity is for a genomic surveillance tool that surveys the population dynamics of diseases like COVID-19, RSV, influenza, or other emerging diseases. “We’ve been using this method of diversity called K Hill. We’re looking to port that software into the OSPool because it’s computationally expensive to do this every day, but that becomes feasible with the OSPool. We would like to make this tool a public resource, but we would have to work with the PATh facilitators to figure out if this is logistically possible. We want to make our tools ported to the OSPool so that you don’t need your own dedicated cluster to run an analysis,” Bobo explains.

Future Directions

When asked what’s in store for the future of this partnership, Bobo says he wants it to grow by putting on workshops that mirror the OSG School as a means of generating proximity and convenience for investigators in New York for whom the school may be out of reach. “We are so enthusiastic about building and continuing our relationship with the PATh project. I’m looking forward to developing a workshop that we run here at the museum. In our first year, getting help from the facilitators whom I’m familiar with would be really helpful, and this is something that I’m looking forward to doing subsequent to our first workshop to get there. There’s definitely more coming from our collaboration,” Bobo elaborates.

The PATh facilitators aim to give community members the resources they need to learn about the OSPool and control workload placement at the Access Points, Lombardi explains. Attending and arranging trainings at this workshop with the AMNH was one of the ways they upheld this goal. “I feel like we hit the nail on the head with this event set up in that we provided OSPool as a resource and they provided a lot of valuable input and feedback; it’s like a two-way street.”