Following the Long Tail of Economics: Amit Gandhi Uses the OSG to Outdo Parallel Computing

GandhiAmit Gandhi is an assistant professor of economics at the University of Wisconsin-Madison. His specialty is industrial organization, and his research involves using data from industry to estimate the parameters in economic models. Once he knows those parameters—in other words, once he recovers the economic model from the data—he can use the model for policy analysis (answering policy-relevant questions such as what the likely effect of regulation will be).

Among other things, Gandhi is studying the long-tail phenomena—the full range of product variety in a given market, which is consistent with the choices consumers actually face. His goal is producing economic models of demand and supply that capture the full richness of product variety in the marketplace. Whereas traditional economic analysis has focused on competition among the top handful of brands or products in an industry, the reality is that the industries are characterized by a much larger mass of product variety than standard models have allowed and this variety is important for understanding competition and economic forces.The concept of “the long tail” is credited to Chris Anderson, author of “The Long Tail: Why the Future of Business is Selling Less of More.” According to Anderson, products in low demand or those that have a low sales volume can collectively make up a market share that rivals or exceeds the relatively few current bestsellers.

Gandhi2 Gandhi’s recent research has generalized traditional models to take into account the entire pattern of product variety in an industry (i.e., both the head and the tail of demand), but one key cost of estimating these models against data is that it is computationally intensive. The model makes a prediction, the data seem to say something, and the econometrics underlying Gandhi’s methods tries to understand it all and minimize discrepancies. Optimization is especially challenging because everything becomes increasingly non-linear as models become more complicated. Gandhi maintains that the best way to solve non-linear problems like these is to do grid search because it is the only way to know for sure you have found the “best” answer, but people stopped doing grid search when they started finding too many points to search.

For the models Gandhi has recently designed to study massive product variety, the econometric procedure requires “testing” any candidate parameter value to see whether it is an accepted member of a   confidence   set (a confidence interval of parameter values), and hence the estimation problem takes the form of an exhaustive grid search.

Fortunately for Gandhi, UW-Madison’s Center for High Throughput Computing (CHTC) offers advanced computing resources for researchers at the university. The Open Science Grid is one of the key tools the CHTC uses to help with complex problems. In particular, Gandhi works with the CHTC to use the OSG for structural estimations of his economic models.

Gandhi3By searching the grid with high throughput computing on the OSG, Gandhi has found that he can break up problems into nodes—the communication between the jobs only has to happen once all the nodes are finished. The ability to call upon thousands of nodes has revolutionized his research, as computation is no longer limited by parallel computing (which uses the Message Passing Interface). This approach allows him to gain a more immediate understanding of the likely parameter values, and greatly facilitates the econometric estimation of problems that otherwise would not be feasible.

Gandhi believes an approach combining CHTC resources and the OSG is superior to a departmental cluster approach. “Those clusters are expensive and soon become antiquated and hard to maintain,” he notes. “The CHTC/OSG approach is flexible—they can adjust.” The CHTC has also helped Gandhi with his submit scripts, so he can concentrate on his work. “For me,” he says, “CHTC is critical. I don’t have time to think about the computing end of it.”

Gandhi thinks economists could greatly benefit from learning about the benefits of grid computing over parallel computing. He also observes that many disciplines have common fundamental problems that are similar in structure. Ultimately, he would like to see disciplines learn from each other’s computational solutions. This is starting to happen through shared approaches to using resources such as the OSG, but researchers are only now beginning to realize how broad the implications could be.

~ Greg Moore and Sarah Engel