This work collectively develops a novel non-parametric data structure (a “sparse distribution over permutations”) to capture and predict individual choice. The data structure allows us to seamlessly integrate choice signals fron heterogeous sources, including transactions, social data, web logs etc. The algorithms to learn these models are scalable (map-reducible). Their efficacy is described through a variety of case studies.
“Network” structure provides a powerful lens to process seemingly unstructured data. This body of work proposes new notions of network structure (‘centralities’). In real networks, these notions of structure allow us to learn the nature of a dynamic on the network (eg. Who started a rumor? How fast and far is it likely to spread?). In synthetic networks these notions give us a different approach to understanding and aggregating choices. In addition, we develop map reducible algorithms to learn and compute these centralities.
Effective crowd-sourcing requires developing (a) useful interfaces to seek information, (b) statistical models to capture human uncertainty, and (c) efficient inference algorithms to ‘aggregate’ the collective wisdom of crowds. These papers develop such a framework in the context of the classical model of Dawid-Skene (1971) and are accompanied by scalable algorithms that yield optimal performance in terms of information processing.
Large Scale ComputationThese papers discuss frameworks for large scale computing in two contexts: (a) Graph processing and (b) massively scaled optimization. The latter is accompanied by implementations built on top of open source libraries (such as Mpich) that allow us to solve fairly general optimization problems at terabyte scale.