I’m a second year MS/PhD student at the University of Massachusetts Amherst. I am also a member of the Autonomous Learning Lab (ALL) and am fortunate to be advised by Prof. Philip Thomas. My primary interest is in continual learning, a branch of Artificial Intelligence, which aims at teaching machines new concepts over time. My research is mostly at the intersection of reinforcement learning, optimization, and machine learning. I enjoy reading and looking out for inspirations from neuroscience as well.

I completed my B.Tech in Computer Science at VIT University in 2017, where Prof. Nithya Darisini was my mentor. I was also fortunate to spend most of my senior year at IIT-Madras under the guidance of Prof. B. Ravindran. In my junior year, I had great learning experiences at the Indian Defence Research and Development Organization(IRDE, DRDO) under Dr. J.P. Singh and later at the University of Technology, Troyes, France under Prof. Babiga Birregah. During my sophomore days, I was introduced to machine learning research by Prof. James Davis and Rajan Vaish through Aspiring Researcher Challenge, a large-scale research initiative by professors from the Univ. of California and Stanford. Before that, I was a national level basketball player in India and used to play all day everyday.





Learning Action Representations for Reinforcement Learning
Yash Chandak, Georgios Theocharous, James Kostas, Scott Jordan, Philip Thomas

Abstract | Arxiv

Most model-free reinforcement learning methods leverage state representations (embeddings) for generalization, but either ignore structure in the space of actions or assume the structure is provided a priori. We show how a policy can be decomposed into a component that acts in a low-dimensional space of action representations and a component that transforms these representations into actual actions. These representations improve generalization over large, finite action sets by allowing the agent to infer the outcomes of actions similar to actions already taken. We provide an algorithm to both learn and use action representations and provide conditions for its convergence. The efficacy of the proposed method is demonstrated on large-scale real-world problems.



Reinforcement Learning with a Dynamic Action Set
Yash Chandak, Georgios Theocharous, James Kostas, Philip Thomas
Continual Learning workshop at the Thirty-second Conference on Neural Information Processing Systems (NIPS 2018)


Reinforcement learning has been successfully applied to many sequential decision making problems, where the set of possible actions (possible decisions) is fixed. However, in many real-world settings, the set of possible actions can change over time. We present a model-free method to continually adapt to a dynamic set of possible actions. We show how a policy can be decomposed into an internal policy that acts in a space of action representations and a reward-independent component that transforms these representations into actual actions. These representations not only make the internal policy parameterization invariant to the cardinality of the action set, but also improve generalization by allowing the agent to infer the outcomes of actions similar to actions already taken. We provide an algorithm to autonomously adapt to this dynamic action set by exploiting structure in the space of actions using supervised learning while learning the internal policy using policy gradient. The efficacy of the proposed method is demonstrated on large-scale real-world continual learning problems.


Fusion Graph Convolutional Networks
Priyesh Vijayan Yash Chandak, Mitesh Khapra, Balaraman Ravindran
14th International Workshop on Machine Learning with Graphs, 24th ACM SIGKDD Conference on Knowledge Discovery and Data Mining (KDD 2018).

Abstract | Arxiv | Code

Semi-supervised node classification in attributed graphs, i.e., graphs with node features, involves learning to classify unlabeled nodes given a partially labeled graph. Label predictions are made by jointly modeling the node and its' neighborhood features. State-of-the-art models for node classification on such attributed graphs use differentiable recursive functions that enable aggregation and filtering of neighborhood information from multiple hops. In this work, we analyze the representation capacity of these models to regulate information from multiple hops independently. From our analysis, we conclude that these models despite being powerful, have limited representation capacity to capture multi-hop neighborhood information effectively. Further, we also propose a mathematically motivated, yet simple extension to existing graph convolutional networks (GCNs) which has improved representation capacity. We extensively evaluate the proposed model, F-GCN on eight popular datasets from different domains. F-GCN outperforms the state-of-the-art models for semi-supervised learning on six datasets while being extremely competitive on the other two.


HOPF: Higher Order Propagation Framework for Deep Collective Classification
Priyesh Vijayan Yash Chandak, Mitesh Khapra, Balaraman Ravindran
Eighth International Workshop on Statistical Relational AI at the 27th International Joint Conference on Artificial Intelligence (IJCAI 2018).

Abstract | Arxiv | Code

Given a graph where every node has certain attributes associated with it and some nodes have labels associated with them, Collective Classification (CC) is the task of assigning labels to every unlabeled node using information from the node as well as its neighbors. It is often the case that a node is not only influenced by its immediate neighbors but also by higher order neighbors, multiple hops away. Recent state-of-the-art models for CC learn end-to-end differentiable variations of Weisfeiler-Lehman (WL) kernels to aggregate multi-hop neighborhood information. In this work, we propose a Higher Order Propagation Framework, HOPF, which provides an iterative inference mechanism for these powerful differentiable kernels. Such a combination of classical iterative inference mechanism with recent differentiable kernels allows the framework to learn graph convolutional filters that simultaneously exploit the attribute and label information available in the neighborhood. Further, these iterative differentiable kernels can scale to larger hops beyond the memory limitations of existing differentiable kernels. We also show that existing WL kernel-based models suffer from the problem of Node Information Morphing where the information of the node is morphed or overwhelmed by the information of its neighbors when considering multiple hops. To address this, we propose a specific instantiation of HOPF, called the NIP models, which preserves the node information at every propagation step. The iterative formulation of NIP models further helps in incorporating distant hop information concisely as summaries of the inferred labels. We do an extensive evaluation across 11 datasets from different domains. We show that existing CC models do not provide consistent performance across datasets, while the proposed NIP model with iterative inference is more robust.



On Optimizing Human-Machine Task Assignment
Andreas Veit, Michael Wilber, Rajan Vaish, Serge Belongie, James Davis, Others
The thrid AAAI Conference on Human Computation and Crowdsourcing (wip) (HCOMP 2015).

Abstract | Arxiv

When crowdsourcing systems are used in combination with machine inference systems in the real world, they benefit the most when the machine system is deeply integrated with the crowd workers. However, if researchers wish to integrate the crowd with "off-the-shelf" machine classifiers, this deep integration is not always possible. This work explores two strategies to increase accuracy and decrease cost under this setting. First, we show that reordering tasks presented to the human can create a significant accuracy improvement. Further, we show that greedily choosing parameters to maximize machine accuracy is sub-optimal, and joint optimization of the combined system improves performance.


Lab Talks


Improving Generalization by Learning Action Representations for Reinforcement Learning.
Autonomous Learning Lab, UMass, 2018.


Faster convergence for Q-Learning using Zap-Q.
Autonomous Learning Lab, UMass, 2017.


Life-long learning, overcoming catastrophic forgetting in Neural Netowrks.
RISE lab, IIT-Madras, 2017.