Research

Publications

nuSAM: Memory-Efficient Sharpness-Aware Minimization via Nuclear Norm Constraints
Thomas Pethick, Parameswaran Raman, Lenon Minorics, Mingyi Hong, Shoham Sabach, Volkan Cevher
Transactions on Machine Learning Research, 2025
HLAT: High-quality Large Language Model Pre-trained on AWS Trainium
Haozheng Fan, Hao Zhou, Guangtai Huang, Parameswaran Raman, Xinwei Fu, Gaurav Gupta, Dhananjay Ram, Yida Wang, Jun Huan
IEEE International Conference on Big Data, 2024
EMC^2: Efficient MCMC Negative Sampling for Contrastive Learning with Global Convergence
Chung-Yiu Yau, Hoi-To Wai, Parameswaran Raman, Soumajyoti Sarkar, Mingyi Hong
International Conference on Machine Learning (ICML), 2024
Variance-reduced Zero Order Optimization for LLM Fine-tuning
Tanmay Gautam, Youngsuk Park, Hou Zhou, Parameswaran Raman, Wooseok Ha
International Conference on Machine Learning (ICML), 2024
MADA: Meta-Adaptive Optimizers through hyper-gradient Descent
Kaan Ozkara, Can Karakus, Parameswaran Raman, Mingyi Hong, Shoham Sabach, Branislav Kveton, Volkan Cevher
International Conference on Machine Learning (ICML), 2024
Krylov Cubic Regularized Newton: A Subspace Second-Order Method with Dimension-Free Convergence Rate
Ruichen Jiang, Parameswaran Raman, Shoham Sabach, Aryan Mokhtari, Mingyi Hong, Volkan Cevher
International Conference on Artificial Intelligence and Statistics (AISTATS), 2024
Contractive Error Feedback for Gradient Compression
Bingcong Li, Shuai Zheng, Parameswaran Raman, Anshumali Shrivastava, Georgios B. Giannakis
Preprint, 2022
DS-FACTO: Doubly Separable Factorization Machines
Parameswaran Raman, S.V.N. Vishwanathan
Scaling Multinomial Logistic Regression via Hybrid Parallelism
Parameswaran Raman, Sriram Srinivasan, Shin Matsushima, Xinhua Zhang, Hyokun Yun, S.V.N. Vishwanathan
ACM SIGKDD Conference on Knowledge Discovery and Data Mining (KDD), 2019
Accepted as Oral Presentation (9.16 % acceptance rate).
Extreme Stochastic Variational Inference: Distributed and Asynchronous
Parameswaran Raman, Jiong Zhang, Shihao Ji, Hsiang-Fu Yu, S.V.N. Vishwanathan, Inderjit S. Dhillon (* equally contributed)
International Conference on Artificial Intelligence and Statistics (AISTATS), 2019
Ranking via Robust Binary Classification
Hyokun Yun, Parameswaran Raman, S.V.N. Vishwanathan
Advances in Neural Information Processing Systems (NeurIPS), 2014
Optimization on the surface of the (Hyper)-Sphere
Parameswaran Raman, Jiasen Yang
Tech Report, 2014
Relevancy Prediction of Micro-blog Questions in an Educational Setting
Mariheida Córdova Sánchez, Parameswaran Raman, Luo Si, Jason Fish
Proceedings of the 7th International Conference on Educational Data Mining (EDM), 2014

Open Source Software

Hybrid-Parallel stochastic optimization algorithm for Multinomial Logistic Regression with large number of data points and large number of classes.
Hybrid-Parallel variational inference algorithm for Mixture of Exponential Family models with large number of data points and mixture components.
Robust and scalable ranking algorithm for Large Data (both learning to rank and latent collaborative retrieval).

Code is released under the Apache License ver 2.0.

PhD Thesis

Hybrid-Parallel Parameter Estimation for Bayesian and Frequentist Models

Distributed parameter estimation algorithms in machine learning follow two main flavors: data parallel, where the data is distributed across multiple workers and model parallel, where the model parameters are partitioned across multiple workers. Neither of these are desirable approaches since they lead to replicating either data or model parameters. In order to scale to arbitrary sizes, it is imperative to distribute both, however this is not possible in many machine learning problems due to the tightly coupled optimization problem (for e.g. log-partition function in multinomial logistic regression). In this thesis, I develop alternative reformulations for various large-scale machine learning problems in order to enable Hybrid-Parallelism (simultaneous data and model parallelism). Morever, since each worker only needs access to a subset of the data and a subset of the parameters while performing parameter updates, bulk synchronization can be avoided. I demonstrate how to apply these ideas to four types of popular models: (1) Multinomial Logistic Regression for large # of classes and examples (2) Mixture of Exponential Families for large # of examples and clusters, (3) Latent Collaborative Retrieval for large # of users and items, and (4) Factorization Machines for large number of examples and features.

Other publications during Masters and prior

Participatory Design Process for an In-Vehicle Affect Detection and Regulation System for Various Drivers
Myounghoon “Philart” Jeon, Parameswaran Raman, Jung-Bin Yim, J B, Bruce N. Walker
Proceedings of the 13th International ACM SIGACCESS Conference on Computers and Accessibility (ASSETS), 2011
ENGIN (Exploring Next Generation IN-vehicle INterfaces): Drawing a New Conceptual Framework through Iterative Participatory Processes
Myounghoon “Philart” Jeon, Jonathan Schuett, Jung-Bin Yim, Parameswaran Raman, Bruce N. Walker
Proceedings of the 3rd International Conference on Automotive User Interfaces and Interactive Vehicular Applications (AutomotiveUI), 2011
Advanced Auditory Menus for Universal Access to Electronic Devices
Myounghoon “Philart” Jeon, Benjamin Davison, Jeff Wilson, Parameswaran Raman, Bruce N. Walker
Proceedings of International Technology & Persons with Disabilities Conference (CSUN), 2010
Reducing repetitive development tasks in auditory menu displays with the auditory menu library
Parameswaran Raman, Benjamin Davison, Myounghoon “Philart” Jeon, Bruce N. Walker
Proceedings of the 16th International Conference on Auditory Display (ICAD), 2010
Target Score Prediction in the game of Cricket
Sethuraman K, Parameswaran Raman, Vijay Ramakrishnan
Tech Report, 2010
PiX-C, Pictures: Express & Communicate (Augmenting Communication with Visual Input for Children in the Autism Spectrum)
Narayanan Ramakrishnan, Parameswaran Raman, Manohar Ganesan, Gourab Kar, Gregory D. Abowd
Poster at 23rd ACM Symposium on User Interface Software and Technology (UIST), 2010
PINE-guided cache replacement policy for location-dependent data in mobile environment
Mary Magdalene Jane, Parameswaran Raman, Nadarajan R, Maytham Safar
Proceedings of the First international conference on Pervasive Technologies Related to Assistive Environments (PETRA), 2008
Weighted Angular Distance Based Cache Replacement Strategy for Location-Dependent Data in Wireless Environment
Parameswaran Raman, Raghavendra Prasad, Nadarajan R, Mary Magdalene Jane
Proceedings of the DCCA Conference, Jordan, 2007

Parameswaran Raman

Publications

Open Source Software

PhD Thesis

Other publications during Masters and prior