Research

Publications

  1. nuSAM: Memory-Efficient Sharpness-Aware Minimization via Nuclear Norm Constraints
    Thomas Pethick, Parameswaran Raman, Lenon Minorics, Mingyi Hong, Shoham Sabach, Volkan Cevher
    Transactions on Machine Learning Research, 2025

  2. HLAT: High-quality Large Language Model Pre-trained on AWS Trainium
    Haozheng Fan, Hao Zhou, Guangtai Huang, Parameswaran Raman, Xinwei Fu, Gaurav Gupta, Dhananjay Ram, Yida Wang, Jun Huan
    IEEE International Conference on Big Data, 2024

  3. EMC^2: Efficient MCMC Negative Sampling for Contrastive Learning with Global Convergence
    Chung-Yiu Yau, Hoi-To Wai, Parameswaran Raman, Soumajyoti Sarkar, Mingyi Hong
    International Conference on Machine Learning (ICML), 2024

  4. Variance-reduced Zero Order Optimization for LLM Fine-tuning
    Tanmay Gautam, Youngsuk Park, Hou Zhou, Parameswaran Raman, Wooseok Ha
    International Conference on Machine Learning (ICML), 2024

  5. MADA: Meta-Adaptive Optimizers through hyper-gradient Descent
    Kaan Ozkara, Can Karakus, Parameswaran Raman, Mingyi Hong, Shoham Sabach, Branislav Kveton, Volkan Cevher
    International Conference on Machine Learning (ICML), 2024

  6. Krylov Cubic Regularized Newton: A Subspace Second-Order Method with Dimension-Free Convergence Rate
    Ruichen Jiang, Parameswaran Raman, Shoham Sabach, Aryan Mokhtari, Mingyi Hong, Volkan Cevher
    International Conference on Artificial Intelligence and Statistics (AISTATS), 2024

  7. Contractive Error Feedback for Gradient Compression
    Bingcong Li, Shuai Zheng, Parameswaran Raman, Anshumali Shrivastava, Georgios B. Giannakis
    Preprint, 2022

  8. DS-FACTO: Doubly Separable Factorization Machines
    Parameswaran Raman, S.V.N. Vishwanathan

  9. Scaling Multinomial Logistic Regression via Hybrid Parallelism
    Parameswaran Raman, Sriram Srinivasan, Shin Matsushima, Xinhua Zhang, Hyokun Yun, S.V.N. Vishwanathan
    ACM SIGKDD Conference on Knowledge Discovery and Data Mining (KDD), 2019
    Accepted as Oral Presentation (9.16 % acceptance rate).

  10. Extreme Stochastic Variational Inference: Distributed and Asynchronous
    Parameswaran Raman, Jiong Zhang, Shihao Ji, Hsiang-Fu Yu, S.V.N. Vishwanathan, Inderjit S. Dhillon (* equally contributed)
    International Conference on Artificial Intelligence and Statistics (AISTATS), 2019

  11. Ranking via Robust Binary Classification
    Hyokun Yun, Parameswaran Raman, S.V.N. Vishwanathan
    Advances in Neural Information Processing Systems (NeurIPS), 2014

  12. Optimization on the surface of the (Hyper)-Sphere
    Parameswaran Raman, Jiasen Yang
    Tech Report, 2014

  13. Relevancy Prediction of Micro-blog Questions in an Educational Setting
    Mariheida Córdova Sánchez, Parameswaran Raman, Luo Si, Jason Fish
    Proceedings of the 7th International Conference on Educational Data Mining (EDM), 2014

Open Source Software

  • Hybrid-Parallel stochastic optimization algorithm for Multinomial Logistic Regression with large number of data points and large number of classes.
  • Hybrid-Parallel variational inference algorithm for Mixture of Exponential Family models with large number of data points and mixture components.
  • Robust and scalable ranking algorithm for Large Data (both learning to rank and latent collaborative retrieval).

Code is released under the Apache License ver 2.0.

PhD Thesis

Hybrid-Parallel Parameter Estimation for Bayesian and Frequentist Models

Distributed parameter estimation algorithms in machine learning follow two main flavors: data parallel, where the data is distributed across multiple workers and model parallel, where the model parameters are partitioned across multiple workers. Neither of these are desirable approaches since they lead to replicating either data or model parameters. In order to scale to arbitrary sizes, it is imperative to distribute both, however this is not possible in many machine learning problems due to the tightly coupled optimization problem (for e.g. log-partition function in multinomial logistic regression). In this thesis, I develop alternative reformulations for various large-scale machine learning problems in order to enable Hybrid-Parallelism (simultaneous data and model parallelism). Morever, since each worker only needs access to a subset of the data and a subset of the parameters while performing parameter updates, bulk synchronization can be avoided. I demonstrate how to apply these ideas to four types of popular models: (1) Multinomial Logistic Regression for large # of classes and examples (2) Mixture of Exponential Families for large # of examples and clusters, (3) Latent Collaborative Retrieval for large # of users and items, and (4) Factorization Machines for large number of examples and features.

Other publications during Masters and prior

  1. Participatory Design Process for an In-Vehicle Affect Detection and Regulation System for Various Drivers
    Myounghoon “Philart” Jeon, Parameswaran Raman, Jung-Bin Yim, J B, Bruce N. Walker
    Proceedings of the 13th International ACM SIGACCESS Conference on Computers and Accessibility (ASSETS), 2011

  2. ENGIN (Exploring Next Generation IN-vehicle INterfaces): Drawing a New Conceptual Framework through Iterative Participatory Processes
    Myounghoon “Philart” Jeon, Jonathan Schuett, Jung-Bin Yim, Parameswaran Raman, Bruce N. Walker
    Proceedings of the 3rd International Conference on Automotive User Interfaces and Interactive Vehicular Applications (AutomotiveUI), 2011

  3. Advanced Auditory Menus for Universal Access to Electronic Devices
    Myounghoon “Philart” Jeon, Benjamin Davison, Jeff Wilson, Parameswaran Raman, Bruce N. Walker
    Proceedings of International Technology & Persons with Disabilities Conference (CSUN), 2010

  4. Reducing repetitive development tasks in auditory menu displays with the auditory menu library
    Parameswaran Raman, Benjamin Davison, Myounghoon “Philart” Jeon, Bruce N. Walker
    Proceedings of the 16th International Conference on Auditory Display (ICAD), 2010

  5. Target Score Prediction in the game of Cricket
    Sethuraman K, Parameswaran Raman, Vijay Ramakrishnan
    Tech Report, 2010

  6. PiX-C, Pictures: Express & Communicate (Augmenting Communication with Visual Input for Children in the Autism Spectrum)
    Narayanan Ramakrishnan, Parameswaran Raman, Manohar Ganesan, Gourab Kar, Gregory D. Abowd
    Poster at 23rd ACM Symposium on User Interface Software and Technology (UIST), 2010

  7. PINE-guided cache replacement policy for location-dependent data in mobile environment
    Mary Magdalene Jane, Parameswaran Raman, Nadarajan R, Maytham Safar
    Proceedings of the First international conference on Pervasive Technologies Related to Assistive Environments (PETRA), 2008

  8. Weighted Angular Distance Based Cache Replacement Strategy for Location-Dependent Data in Wireless Environment
    Parameswaran Raman, Raghavendra Prasad, Nadarajan R, Mary Magdalene Jane
    Proceedings of the DCCA Conference, Jordan, 2007