Rameen Mahmood

PhD candidateNew York University

I'm a PhD candidate in the Electrical and Computer Engineering Department at NYU Tandon, advised by Danny Huang in mLab. I earned my BSc in Electrical Engineering (Honors) from NYU Abu Dhabi.

My research lies at the intersection of machine learning and computer networks, with a focus on clinical and health applications. I also work with large language models for identifying devices on the network.

Research

Large language models for identifying devices on the network

Reframes IoT device and vendor identification as a language modeling problem over noisy, heterogeneous network metadata, combining large-scale high-fidelity labeling, curriculum-based instruction tuning of quantized LLMs, and adversarial stress testing for robustness to sparsity, protocol drift, and mimicry.

Network traffic analysis + machine learning for digital health

Building transformer-based models with per-user adapters that use passively observed network traffic from home routers and personal devices as a non-invasive signal for behavior, cognition, and health.

IoT measurement and security

Large-scale measurement of consumer IoT devices, adversarial robustness of device-fingerprinting systems, and the privacy implications of the smart home.

Publications

2026

From packets to patterns: interpreting encrypted network traffic as longitudinal behavioral signals

Rameen Mahmood, Omar El Shahawy, Souptik Barua, Zachary Beattie, Jeffrey Kaye, Xuhai “Orson” Xu, Chao-Yi Wu, Danny Yuxing Huang

preprint, 2026

arXiv

Human behavior is difficult to observe continuously at scale, yet it leaves measurable traces in everyday device use. We test whether encrypted smartphone network traffic — a ubiquitous, always-on, passive sensing modality — can passively capture behavioral patterns related to sleep, stress, and loneliness. We model shared behavioral structure using a transformer backbone with per-user adapters, allowing the model to represent both typical individual behavior and deviations from it. To make these representations interpretable, we apply a sparse autoencoder to extract behavioral features corresponding to distinct patterns of activity. We relate these features to sleep disturbance, stress, and loneliness using generalized estimating equations with Mundlak decomposition, separating between-person differences from within-person changes over time. We find that the three outcomes reflect distinct temporal structures: stress is primarily associated with stable between-person differences, loneliness with within-person variation, and sleep disturbance with a combination of both. Notably, these within-person dynamics are not captured by predefined network-traffic features, demonstrating the value of learned representations for longitudinal behavioral sensing. These results establish encrypted network traffic as a viable passive sensing modality, revealing interpretable behavioral dynamics — particularly deviations from an individual's baseline — that are not visible in raw traffic features.

What's on My Network? Using Large Language Models to Identify Real-World IoT Devices at Scale

Rameen Mahmood, Tousif Ahmed, Sai Teja Peddinti, Danny Yuxing Huang

ACM CoNEXT 2026

PDF

The rapid expansion of IoT devices has outpaced current identification methods, creating significant risks for security, privacy, and network accountability. These challenges are heightened in open-world environments, where traffic metadata is often incomplete, noisy, or intentionally obfuscated. We introduce a semantic inference pipeline that reframes device identification as a language modeling task over heterogeneous network metadata. To construct reliable supervision, we generate high-fidelity vendor labels for the IoT Inspector dataset, the largest real-world IoT traffic corpus, using an ensemble of large language models guided by mutual-information and entropy-based stability scores. We then instruction-tune a quantized LLaMA 3.1 8B model with curriculum learning to support generalization under sparsity and long-tail vendor distributions. Our model achieves 98.25% top-1 accuracy and 90.73% macro accuracy across 2,015 vendors while maintaining resilience to missing fields, protocol drift, and adversarial manipulation. Evaluation on an independent IoT testbed, coupled with explanation quality and adversarial stress tests, demonstrates that instruction-tuned LLMs provide a scalable and interpretable foundation for real-world device identification at scale.

2025

Digital phenotyping via passive network traffic monitoring: feasibility and acceptability in university students

Rameen Mahmood, Donghan Hu, Annabelle David, Zachary Beattie, Jeffrey Kaye, Nabil Alshurafa, Lou Haux, Josiah Hester, Andrew Kiselica, Shinan Liu, Chenxi Qiu, Chao-Yi Wu, Danny Yuxing Huang

JMIR Formative Research, 2025

link

A formative study assessing the feasibility and acceptability of passively capturing encrypted smartphone network traffic from university students to monitor day-to-day digital behavior relevant to mental health. Over a two-week prospective deployment at NYU, 38 students enrolled, 29 provided valid network data, 27 remained active for more than five days, and 25 completed exit interviews. Acceptability was evaluated using the System Usability Scale, NASA Task Load Index, and semi-structured interviews. Beyond feasibility, exploratory analyses link traffic-derived features — timing, intensity, and regularity of use — to aspects of digital behavior relevant to health and daily functioning.

Network traffic as a scalable ethnographic lens for understanding university students' AI usage screen time

Donghan Hu, Rameen Mahmood, Annabelle David, Danny Yuxing Huang

preprint, 2025

arXiv

AI-driven applications have become woven into students' academic and creative workflows, influencing how they learn, write, and produce ideas. Conventional survey and interview methods are limited by recall bias and underreporting of habitual behaviors, while ethnographic methods face challenges of scale and reproducibility. We introduce a privacy-conscious approach that repurposes VPN-based network traffic analysis as a scalable ethnographic technique for examining students' real-world engagement with AI tools. By capturing anonymized metadata rather than content, the method enables fine-grained behavioral tracing while safeguarding personal information. A three-week field deployment with university students reveals fragmented, short-duration interactions across multiple tools and devices, with intense bursts of activity coinciding with exam periods — patterns mirroring institutional rhythms of academic life.

RouterSense: a passive, network-based health monitoring system for in-home patients

Rameen Mahmood, Danny Yuxing Huang

AAIC 2025

project page

RouterSense turns the existing home Wi-Fi router into a passive, low-cost, long-term sensor for behavioral and cognitive health in older adults. By analyzing only metadata from device-router communications — without accessing packet contents — the system captures longitudinal patterns of in-home activity and detects slow drift from a personalized behavioral baseline, with the goal of surfacing early markers of cognitive change.

2024

RouterSense: a passive, network-based health monitoring system for in-home patients

Rameen Mahmood, Danny Yuxing Huang

AAAI Fall Symposium 2024

PDF

Your router as Fitbit: health monitoring with network traffic

Rameen Mahmood, Danny Yuxing Huang

IEEE-EMBS BSN 2024 · poster

PDF

Assisted reproductive technology dataset of embryo time-lapse images and clinical data

Dmytro Zhylko, Raquel Del Gallego, Sarah Pardo, Rameen Mahmood, Ya Tung Hsieh, Salma Selim, Daniela Nogueira, Ibrahim El-Khatib, Barbara Lawrenz, Human M. Fatemi, Farah E. Shamout

medRxiv, 2024

medRxiv

Version 1.0 of the Assisted Reproductive Technology (ART) Dataset is a multi-modal fertility dataset from treatments performed at the ART Fertility Clinic in Abu Dhabi between 2015 and 2022. The data combine electronic health records and embryo development image sequences captured with the Vitrolife EmbryoScope time-lapse system, providing detailed treatment, morphology, and pregnancy outcome information. The final processed dataset consists of 14,776 embryos from 1,810 patients across 2,500 treatments. The dataset supports the development of machine learning models for automated analysis of embryo development and viability to assist clinical decision-making.

News

May 2026New preprint on encrypted network traffic as longitudinal behavioral signals.
Apr 2026LLMs for IoT device identification accepted to ACM CoNEXT 2026.
Mar 2026Digital phenotyping paper accepted to JMIR Formative Research.
Nov 2025Guest lecture for ECE-GY 9113 (Big Data) on scalable data pipelines and LLM/ML integration for large-scale analytics.
Nov 2025Presented RouterSense at the Society for Neuroscience (SfN) 2025.
Nov 2025Talk at the University of Chicago: “Decoding the digital home” (Network Operations and Internet Security Lab).
Oct 2025Presented our LLM-based IoT identification work at NYC Privacy Day, Cornell Tech.
Oct 2025New preprint on network traffic as an ethnographic lens for AI tool practices.
Sept 2025New preprint on digital phenotyping via passive network traffic monitoring.
Sept 2025New preprint on LLMs for IoT device identification — provisional patent filed.
Apr 2025Guest lecture in ECE-GY 9383: Network Security (Spring 2025).
Mar 2025RouterSense accepted to AAIC 2025.
Nov 2024Presented RouterSense at the AAAI 2024 Fall Symposium on AI for Aging in Place.
Oct 2024Presented “Your router as Fitbit” at IEEE-EMBS BSN 2024.
Sept 2024Guest lecture in CUSP-GX 8083: Big Data Management & Analysis.
Mar 2024Presented “Decoding digital footprints of the visually impaired” at NSBE 2024.