aboutwritingtalksworkeducation
11/19
Our paper on semantic foundations for stream progressing will appear at POPL 2025!

Hi, I'm Shadaj

I explore the boundaries of programming

Hi, I'm Shadaj

I explore the boundaries of programming

I'm a fourth-year PhD student co-advised by Alvin Cheung and Joe Hellerstein in the Sky Computing Lab at UC Berkeley. I design programming paradigms that make distributed systems modular, correct, accessible, and performant. My research spans foundational language design, program optimization, and formal methods. I'm grateful to be supported by an NSF Graduate Research Fellowship.

Previously, I received my BS in EE/CS at Berkeley (go bears!). I've interned at Amazon ARG, Google Brain, Facebook, Apollo GraphQL, Khan Academy, and Coursera, leading projects ranging from Rust backends for Dafny to open-source GraphQL IDEs.

I co-organize the SF Systems Club, which brings together community members with meetups and coffee chats. When not researching, you'll find me playing the sitar, editing travel photos, or watching the latest season of Survivor.

Latest News

Nov 2024
Our paper on semantic foundations for stream progressing will appear at POPL 2025!
Jun 2024
Presented upcoming work at PLDI: choreographic programming for streaming systems and foundations for contextual program rewriting
Nov 2023
Our paper on automatic optimization of distributed protocols was accepted at SIGMOD 2024!
Apr 2023
Honored to have received the Tong Leong Lim Pre-Doctoral Prize at UC Berkeley!
Nov 2022
Our vision paper that lays out a research direction for CRDT data stores based on the CALM theorem has been accepted to VLDB 2023!
Oct 2022
Katara, our system for synthesizing CRDTs, is now open-source and will appear at OOPSLA 2022!
Aug 2021
Started my PhD at UC Berkeley in the RISELab, working on the Hydro project under Alvin Cheung and Joe Hellerstein
Apr 2021
Received the EECS Major Citation award!
Mar 2021
Awarded the NSF Graduate Research Fellowship!

Students Advised and Mentored

I've had the great pleasure of mentoring several talented undergraduate students at Berkeley. In my group, students lead research projects that stand on their own. Their projects have been written up as academic papers and upstreamed as production-ready contributions. I primarily recruit students from the Diversifying Access to Research in Engineering program.

  • Tyler Hou: theoretical foundations for e-graphs and applications in query optimization (third prize, POPL 2024 Student Research Competition)
  • Ryan Alameddine (c/o 2025 → Databricks): compiling async/await functions to streaming dataflow
  • Nick Jiang: support for Kubernetes deployments in Hydro Deploy
  • Amrita Rajan (c/o 2023 → Doordash): language features for static data in the Hydro Datalog compiler
  • David Wei (c/o 2022 → founding engineer → Waymo): formally verifying algebraic properties for CRDTs

Publications

POPL 2025

Flo: a Semantic Foundation for Progressive Stream Processing

Shadaj Laddad, Alvin Cheung, Joseph M. Hellerstein, Mae Milano

Existing streaming languages have a variety of semantic models and guarantees that are often incompatible. In this paper, we identify two general yet precise semantic properties: streaming progress and eager execution. We formally define these properties in the context of Flo, a parameterized streaming language that abstracts over dataflow operators and the underlying structure of streams. To demonstrate the generality of our properties, we show how key ideas from representative streaming and incremental computation systems—Flink, LVars, and DBSP—have semantics that can be modeled in Flo and guarantees that map to our properties.

arXiv / PDF

CP 2024

Suki: Choreographed Distributed Dataflow in Rust

Shadaj Laddad, Alvin Cheung, Joseph M. Hellerstein

Programming models for distributed dataflow have long focused on analytical workloads that allow the runtime to dynamically place and schedule compute logic. Meanwhile, models that enable fine-grained control over placement, such as actors, make global optimization difficult. In this extended abstract, we present Suki, an embedded Rust DSL that lets developers implement streaming dataflow with explicit placement of computation. Key to this choreographic programming approach is our use of staged programming, which lets us expose a high-level Rust API while compiling local compute units into individual binaries with zero-overhead.

arXiv / PDF

EGRAPHS 2024

Towards Relational Contextual Equality Saturation

Tyler Hou, Shadaj Laddad, Joseph M. Hellerstein

PDF

SIGMOD 2024

Optimizing Distributed Protocols with Query Rewrites

David Chu, Rithvik Panchapakesan, Shadaj Laddad, Lucky E. Katanahas, Chris Liu, Kaushik Shivakumar, Natacha Crooks, Joseph M. Hellerstein, Heidi Howard

Distributed protocols such as 2PC and Paxos lie at the core of many systems in the cloud, but standard implementations do not scale. This paper presents an approach for scaling any distributed protocol by applying rule-driven rewrites, borrowing from query optimization. Manual rule-driven applications of decoupling and partitioning improve the throughput of 2PC by 5x and Paxos by 3x, and match state-of-the-art throughput in recent work.

Technical Report

VLDB 2023

Keep CALM and CRDT On

Shadaj Laddad*, Conor Power*, Mae Milano, Alvin Cheung, Natacha Crooks, Joseph M. Hellerstein

CRDT guarantees extend only to data updates; observations of CRDT state are unconstrained and unsafe. We propose an agenda that embraces the simplicity of CRDTs, but provides richer, more uniform guarantees. We extend CRDTs with a query model that reasons about which queries are safe without coordination by applying monotonicity results from the CALM Theorem.

VLDB / PDF

VLDB 2023 (Invited Award Paper)

Towards Auto-Generated Data Systems

Alvin Cheung, Maaz Bin Safeer Ahmad, Brandon Haynes, Chanwut Kittivorawong, Shadaj Laddad, Xiaoxuan Liu, Chenglong Wang, Cong Yan

VLDB / PDF

EGRAPHS 2023

Optimizing Stateful Dataflow with Local Rewrites

Shadaj Laddad, Conor Power, Tyler Hou, Alvin Cheung, Joseph M. Hellerstein

Optimizing a stateful dataflow language is a challenging task. There are strict correctness constraints for preserving properties expected by downstream consumers, a large space of possible optimizations, and complex analyses that must reason about the behavior of the program over time. But with e-graphs, we can dramatically simplify the process of building a correct optimizer while yielding more consistent results! In this short paper, we discuss our early work using e-graphs to develop an optimizer for the Hydroflow dataflow language.

arXiv / PDF

ApPLIED 2023 (Invited Paper)

Initial Steps Toward a Compiler for Distributed Programs

Joseph M. Hellerstein, Shadaj Laddad, Mae Milano, Conor Power, Mingwei Samuel

ACM / PDF

OSCAR 2023

Code Transpilation for Hardware Accelerators

Yuto Nishida, Sahil Bhatia, Shadaj Laddad, Hasan Genc, Yakun Sophia Shao, Alvin Cheung

arXiv / PDF

OOPSLA 2022

Katara: Synthesizing CRDTs with Verified Lifting

Shadaj Laddad, Conor Power, Mae Milano, Alvin Cheung, Joseph M. Hellerstein

In this paper, we present Katara, a program synthesis-based system that takes sequential data type implementations and automatically synthesizes verified CRDT designs from them. Katara is able to automatically synthesize CRDTs for a wide variety of scenarios, from reproducing classic CRDTs to synthesizing novel designs based on specifications in existing literature. Crucially, our synthesized CRDTs are fully, automatically verified, eliminating entire classes of common errors and reducing the process of producing a new CRDT from a painstaking paper proof of correctness to a lightweight specification.

ACM / PDF / GitHub

ASE 2021

VizSmith: Automated Visualization Synthesis by Mining Data-Science Notebooks

Rohan Bavishi, Shadaj Laddad, Hiroaki Yoshida, Mukul R. Prasad, Koushik Sen

We built a tool called VizSmith which enables code reuse for visualizations by mining visualization code from Kaggle notebooks and creating a database of 7176 reusable Python functions. Given a dataset, columns to visualize and a text query from the user, VizSmith searches this database for appropriate functions, runs them and displays the generated visualizations to the user.

IEEE / PDF

GPCE 2020

Fluid Quotes: Metaprogramming across Abstraction Boundaries with Dependent Types

Shadaj Laddad and Koushik Sen

We introduce Fluid Quotes, a new technique that uses dependent types to let users pass quotes through abstraction boundaries in runtime code while splicing them ahead-of-time. This technique enables new metaprogramming capabilities by eliminating the traditional requirement of co-locating parameter expressions with call-sites. Fluid Quotes capture not only source code but also associated runtime context to ensure correctness.

ACM / PDF

Fellowships and Awards