Author ORCID Identifier
0000-0003-1002-5677
Date of Award
5-31-2026
Document Type
Open Access Dissertation
Degree Name
Doctor of Philosophy (PhD)
Department
Computer Science
First Advisor
Tiago Cogumbreiro
Abstract
GPUs are essential to modern computing but notoriously difficult to program correctly. Static analysis tools can verify data-race freedom, but their over-approximations produce spurious reports of data races that do not occur, limiting their practical usefulness. This dissertation establishes when Memory Access Protocols (MAPs), a compositional abstraction modeling memory access behavior between synchronization barriers, can be simultaneously sound and complete. We first prove that MAP-based analysis is sound and complete for well-typed Jaminan programs---those without data-dependent array indexing. This result establishes the theoretical boundary: completeness is achievable when data-dependent control flow and indexing are absent. Jaminan extends MAPs with symbolic variable binders and execution traces to handle real GPU programs that may not satisfy the typing discipline. These mechanisms enable precise reasoning about which alarms correspond to real races versus approximation artifacts. We introduce a formal theory of partial completeness, proving that for large classes of real-world programs, every reported alarm is guaranteed to be a true positive. Programs are classified as Control-Independent (CI) when symbolic variables do not affect control flow, and Data-Independent (DI) when they do not affect array indices. The True Positives Theorem establishes that for CIDI programs, every detected data race is guaranteed real, pinpointing the root causes of over-approximations: control-dependent alarms may arise from unreachable code, while data-dependent alarms may result from index imprecision. We implement these contributions in FaialAA, a static analysis tool built on LLVM and Z3. Empirical evaluation demonstrates the practical impact of partial completeness: analyzing over 400 real-world kernels shows that 59.5% are CIDI, meaning our analysis achieves perfect precision for the majority of GPU programs. On the CAV'14 benchmark dataset, FaialAA confirms more data-race free kernels while producing 1.9× fewer alarms than state-of-the-art tools. FaialAA discovers 10 previously undocumented data races, including 6 missed by all competing tools. In real-world bug scenarios from OpenMM and Nvidia Megatron-LM, FaialAA correctly classifies both buggy and fixed versions in 5 out of 6 cases, while competing tools handle at most 2. The theoretical results are mechanically verified in Coq with 15,900 lines of code and over 750 theorems. This work demonstrates that perfect precision is achievable for large classes of real programs, transforming static analysis from a tool that produces false alarms into one that delivers trustworthy, actionable results.
Recommended Citation
Liew, Zhen Rong, "Static Data-Race Detection for GPU Programs: Behavioral Types with Partial Completeness Guarantees" (2026). Graduate Doctoral Dissertations. 1148.
https://scholarworks.umb.edu/doctoral_dissertations/1148
Comments
Free and open access to this work is made available to the UMass Boston community by ScholarWorks at UMass Boston. Those not on campus and those without a UMass Boston campus username and password may gain access to this work through Interlibrary Loan. If you have a UMass Boston campus username and password and would like to download this work from off-campus, click on the “Off-Campus Users” button.