Date of Award


Document Type

Campus Access Thesis

Degree Name

Master of Science (MS)



First Advisor

Todd Riley

Second Advisor

Jill A. Macoska

Third Advisor

Kourosh Zarringhalam


High-Throughput Sequencing is one of the most promising tools available to researchers united by the common goals of personalized medicine. The first step in the development of targeted interventions for genetic diseases is identifying the genomic features common to a particular phenotype. Identifying a unique transcriptional signature through comparative analysis of RNA-Seq data provides a glimpse of the regulatory machinery responsible for the presentation of a diseased phenotype. But the development of a standardized downstream analysis procedure for identifying clinically informative feature from the massive sequence libraries produced during one RNA-Seq experiment remains an open area of research and requires computational resources that necessitate the use of a high-performance computing cluster. These circumstances create a skills-gap bottleneck that require molecular biologists to develop a new skill set with extensive knowledge of computer programming and software engineering. This bottleneck is further exacerbated by the difficulty involved in identifying high-confidence population-level bio-markers from small sample-size experiments with low statistical power. The Tailor Pipeline was developed to address these issues and facilitate bio-marker discovery from comparative RNA-Seq analyses between two or more conditions. The Tailor pipeline is operated via two word commands that simplify the use of high performance computing clusters. Tailor produces a visualization of the salient features of an RNA-Seq data set along with sorted, human readable files listing potential bio-markers calls identified from hypothesis tests of the pooled expression levels between two or more conditions. In a recent comparative analysis, Tailor analyzed RNA extracted from urine samples provided by patients with fibrosis-associated lower urinary tract syndrome (LUTS) and a non-symptomatic control group, and identified 370 genes and 30 biochemical pathways that were significantly differentially expressed between the groups. Repetitive analysis with other commonly used tool packages was employed to refine this list to 44 bio-markers that may serve as noninvasive diagnostics for fibrosis-associated LUTS, and potential targets for drug development. Tailor's sensitivity to differential gene expression profiles allows biologists to identify the causal, genetic mechanisms that contribute to diseased phenotypes from non-invasive tests.


Free and open access to this Campus Access Thesis is made available to the UMass Boston community by ScholarWorks at UMass Boston. Those not on campus and those without a UMass Boston campus username and password may gain access to this thesis through resources like Proquest Dissertations & Theses Global or through Interlibrary Loan. If you have a UMass Boston campus username and password and would like to download this work from off-campus, click on the "Off-Campus UMass Boston Users" link above.

Additional Files

judell-supplemental-data.pdf (2279 kB)