Archives

# br Impacted network br For

Impacted network

For each TSG in each cancer, we imposed the t-value of each gene onto a comprehensive human reference interaction network (denoted by G, with node size g). This network was built based on HPRD and String protein interaction databases. With nodes weighted by their t-values, we applied Random Walk with Restart (RWR) as follows: pm + 1 = rWpm + ð1 r Þp0, where r = 0.5, p0, pm, and pm + 1 are vectors of length g at time points 0, m, and m+1, and W is the column-normalized adjacency matrix for the network G. At time 0, each Fusaric Acid in p0 represents the t-value for the corresponding gene; and when the function converged (e.g., the difference between pm and pm + 1 < 1 3 10 3), the elements in the final pm + 1 represent the probabilities that the walker would arrive at the corresponding nodes, referred to as pS = fpSig; i = 1; .g. As a control, we initially applied RWR with all nodes equally weighted, i.e., p0i = 1=g; i = 1; .g. We referred to the results from this setting as the baseline probabilities, pBASE = fpBig; i = 1; .g. For each TSG 3 cancer event, the probability of visiting each node in the network at the stable status was formulated as zBi = pSi meanðpBASE Þ=sdðpBASE Þ. To further evaluate significance, we randomized the labels of inactivated samples and the WT samples by 10,000 times, resulting in 10,000 sets of t-values. The resampled labels thus had no relationship with the TSG inactivation status. For each random set, we applied RWR following the above strategy, resulting in 10,000 random sets of pS for each TSG, denoted by pp = fpi;jg; i = 1; .; g; j = 1; .; 10000. For each gene, these probability values formed the null distribution and we calculated a second z-score as zpi = pSi meanðppiÞ=sd ðppiÞ. We considered only those genes as impacted when they had zBiR2 and zpiR2, corresponding to 2 standard deviation from the center. Furthermore, only the genes whose shortest path to the corresponding TSG was % 2 were included.

QUANTIFICATION AND STATISTICAL ANALYSES

Somatic mutation, CNV, and RNA-sequencing analyses were based on 7248 samples with qualified data. Definitions of significance for various statistical tests are described and referenced in their respective sections in the Method Details.

Characterizing lymphoma incidence and disparities for a cancer center catchment region

To appear in: Clinical Lymphoma, Myeloma and Leukemia

Please cite this article as: Ayers AA, Lyu L, Dance K, Ward KC, Flowers CR, Koff JL, McCullough LE, Characterizing lymphoma incidence and disparities for a cancer center catchment region, Clinical Lymphoma, Myeloma and Leukemia (2019), doi: https://doi.org/10.1016/j.clml.2019.06.009.

This is a PDF file of an unedited manuscript that has been accepted for publication. As a service to our customers we are providing this early version of the manuscript. The manuscript will undergo copyediting, typesetting, and review of the resulting proof before it is published in its final form. Please note that during the production process errors may be discovered which could affect the content, and all legal disclaimers that apply to the journal pertain.

ACCEPTED MANUSCRIPT

Original Study

Manuscript Title: Characterizing lymphoma incidence and disparities for a cancer center catchment region

Authors: † Amy A. Ayers, MPHa, † Lin Lyu, MD, PhDa,c, Kaylin Dancea,c, Kevin C. Ward, PhD, MPHc,d, Christopher R. Flowers, MD, MSca,b, Jean L. Koff, MD, MSca,b, Lauren E. McCullough, PhD, MSPHa,c
† These authors contributed equally to this work

a Winship Cancer Institute 1365-C Clifton Road

Atlanta, Georgia 30322, United States of America

b Department of Hematology and Medical Oncology, Emory University School of Medicine 100 Woodruff Circle

Atlanta, Georgia 30322 United States of America