class: left, middle, inverse, title-slide .title[ # Robustness of colorectal cancer screening
A Stress test of US colonoscopy screening guidelines ] .author[ ###
Pedro Nascimento de Lima, Carolyn Rutter, Christopher Maerzluft, Jonathan Ozik and Nicholson Collier
] .date[ ###
INFORMS, Oct 16th 2023
slides: pedrodelima.com/talks
] --- # Agenda **1. Why**: Why worry about policy robustness? - Colorectal cancer (CRC) screening widely recommended, guidelines use modeling modeling. - But consensus cannot be taken for granted: Nordicc trial and conflicting guidelines reveal disagreement and renewed skepticism towards screening. **2. What we do**: A stress-test of CRC screening strategies - Bayesian inference of natural history parameters - Robustness analysis of policy recommendation --- # Conflicts of Interest - No conflicts of interest to report --- # Funding statement - This research was supported by grant U01-CA253913 from the National Cancer Institute (NCI) as part of the Cancer Intervention and Surveillance Modeling Network (CISNET). - This research used resources of the Argonne Leadership Computing Facility, which is a DOE Office of Science User Facility supported under Contract DE-AC0206CH11357. This research was completed with resources provided by the Laboratory Computing Resource Center at Argonne National Laboratory. - The content is solely the responsibility of the authors and does not necessarily represent the official views of the National Institutes of Health, the Argonne National Laboratory, the Fred Hutchinson Cancer Center or the RAND Corporation. --- class: center, middle # Why **Modeling must address skepticism around CRC screening guidelines** --- # Modeling informs cancer screening policy - CISNET simulation models used by the United States Preventive Services Task Force (USPSTF) to assess screening strategies - in 2021, age to start colorectal cancer (CRC) screening reduced from 50 to 45 - ~ 15 million colonoscopies performed in the US every year -- <br> <br> It seemed there was clear messaging and widespread consensus around CRC screening guidelines, until ... --- class: center, middle # 2022: Cancer screening makes headlines <img src="data:image/png;base64,#video-01.png" width="700px" /> https://www.youtube.com/watch?v=8FBSHLy2Jzc --- # 2022: Cancer screening makes headlines Nordicc trial 10-year results released. - To no one's surprise: telling people to get a colonoscopy doesn't prevent cancer (quintessential non-compliance problem for interventions that are not "pleasing") - But getting a colonoscopy reduced mortality by ~ 50% - Outcomes somewhat line with model predictions (see [van den Berg 2023](https://www.sciencedirect.com/science/article/pii/S001650852304773X?via%3Dihub)), yet causes of low ITT effects remain explained --- class: center, middle # 2023: Cancer screening makes headlines (again) <img src="data:image/png;base64,#video-02.png" width="700px" /> https://www.cnn.com/2023/08/28/health/cancer-screenings-extend-life-wellness/index.html --- # 2023: Cancer screening makes headlines (again) American College of Physicians releases new CRC screening guidelines based on an evidence review. Contrary to current USPSTF recommendations: - discourages screening from age 45 to 49 - And discourages annual FIT - By our calculations, that could erase ~ 1/5 of the benefit of screening - We and many others responded [to the guidance statement](https://www.acpjournals.org/doi/10.7326/M23-0779), how ACP guidelines will change screening practice is unclear. --- class: center, middle #### CRC screening still widely recommended, ample support from ACS #### But consensus cannot be taken for granted #### Can we demonstrate CRC screening guidelines are **robust** to uncertainties? --- # The need to address early onset of colorectal cancer - Colorectal cancer (CRC) incidence has increased among adults younger than 50 years - Adults born around 1990 have **double** the risk of colon cancer and **quadruple** the risk of rectal cancer compared to those born in 1950 (Siegel, 2017) - Age-Period-Cohort modeling identifies a birth cohort effect underlying the increase in incidence - Screening policy needs to be responsive to that --- class: center, middle ## Goals ### 1. Quantify uncertainty around adenoma birth-cohort effects given increased early onset ### 2. Investigate the robustness of CRC screening recommendations --- # How CRC models work - Cancer screening models simulate the natural history of colorectal cancer, assuming all cancers come from the adenoma pathway. <img src="data:image/png;base64,#crcmodels.png" width="100%" /> *Estimation of Benefits, Burden, and Harms of Colorectal Cancer Screening Strategies: Modeling Study for the US Preventive Services Task Force.” JAMA - Journal of the American Medical Association 315(23): 2595–2609.* --- class: center, middle # Part I **Bayesian calibration of Birth Cohort Effects** --- # We use Incremental Mixture Bayesian Computation (IMABC) for calibration Given: A set of `\(y\)` calibration targets. A vector of `\(\theta\)` parameters and their prior distributions `\(\pi (\theta)\)` A model `\(M(\theta)\)` that maps parameters to observable targets `\(\hat y\)`. Do the following: - Use the prior distribution `\(\pi (\theta)\)` and simulate the model iteratively `\(M(\theta_i)\)`, preserving predictions within error bounds `\(\hat y \in y \pm \epsilon_y\)`. - The resulting set of parameters approximate the posterior distribution of `\(f(\theta)\)`. [IMABC](https://projecteuclid.org/journals/annals-of-applied-statistics/volume-13/issue-4/Microsimulation-model-calibration-using-incremental-mixture-approximate-Bayesian-computation/10.1214/19-AOAS1279.full) provides an algorithm to perform ABC in a batch-sequential fashion. This process is computationally intensive and requires HPC for microsim models. --- # How to address early onset in CRCSPIN? We concurrently investigate two mechanisms that could explain higher CRC incidence among young adults: ### 1. Adenomas might initiate before age 20 ("young onset") - Calibrate separate model specifications separately. ### 2. Recent birth cohorts might have increased risk of adenoma initiation (birth cohorts) - Add birth cohort effects to the model, and add calibration targets to inform them. --- # Adenoma risk model now includes birth cohort effects .pull-left[ - This analysis now allows birth-cohort effects on our adenoma risk model. - Individual log-risk `\(\Psi_{ia}\)` of person `\(i\)` at age `\(a\)` depends on their sex, age, and **year of birth**. $$ `\begin{aligned} \ln(\Psi_{ia}) =&\ \alpha_0i + \\ &\ \alpha_1 sex_i + \\ &\ \sum_{y \in I}{\alpha_y BC_{iy}} + \\ &\ \alpha_{k_0} \delta(a >= k_0) \min(a-k_0, k_1-k_0) + \\ &\ \alpha_{k_1} \delta(a >= k_1) \min(a-k_1, k_2-k_1) + \\ &\ \alpha_{k_2} \delta(a >= k_2) \min(a-k_2, k_3-k_2) + \\ &\ \alpha_{k_3} \delta(a >= k_3) (a-k_3) \\ \end{aligned}` $$ ] -- .pull-right[ Like prior CRCSPIN versions (2.5 and below): - `\(\delta(x) = 1\)` when `\(x\)` is true and 0 otherwise - `\(\alpha_{0i} \sim N(A_a, \sigma_a)\)` allows variation in risk. - Parameters `\(A_a, \sigma_a, \alpha_1, \alpha_2, \alpha_{k_0}, \alpha_{k_1}, \alpha_{k_2}, and \ \alpha_{k_3}\)` require estimation. s Model now is more flexible: - Ages at which the slope of risk changes `\(k_0, k_1, k_2, k_3\)` are now model inputs. We test two model specifications: **BC-20** `\((k_0 = 20)\)` and **BC-10** `\((k_0 = 10)\)` - `\(BC_{iy}\)` is an indicator variable for the birth cohort `\(y\)` of person `\(i\)`. `\(exp(\alpha_y)\)` is the **Incidence risk ratio* of cohort `\(y\)` relative to 1940. ] --- # Adenoma risk posteriors vary by model .pull-left[ - Shaded areas represent 50% and 95% Highest-Density Regions (HRDs) for cumulative adenoma risk by age 25 and 80. - HDR interpretation: An X% HDR contains % of the mass of a pdf *such that all points within those bounds are more likely than any points outside the HDR*. - The model that allows earlier adenoma initiation (BC-10) implies a higher cumulative risk of adenoma initiation by age 25. - Both models exhibit non-overlapping beliefs about adenoma risk. - BC-10 is compatible with a broader range of outcomes for risk. ] .pull-right[ <img src="data:image/png;base64,#2023-10-11-INFORMS-Robustness-CRC-Screening_files/figure-html/unnamed-chunk-4-1.png" width="100%" /> ] --- # Strong birth cohort effects after 1940 .pull-left[ - Shaded areas represent the posterior distribution of adenoma incidence risk ratios by cohort year and model. - A grey vertical line crosses the 1940 birth cohort, used as a reference cohort. - BC-10 and BC-20 refer to the models with a minimum age at adenoma initiation of 10 and 20 years, respectively. - No birth cohort effects assumption is implausible, particularly after 1940. - Relative risk ratios are not equal across both models (lower if adenomas are allowed to initiate earlier). ] .pull-right[ <img src="data:image/png;base64,#2023-10-11-INFORMS-Robustness-CRC-Screening_files/figure-html/unnamed-chunk-5-1.png" width="100%" /> ] --- class: center, middle # Part II **Robustness analysis of CRC colonoscopy screening** --- # Stress-testing CRC screening guidelines - We now have a posterior distribution for natural history parameters (including birth cohort effects) for two specifications of CRC-SPIN (BC-10 and BC-20). - They capture uncertainty about the disease process and recent changes in risk. - In an appendix in the paper, I demonstrate the plausibility of four *colonoscopy sensitivity scenarios* (Very Low, Low, Baseline, High). - **How robust are our recommendations to those uncertainties?** --- # Experimental design The entire experimental design combines: - The 26 colonoscopy screening strategies assessed in the 2021 USPSTF analysis - Two model specifications (BC-20 and BC-10, sampling 500 points from each posterior distribution). - Four colonoscopy sensitivity Scenarios (Very Low, Low, Baseline, High). - Each run simulated for 2 million individuals. - Report prediction intervals for each strategy conditional on each scenario for Benefit (LYG), Burden (N colonoscopies), and Incremental Cost Effectiveness Ratio (ICER). --- # LYG estimates depend crucially on assumptions .left-column[ LYG estimates for strategy 45-70, 10 vary substantially across scenarios: - High Sens: 410 (95% PI [310, 559]) LYG / 1000 people - Very Low Sens: 364 (95% PI [273, 502]) LYG / 1000 people ] .right-column[ <img src="data:image/png;base64,#2023-10-11-INFORMS-Robustness-CRC-Screening_files/figure-html/unnamed-chunk-6-1.png" width="100%" /> ] --- # Computing the cost-effectiveness frontier over models' posterior distributions - Selected strategies for the cost-effectiveness frontier separately for each model specification and sensitivity combination using posterior mean*. - To be selected for the Pareto frontier, a strategy had to be non-dominated and not extended dominated by other strategies (i.e., also no worse than a linear combination of pareto-efficient strategies). - After selecting non-dominated strategies *in expectation*, for each scenario, compute LYG, # of colonoscopies, and ICER for the strategies defined as part of the Pareto frontier for each parameter set in the posterior distribution. - Report prediction intervals for LYG, # of colonoscopies, and ICER. --- class: full-slide-fig <img src="data:image/png;base64,#2023-10-11-INFORMS-Robustness-CRC-Screening_files/figure-html/unnamed-chunk-7-1.png" width="100%" /> --- # Incremental effectiveness ratios more stable .left-column[ - ICER estimates are also somewhat similar across model specifications but vary across sensitivity levels. ] .right-column[ <img src="data:image/png;base64,#2023-10-11-INFORMS-Robustness-CRC-Screening_files/figure-html/unnamed-chunk-8-1.png" width="100%" /> ] --- # Takeaways - Cost-effectiveness measures varied substantially within and across scenarios. - But the Pareto frontier seems robust across scenarios and models. - Uncertainty from the natural history parameters seems as relevant as colonoscopy sensitivity assumptions. - Models represent mutually exclusive assumptions about adenoma initiation, but those differences did not change cancer screening recommendations. - Paper reports bayesian prediction intervals for effectiveness, burden, and efficiency ratios accounting for natural history uncertainty. --- class: center, middle # Credits **Carolyn Rutter & Chris Maerzulft (Fred Hutchinson)** *indispensable mentoring, CRCSPIN maintenance* **Jonathan Ozik and Nick Collier (ANL)** *for all the HPC expertise and support* --- class: center, middle *Conclusion* # CRC screening strategies robust to wide range of uncertainties and assumptions Slides & pre-print available [pedrodelima.com/talks](https://pedrodelima.com/talks) plima@rand.org @PedroNdeLima --- class: center, middle # Backup slides --- # Calibration targets date back to the 1800s <img src="data:image/png;base64,#cohorts.png" width="100%" /> --- # Estimating birth cohort effects - Ideally, we would want to estimate `\(\alpha_y\)` for each birth-cohort year within the set of years `\(I = [1880,1881,...,1975]\)`. - But this would result in a substantial expansion of the parameter space. Instead: - Define a set of years at which the birth cohort effect will be estimated: 1980, 1910, 1940, 1955, 1970, and 1975. - 1940 is a reference point at which the effect is zero. - The minimum and maximum dates are chosen based on the available calibration targets. - The intermediate knots were chosen to be evenly spaced, with a finer resolution after 1940. - Produce a smooth, monotonic interpolation between these points to define birth cohort effects between the years used as knots, using a method based on piecewise radial functions (Stineman 1980, Jhannesson 2018). - More details: See our [pre-print](https://www.medrxiv.org/content/10.1101/2023.03.07.23286939v1) --- # Colonoscopy sensitivity assumptions
.footnote[ \* *Sensitivity for 6-9mm and >=10 mm are compatible with low-sensitivity scenarios in Rutter et al. (2021). Sensitivity for <=5 mm adenomas justified in Nascimento de Lima (2022).* \*\* *Following Scenarios used in Knudsen et al. (2016) and (2021) Task-Force runs.* ] --- # Stress-testing CRC strategies requires high-performance computing - Experimental design defined to fit a ~ 200,000 core-hour budget. - Combining 2 model specifications, 500 natural history parameter sets for each model, 26 screening strategies, 1 “No-Screening” scenario used as the comparator, and four colonoscopy sensitivity scenarios, resulting in *105,000* unique model runs. - Simulating 2 million people for each run requires simulating 210 billion life histories and over **0.63 trillion adenomas**. - Experiment conducted on the Theta Supercomputer using a couple 1000s of concurrent nodes.