Using R in Clinical Trials

If you’re involved in the review and analysis of clinical trial data, you likely already know how important R can be to your clinical research team and your organization. You may have heard about the ongoing SAS® vs R debate in the programming world, but you’re not sure why it matters in clinical trials. In this blog, we’ll discuss how R can be used in the clinical trial industry and how it can provide even more value to data analysts.


What is R?

Though it can be used in a variety of settings, R is commonly associated with scientific computing and statistics. In fact, that’s how it got its start: an IBM research team created it back in 1986. These days, one of its biggest uses has become analyzing data collected during clinical trials (CTs). In these studies, researchers are trying to better understand—and improve—the way drugs are developed and tested.


R is an integrated suite of software facilities for data manipulation, calculation and graphical display. It includes:

  • an effective data handling and storage facility,

  • a suite of operators for calculations on arrays, in particular matrices,

  • a large, coherent, integrated collection of intermediate tools for data analysis,

  • graphical facilities for data analysis and display either on-screen or on hardcopy, and

  • a well-developed, simple and effective programming language which includes conditionals, loops, user-defined recursive functions & input and output facilities.[1]


What are the benefits of programming in R?

While SAS® has been a programming language for clinical trials for years, there are many benefits of using R. First, it’s free. SAS® is expensive. While it offers some great functionality for clinical trial design and execution, you might decide it just isn’t worth paying so much more money over R. Second, R has an active user community. If you have questions regarding how to use R in an upcoming clinical trial, someone out there has probably already answered your question on Quora or another forum. Third, SAS® isn’t always ideal for small data sets. However, both programs offer similar features and capabilities when it comes to running tables, listings, and figures (TLFs).


Compare R to SAS

One of the most immediate differences between SAS and R is that SAS is proprietary software but R is an open-source programming language. Many users prefer open-source software for its flexibility. There are also thousands of add-ons available from CRAN (Comprehensive R Archive Network). With all these options at your disposal, you can customize your statistical analyses based on individual needs. That said, R has no GUI (graphical user interface) or menu-driven tools. If you’re familiar with SAS, using R will be a matter of getting used to a different interface. Most reviewers agree that once they put in some practice time with R’s syntax it’s just as easy as using other analytics programs. Where R really excels is export functionality. While SAS supports various output formats, including PDF and Excel documents, R’s format of choice is XML. R also exports results directly to LaTeX, Markdown, HTML, Word, PowerPoint, JSON and YAML. If you’re working with multiple formats every day, having them all in one program reduces download times.


Using R to support interim reviews

In clinical trials, R has been used for statistical analysis and reporting. During clinical trials, frequent reviews of interim data enable clinical research teams to identify and act upon issues more quickly. Traditional tools such as SAS are powerful but slow and require repetitive programming.


Although SAS has served clinical research teams well, using SAS for interim analyses slows time-to-insight. Programming and reviewing code can become a tedious burden. New tools such as R and Python make it easier to analyze data from clinical trials faster than ever before. They offer sophisticated analytic capabilities that allow even nonprogrammers to access powerful statistical modeling and simulation capabilities.

The ability to share and collaborate on code, documentation, simulations, visualizations and more via cloud computing also reduces workloads by providing easy web-based access regardless of location.


With tools such as Shiny – an open-source web application framework built with R – you can easily create interactive data exploration tools that allow clinical trial teams to identify trends and make informed decisions while they're still learning from their data.

Another advantage of using new technologies like R for interim analysis is that it allows you to complete a full statistical analysis for a study well before it ends, so you can use those data earlier in your next study.


Where can I find more information about using R for clinical trial data analysis?

For more detailed information, check out the R Project website. You can also plan on attending one of the useR! conferences. There are countless resources available online related to how you can use R and specific packages and several books available on R programming.

[1] https://www.r-project.org/about.html