Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closes #168 University post #213

Merged
merged 20 commits into from
Sep 16, 2024
Merged
Show file tree
Hide file tree
Changes from 4 commits
Commits
Show all changes
20 commits
Select commit Hold shift + click to select a range
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
12 changes: 11 additions & 1 deletion inst/WORDLIST.txt
Original file line number Diff line number Diff line change
Expand Up @@ -107,6 +107,7 @@ CKD
CMCAT
CMSEQ
COHORTC
CoP
COUNTRYL
COUNTRYN
COVID
Expand Down Expand Up @@ -252,6 +253,7 @@ Novo
Novotel
ORobErYX
Olo
OPM
nqJsLSLd
nrow
NUMCYC
Expand All @@ -268,6 +270,7 @@ param
PARAM
PARAMCD
PARCAT
PDD
PHUSE
PHUSE’s
POSIXct
Expand All @@ -276,6 +279,7 @@ PPT
PRE
PRSEQ
PRURITUS
Parashar
Patil
Pavel
Pawel
Expand Down Expand Up @@ -309,6 +313,7 @@ RStudioTableContest
Rbasel
Reactiveness
Representable
Retinopathy
Rhinoverse
RightTool
Rimler
Expand Down Expand Up @@ -348,6 +353,7 @@ Stagg
StefanThoma
Stoilova
Straub
Subfield
Sumesh
Survivorship
TBC
Expand Down Expand Up @@ -403,6 +409,7 @@ WG
WTBL
WebAssembly
WebR
Welwyn
WnpvVgmyE
XANO
XPORT
Expand Down Expand Up @@ -434,6 +441,7 @@ admiralroche
admiraltemplate
admiralvaccine
adnca
adoe
adrian
adsl
adtte
Expand Down Expand Up @@ -687,6 +695,7 @@ michaelrimler
michaelrimler
mmThh
mmaechler
modelling
modles
modularity
modularization
Expand All @@ -706,6 +715,7 @@ nqJsLSLd
nrow
nyone
onboarding
ophtha
opifex
optimised
organisations
Expand Down Expand Up @@ -1072,4 +1082,4 @@ accountabilities
curation
requestor
pageId
viewpage
viewpage
74 changes: 74 additions & 0 deletions posts/2024-09-03_university_undergraduate_report/appendix.R
Original file line number Diff line number Diff line change
@@ -0,0 +1,74 @@
library(dplyr)
# markdown helpers --------------------------------------------------------

markdown_appendix <- function(name, content) {
paste(paste("##", name, "{.appendix}"), " ", content, sep = "\n")
}
markdown_link <- function(text, path) {
paste0("[", text, "](", path, ")")
}



# worker functions --------------------------------------------------------

insert_source <- function(repo_spec, name,
collection = "posts",
branch = "main",
host = "https://github.com",
text = "Source",
file_name) {
path <- paste(
host,
repo_spec,
"tree",
branch,
collection,
name,
file_name,
sep = "/"
)
return(markdown_link(text, path))
}

insert_timestamp <- function(tzone = Sys.timezone()) {
time <- lubridate::now(tzone = tzone)
stamp <- as.character(time, tz = tzone, usetz = TRUE)
return(stamp)
}

insert_lockfile <- function(repo_spec, name,
collection = "posts",
branch = "main",
host = "https://github.com",
text = "Session info") {
path <- path <- "https://pharmaverse.github.io/blog/session_info.html"

return(markdown_link(text, path))
}



# top level function ------------------------------------------------------

insert_appendix <- function(repo_spec, name, collection = "posts", file_name) {
appendices <- paste(
markdown_appendix(
name = "Last updated",
content = insert_timestamp()
),
" ",
markdown_appendix(
name = "Details",
content = paste(
insert_source(repo_spec, name, collection, file_name = file_name),
# get renv information,
insert_lockfile(repo_spec, name, collection),
sep = ", "
)
),
sep = "\n"
)
knitr::asis_output(appendices)
}

Original file line number Diff line number Diff line change
@@ -0,0 +1,49 @@
---
title: "Undergraduate University Statistics Report using pharamverse data"
manciniedoardo marked this conversation as resolved.
Show resolved Hide resolved
author:
- name: "Syon Parashar"
description: |
A short journal highlighting how I was able to understand and apply the pharamverse package to R
manciniedoardo marked this conversation as resolved.
Show resolved Hide resolved
date: "2024-09-02"
categories:
manciniedoardo marked this conversation as resolved.
Show resolved Hide resolved
image:
manciniedoardo marked this conversation as resolved.
Show resolved Hide resolved
---

<!--------------- typical setup ----------------->

```{r setup, include=FALSE}
long_slug <- "2024-09-02_university_undergraduate_report"
# renv::use(lockfile = "renv.lock")
```

<!--------------- post begins here ----------------->

```{r, echo = FALSE}
link::auto()
```

As part of my placement year as a Data Sciences Industrial Placement student in Biostatistics at Roche Products, Welwyn, I was required to write produce a "Business Project" and present my project to the entire Product and Drug Development (PDD) department. From February, I started brainstorming ideas for my project, and decided to pursue designing training for new Biostatisticians in R Studio. For maximum efficiency, I tied my business project with a quantitative project report, due August 2024, for my undergraduate degree in Mathematics, Operational Research and Statistics at Cardiff University.
syon45 marked this conversation as resolved.
Show resolved Hide resolved

The quantitative project report investigates statistical analyses on preliminary clinical trial data using the R Studio software as instructed by the trainStats program I have authored to help ease new Biostatisticians in the industry. The software was built considering the needs of people who are keen to pursue a career in Biostatistics, and was inspired from some exploratory analyses I had done during my placement year with the Ophthalmology Precision Medicine (OPM) group. It entails both hard abilities such as programming, data visualisation and modelling alongside soft skills like logic, reasoning and communication.
manciniedoardo marked this conversation as resolved.
Show resolved Hide resolved
syon45 marked this conversation as resolved.
Show resolved Hide resolved

I had a smooth experience with pharmaverse all throughout my business and university project. I was introduced to the package by Ross Farrugia in the "Synthetic Data CoP" google space. The package was very easy to read and use, with excellent documentation on the pharamverseadam website. As I was planning to share aggregated outputs (such as tables, listings and graphs) from clinical datasets externally to the university, even using historical clinical data was not allowed since external use of confidential data did not align with Roche's FAIR Data principles.
manciniedoardo marked this conversation as resolved.
Show resolved Hide resolved
syon45 marked this conversation as resolved.
Show resolved Hide resolved

Throughout the trainStats documentation, I have primarily used the "adoe_ophtha" dataset to allow for a variety of exploratory statistical analyses ranging from producing boxplots of the spread of data by visit day, computing standard deviation and confidence intervals for endpoints, as well as programming linear regression models and patient profiles. As the "adoe_ophtha" ADaM dataset contains visit day, active arm and endpoint data, it was ideal to use for training purposes. In addition, I did use the "adsl" dataset too, to encourage trainStats users to join and merge datasets, taking into account patient demographics such as Age.
syon45 marked this conversation as resolved.
Show resolved Hide resolved

The objective of the trainStats was not solely about investigating the data in detail and learning statistical theory, but instead highlighting the application of fundamental mathematical theory to the pharmaceutical sector as a whole, and help users familiarise themseleves with ADaM datasets. My favourite element of the package was that the format of both synthetic ADaM datasets were incredibly similar to that of a true clinical trial ADaM for a study in Ophthalmology.

To further develop and improve the pharamverse package, I believe including more endpoints in the "adoe_ophtha" dataset would be invaluable for future application and statistical analyses. Often adoe datasets have several endpoints but the "adoe_ophtha" dataset only included 2 clinical parameters namely "Central Subfield Thickness" and "Diabetic Retinopathy Severity Scale". In addition, since the data is synthetic and randomly generated, the outputs had no significant correlations or trends from a statistical perspective in terms of disease progression or measures of central tendencies. Although, in this case, the emphasis was on understanding logic and reasoning whilst programming the statistical outputs, I experienced difficulties analysing the data quantitatively in my university report due to the high variation in data. Going forward, if there is a method to simulate the data less randomly, then that may be more useful for future analyses on pharamverse data.
manciniedoardo marked this conversation as resolved.
Show resolved Hide resolved
syon45 marked this conversation as resolved.
Show resolved Hide resolved

Overall, my experience of using the pharamverse and pharmaverseadam package for the first time was excellent. The package was convenient to use in R Studio, and clearly formatted for multi-purpose use. I would definitely recommend using pharamverse to all users in the industry, who are required to produce a piece of project work or any analyses/summary for external use, or even those keen to publicly publish articles and papers in their areas within pharma to the wider community, in a safe and responsible manner regarding external use of data. I would like to thank Ross Farrugia for introducing me to the package, and especially Edoardo Mancini for talking me through the package and supporting me throughout the business project and university report.

<!--------------- appendices go here ----------------->

```{r, echo=FALSE}
source("appendix.R")
insert_appendix(
repo_spec = "pharmaverse/blog",
name = long_slug,
# file_name should be the name of your file
file_name = list.files() %>% stringr::str_subset(".qmd") %>% first()
)
```
Original file line number Diff line number Diff line change
@@ -0,0 +1,49 @@
---
title: "Undergraduate University Statistics Report using pharamverse data"
author:
- name: "Syon Parashar"
description: |
A short journal highlighting how I was able to understand and apply the pharamverse package to R
date: "2024-09-02"
categories: ["ADAM", "TLG", "ADOE", "Submissions", "Community"]
image: "pharmaverse.png"
---

<!--------------- typical setup ----------------->

```{r setup, include=FALSE}
long_slug <- "2024-09-02_university_undergraduate_report"
# renv::use(lockfile = "renv.lock")
```

<!--------------- post begins here ----------------->

```{r, echo = FALSE}
link::auto()
```

As part of my placement year as a Data Sciences Industrial Placement student in Biostatistics at Roche Products, Welwyn, I was required to write produce a "Business Project" and present my project to the entire Product and Drug Development (PDD) department. From February, I started brainstorming ideas for my project, and decided to pursue designing training for new Biostatisticians in R Studio. For maximum efficiency, I tied my business project with a quantitative project report, due August 2024, for my undergraduate degree in Mathematics, Operational Research and Statistics at Cardiff University.

The quantitative project report investigates statistical analyses on preliminary clinical trial data using the R Studio software as instructed by the trainStats program I have authored to help ease new Biostatisticians in the industry. The software was built considering the needs of people who are keen to pursue a career in Biostatistics, and was inspired from some exploratory analyses I had done during my placement year with the Ophthalmology Precision Medicine (OPM) group. It entails both hard abilities such as programming, data visualization and modelling alongside soft skills like logic, reasoning and communication.

I had a smooth experience with pharmaverse all throughout my business and university project. I was introduced to the package by Ross Farrugia in the "Synthetic Data CoP" google space. The package was very easy to read and use, with excellent documentation on the pharmaverseadam website. As I was planning to share aggregated outputs (such as tables, listings and graphs) from clinical datasets externally to the university, even using historical clinical data was not allowed since external use of confidential data did not align with Roche's FAIR Data principles.

Throughout the trainStats documentation, I have primarily used the "adoe_ophtha" dataset to allow for a variety of exploratory statistical analyses ranging from producing boxplots of the spread of data by visit day, computing standard deviation and confidence intervals for endpoints, as well as programming linear regression models and patient profiles. As the "adoe_ophtha" ADaM dataset contains visit day, active arm and endpoint data, it was ideal to use for training purposes. In addition, I did use the "adsl" dataset too, to encourage trainStats users to join and merge datasets, taking into account patient demographics such as Age.

The objective of the trainStats was not solely about investigating the data in detail and learning statistical theory, but instead highlighting the application of fundamental mathematical theory to the pharmaceutical sector as a whole, and help users familiarise themselves with ADaM datasets. My favorite element of the package was that the format of both synthetic ADaM datasets were incredibly similar to that of a true clinical trial ADaM for a study in Ophthalmology.

To further develop and improve the pharmaverse package, I believe including more endpoints in the "adoe_ophtha" dataset would be invaluable for future application and statistical analyses. Often ADOE datasets have several endpoints but the "adoe_ophtha" dataset only included 2 clinical parameters namely "Central Subfield Thickness" and "Diabetic Retinopathy Severity Scale". In addition, since the data is synthetic and randomly generated, the outputs had no significant correlations or trends from a statistical perspective in terms of disease progression or measures of central tendencies. Although, in this case, the emphasis was on understanding logic and reasoning whilst programming the statistical outputs, I experienced difficulties analysing the data quantitatively in my university report due to the high variation in data. Going forward, if there is a method to simulate the data less randomly, then that may be more useful for future analyses on pharamverse data.

Overall, my experience of using the pharmaverse and pharmaverseadam package for the first time was excellent. The package was convenient to use in R Studio, and clearly formatted for multi-purpose use. I would definitely recommend using pharmaverse to all users in the industry, who are required to produce a piece of project work or any analyses/summary for external use, or even those keen to publicly publish articles and papers in their areas within pharma to the wider community, in a safe and responsible manner regarding external use of data. I would like to thank Ross Farrugia for introducing me to the package, and especially Edoardo Mancini for talking me through the package and supporting me throughout the business project and university report.

<!--------------- appendices go here ----------------->

```{r, echo=FALSE}
source("appendix.R")
insert_appendix(
repo_spec = "pharmaverse/blog",
name = long_slug,
# file_name should be the name of your file
file_name = list.files() %>% stringr::str_subset(".qmd") %>% first()
)
```
Loading