Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closes #125 xportr 0.4.0 blog #126

Merged
merged 9 commits into from
Apr 2, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 2 additions & 0 deletions .lycheeignore
Original file line number Diff line number Diff line change
Expand Up @@ -14,3 +14,5 @@ https://github.com/pharmaverse/blog/pkgs/container/blog
https://stackoverflow.com/*
https://stackoverflow.com/questions/12688717/round-up-from-5#comment110611119_12688836
https://appsilon.com/docker-vs-podman-vs-singularity/amp/
file:///home/runner/work/blog/blog/posts/2024-03-01_rhino_shiny_app_validation/Appsilon.com
https://rsc.niaid.nih.gov/clinical-research-sites/daids-adverse-event-grading-tables
2 changes: 1 addition & 1 deletion R/CICD.R
Original file line number Diff line number Diff line change
Expand Up @@ -8,7 +8,7 @@ spelling::spell_check_files(list.files(pattern = ".*.qmd$", recursive = TRUE),
)

# now check those words and whether or not they are really mistakes.
# once you fixed all mistaked you can:
# once you fixed all mistakes you can:
words <- spelling::spell_check_files(list.files(pattern = ".*.qmd$", recursive = TRUE),
ignore = readr::read_lines("inst/WORDLIST.txt")
)
Expand Down
29 changes: 29 additions & 0 deletions inst/WORDLIST.txt
Original file line number Diff line number Diff line change
Expand Up @@ -653,4 +653,33 @@ QKz
scm
ytJ
yv
ztamazonaws
Appsilon's
atorusresearch
ci
datasetjson
deepdive
Ege
eu
feedbacks
IoT
Javascript
JSON's
Kamil
klmr
lifecycle
linters
magrittr
NotUsed
opifex
overreliance
PoC
shinytest
Taşlıçukur
tempdir
thinkr
utm
Visualisation
Żyła
amazonaws
zt
Binary file added media/xportr.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added media/xportr_design_flow.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
78 changes: 78 additions & 0 deletions posts/2024-03-29_xportr_0_4_0/appendix.R
Original file line number Diff line number Diff line change
@@ -0,0 +1,78 @@
# markdown helpers --------------------------------------------------------

markdown_appendix <- function(name, content) {
paste(paste("##", name, "{.appendix}"), " ", content, sep = "\n")
}
markdown_link <- function(text, path) {
paste0("[", text, "](", path, ")")
}



# worker functions --------------------------------------------------------

insert_source <- function(repo_spec, name,
collection = "posts",
branch = "main",
host = "https://github.com",
text = "source code") {
path <- paste(
host,
repo_spec,
"tree",
branch,
collection,
name,
"code_sections.qmd",
sep = "/"
)
return(markdown_link(text, path))
}

insert_timestamp <- function(tzone = Sys.timezone()) {
time <- lubridate::now(tzone = tzone)
stamp <- as.character(time, tz = tzone, usetz = TRUE)
return(stamp)
}

insert_lockfile <- function(repo_spec, name,
collection = "posts",
branch = "main",
host = "https://github.com",
text = "R environment") {
path <- paste(
host,
repo_spec,
"tree",
branch,
collection,
name,
"renv.lock",
sep = "/"
)
return(markdown_link(text, path))
}



# top level function ------------------------------------------------------

insert_appendix <- function(repo_spec, name, collection = "posts") {
appendices <- paste(
markdown_appendix(
name = "Last updated",
content = insert_timestamp()
),
" ",
markdown_appendix(
name = "Details",
content = paste(
insert_source(repo_spec, name, collection),
insert_lockfile(repo_spec, name, collection),
sep = ", "
)
),
sep = "\n"
)
knitr::asis_output(appendices)
}
Binary file added posts/2024-03-29_xportr_0_4_0/xportr.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
145 changes: 145 additions & 0 deletions posts/2024-03-29_xportr_0_4_0/xportr_0_4_0.qmd
Original file line number Diff line number Diff line change
@@ -0,0 +1,145 @@
---
title: "xportr 0.4.0"
author:
- name: Sadchla Mascary
description: "xportr is a tool tailor-made for clinical programmers, enabling them to generate CDISC-compliant XPT files for SDTM and/or ADaM. It streamlines the preparation on xpt files, ensuring they are primed for submission to clinical data validation applications or regulatory agencies."
date: "2024-03-29"
# please do not use any non-default categories.
# You can find the default categories in the repository README.md
categories: [ADaMs, SDTMs, xportr]
# feel free to change the image
image: "xportr.png"

---

<!--------------- typical setup ----------------->

```{r setup, include=FALSE}
long_slug <- "2024-03-29_xportr_0_4_0"
# renv::use(lockfile = "renv.lock")
```

<!--------------- post begins here ----------------->

# `xpt` Meets [`xportr`](https://atorus-research.github.io/xportr/)!
In the pharmaceuticals and healthcare industries, it is crucial to maintain a standard structure for data exchange and regulatory submissions, enter `xpt` datasets! `xpt` datasets are binary files that are typically created by SAS software, they contain structured data, including variables, labels, and metadata. In order to develop `xpt` formatted files in R, let's introducing you to [`xportr`](https://atorus-research.github.io/xportr/).

# What is [`xportr`](https://atorus-research.github.io/xportr/)?

[`xportr`](https://atorus-research.github.io/xportr/) is the result of collaboration among developers from various domains within the pharmaceutical industry. [click here for a full list](https://atorus-research.github.io/xportr/authors.html). This collaborative effort ensures that [`xportr`](https://atorus-research.github.io/xportr/) meets the diverse needs and requirements of users across various domains. By leveraging insights and expertise from different stakeholders, [`xportr`](https://atorus-research.github.io/xportr/) continues to evolve and improve, maintaining its relevance and effectiveness in the ever-changing landscape of data management and analysis. [`xportr`](https://atorus-research.github.io/xportr/) is currently on version 0.4.0 (you can keep up with all of its improvements by following the [`Changelog here`](https://atorus-research.github.io/xportr/news/index.html)) and with its meticulous design flow, users can expect to develop `xpt` files that have been vetted by many checks in the backend and which are CDISC compliant.
![](xportr_design_flow.png){fig-align="center" width="800"}

# Why [`xportr`](https://atorus-research.github.io/xportr/)?

Did I mention that [`xportr`](https://atorus-research.github.io/xportr/) is an indispensable R package for creating `xpt` datasets? This toolkit offers functions to apply metadata information to your CDISC-compliant datasets created in R, conduct strict checks, and offer valuable feedbacks about formats, format lengths, etc. In this blog post, we'll explore some high level capabilities of [`xportr`](https://atorus-research.github.io/xportr/), and its development.

# How does it work?

You too can experience some of the benefits of [`xportr`](https://atorus-research.github.io/xportr/) by taking a [`deep dive`](https://atorus-research.github.io/xportr/articles/deepdive.html) at [`xportr`](https://atorus-research.github.io/xportr/)!

There are data deliverables and supporting documentation needed for a successful submission to Health Authorities, while you can find a walk through on how you can use `xportr` [`here`](https://atorus-research.github.io/xportr/articles/deepdive.html#what-goes-in-a-submission-to-a-health-authority), let's look at a high level example:

```{r, message = FALSE, echo = FALSE }
library(xportr)
library(magrittr)
```

Looking at an `adsl` and the define metadata information
```{r, message = FALSE, warnings = FALSE }
adsl <- data.frame(
Subj = as.character(123, 456, 789),
Different = c("a", "b", "c"),
Val = c("1", "2", "3"),
Param = c("param1", "param2", "param3")
)
```

and the define metadata information
```{r, message = FALSE, warnings = FALSE}
metadata <- data.frame(
dataset = "test",
variable = c("Subj", "Param", "Val", "NotUsed"),
type = c("numeric", "character", "numeric", "character"),
format = NA,
order = c(1, 3, 4, 2)
)
```

Users can use the [`xportr_metadata()`](https://atorus-research.github.io/xportr/reference/metadata.html) function along with [`xportr_type()`](https://atorus-research.github.io/xportr/reference/xportr_type.html) and [`xportr_order()`](https://atorus-research.github.io/xportr/reference/xportr_order.html) to not only apply metadata attributes to your data but also receive feedback on your data based on your spec. Additionally, some of those functions performing valuable checks like:

* Variable names must start with a letter (not an underscore), be comprised of only uppercase letters (A-Z), numerals (0-9) and be free of non-ASCII characters, symbols, and underscores.
* Allotted length for each column containing character (text) data should be set to the maximum length of the variable used across all data sets (≤ 200)
* Coerces variables to only numeric or character types
* Display format support for numeric float and date/time values
* Variables names are ≤ 8 characters.
* Variable labels are ≤ 40 characters.
* Data set labels are ≤ 40 characters.
* Presence of non-ASCII characters in Variable Names, Labels or data set labels.

For example:
```{r, warnings = FALSE}
adsl %>%
xportr_metadata(metadata, "test") %>%
xportr_type() %>%
xportr_order()
```

In addition to the functions mentioned above, users have access to other functionalities such as [`xportr_write()`](https://atorus-research.github.io/xportr/reference/xportr_write.html), which facilitates the conversion of a local data frame into an `xpt` file. For instance, if we utilize the ADSL dataframe defined earlier, we can create a variable `var_spec` using the following syntax.
```{r eval=FALSE}
var_spec <- data.frame(
dataset = "adsl",
label = "Subject-Level Analysis Dataset",
data_label = "ADSL"
)
```

Users can then create their ADSL `xpt` file in the following manner:
```{r eval=FALSE}
xportr_write(adsl,
path = paste0(tempdir(), "/adsl.xpt"),
domain = "adsl",
metadata = var_spec,
strict_checks = FALSE
)
```

[`xportr`](https://atorus-research.github.io/xportr/) also includes other important functions such as:

* [`xportr_length()`](https://atorus-research.github.io/xportr/reference/xportr_length.html) to assign the SAS length to a specified data frame.

* [`xportr_label()`](https://atorus-research.github.io/xportr/reference/xportr_length.html) to enable users to assign variable labels from metadata at the variable level to a specified data frame.

* [`xportr_format()`](https://atorus-research.github.io/xportr/reference/xportr_format.html), which allow users to apply a SAS format from a variable level metadata to a given data frame
Further details on all [`xportr`](https://atorus-research.github.io/xportr/) functions can be found [`here`](https://atorus-research.github.io/xportr/reference/index.html).


# Key Features of [`xportr`](https://atorus-research.github.io/xportr/):

- **Efficient Handling**: [`xportr`](https://atorus-research.github.io/xportr/) offers efficient avenues to produce `xpt` files, allowing users to manipulate clinical datasets with ease and export in `xpt` format in R.
- **Data Integrity**: [`xportr`](https://atorus-research.github.io/xportr/) preserves data integrity by retaining variable labels, formats, and other metadata during data export operations.
- **User-Friendly Interface**: [`xportr`](https://atorus-research.github.io/xportr/) provides a user-friendly interface with intuitive functions for generating `xpt` datasets and performing data manipulation tasks.

# About the other kid on the blog, (`JSON`)! Is The Future `xpt` or JSON?

`xpt` datasets are standardized, structured files used for storing clinical trial data, making them essential for regulatory submissions and data analysis. They remain indispensable in certain industries and with advancements in data management and analysis tools, the future of `xpt` datasets lies in improved efficiency, compatibility, and ensuring seamless integration with emerging technologies and data formats.

As data continues to play a central role in decision-making and innovation, the future of JavaScript Object Notation (JSON) datasets is worth exploring. JSON datasets offer flexibility, simplicity, and interoperability, making them increasingly popular for data storage and exchange. In fact, R users can refer to [`this article`](https://www.atorusresearch.com/datasetjson-0-0-1-release/) for a summary about JSON and the R package [`datasetjson`](https://atorus-research.github.io/datasetjson/) available on [`CRAN`](https://cloud.r-project.org/web/packages/datasetjson/index.html). JSON's lightweight and human-readable format make it suitable for a wide range of applications, including web development, APIs, IoT, and data storage. While `xpt` datasets remain prevalent in certain industries, there's a growing interest in JSON datasets across various domains. Although we will not be exploring JSON datasets, this formatted data type was worth mentioning in this blog.

# Let's Recap!

This blog post highlighted the benefits of using the [`xportr`](https://atorus-research.github.io/xportr/) package, showcased an example of [`xportr`](https://atorus-research.github.io/xportr/) function, and discussed the growing interest of JSON datasets in the data landscape. We've also discussed [`xportr`](https://atorus-research.github.io/xportr/)'s development through collaborations between developers from different parts of the industry, highlighting its versatility and relevance in various domains. Which gives me a nice transition to my call to action!! If you haven't used [`xportr`](https://atorus-research.github.io/xportr/) before, then consider this blog as your official invite to test it! Reap the benefit of collaborative efforts across industries and the pharma world. If are already a user of [`xportr`](https://atorus-research.github.io/xportr/) or you are experiencing some issues with a particular functions, then chances are someone else could either be experiencing a similar issue or have an answer so reach out to the team by creating an issue [`here`](https://github.com/atorus-research/xportr/issues).

As we look towards the future, embracing diverse data formats and technologies will be key to unlocking the full potential of data-driven innovation. Championing tools like [`xportr`](https://atorus-research.github.io/xportr/) and staying abreast of emerging data formats, researchers, analysts, and regulatory professionals can unlock new opportunities for insights and innovation in their respective fields.


**Disclaimer**
This blog contains opinions that are of the authors alone and do not necessarily reflect the strategy of their respective organizations.
<!--------------- appendices go here ----------------->

```{r, echo=FALSE}
source("appendix.R")
insert_appendix(
repo_spec = "pharmaverse/blog",
name = long_slug
)
```
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading