Skip to content
/ xmd Public

RMarkdown script scheduler and web server to run adhoc scripts - eXtended r MarkDown

Notifications You must be signed in to change notification settings

mattwg/xmd

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

15 Commits
 
 

Repository files navigation

XMD

XMD stands for eXtended R MarkDown. XMD extends the reach of R and RMarkdown scripts beyond the domain of R experts and closer to business users. It also provides a convenient infrastructure to deploy RMarkdown scripts for repeated execution and scheduling.

RMarkdown powered by knitr is a very cool tool for data scientists working in R - and in particular R Studio.

In the words of it's creators:

"R Markdown is an authoring format that enables easy creation of dynamic documents, presentations, and reports from R. It combines the core syntax of markdown (an easy-to-write plain text format) with embedded R code chunks that are run so their output can be included in the final document. R Markdown documents are fully reproducible (they can be automatically regenerated whenever underlying R code or data changes)."

XMD is the server infrastructure that supports the following User Stories:

  • As a data scientist I would like to enable colleagues and collaborators with limited or no experience of R to be able to rerun my RMarkdown scripts (which generate HTML output) so that they can self serve and run parameterised analytic processes without my help
  • As a data scientist I would like to be able to schedule any data munging, statistical modelling or visualisation process in the language in which I created it (R obviously!) so that I can be more productive and not have to rework and revalidate code in the languages of production
  • As a data scientist I produce lots of RMarkdown scripts and I want to publish them to XMD as quickly as seamlessly as possible without having to log in to the server or execute any manual process - one command to deploy!I might be a good data scientist but my Unix / Ubuntu admin skills and web development skills are limited and whilst I can do it these tasks waste a lot of my time
  • As a data scientist who works with big data my analysis processes often take a long time to run - when a user runs an XMD process I want to provide users with email updates as the job executes so they can start the job and forget about it confident in the knowledge that it is running
  • Since users do not always want to log into the XMD web site to start an XMD job they should be able to start a job by calling a RESTful API so they can control the execution of XMD jobs from their own tools or processes

XMD is not:

  • An interactive reporting tool like RShiny or Tableau
  • A tool for writing RMarkdown code
  • A business intelligence tool
  • A replacement for R, RMarkdown or knitr
  • Anything to do with XML!

Hope that explains what XMD is - I have a working prototype running - this project is about rewriting it and sharing it with the wider community. There were too many hacks in XMD1.0 ;-)

I also want to rewrite the entire server using Python - I chose Perl first time round but my trendy colleagues don't have experience in that and Perl is too hard to read - Python is more cool from a data science point of view than Perl and more readable - so I am hoping this will encourage more collaboration!

Cheers, Matt

About

RMarkdown script scheduler and web server to run adhoc scripts - eXtended r MarkDown

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published