Finish strategy post

ChrisBeeley · Jun 18, 2024 · 43c376a · 43c376a
1 parent b7f6fe4
commit 43c376a
Show file tree

Hide file tree

Showing 277 changed files with 9,967 additions and 3,279 deletions.
diff --git a/...01-20-data-science-accelerator-lesson-one-build-a-pipeline-and-ship-the-code.md b/...01-20-data-science-accelerator-lesson-one-build-a-pipeline-and-ship-the-code.md
@@ -3,20 +3,17 @@ author: chrisbeeley
 categories:
 - Uncategorized
 date: "2019-01-20T17:41:39Z"
-guid: https://chrisbeeley.net/?p=1186
-id: 1186
 title: Data science accelerator lesson one- build a pipeline and ship the code!
-url: /?p=1186
 ---
 
 My exciting news is that I was accepted onto the [data science accelerator](https://www.gov.uk/government/publications/data-science-accelerator-programme/introduction-to-the-data-science-accelerator) and have been doing it since late December. My project, basically, is all about using natural language processing to better understand the patient experience data that we collect (and, if I have time, the staff experience data too). Here are the goals:
 
-1\) Using an unsupervised technique, generate a novel way of categorising the text data to give us a different perspective. We already have tagged data but I would like to interrogate the usefulness of the tags that we have used  
-2\) a. Generate a system that, given a comment or set of comments, can find other comments within the data that are semantically similar. Note that this system will need to run live on the server, since it would be impossible to store the semantic similarity of every comment to every other comment  
-3\) b. Generate a system that, instead of searching by word, searches by similarity to that word  
-3\) Produce a supervised learning algorithm which can be trained on a sample of tagged comments and then produce tags for comments that it has not previously seen  
-4\) a. Produce a sentiment analysis function that can tag every comment in a database with how positive or negative it is  
-4\) b. Produce reporting functions that can compute overall sentiment for a group of documents (e.g. of a particular service area) and optionally describe the change in sentiment over time
+1) Using an unsupervised technique, generate a novel way of categorising the text data to give us a different perspective. We already have tagged data but I would like to interrogate the usefulness of the tags that we have used  
+    1) Generate a system that, given a comment or set of comments, can find other comments within the data that are semantically similar. Note that this system will need to run live on the server, since it would be impossible to store the semantic similarity of every comment to every other comment  
+    2) Generate a system that, instead of searching by word, searches by similarity to that word  
+3) Produce a supervised learning algorithm which can be trained on a sample of tagged comments and then produce tags for comments that it has not previously seen  
+    1) Produce a sentiment analysis function that can tag every comment in a database with how positive or negative it is  
+    2) Produce reporting functions that can compute overall sentiment for a group of documents (e.g. of a particular service area) and optionally describe the change in sentiment over time
 
 I’m not really sure if I’m going to get through all of it but I’ve made a decent start. I’ve made a [Trello board](https://trello.com/b/GlmtpsqB/data-science-accelerator) and there’s a [GitHub](https://github.com/ChrisBeeley/naturallanguageprocessing) too.
 

diff --git a/...t/post/2019-02-03-dplyr-function-to-replace-nested-ifelse-like-sql-case-when.md b/...t/post/2019-02-03-dplyr-function-to-replace-nested-ifelse-like-sql-case-when.md
@@ -3,10 +3,7 @@ author: chrisbeeley
 categories:
 - Uncategorized
 date: "2019-02-03T17:45:39Z"
-guid: https://chrisbeeley.net/?p=1188
-id: 1188
 title: Dplyr function to replace nested ifelse (like SQL CASE WHEN)
-url: /?p=1188
 ---
 
 I hate nested ifelse statements in R code. I absolutely hate them. They are just ugly and difficult to read. I found this lovely function in dplyr called case\_when (which is like the SQL CASE WHEN if you know that) that banished them forever. It’s easier if I just show you (this is from the documentation)

diff --git a/content/post/2019-02-17-citing-r-packages-in-rmarkdown.md b/content/post/2019-02-17-citing-r-packages-in-rmarkdown.md
@@ -3,10 +3,7 @@ author: chrisbeeley
 categories:
 - Uncategorized
 date: "2019-02-17T15:46:33Z"
-guid: https://chrisbeeley.net/?p=1190
-id: 1190
 title: Citing R packages in RMarkdown
-url: /?p=1190
 ---
 
 I haven’t really done much in the way of citing papers in the last couple of years, I’ve spent my time either messing around with Shiny servers or databases or writing RMarkdown reports- and being horribly ill, of course, haha!- see this blog, passim. Don’t worry, I’m as fit as a flea now 🙂

diff --git a/content/post/2019-03-03-a-world-of-plotthedots-and-what-else.md b/content/post/2019-03-03-a-world-of-plotthedots-and-what-else.md
@@ -3,10 +3,7 @@ author: chrisbeeley
 categories:
 - Uncategorized
 date: "2019-03-03T09:26:42Z"
-guid: https://chrisbeeley.net/?p=1195
-id: 1195
-title: 'A world of #plotthedots and&#8230; what else?'
-url: /?p=1195
+title: 'A world of #plotthedots and what else?'
 ---
 
 ![image](https://chrisbeeley.net/wp-content/uploads/2019/02/image.png)

diff --git a/content/post/2019-03-17-suppress-console-output-with-ggplot-purrr-and-rmarkdown.md b/content/post/2019-03-17-suppress-console-output-with-ggplot-purrr-and-rmarkdown.md
@@ -3,10 +3,7 @@ author: chrisbeeley
 categories:
 - Uncategorized
 date: "2019-03-17T18:27:09Z"
-guid: https://chrisbeeley.net/?p=1198
-id: 1198
 title: Suppress console output with ggplot, purrr, and RMarkdown
-url: /?p=1198
 ---
 
 So [I posted a while back](https://chrisbeeley.net/?p=1143) about producing several plots at once with RMarkdown and purrr and how to suppress the console output in the document.
@@ -18,8 +15,6 @@ For ggplot, you need to excellent function walk() which is like map() except it
 Bish bash bosh. Easy
 
 ```
-<pre class="brush: r; title: ; notranslate" title="">
-
 ```{r, message = FALSE, echo = FALSE}
 
 library(tidyverse)
@@ -32,5 +27,4 @@ walk(c("Plot1", "Plot 2", "Plot 3"), function(x) {
   print(p)
 })
 ```
-
 ```
diff --git a/content/post/2019-03-19-producing-rmarkdown-reports-with-plumber.md b/content/post/2019-03-19-producing-rmarkdown-reports-with-plumber.md
@@ -3,10 +3,7 @@ author: chrisbeeley
 categories:
 - Uncategorized
 date: "2019-03-19T18:56:03Z"
-guid: https://chrisbeeley.net/?p=1209
-id: 1209
 title: Producing RMarkdown reports with Plumber
-url: /?p=1209
 ---
 
 I wasn’t going to post this until I got it working on the server but I’ve got the wrong train ticket and am stuck in London St Pancras until 7pm so I thought I’d be productive and put it up now.

diff --git a/content/post/2019-03-31-adding-line-returns-in-rmarkdown-in-a-loop.md b/content/post/2019-03-31-adding-line-returns-in-rmarkdown-in-a-loop.md
@@ -3,10 +3,7 @@ author: chrisbeeley
 categories:
 - Uncategorized
 date: "2019-03-31T16:48:05Z"
-guid: https://chrisbeeley.net/?p=1212
-id: 1212
 title: Adding line returns in RMarkdown in a loop
-url: /?p=1212
 ---
 
 Another one that’s for me when I forget. The internet seems strangely reluctant to tell me how to do this, yet [here it is](https://stackoverflow.com/questions/49561077/creating-a-new-line-within-an-rmarkdown-chunk) buried in the answer to something else.

diff --git a/...19-04-05-the-future-of-patient-experience-data-at-nottinghamshire-healthcare.md b/...19-04-05-the-future-of-patient-experience-data-at-nottinghamshire-healthcare.md
@@ -3,10 +3,7 @@ author: chrisbeeley
 categories:
 - Uncategorized
 date: "2019-04-05T10:50:20Z"
-guid: https://chrisbeeley.net/?p=1214
-id: 1214
 title: The future of patient experience data at Nottinghamshire Healthcare
-url: /?p=1214
 ---
 
 I’ve talked about my plans for patient experience data in quite a few different forums recently, but it just occurred to me that it isn’t written down anywhere.

diff --git a/content/post/2019-05-25-a-note-for-kindle-readers-of-my-shiny-book.md b/content/post/2019-05-25-a-note-for-kindle-readers-of-my-shiny-book.md
@@ -3,10 +3,7 @@ author: chrisbeeley
 categories:
 - Uncategorized
 date: "2019-05-25T14:34:00Z"
-guid: https://chrisbeeley.net/?p=1216
-id: 1216
 title: A note for Kindle readers of my Shiny book
-url: /?p=1216
 ---
 
 Someone has been in touch with me to say that the Kindle version has no chapter numbers and they were finding it difficult to figure out which bit of the book went with which bit in the repository.

diff --git a/...ng-together-and-productionising-code-at-nottinghamshire-healthcare-nhs-trust.md b/...ng-together-and-productionising-code-at-nottinghamshire-healthcare-nhs-trust.md
@@ -3,11 +3,8 @@ author: chrisbeeley
 categories:
 - Uncategorized
 date: "2019-07-02T16:28:58Z"
-guid: https://chrisbeeley.net/?p=1218
-id: 1218
 title: Python, working together, and productionising code at Nottinghamshire Healthcare
   NHS Trust
-url: /?p=1218
 ---
 
 Someone just asked me a question on Twitter and I was most of the way through what would have been several tweets before I thought perhaps it would work better as a blog post. I’m going to answer the question first, and then talk some more about the context and what I’m hoping to achieve where I am and (with the advent of more collaborative working through the ICS) in the wider system. So I’ve been using scikit-learn to train a classifier to predict non attendance at healthcare appointments and the question to me was “What made you choose Python rather than R for this one? Which classifier are you using?”.

diff --git a/content/post/2019-08-08-the-analysts-manifesto.md b/content/post/2019-08-08-the-analysts-manifesto.md
@@ -3,10 +3,7 @@ author: chrisbeeley
 categories:
 - Uncategorized
 date: "2019-08-08T14:53:23Z"
-guid: https://chrisbeeley.net/?p=1222
-id: 1222
-title: The analysts&#8217; manifesto
-url: /?p=1222
+title: The analysts' manifesto
 ---
 
 I was at an event a little while ago and there was talk of change coming for healthcare analysts. With the advent of population health management we were going to finally get the recognition we as a profession deserve and would get the training and tools necessary to deliver the improvements we all know that data can give.

diff --git a/content/post/2019-10-31-add-label-to-shinydashboard.md b/content/post/2019-10-31-add-label-to-shinydashboard.md
@@ -3,10 +3,7 @@ author: chrisbeeley
 categories:
 - Uncategorized
 date: "2019-10-31T11:38:12Z"
-guid: http://chrisbeeley.net/?p=1230
-id: 1230
 title: Add label to shinydashboard
-url: /?p=1230
 ---
 
 I feel like I’m sticking my neck out a bit here, and there’s a simple way to doing this that I haven’t found, but I’ve looked pretty hard and “add label to sidebar shiny dashboard” has basically no Google juice at all, and I should know because I’ve been staring at it for half an hour.
@@ -15,10 +12,9 @@ Sometimes you want to add a simple, static label to a shinydashboard sidebar. If
 
 You can add something dynamic with sidebarMenuOutput, and you could do that as a long way round, but I got to thinking that there must be a simple way of doing it. I ended up looking at my shinydashboard in developer view in Chrome, and just stealing the markup from there. Once you’ve done that it’s very simple.
 
-<div class="wp-block-syntaxhighlighter-code ">```
-<pre class="brush: r; title: ; notranslate" title="">
+```
 div(class = "shiny-input-container", 
       p(paste0("Data updated: ", date_update))
 ```
 
-</div>This code just shows a pre-calculated value of the date when the data was updated. I stole this idea from a colleague because sometimes the cron job that updates the data chokes and nobody notices for a while.
+This code just shows a pre-calculated value of the date when the data was updated. I stole this idea from a colleague because sometimes the cron job that updates the data chokes and nobody notices for a while.
diff --git a/...11-06-authenticating-using-ldap-from-active-directory-using-shiny-server-pro.md b/...11-06-authenticating-using-ldap-from-active-directory-using-shiny-server-pro.md
@@ -3,10 +3,7 @@ author: chrisbeeley
 categories:
 - Uncategorized
 date: "2019-11-06T18:03:08Z"
-guid: http://chrisbeeley.net/?p=1242
-id: 1242
 title: Authenticating using LDAP from Active Directory using Shiny Server Pro
-url: /?p=1242
 ---
 
 Right, I promised I would write this a very long time ago and I still haven’t done it. I’ve just got back from the NHS-R community conference (see an upcoming post) and I’m in a sharing mood, and besides someone just asked me about this on Twitter.

diff --git a/content/post/2019-12-08-nhs-r-conference.md b/content/post/2019-12-08-nhs-r-conference.md
@@ -3,10 +3,7 @@ author: chrisbeeley
 categories:
 - Uncategorized
 date: "2019-12-08T12:36:07Z"
-guid: http://chrisbeeley.net/?p=1257
-id: 1257
 title: NHS-R conference
-url: /?p=1257
 ---
 
 So I recently just got back from the NHS-R community conference, which was amazing of course, and it’s got me in the mood to share, so I’m writing a few blog posts. I’ve got some more in depth stuff to say about where I think NHS-R is/ should be going, but this is the “feels” one.

diff --git a/content/post/2019-12-16-my-superpower-a-talk-i-gave-about-being-ill.md b/content/post/2019-12-16-my-superpower-a-talk-i-gave-about-being-ill.md
@@ -3,10 +3,7 @@ author: chrisbeeley
 categories:
 - Uncategorized
 date: "2019-12-16T09:25:00Z"
-guid: http://chrisbeeley.net/?p=1263
-id: 1263
 title: My superpower (a talk I gave about being ill)
-url: /?p=1263
 ---
 
 So this post is nothing to do with R, or Linux, or statistics, or any of the usual stuff. It’s about me. It’s more than possible that you’re not very interested in that so consider yourself warned.

diff --git a/content/post/2020-01-19-decade-round-up-post.md b/content/post/2020-01-19-decade-round-up-post.md
@@ -3,10 +3,7 @@ author: chrisbeeley
 categories:
 - Uncategorized
 date: "2020-01-19T09:58:21Z"
-guid: http://chrisbeeley.net/?p=1273
-id: 1273
 title: Decade round up post
-url: /?p=1273
 ---
 
 I’ve got into the habit of writing a yearly roundup blog post (see this blog, passim), based on a suggested framework by David Allen. Since this is the end of a decade I thought it would be fun to do one for the whole decade.

diff --git a/...post/2020-02-18-converting-words-on-a-survey-dataset-to-numbers-for-analysis.md b/...post/2020-02-18-converting-words-on-a-survey-dataset-to-numbers-for-analysis.md
@@ -3,10 +3,7 @@ author: chrisbeeley
 categories:
 - Uncategorized
 date: "2020-02-18T13:41:50Z"
-guid: http://chrisbeeley.net/?p=1287
-id: 1287
 title: Converting words on a survey dataset to numbers for analysis
-url: /?p=1287
 ---
 
 As always, I have very little time for blogging (sorry) but I just came up with a neat way of converting “Strongly Agree”, “Always”, all that stuff that you get on survey based datasets into numbers ready for analysis. It’s automatic, so it will play havoc with any word based questions- analyse them in a separate script.

diff --git a/content/post/2020-02-25-rapidly-find-the-mean-of-survey-questions.md b/content/post/2020-02-25-rapidly-find-the-mean-of-survey-questions.md
@@ -3,10 +3,7 @@ author: chrisbeeley
 categories:
 - Uncategorized
 date: "2020-02-25T14:18:00Z"
-guid: http://chrisbeeley.net/?p=1292
-id: 1292
 title: Rapidly find the mean of survey questions
-url: /?p=1292
 ---
 
 Following on from the last blog post, I’ve got quite a nice way of generating lots of means from a survey dataset. This one relies on the fact that I’m averaging questions that go 2.1, 2.2, 2.3, and 3.1, 3.2, 3.3, so I can look for all questions that start with “2.”, “3.”, etc.

diff --git a/content/post/2020-03-04-the-great-survey-munge.md b/content/post/2020-03-04-the-great-survey-munge.md
@@ -3,10 +3,7 @@ author: chrisbeeley
 categories:
 - Uncategorized
 date: "2020-03-04T13:51:00Z"
-guid: http://chrisbeeley.net/?p=1307
-id: 1307
 title: The Great Survey Munge
-url: /?p=1307
 ---
 
 As I mentioned on Twitter the other day, I have this rather ugly spreadsheet that comes from some online survey software that requires quite a lot of cleaning in order to upload it to the database. I had an old version written in base R but the survey has changed so I’ve updated it to tidyverse.

diff --git a/content/post/2020-03-23-nhs-data-science-and-software-licensing.md b/content/post/2020-03-23-nhs-data-science-and-software-licensing.md
@@ -3,10 +3,7 @@ author: chrisbeeley
 categories:
 - Uncategorized
 date: "2020-03-23T13:04:45Z"
-guid: http://chrisbeeley.net/?p=1336
-id: 1336
 title: NHS data science and software licensing
-url: /?p=1336
 ---
 
 I’m writing something about software licensing and IP in NHS data science projects at the moment. I don’t think I ever dreamed about doing this, but I’ve noticed that a lot of people working in data science and related fields are confused about some of the issues and I would like to produce a set of facts (and opinions) which are based on a thorough reading of the subject and share them with interested parties. It’s a big job but I thought I’d trail a bit of it here and there as I go. Here’s the summary at the end of the licences section.

diff --git a/content/post/2020-03-27-data-science-for-human-beings.md b/content/post/2020-03-27-data-science-for-human-beings.md
@@ -3,10 +3,7 @@ author: chrisbeeley
 categories:
 - Uncategorized
 date: "2020-03-27T10:46:57Z"
-guid: http://chrisbeeley.net/?p=1341
-id: 1341
 title: Data science for human beings
-url: /?p=1341
 ---
 
 Someone just emailed me to ask me about getting into data science. They knew all the usual stuff, linear algebra, Python, all that stuff, so I thought I’d talk about the other side of data science. It’s all stuff I say whenever I talk about data science, but I’ve never written it down so I thought I may as well blog it.

diff --git a/content/post/2020-04-03-app-r-and-global-r.md b/content/post/2020-04-03-app-r-and-global-r.md
@@ -4,11 +4,7 @@ categories:
 - Uncategorized
 date: "2020-04-03T18:14:49Z"
 guid: http://chrisbeeley.net/?p=1344
-id: 1344
-spay_email:
-- ""
 title: app.R and global.R
-url: /?p=1344
 ---
 
 I’m doing some Shiny training this year and I want to teach whatever the new thinking is so I’ve been reading Hadley Wickham’s online book [Mastering Shiny](https://mastering-shiny.org/). There’s a couple of things that I’ve noticed where Shiny is moving on, so if you want to keep up to date I suggest you have a look. I’m going to pick out a few here. Firstly, note that in Shiny 1.5 (which is not released at the time of writing) all code in the R/ directory will be sourced automatically. This is a very good idea, I’ve got loads of source(“useful\_code.R”, local = TRUE) lines in some of my applications, so it gets rid of all that.

diff --git a/content/post/2020-05-26-productionising-r-at-nottinghamshire-healthcare.md b/content/post/2020-05-26-productionising-r-at-nottinghamshire-healthcare.md
@@ -3,10 +3,7 @@ author: chrisbeeley
 categories:
 - Uncategorized
 date: "2020-05-26T17:18:50Z"
-guid: http://chrisbeeley.net/?p=1366
-id: 1366
 title: Productionising R at Nottinghamshire Healthcare
-url: /?p=1366
 ---
 
 I’m hopeful we’re moving into a bit of a new phase with using R in my Trust so I thought I’d outline the direction of travel, to see if it chimes with anyone else and just to keep people up to date about what we’re doing.

diff --git a/content/post/2020-06-10-rstudio-connect-behind-the-firewall.md b/content/post/2020-06-10-rstudio-connect-behind-the-firewall.md
@@ -3,10 +3,7 @@ author: chrisbeeley
 categories:
 - Uncategorized
 date: "2020-06-10T16:18:43Z"
-guid: http://chrisbeeley.net/?p=1388
-id: 1388
 title: RStudio Connect behind the firewall
-url: /?p=1388
 ---
 
 This is part II of what would otherwise have been a far-too-long post about configuring RStudio Connect. A bit of back story, particularly for those of you who might have hit this from a Google search (which does happen, JetPack tells me) and don’t know who the heck I am and what I do all day. Here’s what I said in part I:

diff --git a/content/post/2020-06-10-rstudio-connect-in-the-cloud.md b/content/post/2020-06-10-rstudio-connect-in-the-cloud.md
@@ -3,10 +3,7 @@ author: chrisbeeley
 categories:
 - Uncategorized
 date: "2020-06-10T16:09:06Z"
-guid: http://chrisbeeley.net/?p=1380
-id: 1380
 title: RStudio Connect in the cloud
-url: /?p=1380
 ---
 
 I’ve been using RStudio stuff on the server for a long time. I started using Shiny community edition back in 2013 for an application that is totally open and so doesn’t need authenticating. Then two years ago I started deploying Shiny applications that people authenticated to behind our Trust firewall using Shiny Pro. I have wanted to use RStudio Connect for a long time but it was hard to get the funding together for it given how things are with austerity since the banking crisis.