Skip to content

Commit

Permalink
Finish strategy post
Browse files Browse the repository at this point in the history
  • Loading branch information
ChrisBeeley committed Jun 18, 2024
1 parent b7f6fe4 commit 43c376a
Show file tree
Hide file tree
Showing 277 changed files with 9,967 additions and 3,279 deletions.
Original file line number Diff line number Diff line change
Expand Up @@ -3,20 +3,17 @@ author: chrisbeeley
categories:
- Uncategorized
date: "2019-01-20T17:41:39Z"
guid: https://chrisbeeley.net/?p=1186
id: 1186
title: Data science accelerator lesson one- build a pipeline and ship the code!
url: /?p=1186
---

My exciting news is that I was accepted onto the [data science accelerator](https://www.gov.uk/government/publications/data-science-accelerator-programme/introduction-to-the-data-science-accelerator) and have been doing it since late December. My project, basically, is all about using natural language processing to better understand the patient experience data that we collect (and, if I have time, the staff experience data too). Here are the goals:

1\) Using an unsupervised technique, generate a novel way of categorising the text data to give us a different perspective. We already have tagged data but I would like to interrogate the usefulness of the tags that we have used
2\) a. Generate a system that, given a comment or set of comments, can find other comments within the data that are semantically similar. Note that this system will need to run live on the server, since it would be impossible to store the semantic similarity of every comment to every other comment
3\) b. Generate a system that, instead of searching by word, searches by similarity to that word
3\) Produce a supervised learning algorithm which can be trained on a sample of tagged comments and then produce tags for comments that it has not previously seen
4\) a. Produce a sentiment analysis function that can tag every comment in a database with how positive or negative it is
4\) b. Produce reporting functions that can compute overall sentiment for a group of documents (e.g. of a particular service area) and optionally describe the change in sentiment over time
1) Using an unsupervised technique, generate a novel way of categorising the text data to give us a different perspective. We already have tagged data but I would like to interrogate the usefulness of the tags that we have used
1) Generate a system that, given a comment or set of comments, can find other comments within the data that are semantically similar. Note that this system will need to run live on the server, since it would be impossible to store the semantic similarity of every comment to every other comment
2) Generate a system that, instead of searching by word, searches by similarity to that word
3) Produce a supervised learning algorithm which can be trained on a sample of tagged comments and then produce tags for comments that it has not previously seen
1) Produce a sentiment analysis function that can tag every comment in a database with how positive or negative it is
2) Produce reporting functions that can compute overall sentiment for a group of documents (e.g. of a particular service area) and optionally describe the change in sentiment over time

I’m not really sure if I’m going to get through all of it but I’ve made a decent start. I’ve made a [Trello board](https://trello.com/b/GlmtpsqB/data-science-accelerator) and there’s a [GitHub](https://github.com/ChrisBeeley/naturallanguageprocessing) too.

Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -3,10 +3,7 @@ author: chrisbeeley
categories:
- Uncategorized
date: "2019-02-03T17:45:39Z"
guid: https://chrisbeeley.net/?p=1188
id: 1188
title: Dplyr function to replace nested ifelse (like SQL CASE WHEN)
url: /?p=1188
---

I hate nested ifelse statements in R code. I absolutely hate them. They are just ugly and difficult to read. I found this lovely function in dplyr called case\_when (which is like the SQL CASE WHEN if you know that) that banished them forever. It’s easier if I just show you (this is from the documentation)
Expand Down
3 changes: 0 additions & 3 deletions content/post/2019-02-17-citing-r-packages-in-rmarkdown.md
Original file line number Diff line number Diff line change
Expand Up @@ -3,10 +3,7 @@ author: chrisbeeley
categories:
- Uncategorized
date: "2019-02-17T15:46:33Z"
guid: https://chrisbeeley.net/?p=1190
id: 1190
title: Citing R packages in RMarkdown
url: /?p=1190
---

I haven’t really done much in the way of citing papers in the last couple of years, I’ve spent my time either messing around with Shiny servers or databases or writing RMarkdown reports- and being horribly ill, of course, haha!- see this blog, passim. Don’t worry, I’m as fit as a flea now 🙂
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -3,10 +3,7 @@ author: chrisbeeley
categories:
- Uncategorized
date: "2019-03-03T09:26:42Z"
guid: https://chrisbeeley.net/?p=1195
id: 1195
title: 'A world of #plotthedots and… what else?'
url: /?p=1195
title: 'A world of #plotthedots and what else?'
---

![image](https://chrisbeeley.net/wp-content/uploads/2019/02/image.png)
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -3,10 +3,7 @@ author: chrisbeeley
categories:
- Uncategorized
date: "2019-03-17T18:27:09Z"
guid: https://chrisbeeley.net/?p=1198
id: 1198
title: Suppress console output with ggplot, purrr, and RMarkdown
url: /?p=1198
---

So [I posted a while back](https://chrisbeeley.net/?p=1143) about producing several plots at once with RMarkdown and purrr and how to suppress the console output in the document.
Expand All @@ -18,8 +15,6 @@ For ggplot, you need to excellent function walk() which is like map() except it
Bish bash bosh. Easy

```
<pre class="brush: r; title: ; notranslate" title="">
```{r, message = FALSE, echo = FALSE}
library(tidyverse)
Expand All @@ -32,5 +27,4 @@ walk(c("Plot1", "Plot 2", "Plot 3"), function(x) {
print(p)
})
```

```
Original file line number Diff line number Diff line change
Expand Up @@ -3,10 +3,7 @@ author: chrisbeeley
categories:
- Uncategorized
date: "2019-03-19T18:56:03Z"
guid: https://chrisbeeley.net/?p=1209
id: 1209
title: Producing RMarkdown reports with Plumber
url: /?p=1209
---

I wasn’t going to post this until I got it working on the server but I’ve got the wrong train ticket and am stuck in London St Pancras until 7pm so I thought I’d be productive and put it up now.
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -3,10 +3,7 @@ author: chrisbeeley
categories:
- Uncategorized
date: "2019-03-31T16:48:05Z"
guid: https://chrisbeeley.net/?p=1212
id: 1212
title: Adding line returns in RMarkdown in a loop
url: /?p=1212
---

Another one that’s for me when I forget. The internet seems strangely reluctant to tell me how to do this, yet [here it is](https://stackoverflow.com/questions/49561077/creating-a-new-line-within-an-rmarkdown-chunk) buried in the answer to something else.
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -3,10 +3,7 @@ author: chrisbeeley
categories:
- Uncategorized
date: "2019-04-05T10:50:20Z"
guid: https://chrisbeeley.net/?p=1214
id: 1214
title: The future of patient experience data at Nottinghamshire Healthcare
url: /?p=1214
---

I’ve talked about my plans for patient experience data in quite a few different forums recently, but it just occurred to me that it isn’t written down anywhere.
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -3,10 +3,7 @@ author: chrisbeeley
categories:
- Uncategorized
date: "2019-05-25T14:34:00Z"
guid: https://chrisbeeley.net/?p=1216
id: 1216
title: A note for Kindle readers of my Shiny book
url: /?p=1216
---

Someone has been in touch with me to say that the Kindle version has no chapter numbers and they were finding it difficult to figure out which bit of the book went with which bit in the repository.
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -3,11 +3,8 @@ author: chrisbeeley
categories:
- Uncategorized
date: "2019-07-02T16:28:58Z"
guid: https://chrisbeeley.net/?p=1218
id: 1218
title: Python, working together, and productionising code at Nottinghamshire Healthcare
NHS Trust
url: /?p=1218
---

Someone just asked me a question on Twitter and I was most of the way through what would have been several tweets before I thought perhaps it would work better as a blog post. I’m going to answer the question first, and then talk some more about the context and what I’m hoping to achieve where I am and (with the advent of more collaborative working through the ICS) in the wider system. So I’ve been using scikit-learn to train a classifier to predict non attendance at healthcare appointments and the question to me was “What made you choose Python rather than R for this one? Which classifier are you using?”.
Expand Down
5 changes: 1 addition & 4 deletions content/post/2019-08-08-the-analysts-manifesto.md
Original file line number Diff line number Diff line change
Expand Up @@ -3,10 +3,7 @@ author: chrisbeeley
categories:
- Uncategorized
date: "2019-08-08T14:53:23Z"
guid: https://chrisbeeley.net/?p=1222
id: 1222
title: The analysts&#8217; manifesto
url: /?p=1222
title: The analysts' manifesto
---

I was at an event a little while ago and there was talk of change coming for healthcare analysts. With the advent of population health management we were going to finally get the recognition we as a profession deserve and would get the training and tools necessary to deliver the improvements we all know that data can give.
Expand Down
8 changes: 2 additions & 6 deletions content/post/2019-10-31-add-label-to-shinydashboard.md
Original file line number Diff line number Diff line change
Expand Up @@ -3,10 +3,7 @@ author: chrisbeeley
categories:
- Uncategorized
date: "2019-10-31T11:38:12Z"
guid: http://chrisbeeley.net/?p=1230
id: 1230
title: Add label to shinydashboard
url: /?p=1230
---

I feel like I’m sticking my neck out a bit here, and there’s a simple way to doing this that I haven’t found, but I’ve looked pretty hard and “add label to sidebar shiny dashboard” has basically no Google juice at all, and I should know because I’ve been staring at it for half an hour.
Expand All @@ -15,10 +12,9 @@ Sometimes you want to add a simple, static label to a shinydashboard sidebar. If

You can add something dynamic with sidebarMenuOutput, and you could do that as a long way round, but I got to thinking that there must be a simple way of doing it. I ended up looking at my shinydashboard in developer view in Chrome, and just stealing the markup from there. Once you’ve done that it’s very simple.

<div class="wp-block-syntaxhighlighter-code ">```
<pre class="brush: r; title: ; notranslate" title="">
```
div(class = "shiny-input-container",
p(paste0("Data updated: ", date_update))
```

</div>This code just shows a pre-calculated value of the date when the data was updated. I stole this idea from a colleague because sometimes the cron job that updates the data chokes and nobody notices for a while.
This code just shows a pre-calculated value of the date when the data was updated. I stole this idea from a colleague because sometimes the cron job that updates the data chokes and nobody notices for a while.
Original file line number Diff line number Diff line change
Expand Up @@ -3,10 +3,7 @@ author: chrisbeeley
categories:
- Uncategorized
date: "2019-11-06T18:03:08Z"
guid: http://chrisbeeley.net/?p=1242
id: 1242
title: Authenticating using LDAP from Active Directory using Shiny Server Pro
url: /?p=1242
---

Right, I promised I would write this a very long time ago and I still haven’t done it. I’ve just got back from the NHS-R community conference (see an upcoming post) and I’m in a sharing mood, and besides someone just asked me about this on Twitter.
Expand Down
3 changes: 0 additions & 3 deletions content/post/2019-12-08-nhs-r-conference.md
Original file line number Diff line number Diff line change
Expand Up @@ -3,10 +3,7 @@ author: chrisbeeley
categories:
- Uncategorized
date: "2019-12-08T12:36:07Z"
guid: http://chrisbeeley.net/?p=1257
id: 1257
title: NHS-R conference
url: /?p=1257
---

So I recently just got back from the NHS-R community conference, which was amazing of course, and it’s got me in the mood to share, so I’m writing a few blog posts. I’ve got some more in depth stuff to say about where I think NHS-R is/ should be going, but this is the “feels” one.
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -3,10 +3,7 @@ author: chrisbeeley
categories:
- Uncategorized
date: "2019-12-16T09:25:00Z"
guid: http://chrisbeeley.net/?p=1263
id: 1263
title: My superpower (a talk I gave about being ill)
url: /?p=1263
---

So this post is nothing to do with R, or Linux, or statistics, or any of the usual stuff. It’s about me. It’s more than possible that you’re not very interested in that so consider yourself warned.
Expand Down
3 changes: 0 additions & 3 deletions content/post/2020-01-19-decade-round-up-post.md
Original file line number Diff line number Diff line change
Expand Up @@ -3,10 +3,7 @@ author: chrisbeeley
categories:
- Uncategorized
date: "2020-01-19T09:58:21Z"
guid: http://chrisbeeley.net/?p=1273
id: 1273
title: Decade round up post
url: /?p=1273
---

I’ve got into the habit of writing a yearly roundup blog post (see this blog, passim), based on a suggested framework by David Allen. Since this is the end of a decade I thought it would be fun to do one for the whole decade.
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -3,10 +3,7 @@ author: chrisbeeley
categories:
- Uncategorized
date: "2020-02-18T13:41:50Z"
guid: http://chrisbeeley.net/?p=1287
id: 1287
title: Converting words on a survey dataset to numbers for analysis
url: /?p=1287
---

As always, I have very little time for blogging (sorry) but I just came up with a neat way of converting “Strongly Agree”, “Always”, all that stuff that you get on survey based datasets into numbers ready for analysis. It’s automatic, so it will play havoc with any word based questions- analyse them in a separate script.
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -3,10 +3,7 @@ author: chrisbeeley
categories:
- Uncategorized
date: "2020-02-25T14:18:00Z"
guid: http://chrisbeeley.net/?p=1292
id: 1292
title: Rapidly find the mean of survey questions
url: /?p=1292
---

Following on from the last blog post, I’ve got quite a nice way of generating lots of means from a survey dataset. This one relies on the fact that I’m averaging questions that go 2.1, 2.2, 2.3, and 3.1, 3.2, 3.3, so I can look for all questions that start with “2.”, “3.”, etc.
Expand Down
3 changes: 0 additions & 3 deletions content/post/2020-03-04-the-great-survey-munge.md
Original file line number Diff line number Diff line change
Expand Up @@ -3,10 +3,7 @@ author: chrisbeeley
categories:
- Uncategorized
date: "2020-03-04T13:51:00Z"
guid: http://chrisbeeley.net/?p=1307
id: 1307
title: The Great Survey Munge
url: /?p=1307
---

As I mentioned on Twitter the other day, I have this rather ugly spreadsheet that comes from some online survey software that requires quite a lot of cleaning in order to upload it to the database. I had an old version written in base R but the survey has changed so I’ve updated it to tidyverse.
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -3,10 +3,7 @@ author: chrisbeeley
categories:
- Uncategorized
date: "2020-03-23T13:04:45Z"
guid: http://chrisbeeley.net/?p=1336
id: 1336
title: NHS data science and software licensing
url: /?p=1336
---

I’m writing something about software licensing and IP in NHS data science projects at the moment. I don’t think I ever dreamed about doing this, but I’ve noticed that a lot of people working in data science and related fields are confused about some of the issues and I would like to produce a set of facts (and opinions) which are based on a thorough reading of the subject and share them with interested parties. It’s a big job but I thought I’d trail a bit of it here and there as I go. Here’s the summary at the end of the licences section.
Expand Down
3 changes: 0 additions & 3 deletions content/post/2020-03-27-data-science-for-human-beings.md
Original file line number Diff line number Diff line change
Expand Up @@ -3,10 +3,7 @@ author: chrisbeeley
categories:
- Uncategorized
date: "2020-03-27T10:46:57Z"
guid: http://chrisbeeley.net/?p=1341
id: 1341
title: Data science for human beings
url: /?p=1341
---

Someone just emailed me to ask me about getting into data science. They knew all the usual stuff, linear algebra, Python, all that stuff, so I thought I’d talk about the other side of data science. It’s all stuff I say whenever I talk about data science, but I’ve never written it down so I thought I may as well blog it.
Expand Down
4 changes: 0 additions & 4 deletions content/post/2020-04-03-app-r-and-global-r.md
Original file line number Diff line number Diff line change
Expand Up @@ -4,11 +4,7 @@ categories:
- Uncategorized
date: "2020-04-03T18:14:49Z"
guid: http://chrisbeeley.net/?p=1344
id: 1344
spay_email:
- ""
title: app.R and global.R
url: /?p=1344
---

I’m doing some Shiny training this year and I want to teach whatever the new thinking is so I’ve been reading Hadley Wickham’s online book [Mastering Shiny](https://mastering-shiny.org/). There’s a couple of things that I’ve noticed where Shiny is moving on, so if you want to keep up to date I suggest you have a look. I’m going to pick out a few here. Firstly, note that in Shiny 1.5 (which is not released at the time of writing) all code in the R/ directory will be sourced automatically. This is a very good idea, I’ve got loads of source(“useful\_code.R”, local = TRUE) lines in some of my applications, so it gets rid of all that.
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -3,10 +3,7 @@ author: chrisbeeley
categories:
- Uncategorized
date: "2020-05-26T17:18:50Z"
guid: http://chrisbeeley.net/?p=1366
id: 1366
title: Productionising R at Nottinghamshire Healthcare
url: /?p=1366
---

I’m hopeful we’re moving into a bit of a new phase with using R in my Trust so I thought I’d outline the direction of travel, to see if it chimes with anyone else and just to keep people up to date about what we’re doing.
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -3,10 +3,7 @@ author: chrisbeeley
categories:
- Uncategorized
date: "2020-06-10T16:18:43Z"
guid: http://chrisbeeley.net/?p=1388
id: 1388
title: RStudio Connect behind the firewall
url: /?p=1388
---

This is part II of what would otherwise have been a far-too-long post about configuring RStudio Connect. A bit of back story, particularly for those of you who might have hit this from a Google search (which does happen, JetPack tells me) and don’t know who the heck I am and what I do all day. Here’s what I said in part I:
Expand Down
3 changes: 0 additions & 3 deletions content/post/2020-06-10-rstudio-connect-in-the-cloud.md
Original file line number Diff line number Diff line change
Expand Up @@ -3,10 +3,7 @@ author: chrisbeeley
categories:
- Uncategorized
date: "2020-06-10T16:09:06Z"
guid: http://chrisbeeley.net/?p=1380
id: 1380
title: RStudio Connect in the cloud
url: /?p=1380
---

I’ve been using RStudio stuff on the server for a long time. I started using Shiny community edition back in 2013 for an application that is totally open and so doesn’t need authenticating. Then two years ago I started deploying Shiny applications that people authenticated to behind our Trust firewall using Shiny Pro. I have wanted to use RStudio Connect for a long time but it was hard to get the funding together for it given how things are with austerity since the banking crisis.
Expand Down
Loading

0 comments on commit 43c376a

Please sign in to comment.