Skip to content

Commit

Permalink
Update README.md
Browse files Browse the repository at this point in the history
  • Loading branch information
bethanyhgardner authored Feb 10, 2024
1 parent 1743780 commit 44eabb0
Showing 1 changed file with 89 additions and 2 deletions.
91 changes: 89 additions & 2 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,2 +1,89 @@
# lingthusiasm-vowel-plots

# Lingthusiasm Vowel Plots

## About

This repository includes the data, code, and [tutorial website](https://bethanyhgardner.github.io/lingthusiasm-vowel-plots/) accompanying Lingthusiasm's bonus episode about vowel plots (coming in March 2024!).

If you use this tutorial to make your own plots, I’d love to see them! If you have questions about this material, feel to get in touch with me by posting on discussion page or sending me an email. To see if I’m currently taking freelance contracts for data visualizations or other related tasks, send me an email (this GitHub username @ gmail.com).

## Includes
```
├── 1_find_words.qmd
├── 2_annotate_audio.qmd
├── 3_plot_vowels.qmd
```

The code for the tutorials, written in Python (`1_find_words.qmd` and `2_annotate_audio.qmd`) and R (`3_plot_vowels.qmd`) and rendered using [Quarto](https://quarto.org/).

```
├── audio
└── └──words
```

The audio data. Although `1_find_words.qmd` downloads all of the episodes from YouTube and `2_annotate_audio.qmd` refers to recordings of the Wells Lexical Set that Gretchen and Lauren made for me, only the .wav files that trim out individual words, plus the .TextGrid files annotating the vowel location in each word, are tracked in this repository.

```
├── data
│ ├── captions.csv
│ ├── formants.csv
│ ├── timestamps_all.csv
│ ├── timestamps_annotate.csv
└── └──transcripts.csv
```

Data files from the various stages of finding vowels in the Lingthusiasm episodes, then annotating that and the Wells Lexical Set recordings:

* Downloading the episode transcripts from the Lingthusiasm website (`transcripts.csv`).
* Downloading the captions—which don't always have speaker labels and aren't always proofread, but do have timestamps—from YouTube (`captions.csv`).
* Finding target words in the transcript data and matching them to timestamps in the caption data (`timestamps_all.csv`) and (`timestamps_annotate.csv`).
* Extracting F1 and F2 after annotating the location of the vowel in each word in Praat (`formants.csv`). This is the data used in the plots.

```
├── plots
│ ├── 1_means_original.png
│ ├── 2_means_flipped.png
│ ├── 3_individual_points.png
│ ├── 4_words_episodes.png
│ ├── 4_words_lexical_set.png
│ ├── 5_ellipses.png
│ ├── gretchen_vowels_ep.png
│ ├── gretchen_words_ls.png
│ ├── lauren_vowels_ep.png
│ ├── lauren_words_ls.png
└── └── paired_vowels_ep.png
```

PNG files for all of the plots created in `3_plot_vowels.qmd`.

```
├── docs
├── index.qmd
└── _quarto.yml
```

Website files.


```
├── _environment.yml
```

Python environment info (using conda).

```
├── renv.lock
```

R environment info (using {renv}).

```
├── resources
│ ├── ipa_chart.png
│ ├── lingthusiasm_logo_circle.png
│ ├── lingthusiasm_logo_tagline.png
│ ├── praat_screenshot.png
│ ├── theme.css
└── └── wells_lexical_set.jpg
```

Images used in tutorial, Lingthusiasm logos included in plots, and CSS theme edited to make the website used the Lingthusiasm green color.

0 comments on commit 44eabb0

Please sign in to comment.