Skip to content

Commit

Permalink
code debugging
Browse files Browse the repository at this point in the history
  • Loading branch information
sammo3182 committed Aug 14, 2023
1 parent 124f7b3 commit f2c914e
Show file tree
Hide file tree
Showing 3 changed files with 75 additions and 57 deletions.
94 changes: 47 additions & 47 deletions .Rhistory
Original file line number Diff line number Diff line change
@@ -1,50 +1,3 @@
year_from <- ls_index[1]
year_to <- ls_index[2]
}
# Convert the input to a data.frame for later merging
df_input <- data_input %>% as.data.frame()
names(df_input) <- ls_index[1]
data_output <-
select(region_table, !!ls_index) %>%
distinct() %>%
left_join(df_input, .) %>%
# using left_join to keep the order of the input data
pull(!!year_to)
# Because '2pinyin' can not be used as a variable name
if (topinyin) {
if (is.character(data_output)) {
data_output <-
py(
char = data_output,
dic = pydic(convert_to = "toneless", dic = "pinyin2")
)
}
}
return(data_output)
}
# original geocodes. It's 2019 version
corruption$prefecture_id
knitr::opts_chunk$set(message = FALSE, warning = FALSE)
library(regioncode)
data("corruption")
# original geocodes. It's 2019 version
corruption$prefecture_id
corruption <- sample_n(na.exclude(corruption), 10)
corruption <- dplyr::sample_n(na.exclude(corruption), 10)
View(corruption)
set.seed(313)
corruption <- dplyr::sample_n(na.exclude(corruption), 10)
View(corruption)
set.seed(313)
corruption <- dplyr::sample_n(na.exclude(corruption), 10) %>%
select(province:countyid)
library(dplry)
library(dplyr)
set.seed(313)
corruption <- dplyr::sample_n(na.exclude(corruption), 10) %>%
select(province:countyid)
set.seed(313)
corruption <- dplyr::sample_n(na.exclude(corruption), 10) %>%
select(province:county_id)
View(corruption)
save(corruption, "../data/corruption.rda")
Expand Down Expand Up @@ -510,3 +463,50 @@ View(region_table)
names(region_data)
names(region_table)
load("D:/Seafile/WW_research/01_Research/R_package/regioncode/R/sysdata.rda")
library(regioncode)
load("D:/Seafile/WW_research/01_Research/R_package/regioncode/R/sysdata.rda")
names(region_data)
save(corruption, region_data, file = here::here("R", "sysdata.rda"))
load("D:/Seafile/WW_research/01_Research/R_package/regioncode/R/sysdata.rda")
knitr::opts_chunk$set(message = FALSE, warning = FALSE)
if(!require(regioncode)) install.packages("regioncode")
library(regioncode)
data("corruption")
# Original 2019 version
corruption$prefecture_id
# 1999 version
regioncode(data_input = corruption$prefecture_id,
convert_to = "code", # default set
year_from = 2019,
year_to = 1989)
# The original name
corruption$prefecture
# Codes to name
regioncode(data_input = corruption$prefecture_id,
convert_to = "name",
year_from = 2019,
year_to = 1989)
# Name to codes of the same year
regioncode(data_input = corruption$prefecture,
convert_to = "code",
year_from = 2019,
year_to = 2019)
# Name to name of a different year
regioncode(data_input = corruption$prefecture,
convert_to = "name",
year_from = 2019,
year_to = 1989)
tibble(
preference = corruption$prefecture,
rank = regioncode(data_input = corruption$prefecture,
year_from = 2011,
year_to = 1989,
convert_to="rank")
)
tidyr::tibble(
preference = corruption$prefecture,
rank = regioncode(data_input = corruption$prefecture,
year_from = 2011,
year_to = 1989,
convert_to="rank")
)
1 change: 1 addition & 0 deletions .Rproj.user/shared/notebooks/paths
Original file line number Diff line number Diff line change
@@ -1,3 +1,4 @@
D:/Seafile/WW_research/01_Research/R_package/regioncode/DESCRIPTION="F2A6D482"
D:/Seafile/WW_research/01_Research/R_package/regioncode/R/corruption.R="A63D693B"
D:/Seafile/WW_research/01_Research/R_package/regioncode/R/globals.R="8687F590"
D:/Seafile/WW_research/01_Research/R_package/regioncode/R/regioncode.R="4D95B977"
Expand Down
37 changes: 27 additions & 10 deletions vignettes/regioncode-vignette.Rmd
Original file line number Diff line number Diff line change
Expand Up @@ -23,9 +23,10 @@ editor_options:
knitr::opts_chunk$set(message = FALSE, warning = FALSE)
if(!require(regioncode)) install.packages("regioncode")
library(tidyverse)
```

Inspired by Vincent Arel-Bundock's amazing [`countrycode`](https://joss.theoj.org/papers/10.21105/joss.00848) package, we created `regioncode`, a package to achieve similar functions but specifically within China.
Inspired by Vincent Arel-Bundock's [`countrycode`](https://joss.theoj.org/papers/10.21105/joss.00848) package, we created `regioncode`, a package to achieve similar functions but specifically within China.
`regioncode` aims to enable seamlessly converting regions' formal names, common-used names, and administrative division codes between each other in modern China (1986--2019).

## Why `regioncode`?
Expand All @@ -47,15 +48,19 @@ To install:

# Basic Usage

We uses a randomly drawn sample of Yuhua Wang's [`China’s Corruption Investigations Dataset`](https://dataverse.harvard.edu/dataset.xhtml?persistentId=doi:10.7910/DVN/9QZRAD) to illustrate how the package works.
In the data, the division codes were recorded with the 2019 version, and we added prefectural abbreviations for the sake of illustration.
We uses a randomly sample from the [`China's Corruption Investigations Dataset`](https://dataverse.harvard.edu/dataset.xhtml?persistentId=doi:10.7910/DVN/9QZRAD) to illustrate how the functions work.

In `regioncode` package, we named administrative division codes as `code`, regions' formal names as `name`, and their commonly used abbreviation as `sname`.

In `regioncode` package, we named administrative division codes as `code`, regions' formal names as `name`, and their commonly used abbreviation ("short name") as `sname`.
The current version enables mutual conversion between any pair of them.
To do so, users just need to pass a character vector of names or a numeric vector of geocodes into the function.
In the current version, the function can produce three types of output at both the prefectural and provincial levels: codes (`code`), names (`name`) and area (`area`, such as 华北, 东北, 华南, etc.).
One just needs to specify which type of the output they want in the argument `convert_to` and corresponding years of the input and output.
For example, the following codes convert the 2019 geocodes in the `corruption` data to their 1989 version:
To do so, users just need to pass a character vector of names or a numeric vector of geocodes into the function and specify which type of the output they want in the argument `convert_to` to gain the type of output of interest.
In the current version,three types of output are available:

1. Geocodes (`code`)
1. Names of the given cities/provinces (`name`)
1. Area the given cities/provinces belong (`area`, such as 华北, 东北, 华南, etc.).

In the following example, the 2019 geocodes in the toy data to their 1989 version:

```{r code2code}
library(regioncode)
Expand All @@ -65,11 +70,23 @@ data("corruption")
# Original 2019 version
corruption$prefecture_id
# 1999 version
regioncode(data_input = corruption$prefecture_id,
# 1989 version
temp <- regioncode(data_input = corruption$prefecture_id,
convert_to = "code", # default set
year_from = 2019,
year_to = 1989)
# tibble(
# code2019 = corruption$prefecture_id,
# code1989 = regioncode(data_input = corruption$prefecture_id,
# convert_to = "code", # default set
# year_from = 2019,
# year_to = 1989),
# name1989 = regioncode(data_input = corruption$prefecture_id,
# convert_to = "name", # default set
# year_from = 2019,
# year_to = 1989)
# )
```

Note that if a region was initially geocoded in e.g., 1989 and included in a new region, in 2019, the new region geocode will be used hereafter.
Expand Down

0 comments on commit f2c914e

Please sign in to comment.