Skip to content

Commit

Permalink
Implement walking of directories
Browse files Browse the repository at this point in the history
Recurse into directories using BurntSushi's walkdir crate. This means
the following is now possible:

    unicop src/

Also set the default to '.' (i.e. recursively checking the current
directory).

I've read in the documentation that tree-sitter actually has a
language-detection framework that grammars can plug into, but
unfortunately, it doesn't seem to be exposed in the library. It might
still be a good idea to adopt it, as it will make it easier to add new
languages. This commit contains a stripped-down version that only look
at file extensions (tree-sitter's language detection also uses regexps,
but none of the current grammars define any.)
  • Loading branch information
gregoire-mullvad committed Jul 24, 2024
1 parent fdf15aa commit fb9b3b6
Show file tree
Hide file tree
Showing 4 changed files with 50 additions and 13 deletions.
1 change: 1 addition & 0 deletions Cargo.lock

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

1 change: 1 addition & 0 deletions Cargo.toml
Original file line number Diff line number Diff line change
Expand Up @@ -26,6 +26,7 @@ unic-ucd-block = "0.9.0"
unic-char-range = "0.9.0"
toml = "0.8.14"
serde = { version = "1.0.203", features = ["derive"] }
walkdir = "2.5.0"

[dev-dependencies]
trycmd = "0.15.5"
8 changes: 8 additions & 0 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,5 +1,13 @@
# Unicop

## Usage

```sh,ignore
unicop [FILES]...
```

Where `[FILES]...` is a list of files or directory to check, default: `.`.

## Example

```console
Expand Down
53 changes: 40 additions & 13 deletions src/main.rs
Original file line number Diff line number Diff line change
@@ -1,5 +1,6 @@
use std::env;
use std::fs;
use std::path::Path;

use miette::{miette, LabeledSpan, NamedSource, Severity};
use unic_ucd_name::Name;
Expand All @@ -8,26 +9,36 @@ mod config;
mod rules;

fn main() {
for arg in env::args().skip(1) {
check_file(&arg);
let mut args: Vec<String> = env::args().skip(1).collect();
if args.is_empty() {
args = vec![String::from(".")]
}
for arg in args {
for entry in walkdir::WalkDir::new(arg) {
match entry {
Err(err) => eprintln!("{:}", err),
Ok(entry) if entry.file_type().is_file() => check_file(entry.path()),
Ok(_) => {}
}
}
}
}

fn check_file(arg: &str) {
let src = fs::read_to_string(arg).unwrap();
let nsrc = NamedSource::new(arg, src.clone());
fn check_file(path: &Path) {
let Some(lang) = detect_language(path) else {
return;
};
let filename = path.display().to_string();
let src = fs::read_to_string(path).unwrap();
let named_source = NamedSource::new(&filename, src.clone());
let mut parser = tree_sitter::Parser::new();
parser
.set_language(&tree_sitter_javascript::language())
.expect("Error loading JavaScript grammar");
// parser
// .set_language(&tree_sitter_python::language())
// .expect("Error loading Python grammar");
parser.set_language(&lang).expect("Error loading grammar");
let tree = parser.parse(&src, None).unwrap();
if tree.root_node().has_error() {
println!(
"{:?}",
miette!(severity = Severity::Warning, "{}: parse error", arg).with_source_code(nsrc)
miette!(severity = Severity::Warning, "{}: parse error", filename)
.with_source_code(named_source.clone())
);
}
for (off, ch) in src.char_indices() {
Expand All @@ -52,7 +63,23 @@ fn check_file(arg: &str) {
chname,
node.kind()
)
.with_source_code(NamedSource::new(arg, src.clone()));
.with_source_code(named_source.clone());
println!("{:?}", report);
}
}

// Tree-sitter grammars include some configurations to help decide whether the language applies to
// a given file.
// Unfortunately, neither the language-detection algorithm nor the configurations are included in
// the Rust crates. So for now we have a simplified language-detection with hard-coded
// configurations.
// See https://tree-sitter.github.io/tree-sitter/syntax-highlighting#language-detection
fn detect_language(path: &Path) -> Option<tree_sitter::Language> {
match path.extension()?.to_str()? {
// https://github.com/tree-sitter/tree-sitter-javascript/blob/master/package.json
"js" | "mjs" | "cjs" | "jsx" => Some(tree_sitter_javascript::language()),
// https://github.com/tree-sitter/tree-sitter-python/blob/master/package.json
"py" => Some(tree_sitter_python::language()),
_ => None,
}
}

0 comments on commit fb9b3b6

Please sign in to comment.