Skip to content

Removes comments from parquet files that have C# source code as their content

License

Notifications You must be signed in to change notification settings

fasterinnerlooper/CommentRemover

Repository files navigation

CommentRemover

The Comment Remover is designed specifically to remove comments from parquet files that contain C# code. The code is specifically designed to process files found at hf.co/datasets/microsoft/LCC_csharp, but it could be altered to work with a parquet dataset of any type with minor modifications to the DataReader class, specifically lines 40 and 41.

About

Removes comments from parquet files that have C# source code as their content

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages