Conveyor is a lightweight multithreaded file processing library.
Think of it as a simple way to apply a function/method to every line in 1 to n file(s).
A few good example use cases for this library are:
- A file-wide map function (e.g Example Usage).
- Drop or add new lines (e.g Split Lines Example).
- Split a file into separate files (e.g Animal Sorter Example)
- Count occurrences of certain things
(e.g Rune Counter Example)
...
go get github.com/fgehrlicher/conveyor
Redact all occurrences of a given email:
func main() {
// Create the output file
resultFile, _ := os.Create("redacted_data.txt")
// Instantiate a new ConcurrentWriter which wraps the resultFile handle.
// The ConcurrentWriter type is just a small thread-safe wrapper for
// io.Writer which is able to keep the chunk output in order.
w := conveyor.NewConcurrentWriter(resultFile, true)
// Split the input file into chunks of 512 bytes with
// the concurrent writer as output ChunkWriter.
chunks, _ := conveyor.GetChunksFromFile("data.txt", 512, w)
// Create and execute a Queue with 4 workers and the Redact function as LineProcessor.
result := conveyor.NewQueue(chunks, 4, conveyor.LineProcessorFunc(Redact)).Work()
// Print the number of processed lines.
log.Printf("processed %d lines", result.Lines)
}
// Text that should be redacted
const emailToRedact = "[email protected]"
// Redact replaces all occurrences of "[email protected]" with "x"
func Redact(line []byte, metadata conveyor.LineMetadata) ([]byte, error) {
result := strings.ReplaceAll(
string(line),
emailToRedact,
strings.Repeat("x", len(emailToRedact)),
)
return []byte(result), nil
}
Additional Examples:
- Rune Counter counts and prints the number of occurrences of certain runes.
- Animal Sorter sorts .csv entries by field and divides them into separate files.
- Split Lines replaces all occurrences of spaces with line breaks.
TODO
TODO
TODO