Skip to content
/ grapnel Public

Repository with tools for convert body in response to plain text

License

Notifications You must be signed in to change notification settings

zbioe/grapnel

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

15 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Grapnel

Repository with tools to convert for text some content types

CLI

Install

To get for run in cli you can get with:

go get github.com/zbioe/grapnel

Usage

you can pass -type of content in "type" for parse reader directly:

cat pdf/testdata/valid.pdf | grapnel -t pdf
cat html/testdata/valid.html | grapnel -t html

or you can not pass type for read all content and try detect the type:

cat pdf/testdata/valid.pdf | grapnel
cat html/testdata/valid.html | grapnel

PDF

Receive Pdf in []byte or io.Reader and transform him to text with pdftotext

Example

create file main.go with content:

package main

import (
	"os"
	"fmt"
	"github.com/zbioe/grapnel/pdf"
)

func main() {
	text, err := pdf.ToTextFromReader(os.Stdin)
	if err != nil {
		fmt.Print(err)
		os.Exit(1)
	}
	fmt.Print(text)
}

run on command line:

go run main.go < pdf/test_files/valid.pdf
curl -Ls "http://www.orimi.com/pdf-test.pdf" | go run main.go

Requirements

HTML

Receive html in bytes or reader and transform him to text

Example

create file main.go with content:

package main

import (
	"os"
	"fmt"
	"github.com/zbioe/grapnel/html"
)

func main() {
	text, err := html.ToTextFromReader(os.Stdin)
	if err != nil {
		fmt.Print(err)
		os.Exit(1)
	}
	fmt.Print(text)
}

run on command line:

go run main.go < pdf/testdata/valid.html
curl -Ls "https://reddit.com/" | go run main.go

About

Repository with tools for convert body in response to plain text

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages