-
Notifications
You must be signed in to change notification settings - Fork 0
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
GPT-4 updated the README to include both utilities
- Loading branch information
Showing
3 changed files
with
77 additions
and
30 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,50 +1,97 @@ | ||
# NAME | ||
openai-tokens-count - counts the number of tokens in text files according to a specified OpenAI model | ||
# openai-tokens | ||
|
||
This Python package provides utilities for working with OpenAI model tokens. These tools allow you to count the number of tokens in text files and to output the first N tokens from text files, according to the specifications of a specified OpenAI model. | ||
|
||
NOTE: This code is not affiliated with or supported by OpenAI. | ||
|
||
## Installation | ||
|
||
To install the package, clone the repository from GitHub: | ||
|
||
# SYNOPSIS | ||
``` | ||
openai-tokens-count [options] file... | ||
git clone https://github.com/alestic/openai-tokens.git | ||
cd openai-tokens | ||
pip install . | ||
``` | ||
|
||
# DESCRIPTION | ||
openai-tokens-count reads the specified text files and computes the number of tokens for each file as per the OpenAI model's specifications. | ||
## Usage | ||
|
||
The package currently includes the following tools: | ||
|
||
1. `openai-tokens-count` | ||
2. `openai-tokens-head` | ||
|
||
### openai-tokens-count | ||
|
||
If no file is specified, or if the file is -, openai-tokens-count reads from standard input. | ||
Counts the number of tokens in text files according to a specified OpenAI model. | ||
|
||
The number of tokens and file name are then printed to standard output. | ||
``` | ||
usage: openai-tokens-count [options] file... | ||
``` | ||
|
||
`openai-tokens-count` reads the specified text files and computes the number of tokens for each file as per the OpenAI model's specifications. If no file is specified, or if the file is -, `openai-tokens-count` reads from standard input. The number of tokens and file name are then printed to standard output. | ||
|
||
#### Options | ||
|
||
# OPTIONS | ||
## --model MODEL_NAME | ||
Specifies the OpenAI model to use for counting tokens. Defaults to "gpt-4-0314". | ||
- `--model MODEL_NAME`: Specifies the OpenAI model to use for counting tokens. Defaults to "gpt-4-0314". | ||
- `file`: The text file to count tokens in. Multiple files can be specified. If no file is provided or if the file is '-', `openai-tokens-count` reads from standard input. | ||
|
||
## file | ||
The text file to count tokens in. Multiple files can be specified. If no file is provided or if the file is '-', openai-tokens-count reads from standard input. | ||
#### Examples | ||
|
||
# EXAMPLES | ||
Count tokens in a single file: | ||
``` | ||
./openai-tokens-count example.txt | ||
openai-tokens-count example.txt | ||
``` | ||
|
||
Count tokens in multiple files: | ||
``` | ||
./openai-tokens-count file1.txt file2.txt | ||
openai-tokens-count file1.txt file2.txt | ||
``` | ||
|
||
Count tokens in standard input: | ||
``` | ||
cat example.txt | ./openai-tokens-count | ||
cat example.txt | openai-tokens-count | ||
``` | ||
|
||
Count tokens using a different model: | ||
``` | ||
./openai-tokens-count --model "gpt-3.5-turbo-0301" example.txt | ||
openai-tokens-count --model "gpt-3.5-turbo-0301" example.txt | ||
``` | ||
|
||
### openai-tokens-head | ||
|
||
Outputs the first `--tokens COUNT` tokens from the input file(s) or stdin. | ||
|
||
``` | ||
usage: openai-tokens-head [options] file... | ||
``` | ||
|
||
`openai-tokens-head` reads the specified text files and outputs the first `--tokens COUNT` tokens according to the OpenAI model's specifications. With more than one FILE, precede each with a header giving the file name. If no file is specified, or if the file is -, `openai-tokens-head` reads from standard input. | ||
|
||
#### Options | ||
|
||
- `-n, --tokens COUNT`: Output the first COUNT tokens. If COUNT is 0, output nothing. | ||
- `--model MODEL_NAME`: Specifies the OpenAI model to use for tokenizing. Defaults to "gpt-4-0314". | ||
- `file`: The text file to get tokens from. Multiple files can be specified. If no file is provided or if the file is '-', `openai-tokens-head` reads from standard input. | ||
|
||
#### Examples | ||
|
||
Output the first 100 tokens from a file: | ||
``` | ||
openai-tokens-head -n 100 example.txt | ||
``` | ||
|
||
Output the first 50 tokens using a different model: | ||
``` | ||
openai-tokens-head --model "gpt-3.5-turbo-0301" -n 50 example.txt | ||
``` | ||
|
||
## Authors | ||
|
||
- Written by GPT-4. | ||
- Prompt engineering by Eric Hammond. | ||
- Some code Copyright (c) 2023 OpenAI | ||
|
||
# AUTHORS | ||
Written by GPT-4. | ||
Prompt engineering by Eric Hammond. | ||
Some code Copyright (c) 2023 OpenAI | ||
## License | ||
|
||
# DATE | ||
2023-06-20 | ||
This project is licensed under the terms of the MIT license. |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters