Parrot GPT is a Python library that enables you to convert bibliographic metadata between various schemas using OpenAI’s large language models through its API. The library is not limited to specific schemas, but currently, some restrictions apply to input and output schemas.
- License: MIT
- Documentation: https://parrot-gpt.readthedocs.io
The following table shows some examples of metadata formats supported by Parrot GPT:
Schema |
---|
DATS |
cff |
crossref_xml |
JATS |
BioSchema |
Codemeta |
RIF-CS |
EDMI |
DCAT |
DCAT-AP |
DataCite |
DataCite-XML |
DataCite-JSON |
Crossref |
schema.org |
bibtex |
DC-XML |
DC-JSON |
Dublin Core |
Install parrot_gpt
using pip
:
$ pip install parrot_gpt
Use the cli.py script to transform metadata using a selected large language model and prompt type. Where:
- MODEL is the large language model to use (e.g., turbo, gpt3)
- PROMPT_TYPE is the type of input prompt (e.g., enrich, translate, crosswalk, peer_review)
- INPUT_FILE is the input metadata file
- OUTPUT_FILE is the output metadata file
- OPTIONS are optional arguments, such as --initial_schema, --target_schema, and --venue (for peer review)
For example:
$ export OPENAI_API_KEY={OPENAI_API_KEY}
$
$ python -m parrot_gpt.cli --model gpt3 --prompt-type translate --input-file input.xml --output-file output.json --initial_schema crossref --target_schema datacite
You can also use Parrot GPT in your Python code:
from parrot_gpt import ParrotGpt
from parrot_gpt.model_interface import GPT3Model
from collections import namedtuple
Arguments = namedtuple('Arguments', 'prompt_type initial_schema target_schema')
args = Arguments(prompt_type="translate", initial_schema="crossref", target_schema="datacite")
input_metadata= "path/to/your/input_file"
parrot_gpt = ParrotGpt(GPT3Model())
output = parrot_gpt.serialize(input_metadata, args)
print(output)
The following large language models are supported:
- turbo: GPT-3.5 Model
- gpt3: GPT-3 Model
The following prompt types are supported:
- enrich: Enriches the input metadata file
- translate: Translates the metadata file to another schema
- crosswalk: Generates a crosswalk between two schemas
- peer_review: Generates a peer review report for the input file
Contributions are welcome! Please check the issues page for any existing discussions, or create a new one if you have any suggestions or ideas.
This project is licensed under the MIT License. See the LICENSE file for details.