README
¶
gptcli - overview
Table of Contents
- CHANGELOG
- EXAMPLES
- Introduction
- Disclaimer
- Getting started
- Token count
- How to get and use OpenAI Api Key
- Credits
Introduction
gptcli: Terminal Chat Completion client for OpenAI's LLM models written in Go.
This utility is a kind of Swiss Army knife for AI developers used to working in a text-based terminal.
It has no chat features. It use chat endpoints as completion endpoint. Its sole purpose is to quickly craft prompts, check payload and/or response format, tokens usage, collect and analyze data, try the same prompt with different parameter settings, and so on.
Prompts can be content from web sites (technical details). Note: dynamic web pages are not supported.
Can get embedding of prompt (technical details).
Embeddings can be used with companion utility eucos.
Request/Response session can be exported in a CSV file (technical details).
Can send asynchronous groups of requests using Batch API (technical details).
More usage examples here.
All of the examples in this guide use the Bash shell on Linux systems, but can run on other platforms as well.
To follow this guide a basic knowledge of OpenAI APIs is required.
Presets
There are some basic presets, such as summarization or sentiment analisys, that can be used as a reference or starting point.
Presets imply that all parameters are presetted, but they can be changed (technical details).
Any preset is executed against a given prompt. For example, preset summary summarize user prompt, preset ner execute a Named Entity Recognition task on user text prompt, etc...
Presets can also be used with web pages. If you use --web URL
option with a preset, web page content becomes the prompt (technical details).
Presets list
To list all presets use --list-presets
option:
gptcli --list-presets
Output:
presets:
absurdity (write nonsense, using PROMPT as starting point)
baby (write stories for kids using user PROMPT as inspiration)
contrarian (try to find any flaw in PROMPT, eg. a marketing plan, a poem, source code, etc...)
description (try to describe PROMPT context, such as a place, an object, etc...)
elegant (write a story inspired by PROMPT, trying to use an elegant style)
headline (an headline of user PROMPT)
ner (Named Entity Recognition of user PROMPT)
offensive (user PROMPT analisys for offenses directed 'at': ignore bad words used as break-in)
poem (write a poem inspired by PROMPT)
presentationbare (converts PROMPT to barebone presentation - can be used as PROMPT for 'presentationcode' preset)
presentationcode (converts PROMPT from 'presentationbare' preset to Python code needed to create a PowerPoint basic presentation)
semiotic (try to analyze PROMPT with Umberto Eco's semiotic rules)
sentiment (sentiment analisys of user PROMPT: positive/negative)
sentimentneutral (sentiment analisys of user PROMPT: positive/negative/neutral)
stylecasual (rewrites user PROMPT in a more casual style)
styleformal (rewrites user PROMPT in a more formal style)
stylenews (rewrites user PROMPT in the style of a newscaster)
summary (summary of user PROMPT)
summarybrief (brief summary of user PROMPT)
summarybullet (bulleted summary of user PROMPT)
table (try to find a data pattern from PROMPT, to output as table)
tablecsv (try to find a data pattern from PROMPT, to output as table, CSV formatted)
terzarima (try to use Dante Alighieri's 'Divina Commedia' style using user PROMPT as inspiration)
tutor (give a big picture about PROMPT topic)
visiondescription (describe an image uploaded with '--vision IMAGEFILE' option - in PROMPT specify language (Italian, English, etc...))
💡 gptcli --preset poem --stream "Little white cat"
gptcli --preset summarybullet --stream --web "https://www.examples.com"
More presets usage examples here.
Models
To list all supported OpenAI's chat completion models, use --list-models
option:
# actual query to AI model
gptcli --list-models
Output:
chat/completion models:
gpt-3.5-turbo
gpt-3.5-turbo-0125
gpt-3.5-turbo-0301
gpt-3.5-turbo-0613
gpt-3.5-turbo-1106
gpt-3.5-turbo-16k
gpt-3.5-turbo-16k-0613
gpt-4
gpt-4-0125-preview
gpt-4-0314
gpt-4-0613
gpt-4-1106-preview
gpt-4-1106-vision-preview
gpt-4-32k-0314
gpt-4-turbo
gpt-4-turbo-2024-04-09
gpt-4-turbo-preview
gpt-4-vision-preview
gpt-4o
gpt-4o-2024-05-13
gpt-4o-mini
gpt-4o-mini-2024-07-18
embedding models:
text-embedding-3-large
text-embedding-3-small
text-embedding-ada-002
ref: https://platform.openai.com/docs/models
With option -m, --model
it's possible to choose which model to use. It could happen that the endpoint use an equivalent, but more capable and up-to-date model. You can check which model was used with option --model-used
.
Note: your models list may be different.
Moderation
It's also possible to use the OpenAI's Moderation endpoint, using command line options --moderation
, for a simple result (safe, not safe), --moderation-bool
, for boolean result (false, true) or --moderation --response-json
, for full report.
Example query:
# actual query to AI model
gptcli --moderation-bool "software developers are ☢ 🌀🗲 🤮❌💩🔪🚨🚫"
Result:
true
Disclaimer
I enjoy to develop this utilities just for my personal use. So, use them at your own risk.
If you want to develop your own solution written in Go and based on OpenAI's models, I suggest you use go-openai library. For other languages check the OpenAI official documentation.
Getting started
Binary
Prebuilt binary packages for Linux, Windows and MacOS can be downloaded from here.
Compile from source
If you prefer, clone this repo and compile from sources.
Prerequisite: Go Development Environment installed.
Clone this repo and build:
git clone https://gitlab.com/ai-gimlab/gptcli.git
cd gptcli
go mod init gptcli && go mod tidy
go build .
Default settings
- model: gpt-4o-mini
- model embeddings: text-embedding-3-small
- model vision: gpt-4o
- model moderation: text-moderation-latest
- function call if no functions are required: none
- function call if functions are required: auto
- parallel tool calls: true
- temperature: 0.0
- top_p: 1.0
- presence_penalty: 0.0
- frequency_penalty: 0.0
- seed: null
- vision detail: auto
- max response tokens: 1000
- max text response tokens for vision: 300
- number of responses: 1
- response format: text
- stream response mode: false
- connection timeout: 120 sec
- connection timeout stream mode (chunk, total): 5 sec, 180 sec
- number of retries: 0
- wait before retry: 10 sec
- logprobs: false
- top logprobs: 0
- embeddings csv filename: data.csv
- embeddings encoding format: float
- batch input filename: batch_input.jsonl
- batch url: selected based on the task, either completions or embeddings
- batch (prefix) id: request
- batch endpoint: /v1/chat/completions
- batch list limit: 20
Presets have their own default settings. Use option --preview
to view settings for a given preset (technical details).
Usage
Basically, a prompt is enough:
gptcli "PROMPT"
More usage examples here.
Help
Use --help
to view all options:
gptcli --help
Output:
Usage: gptcli [OPTIONS] "PROMPT"
Terminal Chat Completion client for OpenAI's models
Defaults:
model: gpt-4o-mini
model embeddings: text-embedding-3-small
model vision: gpt-4o
model moderation: text-moderation-latest
function call if no functions are required: none
function call if functions are required: auto
parallel function calls: true
temperature: 0.0
top_p: 1.0
presence_penalty: 0.0
frequency_penalty: 0.0
seed: null
vision detail: auto
max response tokens: 1000
max text response tokens for vision: 300
number of responses: 1
response format: text
stream response mode: false
connection timeout: 120 sec
connection timeout stream mode (chunk, total): 5 sec, 180 sec
number of retries: 0
wait before retry: 10 sec
logprobs: false
top logprobs: 0
embeddings csv filename: data.csv
embeddings encoding format: float
batch input filename: batch_input.jsonl
batch url: selected based on the task, either completions or embeddings
batch (prefix) id: request
batch endpoint: /v1/chat/completions
batch list limit: 20
Order matter:
first OPTIONS
then PROMPT
Notes:
. PROMPT must always be enclosed in "double" quotes.
. When using [-m, --model] option it is possible that the endpoint
use an equivalent, but more capable and up-to-date model.
You can check which model was used with option '--model-used'
Online Documentation: <https://gitlab.com/ai-gimlab/gptcli#gptcli---overview>
OPTIONS:
Global:
--defaults prints the program default values
-l, --list-models list names of all available models and exit
-m, --model=MODEL select model
use '--list-models' to list all available models
-p, --preview preview request payload and exit
--response-raw print full response body raw
--response-json print full response body json formatted
--help print this help
--version output version information and exit
Network:
--retries=RETRIES number of connection retries if network timeout occur
use the '--retries-wait' option to insert a pause between retries
--retries-wait=SECONDS in conjunction with '--retries RETRIES' option insert a pause between retries
--status check endpoint status before sending any request
exit program if any issue occur at endpoint side
--timeout=SECONDS network connection timeout, in seconds
apply to a complete request/response session
SECONDS = 0 means no timeout
--timeout-chunk=SECONDS stream mode network connection timeout, in seconds
apply to every streamed chunk of response
API Auth: ref: <https://gitlab.com/ai-gimlab/gptcli#how-to-get-and-use-openai-api-key>
-f, --file=APIKEY_FILE file with Openai api key (OPENAI_API_KEY=your_apikey)
Chat/Completion/Vision API: ref: <https://platform.openai.com/docs/api-reference/chat>
-a, --assistant="PROMPT" [role]: 'assistant' message PROMPT
must be used with '--previous-prompt' option
-c, --count=SELECT prints how many tokens have been used
can also print the word count of the AI response
SELECT can be 'in', 'out', 'total' (in + out) or 'words'
SELECT can also be any comma separated combination of above,
without any space, without trailing comma:
in,out
in,out,total,words
etc...
with options '--moderation' or '--moderation-bool' it has no effect
--cvs=CSV_FILE export request/response data, csv formatted, to CSV_FILE
no stream mode, no multiple response, no moderation
- if CSV_FILE does not exist: ask permission to create a new one
- if CSV_FILE exist: append data
--fingerprint print the system fingerprint of model backend configuration
--fp=VALUE frequency penalty: VALUE range from -2.0 to 2.0
--format=FORMAT the format that the model must output
FORMAT can be either 'text' or 'json_object' (or simply 'json')
--function='{JSON}' json object containing function definition
must be enclosed in single quotes
no stream mode
use '--function-examples' to view how to compose json object
support multiple functions
ref: <https://gitlab.com/ai-gimlab/gptcli/-/tree/main/examples?ref_type=heads#functions>
--function-call="MODE" how the model responds to function calls
MODE can be:
- "auto" (model choose if calling function)
- "none" (model does not call a function)
- "required" (model is forced to call one or more functions)
- "FUNCTION_NAME" (always call function "FUNCTION_NAME")
MODE must always be enclosed in single or double quotes
--function-examples print some examples of function json object and exit
--function-noparallel disable parallel function calling
--list-presets list all available predefined tasks and exit
--lb='{"ID": BIAS}' logit bias: BIAS range from -100 to 100
json object: maps token IDs to a BIAS value
must be enclosed in single quotes
keys must always be enclosed in double quotes
eg: --lb '{"1234": -10, "4567": 20, "890": -90}'
--logprobs return log probabilities of the output tokens in tabular format
- in conjunction with the '--response-json' or '--response-raw' options
logprobs are included in the response, json formatted
- with '--stream' option logprobs are printed only if used
in conjunction with the '--response-json' or '--response-raw' options
--model-used which model was used at endpoint side
--name=NAME NAME of the author of the request
--pp=VALUE presence penalty: VALUE range from -2.0 to 2.0
--preset=PRESET_NAME predefined tasks, such as summarization, sentiment analisys, etc...
use '--list-presets' to list all available PRESET_NAMEs and their purpose
--preset-system print PRESET_NAME predefined system message and exit
--previous-prompt=PROMPT in conjunction with the '--assistant' option simulates a single round of chat
in conjunction with the '--tool' option simulates a single round of function call
-r, --response-tokens=TOKENS maximun number of response tokens
range from 0 to model max tokens minus input tokens
ref: <https://platform.openai.com/docs/models> for models context size
--responses=NUMBER how many responses to generate for each input message
no multiple responses in stream mode, moderation mode and csv export
--seed=NUMBER integer number
multiple requests with same prompt, seed and params should return similar/equal result
--stop="STOP,..." comma separated list of stop sequences, up to 4 sequences,
without trailing comma, enclosed in single or double quotes
eg: "STOP1,STOP2,..."
--stream mimic ChatGPT behavior
start printing completion before the full completion is finished
-s, --system="PROMPT" [role]: system message PROMPT
-t, --temperature=VALUE VALUE range from 0.0 to 2.0
--tool='[{JSON}]' json object containing model response to a 'tool call', such as function call
must be used in conjunction with '--previous-prompt' option
must begin and end with square bracket and enclosed in single quotes
ref: <https://gitlab.com/ai-gimlab/gptcli/-/tree/main/examples?ref_type=heads#tools>
--tool-content='{JSON}' json object containing result from user application function
must be enclosed in single quotes
ref: <https://gitlab.com/ai-gimlab/gptcli/-/tree/main/examples?ref_type=heads#tools>
--top-logprobs=NUMBER number of most likely tokens to return at each token position
NUMBER is an integer between 0 and 20
must be used in conjunction with the '--logprobs' option
- in conjunction with the '--response-json' or '--response-raw' options
logprobs are included in the response, json formatted
- with '--stream' option logprobs are printed only if used
in conjunction with the '--response-json' or '--response-raw' options
--top-p=VALUE top_p: VALUE range from 0.0 to 1.0
--user=USER USER unique identifier representing your end-user
useful in organizations, for any end-user policy violations
-v, --vision=IMAGE_FILE answer questions about IMAGE_FILE
IMAGE_FILE can be a local file or an url link to file
no tool and web option if vision is requested
--vision-detail=DETAIL vision resolution of image processing
DETAIL can be 'low' (disable high res) or 'high' (enable high res)
--web=URL any web page URL (dynamic web pages are not supported)
URL must always be encloded in single or double quotes
in user PROMPT ask only for the context of the page. Eg.:
- WRONG: gptcli --web "https://www.example.com" "From the following web page {QUESTION}"
- OK: gptcli --web "https://www.example.com" "Extract a list of cat names"
used in conjunction with option '--preset', URL content becomes the user PROMPT. Eg:
- gptcli --preset ner --web "https://www.example.com"
execute NER preset using content from "https://www.example.com" web page
--web-select=SELECTOR select a section of web page, then strip html tags
SELECTOR can be any HTML element, such as 'h2', 'p', 'body', 'article', etc...
SELECTOR can also be any class or id attribute (eg.: '.myClass', '#myID')
lastly SELECTOR can be 'NONE' keyword: no html tag stripping
- Note: NONE can use a lot of tokens
SELECTOR must always be encloded in single or double quotes
default selectors are 'main', if exist, or 'body'
--web-test print an output of text extracted from web page and exit
useful to find the right HTML selector (check --web-select option)
with the right SELECTOR less tokens are used
Embeddings API: ref: <https://platform.openai.com/docs/api-reference/embeddings>
-e, --embed return PROMPT embedding
PROMPT must always be enclosed in "double" quotes
multiple PROMPTs must be separated by double commas
eg: gptcli --embed "PROMPT_1,,PROMPT_2,,...,,PROMPT_N"
return a cvs table with columns 'id, text, embedding'
append table to file 'data.csv' or create new one
- to save in a different file use '--csv FILENAME' option
- in conjunction with the '--response-json' or '--response-raw' options
response is not saved to file, instead output json data to stdout
--embed-dimensions=VALUE change embedding vector size to VALUE dimensions
--encoding-format set embedding encoding format to base64 (default: float)
response is not saved to file, instead output json data to stdout
Moderation API: ref: <https://platform.openai.com/docs/api-reference/moderations>
--moderation use 'moderation' endpoint: [safe/not safe] result
identify content that OpenAI's policies prohibits
safe: compliant with OpenAI's policies (return code: 0)
not safe: violate OpenAI's policies (return code: 125)
use option '--response-json' for full report
--moderation-bool equivalent to '--moderation', but with [true/false] result
false: compliant with OpenAI's policies (return code: 0)
true: violate OpenAI's policies (return code: 125)
Batch API: ref: <https://platform.openai.com/docs/guides/batch/batch-api>
PREPARE BATCH:
--batch-id=PREFIX prefix for unique custom id value to reference results after completion
a sequential number is appended to PREFIX:
- PREFIX = 'myrequest'
resulting custom_id are 'myrequest-1, myrequest-2, ..., myrequest-n'
it is used in conjuction with '--batch-prepare' option
--batch-input=JSONL_FILE batch input file containing requests created with '--batch-prepare' option
- if JSONL_FILE does not exist: ask permission to create a new one
- if JSONL_FILE exist: append data
--batch-prepare create a '.jsonl' file with batch requests
default JSONL filename: 'batch_input.jsonl' (append mode)
use '--batch-input=JSONL_FILE' option to specify your own JSONL file
CREATE BATCH:
--batch-create=FILE_ID create a new batch
FILE_ID is the ID of a JSONL file uploaded with the '--fupload=FILENAME' option
--batch-endpoint=ENDPOINT endpoint to be used for all requests in the batch
ENDPOINT can be one of '/v1/chat/completions' or '/v1/embeddings'
it is used in conjuction with '--batch-create=FILE_ID' option
--batch-metadata='{JSON}' json object containing custom metadata for the batch
must be enclosed in single quotes
it is used in conjuction with '--batch-create=FILE_ID' option
INFO:
--batch-list list batches
--batch-listlimit=LIMIT limit the number of returned batches from '--batch-list' option
--batch-listafter=BATCH_ID from '--batch-list' option return only batches after BATCH_ID
--batch-status=BATCH_ID status check of BATCH_ID created previously with '--batch-create=FILE_ID' option
CANCEL BATCH:
--batch-cancel=BATCH_ID cancels an in-progress BATCH_ID
SAVE BATCH RESULTS:
--batch-result=JSONL_FILE JSONL_FILE where to save the results when batch is complete
equivalent to '--fsave=FILENAME' File API option
must bu used in conjuction with '--fretrieve=OUTPUT_FILE_ID'
OUTPUT_FILE_ID is the 'output_file_id' field of the json batch object
File API: ref: <https://platform.openai.com/docs/api-reference/files>
UPLOAD/DELETE FILE:
--fupload=FILENAME upload FILENAME
can be used in conjuction with '--fpurpose=PURPOSE' option
--fdelete=FILE_ID delete file FILE_ID
INFO:
--finfo=FILE_ID retrieve information about FILE_ID
--flist list uploaded files
used in conjuction with '--fpurpose=PURPOSE' option return files with the given purpose
--fpurpose=PURPOSE intended use of the uploaded file
PURPOSE must be one of 'assistants', 'vision', 'batch', 'batch_output' and 'fine-tune'
can be used in conjuction with '--flist' or '--fupload=FILENAME' options
Note: a file can be uploaded for any purpose, but gptcli it is able to use only 'batch' purpose
RETRIEVE/SAVE FILE:
--fretrieve=FILE_ID retrieve FILE_ID content
use '--fsave=FILENAME' or shell redirection to save the content
--fsave=FILENAME save FILE_ID content retrieved with '--fretrieve=FILE_ID' option to FILENAME
must be used in conjuction with '--fretrieve=FILE_ID' option
💡 gptcli --count in,out,total --model gpt-4 "Summarize the following text: {TEXT}"
gptcli --batch-prepare --batch--id "image-caption" --vision "image.jpeg" "Create a caption for the image"
Informational options
Some options output informations only, without making any requests:
--function-examples
: print an example of function json object-l, --list-models
: list names of all available models--list-presets
: list all available predefined tasks--preset-system
: print predefined system message for a given preset - must be used with--preset PRESET_NAME
option-p, --preview
: preview request payload, json formatted--web-test
: print an output of text extracted from web page - must be used with--web URL
option--help
: print help--version
: print version information
In the following example some parameters are set, such as logit bias (--lb '{"39203": -6}'
), temperature (--temperature 0.8
) and system message (--system "be polite"
). In addition, two responses are required (--responses 2
). Using -p, --preview
option, you can verify that the resulting payload is the one you want, but without performing any requests:
# preview payload
gptcli --preview --lb '{"39203": -6}' --temperature 0.8 --responses 2 --system "be polite" "In about 15 words write a poem about a little white cat"
Output:
{
"messages": [
{
"role": "system",
"content": "be polite"
},
{
"role": "user",
"content": "In about 15 words write a poem about a little white cat"
}
],
"logit_bias": {
"39203": -6
},
"model": "gpt-4o-mini",
"temperature": 0.8,
"top_p": 1,
"presence_penalty": 0,
"frequency_penalty": 0,
"n": 2,
"max_tokens": 1000,
"stream": false
}
Data analysis options
Some options give more verbose glimpse of what's going on during requests:
-c, --count
: prints how many tokens have been used - check Token count section for more--csv
: export request and response data, csv formatted, to file - useful for data analysis (technical details)--fingerprint
: print the system fingerprint of model backend configuration--model-used
: which model was used at endpoint side--moderation
in conjunction with--response-json
: print full moderation report--response-json
: print full response, json formatted - useful to find the right key to unmarshal--response-raw
: print the unformatted raw response - useful to understand how stream behave--logprobs
and--top-logprobs
: print the log probabilities of the output tokens
The same example as above, but without -p, --preview
option, because we want actually execute request. This time we use --response-json
option, to view all response details:
# actual query to AI model
gptcli --response-json --lb '{"39203": -6}' --temperature 0.8 --responses 2 --system "be polite" "In about 15 words write a poem about a little white cat"
Response:
{
"id": "chatcmpl-0123456789abcabcabcabcabcabch",
"object": "chat.completion",
"created": 634953600,
"model": "gpt-4o-mini",
"choices": [
{
"index": 0,
"message": {
"role": "assistant",
"content": "Graceful white feline,\nPaws tiptoe softly, a delicate\ndance of elegance and charm."
},
"finish_reason": "stop"
},
{
"index": 1,
"message": {
"role": "assistant",
"content": "Graceful and pure,\nA little white cat,\nBringing joy to our hearts,\nIn every gentle pat."
},
"finish_reason": "stop"
}
],
"usage": {
"prompt_tokens": 26,
"completion_tokens": 44,
"total_tokens": 70
}
}
The AI answer contains all the information you need to understand what's going on under the hood:
"usage":
contain actual tokens usage"choices":
contain responses [0] and [1], because we had requested 2 responses with--responses 2
option"model":
contain which model was used at endpoint side- etc...
More details on OpenAI API documentation page.
More usage examples here.
Token count
Option -c, --count=SELECT
return count of used tokens and AI response words. SELECT can be in, out, total (in + out), words(output words) or any combination, such as in,out,words
, in,total
, etc...
There are differences between query mode results and results in preview and web-test modes.
Query Mode: tokens usage is included in response, so it represents the real (and official) usage.
Preview Mode: --preview
option don't actually execute any request, therefore the information on output tokens, total tokens and output words cannot be available. Input tokens, the user prompt, are not directly accountable, but the result represents an (anyway good) approximation of real usage.
Web Test Mode: in conjunction with --web-test
option, -c, --count=SELECT
option return the actual number of tokens that would be used for a given web page. The count refers to the raw web page, without the addition of messages, such as the system message, and other payload stuff. it represents the real web page usage.
More tokens/words usage examples here.
How to get and use OpenAI Api Key
Get your api key from OpenAI site (pricing).
There are three way to supply api key:
[1] default
Create file .env
and insert the following line:
OPENAI_API_KEY='sk-YOUR-API-KEY'
Copy file .env
to $HOME/.local/etc/
folder.
[2] environment variable
If you prefer, export OPENAI_API_KEY as environment variable:
export OPENAI_API_KEY='sk-YOUR-API-KEY'
[3] use '-f, --file' option
You can also supply your own key file, containing the statement OPENAI_API_KEY='sk-YOUR-API-KEY'
, and pass its path as argument to -f, --file
option.
Credits
This project is made possible thanks to the use of the following libraries and the precious work of those who create and maintain them. Of course thanks also to all those who create and maintain the AI models.
Documentation
¶
There is no documentation for this package.