October 2025
-bb|-cc|-dd|-qq]
[opt..]
[PROMPT|TEXT_FILE|PDF_FILE]-i [opt..]
[S|M|L][hd] [PROMPT]
#dall-e-3-i [opt..]
[X|L|P][high|medium|low]
[PROMPT] #gpt-image-i [opt..]
[X|L|P][high|medium|low]
[PNG_FILE]-i [opt..]
[X|L|P][high|medium|low]
[PNG_FILE] [MASK_FILE] [PROMPT]-w [opt..]
[AUDIO_FILE|.] [LANG] [PROMPT]-W [opt..]
[AUDIO_FILE|.] [PROMPT-EN]-z [opt..]
[OUTFILE|FORMAT|-] [VOICE]
[SPEED] [PROMPT]-bccWwz [opt..]
-- [PROMPT] -- [stt_arg..] --
[tts_arg..]-l [MODEL]-TTT [-v]
[-m[MODEL|ENCODING]]
[INPUT|TEXT_FILE|PDF_FILE]-HPP
[/HIST_NAME|.]-HPwThis script acts as a wrapper for ChatGPT, DALL-E, STT (Whisper), and TTS endpoints from OpenAI. Various service providers such as LocalAI, Ollama, Anthropic, Mistral AI, GoogleAI, Groq AI, GitHub Models, Novita, xAI, and DeepSeek APIs are supported.
By default, the script runs in single-turn of chat completion mode, processing INPUT directly when no options are set.
Handles single-turn and multi-turn modes, pure text and native chat completions, image generation and editing, speech-to-text, and text-to-speech models.
Positional arguments are read as a single PROMPT. Some functions such as Whisper (STT) and TTS may handle optional positional parameters before the text prompt itself.
Responses API calls (may be used with options -cc).
Limited support. Set a valid model with “--model
[name]”.
Chat mode in text completions (used with
options -wzvv).
Chat mode in chat completions (used with
options -wzvv).
Continue from (resume) last session (cmpls/chat).
Single-turn session of plain text completions.
Multi-turn session of plain text completions with history support.
Edit first input from stdin or file (cmpls/chat).
With options -eex, edit last text editor buffer from
cache.
Exit on first run (even with options -cc).
Response streaming.
Unset response streaming.
Generate images given a prompt. Set option -v to not open response.
Create variations of a given image.
Edit image with mask and prompt (required).
Insert text rather than completing only. May be set twice for multi-turn.
Use “[insert]” to indicate where the language model should insert text (`instruct’ and Mistral `code models’).
-S .[PROMPT_NAME], -.[PROMPT_NAME]
Load, search for, or create custom prompt.
Set .[PROMPT] to load prompt silently.
Set ,[PROMPT] to single-shot edit prompt.
Set ,,[PROMPT] to edit the prompt template
file.
Set .?, or .list to list
all prompt files.
-S, –awesome /[AWESOME_PROMPT_NAME]
Set or search for an awesome-chatgpt-prompt(-zh).
Set // or %% instead to refresh cache.
-T, --tiktoken
Count input tokens with python Tiktoken (ignores special tokens).
Set twice to print tokens, thrice to available encodings.
Set the model or encoding with option -m.
It heeds options -ccm.
Transcribe audio file speech into text. LANG is optional. A prompt that matches the speech language is optional. Speech will be transcribed or translated to the target LANG.
Set twice to phrase or thrice for word-level timestamps (-www).
With options -vv, stop voice recorder on silence auto
detection.
Translate audio file speech into English text.
Set twice to phrase or thrice for word-level timestamps (-WWW).
Synthesise speech from text prompt. Takes a voice name, speed and text prompt.
Set option -v to not play response automatically.
Toggle multiline prompter, <CTRL-D> flush.
Cat prompter, <CTRL-D> flush.
Edit prompt in text editor.
Set twice to run the text editor interface a single time for the first user input.
Set options -eex to edit last buffer from cache.
Transparent colour of image mask. Def=black.
Fuzz intensity can be set with [VAL%]. Def=0%.
Unset model max response tokens (chat cmpls only).
-NUM
Maximum number of response tokens. Def=4096.
A second number in the argument sets model capacity.
Model capacity token value. Def=auto, Fallback=8000.
Presence penalty (cmpls/chat, -2.0 - 2.0).
Frequency penalty (cmpls/chat, -2.0 - 2.0).
Best of results, must be greater than option -n (cmpls).
Def=1.
--effort [high|medium|low|minimal] (OpenAI)
Amount of effort in reasoning models.
TTS out-file format. Def= mp3.
Seed for deterministic sampling (integer).
Top_k value (local-ai, ollama, google).
How long the model will stay loaded into memory (Ollama).
Language MODEL name. Def=gpt-5/gpt-3.5-turbo-instruct.
Set MODEL name as “.” to pick from the list.
Model multimodal model type.
Number of results. Def=1.
Top_p value, nucleus sampling (cmpls/chat, 0.0 - 1.0).
Restart sequence string (cmpls).
Start sequence string (cmpls).
Stop sequences, up to 4. Def="<|endoftext|>".
Set an instruction text prompt. It may be a text file.
Prepend the current date and time (timestamp) to the instruction prompt.
Temperature value (cmpls/chat/stt), (0.0 - 2.0, stt 0.0 - 1.0). Def=0.
Unset context truncation parameter (Responses API).
--verbosity, --verb [high|medium|low]
Model response verbosity level (OpenAI).
TTS voice name. OpenAI or PlayAI (Groq) voice names. Def=echo, Aaliyah-PlayAI.
/HIST_NAME]Edit history file with text editor or pipe to stdout.
A history file name can be optionally set as argument.
/HIST_NAME]Print out last history session.
Set twice to print commented out history entries, inclusive. Heeds
options -bccdrR.
These are aliases to -HH and -HHH, respectively.
Temporary cache location. Defaults to subdirectory in
$CACHEDIR, $TMPDIR, or /tmp.
Ignore user configuration file.
Edit configuration file with text editor, if it exists.
$CHATGPTRC="~/.chatgpt.conf".
Dump template configuration file to stdout.
Anthropic integration (cmpls/chat). Also see --think.
DeepSeek integration (cmpls/chat).
GitHub Models integration (chat).
Google Gemini integration (cmpls/chat).
Groq AI integration (chat).
LocalAI integration (cmpls/chat).
Mistral AI integration (chat).
Novita AI integration (cmpls/chat).
Reset service integrations.
Ollama server integration (cmpls/chat).
xAI Grok integration (cmpls/chat).
The API key to use.
Set or unset response folding (wrap at white spaces).
Print the help page.
Print OpenAI usage status (requires envar
$OPENAI_ADMIN_KEY).
Disable colour output. Def=auto.
List models or print details of MODEL.
Log file. FILEPATH is required.
Enable markdown rendering in response. Software is optional: bat, pygmentize, glow, mdcat, or mdless.
Disable markdown rendering.
Copy response to clipboard.
Less interface verbosity.
Sleep after response in voice chat (-vvbccw).
With options -bccwv, sleep after response. With
options -bccwzvv, stop recording voice input on silence
detection and play TTS response right away.
May be set multiple times.
Dump raw JSON request block (debug).
Print script version.
Tiktoken for token count (cmpls/chat, python).
Unset tiktoken use (cmpls/chat, python).
Print JSON data of the last responses.
Set option -c to start a multi-turn chat mode via
text completions with history support. This option
works with instruct models, defaults to gpt-3.5-turbo-instruct
if none set.
Set options -cc to start the chat mode via
native chat completions. This mode defaults to the
gpt-5 model, which is optimised to follow instructions.
On options -bb, the Responses API endpoint is set
preferentially.
In chat mode, some options are automatically set to un-lobotomise the bot.
While using other providers, mind that options -c,
-cc, and -bb set different endpoints! These
options must be set according to the model capabilities!
Set option -C to resume (continue from)
last history session, and set option -E to exit on the
first response (even in multi turn mode).
Option -d starts a single-turn session in plain
text completions, no history support. This does not set further
options automatically, such as instruction or temperature.
To run the script in text completion in multi-turn mode and history
support, set command line options -dd.
Set text completion models such as gpt-3.5-turbo-instruct.
Set option -q for insert mode in
single-turn and option -qq for multi-turn. The flag
“[insert]” must be present in the middle of the input prompt.
Insert mode works completing between the end of the text preceding the
flag, and ends completion with the succeeding text after the flag.
Insert mode works with `instruct’ and Mistral `code’ models.
Responses API is a superset of Chat Completions API. Set command line
option -b (with -cc), or set
options -bb for multi-turn.
To activate it during multi-turn chat, set
/responses [model], where model is the name of a
model which works with the Responses API. Aliased to
/resp [model] and -b [model]. This can be
toggled.
Limited support.
The SYSTEM INSTRUCTION prompt may be set with option -S
or via envars $INSTRUCTION and
$INSTRUCTION_CHAT.
Option -S sets an INSTRUCTION prompt (the initial
prompt) for text cmpls, and chat cmpls. A text file path may be supplied
as the single argument. Also see CUSTOM / AWESOME
PROMPTS section below.
To create and reuse a custom prompt, set the prompt name as a command
line option, such as “-S .[_prompt_name_]” or
“-S ,[_prompt_name_]”.
When the operator is a comma “,”, single-shot editing will be available after loading the prompt text. Use double “,,” to actually edit the template file itself!
Note that loading a custom prompt will also change to its respectively-named history file.
Alternatively, set the first positional argument with the operator
and the prompt name after any command line options, such as
“chatgpt;sh -cc .[_prompt_name_]”. This loads the prompt
file unless instruction was set with command line options.
To prepend the current date and time to the instruction prompt, set
command line option --time.
For TTS gpt-4o-tts model type instructions, set command line
option -S "[instruction]" when invoking the script with
option -z only (stand-alone TTS mode). Alternatively, set
envar $INSTRUCTION_SPEECH.
Note that for audio models such as gpt-4o-audio, the
user can control tone and accent of the rendered voice output with a
robust `INSTRUCTION’ as usual.
Minimal INSTRUCTION to behave like a chatbot is
given with chat options -cc, unless otherwise explicitly
set by the user.
On chat mode, if no INSTRUCTION is set, minimal instruction is given, and some options auto set, such as increasing temp and presence penalty, in order to un-lobotomise the bot. With cheap and fast models of text cmpls, such as Curie, the `best_of’ option may be worth setting (to 2 or 3).
Prompt engineering is an art on itself. Study carefully how to craft the best prompts to get the most out of text, code and chat cmpls models.
Certain prompts may return empty responses. Maybe the model has nothing to further complete input or it expects more text. Try trimming spaces, appending a full stop/ellipsis, resetting temperature, or adding more text.
Prompts ending with a space character may result in lower quality output. This is because the API already incorporates trailing spaces in its dictionary of tokens.
Note that the model’s steering and capabilities require prompt engineering to even know that it should answer the questions.
Set model with “-m [MODEL]”, with
MODEL as its name, or set it as “.” to pick from the
model list.
List models with option -l or run /models
in chat mode.
Set maximum response tokens with option
“-NUM” or “-M NUM”. This
defaults to 4096 tokens and 25000 for reasoning
models, or disabled when running on chat completions and responses
endpoints.
If a second NUM is given to this option, maximum model
capacity will also be set. The option syntax takes the form of
“-NUM/NUM”, and “-M
NUM-NUM”.
Model capacity (maximum model tokens) can be set more
intuitively with option “-N NUM”,
otherwise model capacity is set automatically for known models or to
8000 tokens as fallback.
Option -y sets python tiktoken instead of the default
script hack to preview token count. This option makes token count
preview accurate and fast (we fork tiktoken as a coprocess for fast
token queries). Useful for rebuilding history context independently from
the original model used to generate responses.
Option -w transcribes audio speech from
mp3, mp4, mpeg, mpga, m4a,
wav, webm, flac and ogg files. First
positional argument must be an AUDIO/VOICE file. Optionally,
set a TWO-LETTER input language (ISO-639-1) as the
second argument. A PROMPT may also be set to guide the model’s style, or
continue a previous audio segment. The text prompt should match the
speech language.
Note that option -w can also be set to translate
speech input to any text language to the target language.
Option -W translates speech stream to
English text. A PROMPT in English may be set to guide
the model as the second positional argument.
Set these options twice to have phrasal-level timestamps, options -ww and -WW. Set thrice for word-level timestamps.
Combine options -wW with
options -bcc to start chat with voice
input (Whisper) support. Additionally, set
option -z to enable text-to-speech (TTS)
models and voice out.
Option -z synthesises voice from text (TTS models). Set
a voice as the first positional parameter (“alloy”,
“echo”, “fable”, “onyx”, “nova”, or
“shimmer”). Set the second positional parameter as the
voice speed (0.25 - 4.0), and, finally the
output file name or the format, such as
“./new_audio.mp3” (“mp3”, “wav”,
“flac”, “opus”, “aac”, or “pcm16”);
or set “-” for stdout.
Do mind that PlayAI (supported by Groq AI) has different output formats such as “mulaw” and “ogg”, as well as different voice names such as Aaliyah-PlayAI, Adelaide-PlayAI, Angelo-PlayAI, etc.
Set options -zv to not play received
output.
Audio models, such as gpt-4o-audio, deal with audio input and output directly.
To activate the microphone recording function of the script, set
command line option -w.
Otherwise, the audio model accepts any compatible audio file (such as
mp3, wav, and opus).
These files can be added to be loaded at the very end of the user prompt
or added with chat command “/audio
path/to/file.mp3”.
To activate the audio synthesis output mode of an audio model, make
sure to set command line option -z!
Option -i generates images according to
text PROMPT. If the first positional argument is an IMAGE file,
then generate variations of it. If the first positional
argument is an IMAGE file and the second a MASK file
(with alpha channel and transparency), and a text PROMPT (required),
then edit the IMAGE according to MASK
and PROMPT. If MASK is not provided, IMAGE must have
transparency.
The size of output images may be set as the first positional parameter in the command line:
gpt-image: "_1024x1024_" (_L_, _Large_, _Square_), "_1536x1024_" (_X_, _Landscape_), or "_1024x1536_" (_P_, _Portrait_).
dall-e-3: "_1024x1024_" (_L_, _Large_, _Square_), "_1792x1024_" (_X_, _Landscape_), or "_1024x1792_" (_P_, _Portrait_).
dall-e-2: "_256x256_" (_Small_), "_512x512_" (_M_, _Medium_), or "_1024x1024_" (_L_, _Large_).
A parameter “high”, “medium”, “low”, or “auto” may also be appended to the size parameter to set image quality with gpt-image, such as “Xhigh” or “1563x1024high”. Defaults=1024x1024auto.
The parameter “hd” or “standard” may also be set for image quality with dall-e-3.
For dall-e-3, optionally set the generation style as either “natural” or “vivid” as one of the first positional parameters at command line invocation.
Note that the user needs to verify his organisation to use gpt-image models!
See IMAGES section below for more information on inpaint and outpaint.
Given a prompt, the model will return one or more predicted
completions. For example, given a partial input, the language model will
try completing it until probable “<|endoftext|>”, or
other stop sequences (stops may be set with
-s "\[stop-seq]").
Restart and start sequences may be
optionally set. Restart and start sequences are not set automatically if
the chat mode of text completions is not activated with
option -c.
Readline is set to work with multiline input and
pasting from the clipboard. Alternatively, set option -u to
enable pressing <CTRL-D> to flush input! Or set
option -U to set cat command as input
prompter.
Bash bracketed paste is enabled, meaning multiline input may be
pasted or typed, even without setting options -uU
(v25.2+).
Language model SKILLS can be activated with specific prompts, see https://platform.openai.com/examples.
Set option -c to start chat mode of text completions. It
keeps a history file, and keeps new questions in context. This works
with a variety of models. Set option -E to exit on
response.
Set the double option -cc to start chat completions
mode. More recent models are also the best option for many non-chat use
cases.
The defaults chat format is “Q & A”. The restart sequence “\nQ: ” and the start text “\nA:” are injected for the chat bot to work well with text cmpls.
In multi-turn interactions, special prefixes allow prompt
manipulation: * :_PROMPT_ - Prepends text
to the current user prompt before sending. *
::_PROMPT_ - Prepends text to the
system instruction for the current turn. *
::: - Re-injects the original system
instruction into the request, useful for reinforcing instructions after
long conversations.
Entering exactly triple colons “:::” reinjects a system instruction prompt into the current request. This is useful to reinforce the instruction when the model’s context has been truncated.
The options -bccwz may be combined to have voice
recording input and synthesised voice output, specially nice with chat
modes. When setting flag -w or flag -z, the
first positional parameters are read as STT or TTS arguments. When
setting both flags -wz, add a double hyphen to set first
STT, and then TTS arguments.
Set chat mode, plus voice-in transcription language code and text prompt, and the TTS voice-out option argument:
chatgpt.sh -bccwz en 'transcription prompt' -- nova
To send an image or url to vision
models, either set the image with the “!img”
command with one or more filepaths / urls.
chatgpt.sh -cc -m gpt-4-vision-preview '!img path/to/image.jpg'
Alternatively, set the image paths / urls at the end of the text prompt interactively:
chatgpt.sh -cc -m gpt-4-vision-preview
[...]
Q: In this first user prompt, what can you see? https://i.imgur.com/wpXKyRo.jpeg
Make sure file paths containing spaces are backslash-escaped!
The user may add a filepath or URL to the end of the prompt. The file is then read and the text content added to the user prompt. This is a basic text feature that works with any model.
chatgpt.sh -cc
[...]
Q: What is this page: https://example.com
Q: Help me study this paper. ~/Downloads/Prigogine\ Perspective\ on\ Nature.pdf
In the second example, the PDF will be dumped as text.
For PDF text dump support, poppler/abiword is required.
For doc and odt files, LibreOffice is
required. See the Optional Packages section.
Also note that file paths containing white spaces must be backslash-escaped, or the file path must be preceded by a pipe `|’ character.
Multiple images and audio files may be added to the request in this way!
While in chat mode, the following commands can be invoked to change parameters and manage sessions.
!” or “/”
and are usually equivalent.:) add their
output to the current prompt buffer.‡ execute
as as suffix command, see examples below.| Misc | Commands | |
|---|---|---|
-S |
[PROMPT] | Overwrite the system prompt. |
-S: |
: [PROMPT] |
Prepend to current user prompt. |
-S:: |
:: [PROMPT] |
Prepend to system prompt. |
-S::: |
::: |
Reset (inject) system prompt into request. |
-S. |
-. [NAME] |
Load and edit custom prompt. |
-S/ |
!awesome [NAME] |
Load and edit awesome prompt (english). |
-S% |
!awesome-zh
[NAME] |
Load and edit awesome prompt (chinese). |
-Z |
!last |
Print last raw JSON or the processed text response. |
!# |
!save [PROMPT] |
Save current prompt to shell history. ‡ |
! |
!r, !regen |
Regenerate last response. |
!! |
!rr |
Regenerate response, edit prompt first. |
!g: |
!!g: [PROMPT] |
Ground user prompt with web search results. ‡ |
!i |
!info [REGEX] |
Information on model and session settings. |
!!i |
!!info |
Monthly usage stats (OpenAI). |
!j |
!jump |
Jump to request, append start seq primer (text cmpls). |
!!j |
!!jump |
Jump to request, no response priming. |
!cat |
- | Cat prompter as one-shot, <CTRL-D> flush. |
!cat |
!cat:
[TXT|URL|PDF] |
Cat text, PDF file, or dump URL. |
!clot |
!!clot |
Flood the TTY with patterns, as visual separator. |
!dialog |
- | Toggle the “dialog” interface. |
!img |
!media
[FILE|URL] |
Add image, media, or URL to prompt. |
!md |
!markdown
[SOFTW] |
Toggle markdown rendering in response. |
!!md |
!!markdown
[SOFTW] |
Render last response in markdown. |
!rep |
!replay |
Replay last TTS audio response. |
!res |
!resubmit |
Resubmit last STT recorded audio in cache. |
!p |
!pick [PROPMT] |
File picker, appends filepath to user prompt. ‡ |
!pdf |
!pdf: [FILE] |
Convert PDF and dump text. |
!photo |
!!photo [INDEX] |
Take a photo, optionally set camera index (Termux). ‡ |
!sh |
!shell [CMD] |
Run shell command and edit stdout (make request). ‡ |
!sh: |
!shell: [CMD] |
Same as !sh and insert stdout into current prompt. |
!!sh |
!!shell [CMD] |
Run interactive shell command and return. |
!time |
!date |
Add timestamp to the start of user prompt. ‡ |
!url |
!url: [URL] |
Dump URL text or YouTube transcript text. |
| Script | Settings and UX | |
|---|---|---|
!fold |
!wrap |
Toggle response wrapping. |
-F |
!conf |
Runtime configuration form TUI. |
-g |
!stream |
Toggle response streaming. |
-h |
!help [REGEX] |
Print help or grep help for regex. |
-l |
!models [NAME] |
List language models or show model details. |
-o |
!clip |
Copy responses to clipboard. |
-u |
!multi |
Toggle multiline prompter. <CTRL-D> flush. |
-uu |
!!multi |
Multiline, one-shot. <CTRL-D> flush. |
-U |
-UU |
Toggle cat prompter or set one-shot. <CTRL-D> flush. |
-V |
!debug |
Dump raw request block and confirm. |
-v |
- | Toggle interface verbose modes. |
-x |
!ed |
Toggle text editor interface. |
-xx |
!!ed |
Single-shot text editor. |
-y |
!tik |
Toggle python tiktoken use. |
!q |
!quit |
Exit. Bye. |
| Model | Settings | |
|---|---|---|
!Nill |
-Nill |
Unset max response tokens (chat cmpls). |
!NUM |
-M [NUM] |
Maximum response tokens. |
!!NUM |
-N [NUM] |
Model token capacity. |
-a |
!pre [VAL] |
Presence penalty. |
-A |
!freq [VAL] |
Frequency penalty. |
-b |
!responses
[MOD] |
Responses API request (experimental). |
best |
!best-of [NUM] |
Best-of n results. |
-j |
!seed [NUM] |
Seed number (integer). |
-K |
!topk [NUM] |
Top_k. |
-m |
!mod [MOD] |
Model by name, empty to pick from list. |
-n |
!results [NUM] |
Number of results. |
-p |
!topp [VAL] |
Top_p. |
-r |
!restart [SEQ] |
Restart sequence. |
-R |
!start [SEQ] |
Start sequence. |
-s |
!stop [SEQ] |
One stop sequence. |
-t |
!temp [VAL] |
Temperature. |
-w |
!rec [ARGS] |
Toggle voice-in STT. Optionally, set arguments. |
-z |
!tts [ARGS] |
Toggle TTS chat mode (speech out). |
!blk |
!block [ARGS] |
Set and add custom options to JSON request. |
!effort |
- [MODE] | Reasoning effort: minimal, high, medium, or low (OpenAI). |
!think |
- [NUM] | Thinking budget: tokens (Anthropic). |
!ka |
!keep-alive
[NUM] |
Set duration of model load in memory (Ollama). |
!verb |
!verbosity
[MODE] |
Model verbosity level (high, medium, or low). |
!vision |
!audio,
!multimodal |
Toggle multimodality type. |
| Session | Management | |
|---|---|---|
-C |
- | Continue current history session (see !break). |
-H |
!hist [NUM] |
Edit history in editor or print the last n history entries. |
-P |
-HH, !print |
Print session history. |
-L |
!log [FILEPATH] |
Save to log file. |
!c |
!copy [SRC_HIST]
[DEST_HIST] |
Copy session from source to destination. |
!f |
!fork
[DEST_HIST] |
Fork current session and continue from destination. |
!k |
!kill [NUM] |
Comment out n last entries in history file. |
!!k |
!!kill
[[0]NUM] |
Dry-run of command !kill. |
!s |
!session
[HIST_NAME] |
Change to, search for, or create history file. |
!!s |
!!session
[HIST_NAME] |
Same as !session, break session. |
!u |
!unkill [NUM] |
Uncomment n last entries in history file. |
!!u |
!!unkill
[[0]NUM] |
Dry-run of command !unkill. |
!br |
!break,
!new |
Start new session (session break). |
!ls |
!list
[GLOB|.|pr|awe] |
List history files with “glob” in name; Files: “.”; Prompts: “pr”; Awesome: “awe”. |
!grep |
!sub [REGEX] |
Grep sessions and copy session to hist tail. |
!tmp |
!!tmp |
Fork session to a temporary cache. |
Examples
/temp 0.7”,
“!modgpt-5”, “-p 0.2”/session HIST_NAME”,
“[PROMPT] /pick”/sh”,
“Translate this to French /sh”Some options can be disabled and excluded from the request by setting a “-1” as argument (bypass with “-1.0”)
!presence -1”,
“-a -1”, “-t-1”To regenerate response, type in the command
“!regen” or a single exclamation mark or forward slash in
the new empty prompt. In order to edit the prompt before the request,
try “!!” (or “//”).
The “/pick” command opens a file picker (usually a
command-line file manager). The selected file path will be appended to
the current prompt in editing mode.
The “/sh” and “/pick” commands may be run
when typed at the end of the current prompt, such as “[PROMPT]
/sh”, which opens a new shell instance to execute commands
interactively. Shell command or file dumps are appended to the current
prompt.
Any “!CMD” not matching a chat command is executed by
the shell as an alias for “!sh CMD”. Note that this
shortcut only works with operator exclamation mark.
Envar $BLOCK_USR can be set to raw model options in
JSON syntax, according to each API, to be injected in the request block.
Alternatively, run command “!block [ARGS]” during
chat mode.
A history file can hold a single session or multiple sessions. When
it holds a single session, the name of the history file and the session
are the same. However, in case the user breaks a session, the last one
(the tail session) of that history file is always loaded when the resume
option -C is set.
The script uses a TSV file to record entries, which is kept
at the script cache directory (“~/.cache/chatgptsh/”). The
tail session of the history file can always be read and
resumed.
Run command “/list [glob]” with optional “glob”
to list session / history “tsv” files. When glob is
“.” list all files in the cache directory; when “pr”
list all instruction prompt files; and when “awe” list all
awesome prompts.
A new history file can be created or changed to with command
“/session [HIST_NAME]”, in which
HIST_NAME is the file name or path of a history file.
On invocation, when the first positional argument to the script
follows the syntax “/[HIST_NAME]”, the command
“/session” is assumed (with
options -bccCdPP).
To continue from an old session type in a dot “.” or
“/.” as the first positional argument from the command line
on invocation.
The above command is a shortcut of “/copy
current current”. In fact, there are multiple commands
to copy and resume from an older session (the dot means current
session): “/copy . .”, “/fork.”,
“/sub”, and “/grep [REGEX]”.
From the command line on invocation, simply type “.” as
the first positional argument.
It is possible to copy sessions of a history file to another file
when a second argument is given to the “/copy” command.
Mind that forking a session will change to the destination history file and resume from it as opposed to just copying it.
To edit chat context at run time, the history file may be modified
with the “/hist” command (also good for context
injection).
Delete history entries or comment them out with “#”.
When the argument to option -S starts with a full stop,
such as “-S .my_prompt”, load, search
for, or create my_prompt prompt file. If two full stops are
prepended to the prompt name, load it silently. If a comma is used
instead, such as “-S ,my_prompt”,
edit the prompt file, and then load it.
When the argument to option -S starts with a backslash
or a percent sign, such as “-S
/linux_terminal”, search for an
awesome-chatgpt-prompt(-zh) (by Fatih KA and PlexPt).
Set “//” or “%%” to refresh local cache. Use
with davinci and gpt-3.5+ models.
These options also set corresponding history files automatically.
Please note and make sure to backup your important custom prompts!
They are located at “~/.cache/chatgptsh/” with the
extension “.pr”.
An image can be created given a text prompt. A text PROMPT of the desired image(s) is required. The maximum length is 1000 characters.
This script also supports xAI image generation model with invocation
“chatgpt.sh --xai -i -m grok-2-image-1212 "[prompt]"”.
Variations of a given IMAGE can be generated. The IMAGE to use as the basis for the variations must be a valid PNG file, less than 4MB and square.
To edit an IMAGE, a MASK file may be optionally provided. If MASK is not provided, IMAGE must have transparency, which will be used as the mask. A text prompt is required.
If ImageMagick is available, input IMAGE and MASK will be checked and processed to fit dimensions and other requirements.
A transparent colour must be set with
“-@[COLOUR]” to create the mask.
Defaults=black.
By defaults, the COLOUR must be exact. Use the `fuzz option’
to match colours that are close to the target colour. This can be set
with “-@[VALUE%]” as a percentage of the maximum
possible intensity, for example “-@10%black”.
See also:
An alpha channel is generated with ImageMagick from any image with the set transparent colour (defaults to black). In this way, it is easy to make a mask with any black and white image as a template.
In-painting is achieved setting an image with a MASK and a prompt.
Out-painting can also be achieved manually with the aid of this
script. Paint a portion of the outer area of an image with
alpha, or a defined transparent colour which
will be used as the mask, and set the same colour in the script
with option -@. Choose the best result amongst many results
to continue the out-painting process step-wise.
Transcribes audio file or voice record into the set language. Set a two-letter ISO-639-1 language code (en, es, ja, or zh) as the positional argument following the input audio file. A prompt may also be set as last positional parameter to help guide the model. This prompt should match the audio language.
If the last positional argument is “.” or “last” exactly, it will resubmit the last recorded audio input file from cache.
Note that if the audio language is different from the set language code, output will be on the language code (translation).
Translates audio into English. An optional text to guide the model’s style or continue a previous audio segment is optional as last positional argument. This prompt should be in English.
Setting temperature has an effect, the higher the more random.
For LocalAI integration, run the script with
option --localai, or set environment
$OPENAI_BASE_URL with the server Base URL.
For Mistral AI set environment variable
$MISTRAL_API_KEY, and run the script with
option --mistral or set $OPENAI_BASE_URL
to “https://api.mistral.ai/”. Prefer setting command line
option --mistral for complete integration.
For Ollama, set option -O (--ollama), and
set $OLLAMA_BASE_URL if the server URL is different
from the defaults.
Note that model management (downloading and setting up) must follow the Ollama project guidelines and own methods.
For Google Gemini, set environment variable
$GOOGLE_API_KEY, and run the script with the command
line option --google.
For Groq, set the environmental variable $GROQ_API_KEY.
Run the script with option --groq. Transcription (Whisper)
endpoint available.
For Anthropic, set envar $ANTHROPIC_API_KEY and run the
script with command line option --anthropic.
For GitHub Models, $GITHUB_TOKEN and invoke the script
with option --github.
For Novita AI integration, set the environment variable
$NOVITA_API_KEY and use the --novita option
(legacy).
Likewise, for xAI Grok, set environment $XAI_API_KEY
with its API key.
And for DeepSeek API, set environment $DEEPSEEK_API_KEY
with its API key.
Run the script with option --xai and also with
option -cc (chat completions.).
Some models also work with native text completions. For that, set
command-line option -c instead.
BLOCK_USR
Extra options for the request JSON block (e.g. “"seed": 33, "dimensions": 1024”).
Script cache directory base.
Path to the user configuration file.
Defaults="~/.chatgpt.conf"
Path to a history / session TSV file (script-formatted).
Initial initial instruction message.
Initial initial instruction or system message in chat mode.
TTS transcription model instruction (gpt-4o-tts models).
LC_ALL
Default instruction language in chat mode.
MOD_CHAT, MOD_IMAGE, MOD_AUDIO,
MOD_SPEECH, MOD_LOCALAI, MOD_OLLAMA,
MOD_MISTRAL, MOD_AUDIO_MISTRAL, MOD_GOOGLE,
MOD_GROQ, MOD_AUDIO_GROQ, MOD_SPEECH_GROQ,
MOD_ANTHROPIC, MOD_GITHUB, MOD_NOVITA,
Set default model for each endpoint / provider.
OPENAI_BASE_URL
Main Base URL setting. Alternatively, provide a URL_PATH parameter with the full url path to disable endpoint auto selection.
Base URLs for each service provider: LOCALAI, OLLAMA, MISTRAL, GOOGLE, ANTHROPIC, GROQ, GITHUB, NOVITA, XAI, and DEEPSEEK.
OPENAI_API_KEY
PROVIDER_API_KEY
Keys for OpenAI, Gemini, Mistral, Groq, Anthropic, GitHub Models, Novita, xAI, and DeepSeek APIs.
Output directory for received image and audio.
RESTART
Restart and start sequences. May be set to null.
Restart=“\nQ: ” Start="\nA:" (chat mode)
VISUAL
Text editor for external prompt editing.
Defaults="vim"
Clipboard set command, e.g. “xsel -b”, “pbcopy”.
Audio player command, e.g. “mpv –no-video –vo=null”.
Audio recorder command, e.g. “sox -d”.
To ground a user prompt with search results, run chat command
“/g [prompt]”.
Default search provider is Google. To select a different search
provider, run “//g [prompt]” and choose amongst
Google, DuckDuckGo, or Brave.
Running “//g [prompt]” will always use the in-house
solution instead of any service provider specific web search tool.
A cli-browser is required, such as w3m, elinks, links, or lynx**.
Use the in-house solution above, or select models with “search” in the name, such as “gpt-4o-search-preview”.
export BLOCK_USR='"search_parameters": {
"mode": "auto",
"max_search_results": 10
}'
chatgpt.sh --xai -cc -m grok-3-latest
Check more search parameters at the xAI API documentation: https://docs.x.ai/docs/guides/live-search.
export BLOCK_USR='"tools": [{
"type": "web_search_20250305",
"name": "web_search",
"max_uses": 5
}]'
chatgpt.sh --ant -cc -m claude-opus-4-0
Check more web search parameters at Anthropic API docs: https://docs.anthropic.com/en/docs/build-with-claude/extended-thinking.
export BLOCK_CMD='"tools": [ { "google_search": {} } ]'
chatgpt.sh --goo -cc -m gemini-2.5-flash-preview-05-20
Check more web search parameters at Google AI API docs: https://ai.google.dev/gemini-api/docs/grounding?lang=rest.
The colour scheme may be customised. A few themes are available in the template configuration file.
A small colour library is available for the user conf file to personalise the theme colours.
The colour palette is composed of $Red, $Green, $Yellow, $Blue, $Purple, $Cyan, $White, $Inv (invert), and $Nc (reset) variables.
Bold variations are defined as $BRed, $BGreen, etc, and background colours can be set with $On_Yellow, $On_Blue, etc.
Alternatively, raw escaped color sequences, such as \u001b[0;35m, and \u001b[1;36m may be set.
Theme colours are named variables from Colour1 to about
Colour11, and may be set with colour-named variables or raw
escape sequences (these must not change cursor position).
User configuration is stored in ~/.chatgpt.conf. Its path location can be set with envar $CHATGPTRC.
The script cache directory is ~/.cache/chatgptsh/ and may contain the following file types:
Backup Recommendation: It is strongly recommended to back up session record files (tsv) and prompt files (pr), as well as the configuration file (chatgpt.sh) to preserve session history, custom promptsnd settings.
Press <CTRL-X CTRL-E> to edit command line in text editor from readline.
Press <CTRL-J> or <CTRL-V CTRL-J> for newline in readline.
Press <CTRL-L> to redraw readline buffer (user input) on screen.
During cURL requests, press <CTRL-C> once to interrupt the call.
Press <CTRL-\> to exit from the script (send QUIT signal), or “Q” in user confirmation prompts.
Stdin text is appended to any existing command line PROMPT.
Input sequences “\n” and “\t” are only treated specially (as escaped new lines and tabs) in restart, start and stop sequences!
The moderation endpoint can be accessed by setting the model name to omni-moderation-latest (or text-moderation-latest).
For complete model and settings information, refer to OpenAI API docs at https://platform.openai.com/docs/.
See the online man page and chatgpt.sh usage examples
at: https://gitlab.com/fenixdragao/shellchatgpt.
Bash shellcURL and JQOptional packages for specific features.
Base64 - Image endpoint, vision modelsPython - Modules tiktoken, markdown, bs4ImageMagick/fbida - Image edits and
variationsSoX/Arecord/FFmpeg - Record
input (STT, Whisper)mpv/SoX/Vlc/FFplay/afplay
- Play TTS outputxdg-open/open/xsel/xclip/pbcopy
- Open images, set clipboardW3M/Lynx/ELinks/Links
- Dump URL textbat/Pygmentize/Glow/mdcat/mdless
- Markdown supporttermux-api/termux-tools/play-audio
- Termux systempoppler/gs/abiword/ebook-convert/LibreOffice
- Dump PDF or Doc as textdialog/kdialog/zenity/osascript/termux-dialog
- File pickeryt-dlp - Dump YouTube captionsThe script objective is to implement some of the features of OpenAI API version 1. As text is the only universal interface, voice and image features will only be partially supported, and not all endpoints or options will be covered.
This project does not support “Function Calling”, “Structured Outputs”, “Real-Time Conversations”, “Agents/Operators”, “MCP Servers”, nor “video generation / editing” capabilities.
Support for “Responses API” is limited and experimental at this point.
Reasoning (thinking) and answers from certain API services may not have a distinct separation of output due to JSON processing constraints.
Bash “read command” may not correctly display input buffers larger than the TTY screen size during editing. However, input buffers remain unaffected. Use the text editor interface for big prompt editing.
If readline screws up your current input buffer, try pressing <CTRL-L> to force it to redisplay and refresh the prompt properly on screen.
File paths containing spaces may not work correctly in the chat interface. Make sure to backslash-escape filepaths with white spaces.
Folding the response at white spaces may not worked correctly if the user has changed his terminal tabstop setting. Reset it with command “tabs -8” or “reset” before starting the script, or set one of these in the script configuration file.
If folding does not work well at all, try exporting envar
$COLUMNS before script execution.
Bash truncates input on “\000” (null).
Garbage in, garbage out. An idiot savant.
The script logic resembles a bowl of spaghetti code after a cat fight.