GoAI Transcriber
GoAI Transcriber is a tool that uses OpenAI's Whisper model to transcribe audio files. It supports various audio formats including .m4a
and converts them to .mp3
before processing if necessary.
Features
- Transcribes audio files using OpenAI's Whisper model.
- Supports the following audio formats:
.mp3
, .mp4
, .mpeg
, .mpga
, .m4a
, .wav
, .webm
.
- Automatically converts
.m4a
files to .mp3
for processing.
- Provides a REST API for uploading and transcribing audio files.
- Includes Swagger documentation for easy API exploration.
Project Structure
.
├── cmd
│ ├── api
│ │ └── main.go
│ └── app
│ └── main.go
├── deployment
│ ├── Dockerfile
│ ├── Dockerfile.api
│ ├── docker-compose.yml
│ ├── docker-compose.api.yml
│ ├── terraform
│ │ ├── main.tf
│ │ └── variables.tf
├── internal
│ ├── api
│ │ └── api.go
│ ├── app
│ │ ├── app.go
│ │ └── functions.go
│ ├── controller
│ │ └── transcription.go
│ ├── entity
│ │ └── transcription.go
│ ├── repository
│ │ └── transcription.go
│ └── usecase
│ └── transcription.go
├── pkg
│ └── openai
│ └── api.go
├── docs
│ ├── docs.go
│ ├── swagger.json
│ └── swagger.yaml
├── go.mod
├── go.sum
├── LICENSE
├── Makefile
├── README.md
└── .env
Installation
-
Clone the repository:
git clone https://github.com/umarquez/goai_transcriber.git
cd goai_transcriber
-
Set up environment variables:
Create a .env
file and add your OpenAI token:
APP_OPENAI_TOKEN=your_openai_token
APP_WORKING_PATH=./audios
Running the Application
Running the Application Locally
Standard Application
-
Build and run the application:
go build -o bin/app cmd/app/main.go
./bin/app
-
Transcribe audio files:
- The application reads the content of the
./audios
directory.
- If there are
.m4a
files, they are converted to .mp3
due to an error from the OpenAI API processing .m4a
files.
- The transcription result is written to the same directory with a
.txt
extension.
API Version
-
Generate Swagger documentation:
swag init --parseDependency --parseInternal -g cmd/api/main.go -o ./docs
-
Build and run the API:
go build -o bin/api cmd/api/main.go
./bin/api
-
Transcribe audio files via API:
Use a tool like curl
or Postman to send a POST
request to the /transcribe
endpoint with your audio file.
Example curl
command:
curl -X POST "http://localhost:8080/transcribe" -H "accept: application/json" -H "Content-Type: multipart/form-data" -F "file=@path/to/your/audiofile.m4a"
Running the Application Using Docker
Standard Application
-
Build and run the application using Docker:
docker-compose -f deployment/docker-compose.yml up --build
-
Transcribe audio files:
Place your audio files in the audios
directory and the application will automatically process and transcribe them.
API Version
-
Build and run the API using Docker:
docker-compose -f deployment/docker-compose.api.yml up --build
-
Transcribe audio files via API:
Use a tool like curl
or Postman to send a POST
request to the /transcribe
endpoint with your audio file.
Example curl
command:
curl -X POST "http://localhost:8080/transcribe" -H "accept: application/json" -H "Content-Type: multipart/form-data" -F "file=@path/to/your/audiofile.m4a"
API Documentation
The API is documented using Swagger. Once the application is running, you can access the documentation at:
http://localhost:8080/swagger/index.html
Contributing
Contributions are welcome! Please open an issue or submit a pull request.
License
This project is licensed under the MIT License. See the LICENSE file for details.