Docker Based PrivateGPT

I came across an interesting repo this past week.

https://github.com/imartinez/privateGPT

This is a project that claims to allow you to run your own self-hosted alternative to ChatGPT.

The internet has been on fire for months with LLM hype so I decided to spend some time playing around with OpenAI’s GPT4. After playing around a bit I became interested in finding out what it would take to run something like that locally. As soon as I came across the privateGPT repo I thought, “Hmm, can this run in Docker?” Yes! Yes it can!

This is the Dockerfile I came up with

FROM python:3

COPY ./ggml-gpt4all-j-v1.3-groovy.bin /chat_gpt/model/ggml-gpt4all-j-v1.3-groovy.bin

WORKDIR /chat_gpt

RUN git clone https://github.com/imartinez/privateGPT

WORKDIR /chat_gpt/privateGPT

RUN pip3 install -r requirements.txt

ENV MODEL_TYPE=GPT4All
ENV PERSIST_DIRECTORY=/chat_gpt/data
ENV MODEL_PATH=/chat_gpt/model/ggml-gpt4all-j-v1.3-groovy.bin
ENV MODEL_N_CTX=120
ENV EMBEDDINGS_MODEL_NAME=all-mpnet-base-v2
ENV TARGET_SOURCE_CHUNKS=4

COPY ./source_documents /chat_gpt/privateGPT/source_documents 

RUN python ingest.py

CMD [ "python", "privateGPT.py" ]

This Dockerfile will configure the privateGPT app to use the gpt4all model and give it access to the documents provided in the ./source_documents folder.

You can use the following commands to build and run your privateGPT container.

docker build -t gpt:mine -f ./
docker run --rm -it gpt:mine

This assumes you have a Dockerfile in the same directory as your language model and all your training documents in a sub folder named source_documents.

The image will take several minutes to build. When I was playing around most of the build time came from copying the language model and installing the python dependencies. Scanning the source documents was pretty quick. Subsequent builds should be faster as the copying the language model and pip3 install commands should be cached.

Overall, I was pretty pleased with how easy it is to setup. I was much more impressed with how well the privateGPT repo was crafted than I was with it’s performance. I don’t have a very good graphics card though and I threw pretty terrible training data at it.

Leave a Reply

Your email address will not be published. Required fields are marked *