Kong API Gateway with Cudo Compute

How to add authentication and SSL/TLS your AI API

Kong API Gateway is a scalable platform designed for managing, securing, and orchestrating APIs and microservices. Built on top of NGINX, it provides high performance and flexibility, handling API traffic with low latency. Kong offers a wide range of features including load balancing, rate limiting, authentication, logging, and monitoring, making it a comprehensive solution for API management.

In this guide we will use Kong API gateway to wrap an existing AI API with an HTTPS connection and key based authentication. If you run a web application on another cloud but wish to use Cudo Compute for deploying AI such as LLMs; this tutorial can show you how to create a secure connection between the clouds.


  • Create a project and add an SSH key
  • Optionally download CLI tool
  • Choose a VM with an NVIDIA GPU and Configure
  • Use the Ubuntu 22.04 + NVIDIA drivers + Docker image (in CLI tool type -image ubuntu-2204-nvidia-535-docker-v20240214)
  • Start a VM with one or more GPUs

Start AI API

We will start a docker network and run a docker container with Ollama to deploy LLMs. Then we will run a second docker container with Kong API Gateway that will connect to Ollama. Kong is being run without a database, so it simply requires a yaml file.

SSH on to your Cudo GPU VM and create a docker network

docker network create kong-net

Serve Ollama API for LLMs, you can run whichever service you like just make sure to run it on the kong-net network and make a note of the name and the port:

sudo docker run --gpus=all --network=kong-net -d --name ollama -p ollama/ollama

name: ollama port:11434

Make SSL Keys

On the Cudo VM create an SSL certificate, replace the IP with the Cudo VM IP address

mkdir kong
cd kong
openssl req -x509 -newkey rsa:4096 -keyout kong.key -out kong.crt -sha256 -days 3650 -nodes -subj '/CN=CUDO-IP-ADDRESS'
chmod 744 kong.key
chmod 744 kong.crt

Make a yaml file

This yaml file will configure kong to connect to the Ollama docker container. If you are using another service, change the name and port of your docker container in the url: http://ollama:11434. Here the key-auth kong plugin is used to add key based authentication. Swap my-key for your secure key. Change the path to your desired path.


_format_version: '3.0'
_transform: true

  - name: ollama
    url: http://ollama:11434
      - name: ollama-route
          - /ollama
      - name: key-auth

  - username: kong-user
      - key: my-key

Run Kong docker container

Run a detached docker container with Kong:

docker run -d --name kong-dbless \
 --network=kong-net \
 -v "$(pwd):/kong/" \
 -e "KONG_DATABASE=off" \
 -e "KONG_DECLARATIVE_CONFIG=/kong/kong.yaml" \
 -e "KONG_SSL=on" \
 -e "KONG_SSL_CERT=/kong/kong.crt" \
 -e "KONG_SSL_CERT_KEY=/kong/kong.key" \
 -e "KONG_PROXY_ACCESS_LOG=/dev/stdout" \
 -e "KONG_ADMIN_ACCESS_LOG=/dev/stdout" \
 -e "KONG_PROXY_ERROR_LOG=/dev/stderr" \
 -e "KONG_ADMIN_ERROR_LOG=/dev/stderr" \
 -p \
 -p 8443:8443 \


Testing on VM

SSH on to the Cudo VM and run:

curl  --header "apikey: my-key"  -v http://localhost:8000/ollama

Swap /ollama for the path defined in the yaml file. You should see the expected output from your API.

Testing remotely

To test that port 8443 is open and running, from your local machine run:

curl --insecure --header "apikey: my-key"  -v https://CUDO-IP-ADDRESS:8443/ollama

Testing with SSL and python

As the certificate is self-signed we need to copy it to our local machine and use it in our request.

scp [email protected]:/root/kong/kong.crt .
import requests
r = requests.get('https://CUDO-IP-ADDRESS:8443/ollama', headers={'apikey': 'my-key'}, verify='kong.crt')
print(r, r.text)