Running Llama.cpp Agents on AMD RX 9060 XT with Docker and gVisor

Hardware Setup

Software Stack

llama.cpp

Docker with gVisor

Docker Configuration

Base Image

FROM gcr.io/go-containerregistry/docker:27.3.1-dind

GPU Passthrough

--gpus all

Memory Limits

--memory="32g"
--memory-swap="32g"

Volume Mounts

-v /path/to/models:/models:ro
-v /path/to/agents:/agents

Model Loading

Supported Quantizations

Model Size Considerations

Agent Architecture

Container Structure

agents/
├── agent1/
│   ├── Dockerfile
│   └── main.py
├── agent2/
│   ├── Dockerfile
│   └── main.py
└── shared/
    └── models/

Communication

Performance Considerations

VRAM Management

Inference Speed

Security Notes

Troubleshooting

Common Issues

AMD GPU Specific

Future Improvements