A100 Agent Lab is an experimental platform for building persistent coding-agent systems on an NVIDIA A100 80GB PCIe server.
This project is not an LLM runtime. It is the application and orchestration layer that sits on top of existing runtimes such as:
- SGLang
- LMDeploy
- HuggingFace Transformers
The goal is to provide stable Python abstractions for runtime management, persistent conversations, generation metrics, structured logging, and future agent capabilities such as tools, filesystem access, and Git integration.
Current runtime roles:
- Primary runtime: SGLang
- Secondary runtime: LMDeploy
- Reference runtime: HuggingFace Transformers
The platform currently includes runtime adapters for all three backends while
keeping the public AgentLab API runtime-independent.
from a100_agent_lab import AgentLab
lab = AgentLab.from_config("config/transformers-a100.yaml")
lab.start()
session = lab.create_session(system_prompt="You are a concise coding agent.")
result = lab.generate(session, "Explain pointers in C.", max_tokens=64)
print(result.text)
print(result.metrics)
lab.shutdown()The higher-level Agent abstraction owns one persistent session and exposes a
simple conversational API:
from a100_agent_lab import AgentLab
lab = AgentLab.from_config("config/sglang-a100.yaml")
lab.start()
lab.warmup()
agent = lab.create_agent(system_prompt="You are a concise coding assistant.")
reply = agent.ask("Explain pointer arithmetic.")
print(reply.text)
print(agent.statistics())
lab.shutdown()Multiple independent sessions can be managed by id:
first = lab.create_session(system_prompt="You are session one.")
second = lab.create_session(system_prompt="You are session two.")
lab.generate(first, "Explain malloc.")
lab.generate(second, "Explain fork.")
assert first.id != second.id
assert first.turn_count == 1
assert second.turn_count == 1
same_first = lab.get_session(first.id)
all_sessions = lab.list_sessions()
lab.reset_session(first.id)
lab.delete_session(second.id)cd /home/federico.molara/a100-agent-lab
/home/federico.molara/venv/a100-runtime/bin/python scripts/smoke_transformers.pyGeneration events are written as JSONL under experiments/logs/.
Lightweight tests do not load the 27B model and do not require a GPU:
cd /home/federico.molara/a100-agent-lab
/home/federico.molara/venv/a100-runtime/bin/python -m pytestRuntime integration tests are opt-in because they start real runtimes and may load the model:
A100_AGENT_LAB_RUN_INTEGRATION=1 \
/home/federico.molara/venv/a100-runtime/bin/python -m pytest -m integration