Setting Up the Workloads
Setting up the Docker Containers for Triton Server and Client
In this step, we will set up Docker containers for the Triton server and client, and configure the example model repository.
- Run Triton Server in Docker
To begin, clone the server repository to get the example model repository and set it up.
git clone -b r24.08 https://github.com/triton-inference-server/server.git
cd server/docs/examples
- Fetch the Models
Run the script to fetch the models:
./fetch_models.sh
If the script fails, change the download link from here and run it again.
- Run Triton Server in Docker
Now, run the Triton server in a Docker container. Make sure that the model repository is correctly mapped to the container. Run the following command:
docker run -it --net=host --pid=host --name=triton-server -v ${PWD}/server/docs/examples/model_repository:/models nvcr.io/nvidia/tritonserver:24.08-py3 tritonserver --model-repository=/models
This will start the Triton server and load the models from the model_repository.
- Verify Triton Server is Running
After starting the server, you should see logs in the terminal indicating that the Triton server has successfully started. The server should now be running and ready to serve the models.
- Run Triton Client in Docker
To interact with the Triton server, you need the Triton client. Pull the Docker image for the Triton SDK:
docker pull nvcr.io/nvidia/tritonserver:24.08-py3-sdk
Once the image is pulled, run the Triton client in Docker:
docker run -it --net=host --pid=host --name=triton-client -v ${PWD}/server/docs/examples/model_repository:/models nvcr.io/nvidia/tritonserver:24.08-py3-sdk
This command starts the Triton client, allowing it to interact with the server running on your system.