A high-throughput and memory-efficient inference and serving engine for LLMs
-
Updated
Jun 13, 2024 - Python
A high-throughput and memory-efficient inference and serving engine for LLMs
Open deep learning compiler stack for cpu, gpu and specialized accelerators
Next generation LAPACK implementation for ROCm platform
vLLM: A high-throughput and memory-efficient inference and serving engine for LLMs
PygmalionAI's large-scale inference engine
HPC solver for nonlinear optimization problems
The PennyLane-Lightning plugin provides a fast state-vector simulator written in C++ for use with PennyLane
hipBLASLt is a library that provides general matrix-matrix operations with a flexible API and extends functionalities beyond a traditional BLAS library
AMD ROCm Performance Primitives (RPP) library is a comprehensive high-performance computer vision library for AMD processors with HIP/OpenCL/CPU back-ends.
Add a description, image, and links to the rocm topic page so that developers can more easily learn about it.
To associate your repository with the rocm topic, visit your repo's landing page and select "manage topics."