Skip to content
#

attention-mechanism

Here are 1,500 public repositories matching this topic...

swarms

This project uses PyTorch to classify bone fractures. As well as fine-tuning some famous CNN architectures (like VGG 19, MobileNetV3, RegNet,...), we designed our own architecture. Additionally, we used Transformer architectures (such as Vision Transformer and Swin Transformer). This dataset is Bone Fracture Multi-Region X-ray, available on Kaggle.

  • Updated Jun 12, 2024
  • Jupyter Notebook

RWKV is an RNN with transformer-level LLM performance. It can be directly trained like a GPT (parallelizable). So it's combining the best of RNN and transformer - great performance, fast inference, saves VRAM, fast training, "infinite" ctx_len, and free sentence embedding.

  • Updated Jun 11, 2024
  • Python

Researching causal relationships in time series data using Temporal Convolutional Networks (TCNs) combined with attention mechanisms. This approach aims to identify complex temporal interactions. Additionally, we're incorporating uncertainty quantification to enhance the reliability of our causal predictions.

  • Updated Jun 10, 2024
  • Jupyter Notebook

QuillGPT is an implementation of the GPT decoder block based on the architecture from Attention is All You Need paper by Vaswani et. al. in PyTorch. Additionally, this repository contains two pre-trained models — Shakespearean GPT and Harpoon GPT, a Streamlit Playground, Containerized FastAPI Microservice, training - inference scripts & notebooks.

  • Updated Jun 7, 2024
  • Jupyter Notebook

Improve this page

Add a description, image, and links to the attention-mechanism topic page so that developers can more easily learn about it.

Curate this topic

Add this topic to your repo

To associate your repository with the attention-mechanism topic, visit your repo's landing page and select "manage topics."

Learn more