Archive
Tags
Search
Consulting
Subscribe
Veritable Tech Blog
Technical Notes from a Data Geek.
[Notes] Gradient Checkpointing with BERT
A brief analysis of huggingface's implementation
[Paper] Adafactor: Adaptive Learning Rates with Sublinear Memory Cost
Essential for fine-tuning T5 v1.1 and mT5 models
Mistake I Made that Crippled My Streamlit App
Not properly caching slows down the app and increases memory consumption
A Case Study of fastcore @patch_to
Trying out SnapMix with minimal changes to the codebase
[Paper] Rethinking Cooperative Rationalization: Introspective Extraction and Complement Control
Building competitive self-explaining NLP models
←
Newer Posts
Older Posts
→