[Notes] Understanding ConvNeXt

credit Introduction Hierarchical Transformers (e.g., Swin Transformers[1]) has made Transformers highly competitive as a generic vision backbone and in a wide variety of vision tasks. A new paper from Facebook AI Research — “A ConvNet for the 2020s”[2] — gradually and systematically “modernizes” a standard ResNet[3] toward the design of a vision Transformer. The result is a family of pure ConvNet models dubbed ConvNeXt that compete favorably with Transformers in terms of accuracy and scalability. ...

January 28, 2022 · Ceshine Lee

Use MPIRE to Parallelize PostgreSQL Queries

Photo Credit Introduction Parallel programming is hard, and you probably should not use any low-level API to do it in most cases (I’d argue that Python’s built-in multiprocessing package is low-level). I’ve been using Joblib’s Parallel class for tasks that are embarrassingly parallel and it works wonderfully. However, sometimes the task at hand is not simple enough for the Parallel class (e.g., you need to share something from the main process that is not pickle-able, or you want to maintain states in each child process). I’ve recently found this library — MPIRE (MultiProcessing Is Really Easy) — that significantly mitigates this problem of not having enough flexibility, while still having a high-level and user-friendly API. ...

January 7, 2022 · Ceshine Lee

[Notes] Understanding XCiT - Part 2

Photo Credit In Part 1, we introduced the XCiT architecture and reviewed the implementation of the Cross-Covariance Attention(XCA) block. In this Part 2, we’ll review the implementation of the Local Patch Interaction(LPI) block and the Class Attention layer. from [1] Local Patch Interaction(LPI) Because there is no explicit communication between patches(tokens) in XCA, a layer consisting of two depth-wise 3×3 convolutional layers with Batch Normalization with GELU non-linearity is added to enable explicit communication. ...

July 25, 2021 · Ceshine Lee

[Notes] Understanding XCiT - Part 1

credit Overview XCiT: Cross-Covariance Image Transformers[1] is a paper from Facebook AI that proposes a “transposed” version of self-attention that operates across feature channels rather than tokens. This cross-covariance attention has linear complexity in the number of tokens (the original self-attention has quadratic complexity). When used on images as in vision transformers, this linear complexity allows the model to process images of higher resolutions and split the images into smaller patches, which are both shown to improve performance. ...

July 24, 2021 · Ceshine Lee

How to Create a Documentation Website for Your Python Package

Photo Credit Motivation Sphinx is a tool that helps you create intelligent and beautiful documentation. I use it to generate documentation for the pytorch-lightning-spells project and publish it on readthedocs.io for free (if the project is open-source). Documentation is tremendously helpful to users of your project (including yourselves). As long as you maintain the good habit of writing docstrings in your code, Sphinx will convert the docstrings into webpages for you, drastically reducing the manual labor required from you. ...

June 13, 2021 · Ceshine Lee