Tensorflow Profiler with Custom Training Loop

Photo Credit Introduction The Tensorflow Profiler in the upcoming Tensorflow 2.2 release is a much-welcomed addition to the ecosystem. For image-related tasks, often the bottleneck is the input pipeline. But you also don’t want to spend time optimizing the input pipeline unless it is necessary. The Tensorflow Profiler makes pinpointing the bottleneck of the training process much easier, so you can decide where the optimization effort should be put into. ...

April 24, 2020 · Ceshine Lee

Monitor Python Script Cron Jobs using Telegram

Photo Credit Motivation Apache Airflow is great for managing scheduled workflows, but in a lot of cases, it is an overkill and brings unnecessary complexity to the overall solution. Cron jobs are much easier to set up, have built-in support in most systems, and have a very flat learning curve. However, the lack of monitoring features and the consequential silent failures can be the bane of system admins’ lives. We want a simple solution that can help admins monitor the health of cron jobs in simple scenarios that do not warrant Airflow. The simple scenarios have the following characteristics: ...

April 10, 2020 · Ceshine Lee

Clutter-free Interactive Charts in R using Plotly

This is a short post describing how to use Plotly to make text-heavy charts cleaner in R. Introduction David Robinson presented a beautiful way to visualize the ratings of the Office episodes in this screencast: The chart (shown below) is sufficiently readable when zoomed in on a full HD monitor, but is quite messy when exported to a smaller frame. Moreover, some of the episode names are not displayed (to avoid overlapping). ...

March 31, 2020 · Ceshine Lee

TensorFlow 2.1 with TPU in Practice

Photo Credit Executive Summary TensorFlow has become much easier to use: As an experience PyTorch developer who only knows a bit of TensorFlow 1.x, I was able to pick up TensorFlow 2.x in my spare time in 60 days and do competitive machine learning. TPU has never been more accessible: The new interface to TPU in TensorFlow 2.1 works right out of the box in most cases and greatly reduces the development time required to make a model TPU-compatible. Using TPU drastically increases the iteration speed of experiments. We present a case study of solving a Q&A labeling problem by fine-tuning the RoBERTa-base model from huggingface/transformer library: Codebase Colab TPU training notebook Kaggle Inference Kernel High-level library TF-HelperBot to provide more flexibility than the Keras interface. (TensorFlow 2.1 and TPU are also a very good fit for CV applications. A case study of solving an image classification problem will be published in about a month.) Acknowledgment I was granted free access to Cloud TPUs for 60 days via TensorFlow Research Cloud. It was for the TensorFlow 2.0 Question Answering competition. I chose to do this simpler Google QUEST Q&A Labeling competition first but unfortunately couldn’t find enough time to go back and do the original one (sorry!). ...

February 13, 2020 · Ceshine Lee

Create a Customized Text Annotation Tool in Two Days - Part 2

Photo Credit Introduction In Part 1 of this series, we’ve discussed why building your own annotation tool can be a good idea, and demonstrated a back-end API server based on FastAPI. Now in this Part 2, we’re going to build a front-end interface that interacts with the end-user (the annotator). The front-end needs to do mainly three things: Fetch a batch of sentence/paragraph pairs to be annotated from the back-end server. Present the pairs to the annotator and provide a way for them to adjust the automatically generated labels. Send the annotated results to the back-end server. Disclaimer: I’m relatively inexperienced in front-end development. The code here may seem extremely amateur to professionals. However, I hope this post can serve as a reference or starting point for those with similar requirements. ...

December 17, 2019 · Ceshine Lee