Change Sources of Ubuntu in a Docker image

The official docker images of Ubuntu use archive.ubuntu.com as the default package source. Because my Internet connection is metered, I’d like to change it to the free mirror server my ISP provides. And this command does the trick: sed -i 's/http:\/\/archive.ubuntu.com/http:\/\/mirror.internode.on.net\/pub\/ubuntu/g' /etc/apt/sources.list Change http://mirror.internode.on.net/pub/ubuntu to the url of whatever mirror you prefer. And add RUN to the start of the line and put it into the Dockerfile.

July 7, 2015 · Ceshine Lee

Docker: Remove All Untagged Images

By courtesy of this post, its comment section and this thread: Clean up old containers docker ps -a | grep 'Exited' | awk '{print $1}' | xargs --no-run-if-empty docker rm Remove All Untagged Images docker rmi $(docker images -q --filter "dangling=true")

July 5, 2015 · Ceshine Lee

Random Sampling at the Command Line

When you receive a large dataset to analyze, you’d probably want to take a look at the data before fitting any models on it. But what if the dataset is too big to fit into the memory? One way to deal with this is to take much smaller random samples from the dataset. You’ll be able to have some rough ideas about what’s going on. (Of course you cannot get global maximum or minimum through this approach, but that kind of statistics can be easily obtained in linear time with minimal memory requirements) ...

January 22, 2015 · Ceshine Lee

Implement FTRL-Proximal Algorithm in Go - Part 2

I’ve actually finished the concurrent version of the algorithm a while ago, right after the previous post. Unfortunately my laptop broke and it took almost a month to repair. Now I finally get to publish the result here. I know that the code is not elegant nor properly documented, but it’s a start. You’ll need to set the core variable in the main function to the number of cores of your CPU. The program will simultaneously trains a number of models according to that value, and predict the average of the prediction from each model. ...

January 2, 2015 · Ceshine Lee

Implement FTRL-Proximal Algorithm in Go - Part 1

For the sake of practicing, I’ve re-written tinrtgu’s solution to the Avazu challenge on Kaggle using Go. I’ve made some changes to save more memory, but the underlying algorithm is basically the same. (See this paper from where the alogorithm came for more information). The go code has been put on Github Gist. Any constructive comments are welcomed on that gist page, as I haven’t added a comment section on this blog. (I haven’t even set up Google Analytics, so I have no idea how many people are reading thi blog) I’m also working on a concurrent version utilizing the built-in support of concurrency in Go. So theoretically it would run faster in multi-core environment. ...

December 9, 2014 · Ceshine Lee