Machine_learning

Quantile Regression — Part 2

Photo Credit We’ve discussed what quantile regression is and how does it work in Part 1. In this Part 2 we’re going to explore how to train quantile regression models in deep learning models and gradient boosting trees. Source Code The source code to this post is provided in this repository: ceshine/quantile-regression-tensorflow. It is a fork of strongio/quantile-regression-tensorflow, with following modifcations: Use the example dataset from the scikit-learn example. The TensorFlow implementation is mostly the same as in strongio/quantile-regression-tensorflow. ...

Quantile Regression — Part 1

Photo Credit I’m starting to think prediction interval[1] should be a required output of every real-world regression model. You need to know the uncertainty behind each point estimation. Otherwise the predictions are often not actionable. For example, consider historical sales of an item under a certain circumstance are (10000, 10, 50, 100). Standard least squares method gives you an estimate of 2540. If you restock based on that prediction, you’re likely going to significantly overstock 75% of the time. The prediction is almost useless. But if you estimate the quantiles of the data distribution, the estimated 5th, 50th, and 95th percentiles are 16, 75, 8515, which are much more informative than the 2540 single estimation. It is also the idea of quantile regression. ...

Bayesian Logistic Regression using PyMC3

I’ve been reading this amazing (free) book Bayesian Methods for Hackers. I was half way through in early 2015, but dropped it because of some nuisances. But when I finally restarted reading it, I found it might be a good thing that I stopped reading for a while. Now I have more appreciation of the Bayesian methods and more mathematical understanding to fully grasp the idea the book trying to convey. (To be honest, I was quite confused about some concept like MAP in the first round of reading) ...

Implement FTRL-Proximal Algorithm in Go - Part 2

I’ve actually finished the concurrent version of the algorithm a while ago, right after the previous post. Unfortunately my laptop broke and it took almost a month to repair. Now I finally get to publish the result here. I know that the code is not elegant nor properly documented, but it’s a start. You’ll need to set the core variable in the main function to the number of cores of your CPU. The program will simultaneously trains a number of models according to that value, and predict the average of the prediction from each model. ...

Implement FTRL-Proximal Algorithm in Go - Part 1

For the sake of practicing, I’ve re-written tinrtgu’s solution to the Avazu challenge on Kaggle using Go. I’ve made some changes to save more memory, but the underlying algorithm is basically the same. (See this paper from where the alogorithm came for more information). The go code has been put on Github Gist. Any constructive comments are welcomed on that gist page, as I haven’t added a comment section on this blog. (I haven’t even set up Google Analytics, so I have no idea how many people are reading thi blog) I’m also working on a concurrent version utilizing the built-in support of concurrency in Go. So theoretically it would run faster in multi-core environment. ...