gamglm is a software in C++ for Gamma generalized linear model of
huge number of binary features, such as some thousands.
Because Gamma generalized linear model is not convex in its parameters,
ordinary optimization (like L-BFGS) would stuck for the huge humber of
features. This software employs a simple MCMC algorithm
and an efficient data structure for inference.
Makefile and type make.
% gamglm -h gamglm: Bayesian Gamma generalized linear model. $Id: gamglm.cpp,v 1.4 2014/10/27 12:01:07 daichi Exp $ usage: gamglm [-I iter] [-e eps] [-s sigma] TRAIN MODELOptions are:
When the iterations are finished, there will be model files below:
- -I iter
- number of MCMC iterations. (default 1)
- -e eps
- standard deviation of Gaussian random walk. (default 0.2)
- -s sigma
- standard deviation of L2 regularization of weights. (default 0.1)
- model.dic
- Dictionary of features. Internally each feature is assigned an integer corresponding to its line number.
- model.a
- Regression weights wa of Gamma regression for the shape parameter.
- model.b
- Regression weights wb of Gamma regression for the scale parameter.
y feature_1 feature_2 feature_3 .. feature_n
test.dat included in the package.
% gamglm-predict usage: gamglm-predict TEST MODEL $Id: gamglm-predict.cpp,v 1.1 2014/10/28 08:39:59 daichi Exp $TEST is a data file whose format is the same as the training data, but the target variable y is not used and can be any number (such as -1). It will output the prediction of a and b to stdout:
% gamglm-predict test.dat model
-1       0.998879        1.026384
-1       1.108263        0.733772
-1       1.229099        0.723988
-1       1.187355        0.708005
-1       1.290324        0.675810
-1       1.310694        0.556131
         ^-- parameter a ^-- parameter b
Then you can use predicted a and b above in the Gam(a,b) distribution.