NTT Communication Science Laboratories

$Id: index.html,v 1.2 2006/02/15 03:45:22 daiti-m Exp $

bsets is a very simple (almost trivial) implementation of Bayesian Sets (Ghahramani and Heller, 2005) in MATLAB.

- bsets-0.1.tar.gz [2.7KB] (2006/2/14)

- s = bsets(X,q,alpha,beta)

computes the score of each entry in X given the query vector q, hyperparameters alpha and beta.

- s : row vector of scores of the entries
- X : sparse binary matrix of (features * entries)
- q : row vector of entries as a query
- alpha, beta : row vectors of Beta hyperparameters over the features, which can be computed by bsparam().

- [alpha,beta] = bsparam(X,c)

computes hyperparameters for Bayesian Sets algorithm as described in the paper.- X : sparse binary matrix of (features * entries)
- c : concentration parameter (in the paper, c = 2)

- bsshow(s,file,[n])

shows top n retrieval results according to the score vector s.- s : row vector of scores of entries, from bsets().
- file : name of a text file, each line of which explains each dimension of s.
- n : number of results to show (optional: default 20)

- Prepare sparse binary matrix of data.

Element (i,j) of this matrix is 1 if entry j has feature i; otherwise 0.

Columns of the matrix correspond to the entries to be retrieved; for a collaborative filtering for movies, each column represents a movie. Rows of the matrix correspond to (sparse) features; for the movie case, features of a movie are the users who put high votes for that movie. - Sparse matrix can be prepared as a text file and be loaded into MATLAB by spconvert() and load() functions. See help spconvert and the example below.
- Hyperparameters [alpha,beta] can be computed through bsparam() function provided. While this procedure simply follows the paper, you can use any kind of prior as long as the number of dimensions equals the number of features.
- bsshow() shows the retrieval result associated with the scores s. This function can be combined with bsets() directly as bsshow(bsets(X,q,alpha,beta),'file',n).

% matlab >> X = spconvert(load('movielens.dat')); >> [alpha,beta] = bsparam(X,2); >> s = bsets(X,[64 318],alpha,beta); >> bsshow(s,'movie.txt',10); 318 708.519 Schindler's List (1993) 64 669.252 Shawshank Redemption, The (1994) 98 510.812 Silence of the Lambs, The (1991) 174 455.121 Raiders of the Lost Ark (1981) 50 451.214 Star Wars (1977) 357 408.829 One Flew Over the Cuckoo's Nest (1975) 56 405.752 Pulp Fiction (1994) 69 398.551 Forrest Gump (1994) 79 394.869 Fugitive, The (1993) 172 392.896 Empire Strikes Back, The (1980) >>

- Zoubin Ghahramani and Katherine A. Heller, "Bayesian Sets".
*Advances in Neural Information Processing Systems 18*(NIPS 2005), 2005. [pdf]

- If there are unused features (rows with no 1s), MATLAB may issue a warning for zero division. This has no effect for the retrieval, and can be turned off as indicated by the system.

daichi <at> cslab.kecl.ntt.co.jp Last modified: Thu Feb 25 19:08:50 2010