Paper-Reading

Articulated Pose Estimation by a Graphical Model with Image Dependent Pairwise Relations

tags: Deep Learning, Computer Vision, Pose Estimation, NIPS 2014project homepage: http://www.stat.ucla.edu/~xianjie.chen/projects/pose_estimation/pose_estimation.html

Paper

Graphical Model

####Score Function####


Implementation


demo.m

conf is a structure of the given global configuration. conf.pa is the index of the parent of each joint. p_no is the number of the parts(joints). The main part of this function is shown in the following.

// read data 
[pos_train, pos_val, pos_test, neg_train, neg_val, tsize] = LSP_data();
// train dcnn
train_dcnn(pos_train, pos_val, neg_train, tsize, caffe_solver_file);
// train graphical model
model = train_model(note, pos_val, neg_val, tsize);
// testing
boxes = test_model([note,'_LSP'], model, pos_test);
/* ... */
// evaluation
show_eval(pos_test, ests, conf, eval_method);

Read data : LSP_data.m

Some variables and constants:

trainval_frs_pos = 1:1000;      // training frames for positive
test_frs_pos = 1001:2000;       // testing  frames for positive
trainval_frs_neg = 615:1832;    // training frames for negative (of size 1218)
frs_pos = cat(2, trainval_frs_pos, test_frs_pos); // frames for negative
all_pos                         // num(pos)*1 struct array for positive
                                // struct: im, joints, r_degree, isflip
neg                             // num(neg)*1 struct array for negative
pos_trainval = all_pos(1 : numel(trainval_frs_pos));  // training and validation image struct for positive
pos_test = all_pos(numel(trainval_frs_pos)+1 : end);  // testing image struct for positive

Data preparing:


Train DCNN : train_dcnn.m

Some variable and constants:

mean_pixel = [128, 128, 128];           // the mean value of each pixel
K = conf.K;                             // K = T_{ij}

Prepare patches : prepare_patches.m

Prepare the patches and derive their labels to train dcnn

K-means : get \(r_{ij}\), \(t_{ij}\) and the labels \(\cup_{c = 0}^{K}\{c\}\times (\times_{j \in \mathbb{N}(i)} \{1, 2, ..., T_{ij}\})\)
// generate the labels
clusters = learn_clusters(pos_train, pos_val, tsize);
label_train = derive_labels('train', clusters, pos_train, tsize);
label_val = derive_labels('val', clusters, pos_val, tsize);

// labels for negative (dummy)
dummy_label = struct('mix_id', cell(numel(neg_train), 1), ...
    'global_id', cell(numel(neg_train), 1));

// all the training data
train_imdata = cat(1, num2cell(pos_train), num2cell(neg_train));
train_labels = cat(1, num2cell(label_train), num2cell(dummy_label));

// random permute the data and store it in the format of LMDB
perm_idx = randperm(numel(train_imdata));
train_imdata = train_imdata(perm_idx);
train_labels = train_labels(perm_idx);
if ~exist([cachedir, 'LMDB_train'], 'dir')
    store_patch(train_imdata, train_labels, psize, [cachedir, 'LMDB_train']);
end
// validation data for positive
val_imdata = num2cell(pos_val);
val_labels = num2cell(label_val);
if ~exist([cachedir, 'LMDB_val'], 'dir')
    store_patch(val_imdata, val_labels, psize, [cachedir, 'LMDB_val']);
end
Learn clusters : learn_clusters(call cluster_rp cluster relative position)
Derive labels

Train dcnn

System call caffe to train dcnn

system([caffe_root, '/build/tools/caffe train ', sprintf('-gpu %d -solver %s', ...
    conf.device_id, caffe_solver_file)]);
Get fully-convolutional net : net_surgery.m