tags: Deep Learning, Computer Vision, Pose Estimation, NIPS 2014project homepage: http://www.stat.ucla.edu/~xianjie.chen/projects/pose_estimation/pose_estimation.html
| $$K = | \mathcal{V} | \(, simply regard as\)K$$-node tree. |
####Score Function####
IDPR term \(R(l_i, l_j, t_{ij}, t_{ji} | I)\)
conf is a structure of the given global configuration. conf.pa is the index of the parent of each joint. p_no is the number of the parts(joints).
The main part of this function is shown in the following.
// read data
[pos_train, pos_val, pos_test, neg_train, neg_val, tsize] = LSP_data();
// train dcnn
train_dcnn(pos_train, pos_val, neg_train, tsize, caffe_solver_file);
// train graphical model
model = train_model(note, pos_val, neg_val, tsize);
// testing
boxes = test_model([note,'_LSP'], model, pos_test);
/* ... */
// evaluation
show_eval(pos_test, ests, conf, eval_method);
LSP_data.mSome variables and constants:
trainval_frs_pos = 1:1000; // training frames for positive
test_frs_pos = 1001:2000; // testing frames for positive
trainval_frs_neg = 615:1832; // training frames for negative (of size 1218)
frs_pos = cat(2, trainval_frs_pos, test_frs_pos); // frames for negative
all_pos // num(pos)*1 struct array for positive
// struct: im, joints, r_degree, isflip
neg // num(neg)*1 struct array for negative
pos_trainval = all_pos(1 : numel(trainval_frs_pos)); // training and validation image struct for positive
pos_test = all_pos(numel(trainval_frs_pos)+1 : end); // testing image struct for positive
Data preparing:
lsp_pc2oc : function joints = lsp_pc2oc(joints) : convert to person-centricpos_trainval(ii).joints = Trans * pos_trainval(ii).joints; Create ground truth joints for model training. Augment the original 14 joint positions with midpoints of joints, defining a total of 26 jointsadd_flip : flip trainval images (horizontally) (#pos_trainval *= 2)init_scale : init dataset specific parametersadd_rotate : rotate trainval images (every $9^{\circ}$) (#pos_trainval *= 40)val_id = randperm(numel(pos_trainval), 2000); : split training and validation data for positive (random choose 2000 image from the pos_trainval to be the validation set, #training = #pos_trianval - 2000 = 78000)
val_id = randperm(numel(neg), 500); split training and validation data for negtive (random choose 500 image from the neg to be the validation set, #neg_val = #neg - #neg_train = 1218 - 500 = 728)add_flip : flip the negative data (#neg_val *= 2; #neg_train *= 2)train_dcnn.mSome variable and constants:
mean_pixel = [128, 128, 128]; // the mean value of each pixel
K = conf.K; // K = T_{ij}
prepare_patches.mPrepare the patches and derive their labels to train dcnn
// generate the labels
clusters = learn_clusters(pos_train, pos_val, tsize);
label_train = derive_labels('train', clusters, pos_train, tsize);
label_val = derive_labels('val', clusters, pos_val, tsize);
// labels for negative (dummy)
dummy_label = struct('mix_id', cell(numel(neg_train), 1), ...
'global_id', cell(numel(neg_train), 1));
// all the training data
train_imdata = cat(1, num2cell(pos_train), num2cell(neg_train));
train_labels = cat(1, num2cell(label_train), num2cell(dummy_label));
// random permute the data and store it in the format of LMDB
perm_idx = randperm(numel(train_imdata));
train_imdata = train_imdata(perm_idx);
train_labels = train_labels(perm_idx);
if ~exist([cachedir, 'LMDB_train'], 'dir')
store_patch(train_imdata, train_labels, psize, [cachedir, 'LMDB_train']);
end
// validation data for positive
val_imdata = num2cell(pos_val);
val_labels = num2cell(label_val);
if ~exist([cachedir, 'LMDB_val'], 'dir')
store_patch(val_imdata, val_labels, psize, [cachedir, 'LMDB_val']);
end
learn_clusters(call cluster_rp cluster relative position)nbh_IDs = get_IDs(pa, K);: get the neighbor of each part(joint)clusters{ii}: cell : the mean relative postion of ii-th partX(ii,:) = norm_rp(imdata(ii), cur, nbh, tsize); relative position for ii-th data itemmean_X = mean(X(valid_idx,:),1);
normX = bsxfun(@minus, X(valid_idx,:), mean_X); centralize (normalize) the relative positionR trials of the k-means algorithm and choose the one has the smallest distance
[gInd{trial}, cen{trial}, sumdist(trial)] = k_means(normX, K);
calculate the imgid(all the img belongs to the cluster k) of clusters{cur}{n}(k), where clusters{cur}{n}(k) is the k-th cluster of n-th neighbor of the cur-th joint.System call caffe to train dcnn
system([caffe_root, '/build/tools/caffe train ', sprintf('-gpu %d -solver %s', ...
conf.device_id, caffe_solver_file)]);
net_surgery.m