Nan Zhang's Blog: Merging Kinect 3D Point Clouds With 2D/3D Data Fusion

How to merge 3D point clouds is a problem we face everyday while dealing with 3D data. One approach is to use ICP algorithm to merge multiple point clouds. However, in our own practice, ICP algorithm does not give the best results. It is either due to the limitation of ICP such as it does not guarantee global optimality, or just because we are not technically savvy enough to bring out the best side of ICP algorithm. Instead of ICP, we pursue an alternative solution. As the first step, we identify 2D features in images. Then by using optical flow method such as Lucas-Kanade, we are able to align images. By mapping 2D features to its correspondence in 3D point clouds, the alignment information acquired in image analysis can be used to merge 3D point clouds. The source code and test data can be downloaded from https://drive.google.com/drive/folders/0B05MVyfGr8lmOXV1dXZ6STZDalk

The first step is to use openCV library to identify feature points in an image. By processing the image through a differential filter in both horizontal and vertical directions, a Hessian matrix for each point of an image can be obtained as:

$H(p)= \begin{bmatrix} \frac{\partial^2 I}{\partial x^{2}} & \frac{\partial^2 I}{\partial x \partial y}\\ \frac{\partial^2 I}{\partial x \partial y} & \frac{\partial^2 I}{\partial y^{2}} \\ \end{bmatrix}$

$\partial I/\partial x$ and

$\partial I/\partial y$ come from passing the image through differential filter in horizontal and vertical directions. Various ways exist for identifying feature points based on Hessian matrices. Harris' method computes the difference between the determinant and trace and then compare the difference to a threshold. Shi and Tomasi's method compares the smaller one of the two eigenvalues of the matrix

$H(p)$ to a threshold. The cvGoodFeaturesToTrack() function in OpenCV uses Shi and Tomasi's method. The input "imgA" is the input image. In addition, cvFindCornerSubPix() function is used for fractionally interpolation to improve accuracy of feature point locations.

    // To find features to track

IplImage* eig_image = cvCreateImage( img_sz, IPL_DEPTH_32F, 1 );
IplImage* tmp_image = cvCreateImage( img_sz, IPL_DEPTH_32F, 1 );
int corner_count = MAX_CORNERS;
CvPoint2D32f* cornersA = new CvPoint2D32f[ MAX_CORNERS ];
cvGoodFeaturesToTrack(
imgA,
eig_image,
tmp_image,
cornersA,
&corner_count,
0.01,
5.0,
0,
3,
0,
0.04
);
cvFindCornerSubPix(
imgA,
cornersA,
corner_count,
cvSize(win_size,win_size),
cvSize(-1,-1),
cvTermCriteria(CV_TERMCRIT_ITER|CV_TERMCRIT_EPS,20,0.03)
);

After feature points found, pyramid Lucas-Kanade is used to trace the feature points. The goal is to find the corresponding points in image B for features in image A.

    // Call the Lucas Kanade algorithm

char features_found[ MAX_CORNERS ];

float feature_errors[ MAX_CORNERS ];

int iCorner = 0, iCornerTemp = 0;

CvSize pyr_sz = cvSize( imgA->width+8, imgB->height/3 );

IplImage* pyrA = cvCreateImage( pyr_sz, IPL_DEPTH_32F, 1 );

IplImage* pyrB = cvCreateImage( pyr_sz, IPL_DEPTH_32F, 1 );

CvPoint2D32f* cornersB = new CvPoint2D32f[ MAX_CORNERS ];

cvCalcOpticalFlowPyrLK(

imgA,

imgB,

pyrA,

pyrB,

cornersA,

cornersB,

corner_count,

cvSize( win_size,win_size ),

features_found,

feature_errors,

cvTermCriteria( CV_TERMCRIT_ITER | CV_TERMCRIT_EPS, 20, .3 ),

);

Thereafter, the features in image A and B are mapped into each one's corresponding point cloud. The function is carried out by findDepth() function. pointA in this function is 2D input and cloudPointA is 3D output. The constants in the function, widthCoef and heightCoef, are intrinsic parameters of the camera. The camera used here is Asus XtionPro live camera. These two parameters help to derive the actual horizontal and vertical locations based on depth information.

The value of depth is obtained through fractional interpolation. We first find four neighboring points with integer x and y values. Then interpolation is performed in horizontal direction and followed by vertical direction with the output to be depthFinal. There is another possible way for fractional interpolation. Interpolation in 2D can be written as z=a1+a2*x+a3*y+a4*x*y, which is bilinear interpolation formula. With four neighboring points, we can solve the equations and find a1/a2/a3/a4. Then the value of depthFinal can be obtained using bilinear formula.

// Find depth of a feature point found in 2d image

void findDepth(const float (*pointA)[2], float (*cloudPointA)[3], const int totalPoints, const pcl::PointCloud<pcl::PointXYZ> &cloudA)
{
// widthCoef and heightCoef are parameters coming from camera characterization
const float widthCoef = 1.21905;
const float heightCoef = 0.914286;
int width = cloudA.width;
int height = cloudA.height;
int xIndex = 0, yIndex = 0, i;
float xOffset, yOffset, depth00, depth01, depth10, depth11, depthX0, depthX1, depthFinal;
bool printOn = false;
for (i = 0; i < totalPoints; i++)
{
xIndex = (int) floor(pointA[i][0]);
yIndex = (int) ceil(pointA[i][1]);
xOffset = pointA[i][0] - floor(pointA[i][0]);
yOffset = ceil(pointA[i][1]) - pointA[i][1];
if (printOn)
{
std::cout << "xIndex,yIndex = " << xIndex << "," << yIndex <<std::endl;
std::cout << "xOffset,yOffset = " << xOffset << "," << yOffset <<std::endl;
}
if ((cloudA.points[yIndex*width+xIndex].z > 0) && (cloudA.points[yIndex*width+xIndex].z < 10)) // To filter out z = NaN
depth00 = cloudA.points[yIndex*width+xIndex].z;
else
depth00 = 0; // Make it equal to 0 maybe is not the best way
if ((cloudA.points[yIndex*width+xIndex+1].z > 0) && (cloudA.points[yIndex*width+xIndex+1].z < 10))
depth01 = cloudA.points[yIndex*width+xIndex+1].z;
else
depth01 = 0;
if ((cloudA.points[(yIndex-1)*width+xIndex].z > 0) && (cloudA.points[(yIndex-1)*width+xIndex].z < 10))
depth10 = cloudA.points[(yIndex-1)*width+xIndex].z;
else
depth10 = 0;
if ((cloudA.points[(yIndex-1)*width+xIndex+1].z > 0) && (cloudA.points[(yIndex-1)*width+xIndex+1].z < 10))
depth11 = cloudA.points[(yIndex-1)*width+xIndex+1].z;
else
depth11 = 0;
// 2D linear interpolation
depthX0 = (1-xOffset)*depth00 + xOffset*depth01;
depthX1 = (1-xOffset)*depth10 + xOffset*depth11;
depthFinal = (1-yOffset)*depthX0 + yOffset*depthX1;
cloudPointA[i][2] = depthFinal;
// Calculate x and y based on depth
cloudPointA[i][0] = depthFinal*(pointA[i][0]-width/2)/width*widthCoef;
cloudPointA[i][1] = depthFinal*(pointA[i][1]-height/2)/height*heightCoef;
if (printOn)
{
std::cout << "point[" << i << "] " << pointA[i][0] << "," << pointA[i][1] << std::endl;
std::cout << "cloudPoint[" << i << "] " << cloudPointA[i][0] << "," << cloudPointA[i][1] << "," << cloudPointA[i][2] << std::endl;
}
}
}

With 3D feature points from two clouds, we can find their linear transform. This function is done by findTransform(). The core is SVD computation. R_eigen is the unitary rotation between these two clouds and t_vec is the offset between these two clouds.

            // svd

Eigen::JacobiSVD<Eigen::Matrix3f> svd (m, Eigen::ComputeFullU | Eigen::ComputeFullV);

Eigen::Matrix3f u = svd.matrixU ();

Eigen::Matrix3f v = svd.matrixV ();

Eigen::Matrix3f R_eigen = v * u.transpose ();

for (int j = 0; j < 3; j++)

{

for (int k = 0; k < 3; k++)

{

R[j][k] = R_eigen(j,k);

RtMat[3*i+j][k] = R[j][k];

}

t_vec[j] = tgtMean[j] - R[j][0]*srcMean[0] - R[j][1]*srcMean[1] - R[j][2]*srcMean[2];

RtMat[3*i+j][3] = t_vec[j];

}

Multiple point clouds can be combined pair after pair in this way. Finally, we use function in PCL libary to convert the combined point cloud to a mesh network stored in a .ply file. This conversion is implemented in plygen() function.

Nan Zhang's Blog

Sunday, July 17, 2016

Merging Kinect 3D Point Clouds With 2D/3D Data Fusion

No comments:

Post a Comment

About Me

Blog Archive