search instagram arrow-down

Enter your email address to follow this blog and receive notifications of new posts by email.

Recent Posts

Image Alignment (Feature Based) in OpenCV.js ( JavaScript )

This article extends Satya Mallick’s article about feature-based image alignment using OpenCV in both C++ and Python.  In this article, we will show you how to align an image to a baseline image using Opencv’s JavaScript library (opencv.js).  There is a hosted version of this example here: iCollect (view console log in Chrome by pressing F12 key and viewing ALL messages).

Here are the 2 images:

Before you begin, I suggest you read the opencv.js tutorials that cover installation and some very good examples. However, if you want to dive right in, you can find a very large JavaScript library here (version 4.5.1).  As a quick reminder, you will need to understand the following concepts in regards to images (Shape, Channel, Type, Size, Level, and Depth).  

For example, this 4×5 pixel image has:

  • 3 channels (Red, Green, and Blue)
  • Its Shape = 4, 5, 3 (row, column, channels)
  • Its Size = 4, 5  (as compared to the shape of VGA which is 480×640)
  • It contains Type = CV_8UC3 (8-bit unsigned 3 channel) data

You will also need to understand the concept of a “Mat”:

Mat is basically a class with two data parts: the matrix header (containing information such as the size of the matrix, the method used for storing, at which address is the matrix stored, and so on) and a pointer to the matrix containing the pixel values (taking any dimensionality depending on the method chosen for storing). The matrix header size is constant, however, the size of the matrix itself may vary from image to image and usually is larger by orders of magnitude.

Warning: There is no documentation for opencv.js.  The javascript library was built with emscripten and is essentially a port.  Also, not all Opencv functions are supported in JavaScript.  You can find the supported functions here. To understand the functions beyond what is in the tutorial I suggest you attempt to read the C++/Python documentation and code as a reference.  If you like this very powerful library but are turned off by the lack of documentation please let the project know here. There is also very little on stackoverflow and the OpenCV legacy and current forums.

OK, let’s begin! (Console log is in green, code is in blue, highlights of code that are very important are in red)

There are 9 steps that follow and each step (section of Javascript) has Satya’s C++ code referenced in the comments so you can follow along.

For your benefit, we are using the console.log functions (F12 on Chrome) to show the steps and important information that we will reference in this article.  We also created a helper function (getMatStats) that gets a snapshot of the Mat information.  You will find that when using opencv.js the number one issue that can cause you many headaches is misinterpreting data such as Mat “type”.  This function will accept the Mat and a description of the Mat as arguments and produce the following. We will use this to test the Mats at each step.

MatStats:channels=4 type:24 cols:600 rows:600 depth:0 colorspace:RGBA or BGRA type:CV_8UC4 

Step 1. Read in the images.

To initialize the function you will need to get 2 items from your HTML (the baseline image and the image to align).  We call the baseline images “im2” and the image to align “im1”.

The only gotcha here: Canvas only supports 8-bit RGBA images with continuous storage, the cv.Mat type is cv.CV_8UC4. It is different from native OpenCV because images returned and shown by the native imread and imshow have the channels stored in BGR order.

//im2 is the original reference image we are trying to align to
let im2 = cv.imread(image_A_element);

getMatStats(im2, "original reference image");
//im1 is the image we are trying to line up correctly
let im1 = cv.imread(image_B_element);
getMatStats(im1, "original image to line up");

In the console log you will see the following:

MatName :(original reference image)  Mat {$$: {…}}
MatStats:channels=4 type:24 cols:600 rows:600 depth:0 colorspace:RGBA or BGRA type:CV_8UC4
MatName :(original image to line up)  Mat {$$: {…}}
MatStats:channels=4 type:24 cols:600 rows:600 depth:0 colorspace:RGBA or BGRA type:CV_8UC4

Note: In the console log you can also reference the Mat and navigate through the shape, data, and supported class functions as follows:

Mat {$$: {…}}
  cols: 600
  data: Uint8Array(1440000)
  data8S: Int8Array(1440000)
  data16S: Int16Array(720000)
  data16U: Uint16Array(720000)
  data32F: Float32Array(360000)
  data32S: Int32Array(360000)
  data64F: Float64Array(180000)
  matSize: Array(2)
  rows: 600
  step: Array(2)
$$: {ptrType: RegisteredPointer, ptr: 6570904, count: {…}}
__proto__: ClassHandle
  channels: ƒ Mat$channels()
  charAt: ƒ ()
  charPtr: ƒ ()
  clone: ƒ ()
  col: ƒ Mat$col(arg0)
  colRange: ƒ ()
...cut off the rest due to number of methods...

Step 2. Convert the images to Grayscale

We will work with the Grayscale version of the image so it needs to be converted.  You can see below in the comments the C++ code we are referencing from Satya’s article.

//17    Convert images to grayscale
//18    Mat im1Gray, im2Gray;
//19    cvtColor(im1, im1Gray, CV_BGR2GRAY);
//20    cvtColor(im2, im2Gray, CV_BGR2GRAY);
let im1Gray = new cv.Mat();
let im2Gray = new cv.Mat();
cv.cvtColor(im1, im1Gray, cv.COLOR_BGRA2GRAY);

getMatStats(im1Gray, "reference image converted to BGRA2GRAY");
cv.cvtColor(im2, im2Gray, cv.COLOR_BGRA2GRAY);
getMatStats(im2Gray, "image to line up converted to BGRA2GRAY");

In the console log you will see the following (note, this is only showing the im1Gray stats)

MatStats:channels=1 type:0 cols:600 rows:600 opencv_align:532 depth:0 colorspace:GRAY type:CV_8UC1

Step 3. Detect Features & Compute Descriptors

We are going to use the AKAZE keypoint detector and descriptor extractor.  There are a few others that you can use as well. For a full description of what a keyport and associated descriptors are reference this article.

Here are some definitions you need to understand for this step (more here):

  • A keypoint (or interest point) is defined by some particular image intensities “around” it, such as a corner.
  • A descriptor is a finite vector which summarizes properties for the keypoint. A descriptor can be used for classifying the keypoint. Descriptors assign a numerical description to the area of the image the keypoint refers to.
  • Once you calculate descriptors for all the keypoints, you have a way to compare those keypoints. This is useful for image feature matching (when you know the images are of the same object, and would like to identify the parts in different images that depict the same part of the scene, or would like to identify the perspective change between two images), and allows you to compare every keypoint descriptor of one image to every keypoint descriptor of the other image. The keypoints whose descriptors have the smallest distance between them are “matches” between the 2 different images.
//22    Variables to store keypoints and descriptors
//23    std::vector<KeyPoint> keypoints1, keypoints2;
//24    Mat descriptors1, descriptors2;
let keypoints1 = new cv.KeyPointVector();
let keypoints2 = new cv.KeyPointVector();
let descriptors1 = new cv.Mat();
let descriptors2 = new cv.Mat();

//26    Detect ORB features and compute descriptors.
//27    Ptr<Feature2D> orb = ORB::create(MAX_FEATURES);
//28    orb->detectAndCompute(im1Gray, Mat(), keypoints1, descriptors1);
//29    orb->detectAndCompute(im2Gray, Mat(), keypoints2, descriptors2);

var orb = new cv.AKAZE();

orb.detectAndCompute(im1Gray, new cv.Mat(), keypoints1, descriptors1);
orb.detectAndCompute(im2Gray, new cv.Mat(), keypoints2, descriptors2);

If you want to take a peek at what the keypoints look like you can simply add some console.log output like the following:

console.log("Total of ", keypoints1.size(), " keypoints1 (img to align) and ", keypoints2.size(), " keypoints2 (reference)");
console.log("here are the first 5 keypoints for keypoints1:");
for (let i = 0; i < keypoints1.size(); i++) {
  console.log("keypoints1: [",i,"]", keypoints1.get(i).pt.x, keypoints1.get(i).pt.y);
  if (i === 5){break;}

This will allow you to see the type of data you are working with for keypoints. It will look something like this and each of those x, y points are a dot referenced in the image below on the image we are trying to align.

keypoints1: [ 0 ] 307.58734130859375 28.975919723510742
keypoints1: [ 1 ] 187.1143035888672 30.317649841308594
keypoints1: [ 2 ] 201.44839477539062 29.738161087036133

Step 4. Match Features

In this step we will find the matches between the 2 images using the keypoints and descriptors.  We will be using the knnMatch function for Feature Matching.  There are others you can read about here.  Satya’s example used BruteForce-Hamming however we found the following more accurate.

In the console log we will pull out the first 5 matches and then also the first 5 good matches so you can see what they look like.

Note that you need to determine what you want the variable “knnDistance_option” to be in your code. It’s usually something like 0.7.

//31    Match features.
//32    std::vector<DMatch> matches;
//33    Ptr<DescriptorMatcher> matcher = DescriptorMatcher::create("BruteForce-Hamming");
//34    matcher->match(descriptors1, descriptors2, matches, Mat());

    let good_matches = new cv.DMatchVector();

//36    Sort matches by score
//37    std::sort(matches.begin(), matches.end());

//39    Remove not so good matches
//40    const int numGoodMatches = matches.size() * GOOD_MATCH_PERCENT;
//41    matches.erase(matches.begin()+numGoodMatches, matches.end());

let bf = new cv.BFMatcher();
let matches = new cv.DMatchVectorVector();

bf.knnMatch(descriptors1, descriptors2, matches, 2);

let counter = 0;
for (let i = 0; i < matches.size(); ++i) {
    let match = matches.get(i);
    let dMatch1 = match.get(0);
    let dMatch2 = match.get(1);
    //console.log("[", i, "] ", "dMatch1: ", dMatch1, "dMatch2: ", dMatch2);
    if (dMatch1.distance <= dMatch2.distance * parseFloat(knnDistance_option)) {
        //console.log("***Good Match***", "dMatch1.distance: ", dMatch1.distance, "was less than or = to: ", "dMatch2.distance * parseFloat(knnDistance_option)", dMatch2.distance * parseFloat(knnDistance_option), "dMatch2.distance: ", dMatch2.distance, "knnDistance", knnDistance_option);

console.log("keeping ", counter, " points in good_matches vector out of ", matches.size(), " contained in this match vector:", matches);
console.log("here are first 5 matches");
for (let t = 0; t < matches.size(); ++t) {
    console.log("[" + t + "]", "matches: ", matches.get(t));
    if (t === 5){break;}
console.log("here are first 5 good_matches");
for (let r = 0; r < good_matches.size(); ++r) {
    console.log("[" + r + "]", "good_matches: ", good_matches.get(r));
    if (r === 5){break;}

In the console log you will see the following:

keeping  196  points in good_matches vector out of  303  contained in this match vector: DMatchVectorVector {$$: {…}}
here are first 5 matches
    matches:  DMatchVector {$$: {…}}

I’ve expanded the first DMatchVector below so you can see that it is both header information as well as a pointer to the data

[0] matches: MatchVector {$$: {…}}
    count: {value: 1}
    ptr: 6621896
    ptrType: RegisteredPointer
    destructorFunction: null
    isConst: false
    isReference: false
    isSmartPointer: false
    name: "DMatchVector*"
[1] matches:  DMatchVector {$$: {…}}
[2] matches:  DMatchVector {$$: {…}}
[3] matches:  DMatchVector {$$: {…}}
[4] matches:  DMatchVector {$$: {…}}

You can also see in the console log where good_matches exposes the queryIdx, trainIdx, and the distance.  We will use each of these below, however it is important for you to know what these variables reference.

  • distance: This attribute gives us the distance between the descriptors. A lower distance indicates a better match.
  • trainIdx: This attribute gives us the index of the descriptor in the list of train descriptors (in our case, it’s the list of descriptors in the im2–the reference image).
  • queryIdx: This attribute gives us the index of the descriptor in the list of query descriptors (in our case, it’s the list of descriptors in the im1–the image that we are aligning).
here are first 5 good_matches
good_matches:  {queryIdx: 0, trainIdx: 15, imgIdx: 0, distance: 174.21824645996094}
good_matches:  {queryIdx: 1, trainIdx: 13, imgIdx: 0, distance: 219.84767150878906}
good_matches:  {queryIdx: 3, trainIdx: 16, imgIdx: 0, distance: 161.46206665039062}
good_matches:  {queryIdx: 4, trainIdx: 6, imgIdx: 0, distance: 146.70037841796875}
good_matches:  {queryIdx: 7, trainIdx: 10, imgIdx: 0, distance: 337.3707275390625}

If you add the JavaScript code back in that is currently commented out above “//” you can see in the console log how the matching works. This is attempting to use Lowe’s ratio test. You will see that every item in ‘matches[]’ (an array of pointers to keypoint descriptors) has the two closest matches and we are going to save the one into ‘good_matches’ that is much closer than the second best. Once you add the console.log code back in you will see something like the following in the console log:

Matches[ 0 ]dMatch1:  {queryIdx: 0, trainIdx: 860, imgIdx: 0, distance: 623.0064086914062} 
	    dMatch2:  {queryIdx: 0, trainIdx: 527, imgIdx: 0, distance: 636.6152954101562}

Matches[ 1 ]dMatch1:  {queryIdx: 1, trainIdx: 568, imgIdx: 0, distance: 623.645751953125} 
	    dMatch2:  {queryIdx: 1, trainIdx: 438, imgIdx: 0, distance: 640.3553466796875}

***Good Match*** dMatch1.distance:  387.9703674316406 was less than or = to:  493.1697387695312 which is dMatch2.distance * parseFloat(knnDistance_option)


Step 5. Draw the Top Matches Between Both Images and Show the Resulting Image to the User

In this step we simply show the user the result of the match.

//44    Draw top matches
//45    Mat imMatches;
//46    drawMatches(im1, keypoints1, im2, keypoints2, matches, imMatches);
//47    imwrite("matches.jpg", imMatches);
let imMatches = new cv.Mat();
let color = new cv.Scalar(0,255,0, 255);
cv.drawMatches(im1, keypoints1, im2, keypoints2, 
                  good_matches, imMatches, color);
cv.imshow('imageCompareMatches', imMatches);

getMatStats(imMatches, "imMatches");

   Here is the resulting image:

Note: When you are debugging sometimes it’s helpful to show each image with just the keypoints. You can do that with the following code:

let keypoints1_img = new cv.Mat();
let keypoints2_img = new cv.Mat();
let keypointcolor = new cv.Scalar(0,255,0, 255);
cv.drawKeypoints(im1Gray, keypoints1, keypoints1_img, keypointcolor);
cv.drawKeypoints(im2Gray, keypoints2, keypoints2_img, keypointcolor);

cv.imshow('keypoints1', keypoints1_img);
cv.imshow('keypoints2', keypoints2_img);

Step 6. Extract the Location of Good Matches and Build Point1 and Point2 Arrays to Prepare for the Homography

Opencv.js does not have the ability to easily create a Point2fVector object so you will have to do this manually but it’s not difficult.  However, there is an extra step in here because of the differences between the C++ and JavaScript libraries.  In order to build the Mats for the matches between image 1 and image 2 to send to findHomography(), you need to first create the point1 and point2 arrays (Step 6) and then generate the Mats from those arrays (Step 7)

//50    Extract location of good matches
//51    std::vector<Point2f> points1, points2;
//53    for( size_t i = 0; i < matches.size(); i++ )
//54    {
//55        points1.push_back( keypoints1[ matches[i].queryIdx ].pt );
//56        points2.push_back( keypoints2[ matches[i].trainIdx ].pt );
//57    }
let points1 = [];
let points2 = [];
for (let i = 0; i < good_matches.size(); i++) {
    points1.push(keypoints1.get(good_matches.get(i).queryIdx ).pt.x );
    points1.push(keypoints1.get(good_matches.get(i).queryIdx ).pt.y );
    points2.push(keypoints2.get(good_matches.get(i).trainIdx ).pt.x );
    points2.push(keypoints2.get(good_matches.get(i).trainIdx ).pt.y );

console.log("points1:", points1,"points2:", points2);

In the console log you will see points arrays–here is as an example of a few datapoints in the points1 array:

0: 371.2216796875
1: 258.982177734375
2: 348.1711730957031
3: 262.8340759277344
4: 376.22442626953125
5: 266.4711608886719

Step 7. Create Mat1 and Mat2 from Points1 and Points2 Arrays to prepare for the Homography

As mentioned in Step 6 this is the second part of the operation where we create the mat1 and mat2 objects to send to findHomography().  Note that you can use cv.matFromArray() for this by calling:  let mat1 = cv.matFromArray(points1.length, 1, cv.CV_32FC2, points1);  or you can do this manually as we have done below.  I find that doing it manually reduces errors.  Below, you can see we are creating a single-column two-channel matrix of 32-bit float data.

var mat1 = new cv.Mat(points1.length,1,cv.CV_32FC2);
var mat2 = new cv.Mat(points2.length,1,cv.CV_32FC2);

getMatStats(mat1, "mat1 prior to homography");
getMatStats(mat2, "mat2 prior to homography");

In the console log you will see the following:

MatName :(mat1 prior to homography)  Mat {$$: {…}}
  cols: 1
  data: Uint8Array(3136)
  data8S: Int8Array(3136)
  data16S: Int16Array(1568)
  data16U: Uint16Array(1568)

I expanded a part of the data32F portion of the mat so you can verify the data

data32F: Float32Array(784)
[0 … 99]
   0: 371.2216796875
   1: 258.982177734375
   2: 348.1711730957031
   3: 262.8340759277344
   4: 376.22442626953125
   5: 266.4711608886719
data32S: Int32Array(784)
data64F: Float64Array(392)
matSize: Array(2)
rows: 392
step: Array(2)
$$: {ptrType: RegisteredPointer, ptr: 6558456, count: {…}}
__proto__: ClassHandle

MatStats:channels=2 type:13 cols:1 rows:392 depth:5 colorspace:unknown type:CV_32FC2
MatName :(mat2 prior to homography)  Mat {$$: {…}}
MatStats:channels=2 type:13 cols:1 rows:392 depth:5 colorspace:unknown type:CV_32FC2

Step 8. Calculate the Homography using the Mat’s we created in Step 6 and 7

Opencv has a good explanation of homography found in this example.  The homography 3×3 matrix is what is used by WarpPerspective() to transform the image we need to align with the baseline image. The findHomography() API reference can be found here.  Note that the reference says that arguments 1 and 2 need to be a matrix of the type CV_32FC2 or vector<Point2f>.

//59    Find homography
//60    h = findHomography( points1, points2, RANSAC );
let h = cv.findHomography(mat1, mat2, cv.RANSAC);

if (h.empty())
    alert("homography matrix empty!");
    console.log("h:", h);
    console.log("[", h.data64F[0],",", h.data64F[1], ",", h.data64F[2]);
    console.log("", h.data64F[3],",", h.data64F[4], ",", h.data64F[5]);
    console.log("", h.data64F[6],",", h.data64F[7], ",", h.data64F[8], "]");

    In the console log you will see the following:

h: Mat {$$: {…}}
Mat {$$: {…}}
  cols: 3

  data: Uint8Array(72)
  data8S: Int8Array(72)
  data16S: Int16Array(36)
  data16U: Uint16Array(36)
  data32F: Float32Array(18)
  data32S: Int32Array(18)
  data64F: Float64Array(9)
  matSize: (...)
  rows: 3

  step: Array(2)
    0: 24
    1: 8
  length: 2

And here is the homography (h) matrix that we will use in WarpPerspective(). Note the 1 in the bottom right and the higher positive number in the top right.

[ 0.9438660794550876 , -0.32547362358619275 , 114.1118327960684
  0.32513282333220195 , 0.9440804705693663 , -81.13758041676562
 -0.000003681855968072313 , 2.0278787737990302e-7 , 1 ]

Step 9. Warp the Image to Align with the Reference Image

We are using the warpPerspective() api that is referenced here.

//62  Use homography to warp image
//63  warpPerspective(im1, im1Reg, h, im2.size());

let image_B_final_result = new cv.Mat();
cv.warpPerspective(im1, image_B_final_result, h, im2.size());

cv.imshow('imageAligned', image_B_final_result);
getMatStats(image_B_final_result, "finalMat");

In the console log you will see the following:

MatStats:channels=4 type:24 cols:500 rows:647 opencv_align:532  depth:0 colorspace:RGBA or BGRA type:CV_8UC4

Here is the final image:

Extra Credit 1: Follow Your Data

To truly understand how the process works try to follow what is happening at each step with your own data. Here is a graphic that may help:

Extra Credit 2: Use the RANSAC mask created with findHomography and display a graphic of only the good “Inlier” matches

Here is a graphic that will outline the process:

Here is the code to support:

let findHomographyMask = new cv.Mat();
let h = cv.findHomography(mat1, mat2, cv.RANSAC, 3, findHomographyMask);
if (h.empty())

	alert("homography matrix empty!");
	let good_inlier_matches = new cv.DMatchVector();
	for (let i = 0; i < findHomographyMask.rows; i=i+2) {
           if([i] === 1 ||[i+1] === 1) {
	      let x = points2[i];
	      let y = points2[i + 1];
	      for (let j = 0; j < keypoints2.size(); ++j) {
		if (x === keypoints2.get(j).pt.x && y === keypoints2.get(j).pt.y) {
		   for (let k = 0; k < good_matches.size(); ++k) {
			if (j === good_matches.get(k).trainIdx) {
	var inlierMatches = new cv.Mat();
	cv.drawMatches(im1, keypoints1, im2, keypoints2, good_inlier_matches, inlierMatches, color);
	cv.imshow('inlierMatches', inlierMatches);

4 comments on “Image Alignment (Feature Based) in OpenCV.js ( JavaScript )

  1. kiyoki says:

    Excuse me, i download the opencv.js from and did some exercises, but when i tried to change the keypoint’s coordinate, i found that i could not change them. The code such as :
    fast.detect(mat, keyPoints, mask);
    keyPoints.get(0).pt.x += 1;

    But keyPoints(0).pt.x hadn’t been changed.
    I have to build an image pyramid so it is necessary to change the coordinate of the keypoint, could you please tell me how to change it. Thank you all the same.

    1. scottsuhy says:

      I’d ask the question on the forum:

      but isn’t there a “set”–> keyPoint.set ? I see it in the console.log:

      KeyPointVector {$$: {…}}
      $$: {ptrType: RegisteredPointer, ptr: undefined, count: {…}, smartPtr: undefined}
      __proto__: ClassHandle
      get: ƒ KeyPointVector$get(arg0)
      argCount: 1
      arguments: null
      caller: null
      length: 1
      name: “KeyPointVector$get”
      prototype: {constructor: ƒ}
      __proto__: ƒ ()
      [[FunctionLocation]]: VM4682:3
      [[Scopes]]: Scopes[3]
      push_back: ƒ KeyPointVector$push_back(arg0)
      resize: ƒ KeyPointVector$resize(arg0, arg1)
      set: ƒ KeyPointVector$set(arg0, arg1)
      argCount: 2
      arguments: null
      caller: null
      length: 2
      name: “KeyPointVector$set”
      prototype: {constructor: ƒ}
      __proto__: ƒ ()
      [[FunctionLocation]]: VM4062:3
      [[Scopes]]: Scopes[3]
      size: ƒ KeyPointVector$size()
      constructor: ƒ KeyPointVector()
      __proto__: Object

      1. kiyoki says:

        Thank you for your reply.
        I think that “set” is used for cv.KeyPointVector(not KeyPoint in the Vector):
        keyPoints.set(2, keyPoints.get(0))
        and this can make KeyPoints.get(2) equal KeyPoints.get(0).
        And “set” can not make more detailed change such as coordinate or response.

  2. scottsuhy says:

    ah, yes. I’d ask it in the opencv forum. I did a quick search in stackoverflow but did not see anything related.

Leave a Reply
Your email address will not be published. Required fields are marked *

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

%d bloggers like this: