## Why do the visually output results from stereo calibration not match the T value of PoseCamera2?

I want to estimate the positions and orientations of two cameras, so I captured images of a checkerboard, created 27 pairs, and performed stereo calibration. The distance between the cameras is approximately 4.5 meters, and the checkerboard was photographed from various angles at distances ranging from 1 to 6 meters from the cameras. The overall average reprojection error during this process was 0.28 pixels.

The results of the stereo calibration are shown in the attached image. To make the issue more understandable, the image is displayed in the x-z plane. Below, I will list the ‘data from this image’ and ‘PoseCamera2’s T’ obtained from the stereo calibration.

Visual data: approximately [-2700, -500, 3500]…①

PoseCamera2’s T: [3540, -734, 2748]…②

When comparing the results of ① and ②, it seems that the x and z components are reversed. I would like to know why the results of ① and ②, which are supposed to be the same, do not match. Additionally, the actual positional relationship between Camera 1 and Camera 2 nearly matches ①, so I would like to understand the reason for this as well.

Regarding the cause of the discrepancy between ① and ②, I have considered the following points, but I have not resolved the issue:

Differences due to coordinate systems：I think the results’ coordinate systems in ① and ② might be different. However, since PoseCamera2 output by stereo calibration uses the optical center of Camera 1 as the origin, it seems that the origin of the visual data should match. Therefore, I believe that the coordinate systems of ① and ② are the same.

Misunderstanding in handling R and T：PoseCamera2 represents the relative pose that transforms Camera 2’s pose into Camera 1’s pose. If we represent the pose of Camera 1 as [R1, t1; 0 1], the pose of Camera 2 as [R2, t2; 0 1], and PoseCamera2 as [R, T; 0 1], the relationship [R1, t1; 0 1] = [R, T; 0 1] * [R2, t2; 0 1] holds, leading to: R1 = R * R2…③ t1 = R * t2 + T…④ I was previously informed by a STAFF member about this relationship. Here, even when considering only the position and ignoring the orientation of R, substituting ② into ⑤ does not align the pose of Camera 2 with Camera 1. Therefore, I suspect there might be a misunderstanding in my handling of R and T.

As a supplementary note, the values of R are [0.051, 0.085, -0.99; -0.31, 0.95, 0.065; 0.95, 0.31, 0.075].I want to estimate the positions and orientations of two cameras, so I captured images of a checkerboard, created 27 pairs, and performed stereo calibration. The distance between the cameras is approximately 4.5 meters, and the checkerboard was photographed from various angles at distances ranging from 1 to 6 meters from the cameras. The overall average reprojection error during this process was 0.28 pixels.

The results of the stereo calibration are shown in the attached image. To make the issue more understandable, the image is displayed in the x-z plane. Below, I will list the ‘data from this image’ and ‘PoseCamera2’s T’ obtained from the stereo calibration.

Visual data: approximately [-2700, -500, 3500]…①

PoseCamera2’s T: [3540, -734, 2748]…②

When comparing the results of ① and ②, it seems that the x and z components are reversed. I would like to know why the results of ① and ②, which are supposed to be the same, do not match. Additionally, the actual positional relationship between Camera 1 and Camera 2 nearly matches ①, so I would like to understand the reason for this as well.

Regarding the cause of the discrepancy between ① and ②, I have considered the following points, but I have not resolved the issue:

Differences due to coordinate systems：I think the results’ coordinate systems in ① and ② might be different. However, since PoseCamera2 output by stereo calibration uses the optical center of Camera 1 as the origin, it seems that the origin of the visual data should match. Therefore, I believe that the coordinate systems of ① and ② are the same.

Misunderstanding in handling R and T：PoseCamera2 represents the relative pose that transforms Camera 2’s pose into Camera 1’s pose. If we represent the pose of Camera 1 as [R1, t1; 0 1], the pose of Camera 2 as [R2, t2; 0 1], and PoseCamera2 as [R, T; 0 1], the relationship [R1, t1; 0 1] = [R, T; 0 1] * [R2, t2; 0 1] holds, leading to: R1 = R * R2…③ t1 = R * t2 + T…④ I was previously informed by a STAFF member about this relationship. Here, even when considering only the position and ignoring the orientation of R, substituting ② into ⑤ does not align the pose of Camera 2 with Camera 1. Therefore, I suspect there might be a misunderstanding in my handling of R and T.

As a supplementary note, the values of R are [0.051, 0.085, -0.99; -0.31, 0.95, 0.065; 0.95, 0.31, 0.075]. I want to estimate the positions and orientations of two cameras, so I captured images of a checkerboard, created 27 pairs, and performed stereo calibration. The distance between the cameras is approximately 4.5 meters, and the checkerboard was photographed from various angles at distances ranging from 1 to 6 meters from the cameras. The overall average reprojection error during this process was 0.28 pixels.

The results of the stereo calibration are shown in the attached image. To make the issue more understandable, the image is displayed in the x-z plane. Below, I will list the ‘data from this image’ and ‘PoseCamera2’s T’ obtained from the stereo calibration.

Visual data: approximately [-2700, -500, 3500]…①

PoseCamera2’s T: [3540, -734, 2748]…②

When comparing the results of ① and ②, it seems that the x and z components are reversed. I would like to know why the results of ① and ②, which are supposed to be the same, do not match. Additionally, the actual positional relationship between Camera 1 and Camera 2 nearly matches ①, so I would like to understand the reason for this as well.

Regarding the cause of the discrepancy between ① and ②, I have considered the following points, but I have not resolved the issue:

Differences due to coordinate systems：I think the results’ coordinate systems in ① and ② might be different. However, since PoseCamera2 output by stereo calibration uses the optical center of Camera 1 as the origin, it seems that the origin of the visual data should match. Therefore, I believe that the coordinate systems of ① and ② are the same.

Misunderstanding in handling R and T：PoseCamera2 represents the relative pose that transforms Camera 2’s pose into Camera 1’s pose. If we represent the pose of Camera 1 as [R1, t1; 0 1], the pose of Camera 2 as [R2, t2; 0 1], and PoseCamera2 as [R, T; 0 1], the relationship [R1, t1; 0 1] = [R, T; 0 1] * [R2, t2; 0 1] holds, leading to: R1 = R * R2…③ t1 = R * t2 + T…④ I was previously informed by a STAFF member about this relationship. Here, even when considering only the position and ignoring the orientation of R, substituting ② into ⑤ does not align the pose of Camera 2 with Camera 1. Therefore, I suspect there might be a misunderstanding in my handling of R and T.

As a supplementary note, the values of R are [0.051, 0.085, -0.99; -0.31, 0.95, 0.065; 0.95, 0.31, 0.075]. stereocalibration, stereoparameters MATLAB Answers — New Questions