6. ROS+OpenCV Course

6.1 Color Threshold Adjustment

Various light sources can affect colors differently, leading to discrepancies in recognition. To address this issue, LAB TOOL is used to adjust the color threshold so that color recognition remains accurate and consistent.

6.1.1 Launching and Closing LAB TOOL

Note

The input command should be case sensitive, and the keywords can be complemented by the Tab key.

Please strictly follow the steps below to open the LAB TOOL. Otherwise, the tool cannot be opened.
Power on the robot and connect it to the remote control software NoMachine. For instructions on setting up the remote desktop connection, refer to the section 1.7.2 AP Mode Connection Steps in the manual.
Click the terminal icon in the system desktop to open a command-line window.
Enter the following command and press Enter to stop the app auto-start service.

sudo systemctl stop start_app_node.service

Enter the command to start the camera node:

ros2 launch peripherals depth_camera.launch.py

Then open a new terminal, run the LAB Tool by entering the following command, and press Enter:

python3 /home/ubuntu/software/lab_tool/main.py

See the next section for an introduction to the interface and instructions for the buttons. Click the Quit button at the bottom right to close it.

After closing the LAB Tool, enter the command to restart the app auto-start service. Once startup is complete, the robotic arm returns to its initial position.

sudo systemctl restart start_app_node.service

Note

If the app auto-start service is not running, some app functions may not work properly. If the auto-start command has not been executed, restarting the robot will also automatically restart the app auto-start service.

6.1.2 LAB TOOL Interface Introduction

LABTOOL is divided into two parts, including the image display area and the recognition adjustment area.

Image display area: The processed camera feed is shown on the left, and the raw feed is shown on the right.

Note

If the video feed fails to display, indicating a camera connection error, check the camera cable for a secure connection or reconnect it.

Adjustment Panel: Allows color-threshold tuning. The functions of each button are listed below.

Icon	Function
	Sliders L, A, and B are respectively used to adjust L, A, and B components of the camera returned image. The left sliders set the min values for each channel, and the right sliders set the max values.
	Select the target color for threshold tuning.
	Delete the selected color.
	Add a new detectable color.
	Save the adjusted value.
	Switch the camera between the depth camera and the monocular camera.
	Exit from the LAB TOOL.

6.1.3 Adjust Color Threshold

Open LABTOOL, and select the color in the drop-down menu. Take adjusting the red color, for example.

Modify all min values of L, A, and B to 0, and max values to 255.

Put the red block within the camera frame. According to the LAB color space, adjust L, A, and B components to approach the zone of the target color.

If the red area appears close to +a, increase the A component. Keep the A component’s max value unchanged and raise its min value until the target object in the left display turns white while all other areas turn black.

Based on the environment, modify the value of L and B. Increase the L component’s min value if the red tone is too light, or reduce its max value if it is too dark. Increase the B component’s min value if the red tone is too warm, or reduce its max value if it is too cool.

LAB Threshold Adjustment Parameter

Color Component	Range	Corresponding Color Zone
L	0~255	Black-White（-L ~ +L）
A	0~255	Green-Red（-a ~ +a）
B	0~255	Blue-Yellow（-b ~ +b）

Click Save to save the adjusted color threshold parameter.

6.1.4 Add New Color for Detection

In addition to the built-in detectable colors, additional colors can be added. Yellow is used as an example below. The steps are as follows:

Open the LAB TOOL and click the Add button.

Then fill in yellow, and click OK.

Select the newly added color from the color list.

Face the camera at the yellow block. And drag the sliders of L, A, and B to adjust the color threshold till the object on the left turns white and the other area turns black.

Note

For instructions on adjusting color thresholds, refer to section 6.1.3 Adjust Color Threshold.

Click Save to save the adjusted color threshold parameter.

Once the adjustment is completed, press Ctrl + C to disable the camera service, then click Quit to exit the interface.

6.2 Color Recognition

This section demonstrates the detection of red, green, and blue objects using OpenCV, with the results displayed on the video feed. Before starting the feature, please prepare three objects in these colors: red, green, and blue.

6.2.1 Recognition Process

Firstly, the program will acquire the RGB image of the camera, scale the image, and perform Gaussian filtering on the image to convert the image color space from RGB to LAB.

Based on this, color thresholds are used to identify the color of the object within the circle. A mask is then applied to the image, which involves selecting parts of the image, graphics, or objects to globally or locally block out areas in the image for processing.

After masking, morphological operations, including opening and closing, are performed on the object image to refine the results.

Opening operation: Involves erosion followed by dilation. Effect: Removes small objects, smooths object boundaries, and does not affect object area. It can remove small particle noise and separate objects that are stuck together.

Erosion: Removes boundary pixels of an object, causing the edges to shrink inward. This operation can eliminate objects smaller than the structuring element.

Dilation: Expands the boundary pixels of an object by merging surrounding background pixels that are in contact with the object, causing the edges to grow outward.

Finally, the recognition results are overlaid on the return image.

6.2.2 Operation Steps

Note

Commands must be entered with correct capitalization. The Tab key can be used to auto-complete keywords.

Power on the robot and connect it via the NoMachine remote control software. For detailed information on connecting to a remote desktop, please refer to the section 1.7.2 AP Mode Connection Steps in the manual.
Click the terminal icon in the system desktop to open a command-line window.
Enter the command to disable the app auto-start service.

sudo systemctl stop start_app_node.service

Enter the command to start the camera node:

ros2 launch peripherals depth_camera.launch.py

Open a new terminal, navigate to the program directory, and start the color recognition feature by entering the following command:

cd ~/ros2_ws/src/example/example/color_detect && python3 color_detect_demo.py

The program will launch the camera image interface.
To exit the feature, press Ctrl + C in the terminal. If the program does not close successfully, try pressing Ctrl + C again.

6.2.3 Project Outcome

Note

After the feature starts, please ensure there is no other object containing the recognition color except the target object within the camera view. Otherwise, the recognition result will be affected.

After the feature starts, put the objects within the camera’s view. When the target object is detected, the program will highlight it with a circle of the corresponding color and display the color name in the lower-left corner of the window. The program can detect objects in red, blue, and green.

6.2.4 Program Analysis

The feature’s source code is located at: /home/ubuntu/ros2_ws/src/example/example/color_detect/color_detect_demo.py

# Color detection

range_rgb = {
    'red': (0, 0, 255),
    'blue': (255, 0, 0),
    'green': (0, 255, 0),
    'black': (0, 0, 0),
    'white': (255, 255, 255),
}

def get_yaml_data(yaml_file):
    yaml_file = open(yaml_file, 'r', encoding='utf-8')
    file_data = yaml_file.read()
    yaml_file.close()
    
    data = yaml.load(file_data, Loader=yaml.FullLoader)
    
    return data

lab_data = get_yaml_data("/home/ubuntu/software/lab_tool/lab_config.yaml")

Main Function

Initializes the ROS node and creates the color detection node named color_detect_node, subscribes to the image topic /depth_cam/rgb/image_raw with image_callback as its callback function, runs the color detection main function using multi-threading, and waits for the node to shut down.

if __name__ == '__main__':
    rclpy.init()
    node = rclpy.create_node('color_detect_node')
    node.create_subscription(Image, '/depth_cam/rgb/image_raw', image_callback, 1)
    threading.Thread(target=main, daemon=True).start()
    rclpy.spin(node)

Functions

The camera callback function is primarily used to read the video stream from the topic and push it into a queue, ensuring real-time display.

def image_callback(ros_image):
    cv_image = bridge.imgmsg_to_cv2(ros_image, "bgr8")
    bgr_image = np.array(cv_image, dtype=np.uint8)
    if image_queue.full():
        # If the queue is full, discard the oldest image
        image_queue.get()
        # Put the image into the queue
    image_queue.put(bgr_image)

def main()

The images in the queue are read and passed into the run function to obtain the color-recognized frames, which are then displayed using cv2.

def main():
    running = True
    while running:
        try:
            image = image_queue.get(block=True, timeout=1)
        except queue.Empty:
            if not running:
                break
            else:
                continue
        image = run(image)
        cv2.imshow('image', image)
        key = cv2.waitKey(1)
        if key == ord('q') or key == 27:  # Press Q or Esc to exit
            break

YAML File Reading Function

def get_yaml_data(yaml_file):
    yaml_file = open(yaml_file, 'r', encoding='utf-8')
    file_data = yaml_file.read()
    yaml_file.close()
    
    data = yaml.load(file_data, Loader=yaml.FullLoader)
    
    return data

Maximum Contour Function

The function takes the contours detected by OpenCV as input, identifies the largest contour based on area, and returns both the contour and its area.

# Find the contour with the largest area
# The parameter is a list of contours to be compared
def getAreaMaxContour(contours):
    contour_area_temp = 0
    contour_area_max = 0
    area_max_contour = None

    for c in contours:  # Iterate through all contours
        contour_area_temp = math.fabs(cv2.contourArea(c))  # Calculate the area of the contour
        if contour_area_temp > contour_area_max:
            contour_area_max = contour_area_temp
            if contour_area_temp > 50:  # Only when the area is greater than 50, the contour with the largest area is valid. This allows to filter out interference)
                area_max_contour = c

    return area_max_contour, contour_area_max  # Return the contour with the largest area)

Color Recognition Function

The function takes an image as input and resizes it to the specified size (320, 240) to facilitate detection. The image is then processed with a Gaussian filter and converted to the LAB color space.

def run(img):
    global draw_color
    global color_list
    global detect_color
    
    img_copy = img.copy()
    img_h, img_w = img.shape[:2]

    frame_resize = cv2.resize(img_copy, size, interpolation=cv2.INTER_NEAREST)
    frame_gb = cv2.GaussianBlur(frame_resize, (3, 3), 3)      
    frame_lab = cv2.cvtColor(frame_gb, cv2.COLOR_BGR2LAB)  # Convert image to LAB space

    max_area = 0
    color_area_max = None    
    areaMaxContour_max = 0
    for i in ['red', 'green', 'blue']:
        frame_mask = cv2.inRange(frame_lab, tuple(lab_data['lab']['Stereo'][i]['min']), tuple(lab_data['lab']['Stereo'][i]['max']))  #Perform bitwise operation on the original image and the mask
        eroded = cv2.erode(frame_mask, cv2.getStructuringElement(cv2.MORPH_RECT, (3, 3)))  #Erode)
        dilated = cv2.dilate(eroded, cv2.getStructuringElement(cv2.MORPH_RECT, (3, 3))) #Dilate
        contours = cv2.findContours(dilated, cv2.RETR_EXTERNAL, cv2.CHAIN_APPROX_NONE)[-2]  #Find contours
        areaMaxContour, area_max = getAreaMaxContour(contours)  #Find the largest contour
        if areaMaxContour is not None:
            if area_max > max_area:#Find the maximum area
                max_area = area_max
                color_area_max = i
                areaMaxContour_max = areaMaxContour

The function loops through the predefined color list to perform color recognition and outputs the color corresponding to the largest detected contour.

    for i in ['red', 'green', 'blue']:
        frame_mask = cv2.inRange(frame_lab, tuple(lab_data['lab']['Stereo'][i]['min']), tuple(lab_data['lab']['Stereo'][i]['max']))  #Perform bitwise operation on the original image and the mask
        eroded = cv2.erode(frame_mask, cv2.getStructuringElement(cv2.MORPH_RECT, (3, 3)))  #Erode
        dilated = cv2.dilate(eroded, cv2.getStructuringElement(cv2.MORPH_RECT, (3, 3))) #Dilate
        contours = cv2.findContours(dilated, cv2.RETR_EXTERNAL, cv2.CHAIN_APPROX_NONE)[-2]  #Find contours
        areaMaxContour, area_max = getAreaMaxContour(contours)  #Find the largest contour
        if areaMaxContour is not None:
            if area_max > max_area:#Find the maximum area
                max_area = area_max
                color_area_max = i
                areaMaxContour_max = areaMaxContour

Only contours with an area greater than 200 are considered, smaller contours are filtered out. For contours larger than 200, the function calculates the contour coordinates and highlights the target color’s position in the frame based on the largest contour’s color.

    if max_area > 200:  # The maximum area is found
        ((centerX, centerY), radius) = cv2.minEnclosingCircle(areaMaxContour_max)  # Obtain the minimum circumscribed circle
        centerX = int(common.val_map(centerX, 0, size[0], 0, img_w))
        centerY = int(common.val_map(centerY, 0, size[1], 0, img_h))
        radius = int(common.val_map(radius, 0, size[0], 0, img_w))            
        cv2.circle(img, (centerX, centerY), radius, range_rgb[color_area_max], 2)# Draw circle

        if color_area_max == 'red':  # Red has the maximum area
            color = 1
        elif color_area_max == 'green':  # Green has the maximum area
            color = 2
        elif color_area_max == 'blue':  # Blue has the maximum area
            color = 3
        else:
            color = 0
        color_list.append(color)

        if len(color_list) == 3:  # Determine multiple times
            # Averaging
            color = int(round(np.mean(np.array(color_list))))
            color_list = []
            if color == 1:
                detect_color = 'red'
                draw_color = range_rgb["red"]
            elif color == 2:
                detect_color = 'green'
                draw_color = range_rgb["green"]
            elif color == 3:
                detect_color = 'blue'
                draw_color = range_rgb["blue"]
            else:
                detect_color = 'None'
                draw_color = range_rgb["black"]               
    else:
        detect_color = 'None'
        draw_color = range_rgb["black"]
            
    cv2.putText(img, "Color: " + detect_color, (10, img.shape[0] - 10), cv2.FONT_HERSHEY_SIMPLEX, 0.65, draw_color, 2)
    
    return img

6.3 QR Code Creation and Recognition

This section is divided into two parts. The first part explains how to create a QR code, and the second part focuses on recognizing the created QR code and decoding its information via the terminal.

6.3.1 QR Code Generation

Generation Process

First, create an instance of the QR code tool and set its detailed parameters.

Next, obtain the input data and populate it into the QR code.

Finally, generate a QR code image based on the data, display it in a window, and save it to the specified path.

Creation Steps

Note

Commands must be entered with correct capitalization. The Tab key can be used to auto-complete keywords.

Power on the robot and connect it via the NoMachine remote control software. For detailed information on connecting to a remote desktop, please refer to the section 1.7.2 AP Mode Connection Steps in the manual.
Click the terminal icon in the system desktop to open a command-line window.
Enter the command to disable the app auto-start service.
```
sudo systemctl stop start_app_node.service
```
Enter the following command to switch to the program directory and start the QR code creation program:
```
cd ~/ros2_ws/src/example/example/qrcode && python3 qrcode_creater.py
```

After launching the program, enter characters in the terminal to generate a QR code. For example, type ubuntu.

Press Enter, and a QR code containing the input data will be displayed.

To exit the QR code window, press the ESC key or the Q key.
Returning to the command-line terminal, the message displayed below confirms that the QR code has been successfully generated and saved.

Next, click the taskbar on the left side of the system interface, then click to open the file manager. Navigate to the directory highlighted in the red box below to find that the generated QR code has been exported to the host system.

Drag the image to the PC desktop using NoMachine by dragging with the mouse. The image can then be printed or transferred to a photo album.

Program Analysis

The source code for the program is located at: /home/ubuntu/ros2_ws/src/example/example/qrcode/qrcode_creater.py

# encoding: utf-8
import os
import cv2
import qrcode
import numpy as np

def create_qrcode(data, file_name):
    '''
    version：An integer value between 1 and 40 that controls the size of the QR code. The minimum value is 1, which indicates a 12x12 matrix.
            To let the program determine the size automatically, set the value to None and use the fit parameter.
    error_correction：It controls the error correction capability of the QR code, and can take one of the following four constants.
    　　ERROR_CORRECT_L：Can correct about 7% or fewer errors.
    　　ERROR_CORRECT_M（default）：Can correct about 15% or fewer errors.
    　　ROR_CORRECT_H：Can correct about 30% or fewer errors.
    box_size：Controls the number of pixels in each small square in the QR code
    border：Controls the number of squares in the border, the distance between the QR code and the image boundary. Default is 4, which is the minimum value specified by the relevant standard.
    '''
    qr = qrcode.QRCode(
        version=1,
        error_correction=qrcode.constants.ERROR_CORRECT_H,
        box_size=5,
        border=4)
    # Add data
    qr.add_data(data)
    # Fill data
    qr.make(fit=True)
    # Generate image
    img = qr.make_image(fill_color=(0, 0, 0), back_color=(255, 255, 255))

Creating a QR Code Tool Object

Use the qrcode module to create the required object and set the parameters for the QR code.

    qr = qrcode.QRCode(
        version=1,
        error_correction=qrcode.constants.ERROR_CORRECT_H,
        box_size=5,
        border=4)
    # Add data
    qr.add_data(data)

The parameters of the function shown above are as follows:

The first parameter, version, is an integer from 1 to 40 that controls the size of the QR code. To let the program determine the size automatically, set this parameter to None and use the fit parameter.

The second parameter, error_correction, controls the error correction level of the QR code and can take the following values:

ERROR_CORRECT_L allows approximately 7% or less of errors to be corrected.
ERROR_CORRECT_M is the default value and allows approximately 15% or less of errors to be corrected.
ROR_CORRECT_H allows approximately 30% or less of errors to be corrected.

The third parameter, box_size, controls the number of pixels contained in each module of the QR code.

The fourth parameter, border, controls the number of modules included in the border and defines the distance between the QR code and the image edge. The default value is 4, which is the minimum required by the relevant standard.

Generating a QR Code

Data is added using the add_data and make functions, and the QR code image is generated using the make_image function.

    # Add data
    qr.add_data(data)
    # Fill data
    qr.make(fit=True)
    # Generate image
    img = qr.make_image(fill_color=(0, 0, 0), back_color=(255, 255, 255))
    opencv_img = cv2.cvtColor(np.asarray(img), cv2.COLOR_RGB2BGR)

The parameters of the make_image function are as follows:

fill_color=(0, 0, 0) sets the fill color of the QR code, which is black in this case.

back_color=(255, 255, 255) sets the background color of the QR code, which is white.

Displaying the Image

Convert the image’s color space using the cvtColor function, then display it in a window using the imshow function.

   while True:
        cv2.imshow('img', opencv_img)
        k = cv2.waitKey(1)
        if k != -1:
            break
    cv2.imwrite(file_name, opencv_img)
    print('save', data, file_name)

Saving the Image

Use the imwrite function to save the generated QR code image and print the relevant information.

    cv2.imwrite(file_name, opencv_img)
    print('save', data, file_name)

The parameters of the imwrite function are:

file_name specifies the storage path of the image.

opencv_img is the image to be saved.

6.3.2 QR Code Recognition

In the previous section, a QR code is created, which will be used to demonstrate recognizing the content of a QR code.

Recognition Process

First, create an instance of the QR code detector and load the required network structure and model weight files for detection.

Next, capture the video stream from the camera and perform detection on each frame.

Finally, when a QR code is detected, it will be highlighted with a bounding box, and the content of the QR code will be printed.

QR Code Recognition Steps

Note

Commands must be entered with correct capitalization. The Tab key can be used to auto-complete keywords.

Power on the robot and connect it via the NoMachine remote control software. For detailed information on connecting to remote desktop, please refer to section 1.7.2 AP Mode Connection Steps in the manual.
Click the terminal icon in the system desktop to open a command-line window.
Enter the command to disable the app auto-start service.

sudo systemctl stop start_app_node.service

Enter the command to start the camera node:

ros2 launch peripherals depth_camera.launch.py

Open a new ROS2 terminal, navigate to the program directory, and start the QR code recognition program by entering the following command:

cd ~/ros2_ws/src/example/example/qrcode && python3 qrcode_detecter.py

To exit the feature, press Ctrl + C in the terminal. If the program does not close successfully, try pressing Ctrl + C again.

Project Outcome

After starting the feature, the QR code appearing in the returned video frame is detected, highlighted with a red box, and its content is printed.

Program Analysis

The source code for the program is located at: /home/ubuntu/ros2_ws/src/example/example/qrcode/qrcode_detecter.py

#!/usr/bin/env python3
# encoding: utf-8
import os
import cv2  
import queue
import rclpy
import threading
import numpy as np
from rclpy.node import Node
from cv_bridge import CvBridge
from sensor_msgs.msg import Image

class QRCodeDetectNode(Node):
    def __init__(self, name):
        rclpy.init()
        super().__init__(name)
        self.running = True
        self.bridge = CvBridge()
        self.model_path = os.path.join(os.path.abspath(os.path.split(os.path.realpath(__file__))[0]), 'model/detector.tflite')
        self.image_queue = queue.Queue(maxsize=2)
        self.image_sub = self.create_subscription(Image, '/depth_cam/rgb0/image_raw', self.image_callback, 1)
        self.qcd = cv2.QRCodeDetector()
        threading.Thread(target=self.main, daemon=True).start()  

    def image_callback(self, ros_image):
        cv_image = self.bridge.imgmsg_to_cv2(ros_image, "bgr8")
        bgr_image = np.array(cv_image, dtype=np.uint8)

Main Function

Initialize the QRCodeDetectNode and set the node name to qrcode_detect.

Wait for the ROS node to shut down.

If the node shuts down, unregister the current node.

def main():
    node = QRCodeDetectNode('qrcode_detect')
    try:
        rclpy.spin(node)
    except KeyboardInterrupt:
        node.destroy_node()
        rclpy.shutdown()
        print('shutdown')
    finally:
        print('shutdown finish')

if __name__ == "__main__":
    main()

QRCodeDetectNode Class

The class initializes the node, defines parameters, and starts the self.main function.

Parameters:

self.running determines whether detection is enabled.

self.model_path is the path to the detection model.

self.image_queue is the queue for storing images.

self.image_sub subscribes to the camera video stream.

self.qcd initializes the QR code detection model.

        super().__init__(name)
        self.running = True
        self.bridge = CvBridge()
        self.model_path = os.path.join(os.path.abspath(os.path.split(os.path.realpath(__file__))[0]), 'model/detector.tflite')
        self.image_queue = queue.Queue(maxsize=2)
        self.image_sub = self.create_subscription(Image, '/depth_cam/rgb0/image_raw', self.image_callback, 1)
        self.qcd = cv2.QRCodeDetector()
        threading.Thread(target=self.main, daemon=True).start()  

    def image_callback(self, ros_image):
        cv_image = self.bridge.imgmsg_to_cv2(ros_image, "bgr8")
        bgr_image = np.array(cv_image, dtype=np.uint8)
        if self.image_queue.full():
            # If the queue is full, discard the oldest image
            self.image_queue.get()
            # Put the image into the queue
        self.image_queue.put(bgr_image)

    def main(self):
        while self.running:
            try:
                image = self.image_queue.get(block=True, timeout=1)
            except queue.Empty:
                if not self.running:
                    break
                else:
                    continue
            ret_qr, decoded_info, points, _ = self.qcd.detectAndDecodeMulti(image)
            if ret_qr:
                for s, p in zip(decoded_info, points):
                    if s:
                        print(s)
                        color = (0, 255, 0)
                    else:
                        color = (0, 0, 255)
                    image = cv2.polylines(image, [p.astype(int)], True, color, 8)
            cv2.imshow('image', image)
            key = cv2.waitKey(1)
            if key == ord('q') or key == 27:  # Press q or esc to exit
                break

Functions:

The camera callback function reads the camera video stream and places the frames into the queue self.image_queue, automatically updating and discarding outdated frames.

    def image_callback(self, ros_image):
        cv_image = self.bridge.imgmsg_to_cv2(ros_image, "bgr8")
        bgr_image = np.array(cv_image, dtype=np.uint8)
        if self.image_queue.full():
            # If the queue is full, discard the oldest image
            self.image_queue.get()
            # Put the image into the queue
        self.image_queue.put(bgr_image)

The main function handles QR code detection. It checks the self.running parameter to determine whether to start detection.

If True, it reads images from the self.image_queue, inputs them into the initialized detection model self.qcd, and finally prints the recognized content and draws the detection boxes based on the output.

    def main(self):
        while self.running:
            try:
                image = self.image_queue.get(block=True, timeout=1)
            except queue.Empty:
                if not self.running:
                    break
                else:
                    continue
            ret_qr, decoded_info, points, _ = self.qcd.detectAndDecodeMulti(image)
            if ret_qr:
                for s, p in zip(decoded_info, points):
                    if s:
                        print(s)
                        color = (0, 255, 0)
                    else:
                        color = (0, 0, 255)
                    image = cv2.polylines(image, [p.astype(int)], True, color, 8)
            cv2.imshow('image', image)
            key = cv2.waitKey(1)
            if key == ord('q') or key == 27:  # Press q or esc to exit
                break

6.4 Autonomous Line Following

6.4.1 Recognition Process

Firstly, the program will acquire the RGB image of the camera, scale the image, and perform Gaussian filtering on the image to convert the image color space from RGB to LAB.

Based on this, color thresholds are used to identify the color of the object within the circle. A mask is then applied to the image, which involves selecting parts of the image, graphics, or objects to globally or locally block out areas in the image for processing.

After masking, morphological operations, including opening and closing, are performed on the object image to refine the results.

Opening operation: Involves erosion followed by dilation. Effect: Removes small objects, smooths object boundaries, and does not affect object area. It can remove small particle noise and separate objects that are stuck together.

Erosion: Removes boundary pixels of an object, causing the edges to shrink inward. This operation can eliminate objects smaller than the structuring element.

Dilation: Expands the boundary pixels of an object by merging surrounding background pixels that are in contact with the object, causing the edges to grow outward.

Finally, the recognition results are overlaid on the return image.

6.4.2 Operation Steps

Note

Commands must be entered with correct capitalization. The Tab key can be used to auto-complete keywords.

Power on the robot and connect it via the NoMachine remote control software. For detailed information on connecting to a remote desktop, please refer to section 1.7.2 AP Mode Connection Steps in the manual.
Click the terminal icon in the system desktop to open a command-line window.
Enter the command to disable the app auto-start service.

sudo systemctl stop start_app_node.service

Enter the command to start the line-following node:

ros2 launch app line_following_node.launch.py debug:=true

Open a new terminal, navigate to the program directory, and start the autonomous line following feature by entering the following command:

ros2 service call /line_following/enter std_srvs/srv/Trigger {}

Next, in the current terminal, switch to the program directory and start the autonomous line following feature by entering the following command:

ros2 service call /line_following/set_running std_srvs/srv/SetBool "{data: True}"

Click on the line in the interface, select the corresponding color of the track, and the robot will start following the line.

To stop this feature, return to the terminal and enter the following command:

ros2 service call /line_following/enter std_srvs/srv/Trigger {}

To exit the feature, press Ctrl + C in the terminal. If the program does not close successfully, try pressing Ctrl + C again.

6.4.3 Program Analysis

Launch Analysis

The program file corresponding to this section is located at /home/ubuntu/ros2_ws/src/app/launch/line_following_node.launch.py

Reading Paths

    compiled = os.environ['need_compile']
    debug = LaunchConfiguration('debug', default='false')
    debug_arg = DeclareLaunchArgument('debug', default_value=debug)
    if compiled == 'True':
        controller_package_path = get_package_share_directory('controller')

The package path is obtained using get_package_share_directory.

Launch

peripherals_package_path = '/home/ubuntu/ros2_ws/src/peripherals'
    line_following_node = GroupAction([
        IncludeLaunchDescription(
            PythonLaunchDescriptionSource(
                os.path.join(peripherals_package_path, 'launch/lidar.launch.py')),
            condition=IfCondition(debug),
            ),

        IncludeLaunchDescription(
            PythonLaunchDescriptionSource(
                os.path.join(peripherals_package_path, 'launch/depth_camera.launch.py')),
            condition=IfCondition(debug),
            ),

        IncludeLaunchDescription(
            PythonLaunchDescriptionSource(
                os.path.join(controller_package_path, 'launch/controller.launch.py')),
            condition=IfCondition(debug),
            ),

        Node(
            package='app',
            executable='line_following',
            output='screen',
            parameters=[{'debug': debug}],
            ),

controller_launch starts the motion control launch file.

depth_camera_launch starts the depth camera launch file.

lidar_launch starts the LiDAR launch file.

None

line_following_node starts the line following node.

            package='app',
            executable='line_following',
            output='screen',

Source Code Analysis

The source code for this demo is located at /home/ubuntu/ros2_ws/src/app/app/line_following.py

Note

Before modifying the program, backup the original factory code. Do not modify the source code file directly to avoid robot malfunction due to incorrect parameter changes.

Subscribe to the topic published by the camera node to obtain RGB images. Detect and select the line to be followed in the image, then sample its color to obtain the corresponding threshold range. Based on the line color information, extract the line features, calculate the offset of the robot relative to the line in the field of view, and drive the robot forward along the line while continuously correcting its position to keep the line centered in the image. LiDAR-based obstacle detection is also used to avoid obstacles.

The initialization function first defines several preset parameters, such as the color threshold ratio, stop threshold, and angle swing parameters. These parameters can affect the final recognition result. A color picker object is then defined for the color sampling step in this application, followed by the configuration of the additional LiDAR obstacle avoidance function. Next is the implementation of the LineFollower class, which mainly includes the image preprocessing function and the line-following logic.

Functions

Main:

def main():
    node = LineFollowingNode('line_following')
    rclpy.spin(node)
    node.destroy_node()
    rclpy.shutdown()

Starts the line-following node.

Class:

LineFollower:

class LineFollower:
    def __init__(self, color, node):
        self.node = node

Init:

    def __init__(self, color, node):
        self.node = node
        self.target_lab, self.target_rgb = color
        if 'Mecanum' in os.environ['MACHINE_TYPE'] or 'Acker' in os.environ['MACHINE_TYPE']:
            self.rois = ((0.90, 0.92, 0, 1, 0.7), (0.78, 0.80, 0, 1, 0.2), (0.66, 0.68, 0, 1, 0.1))
        else:
            self.rois = ((0.81, 0.83, 0, 1, 0.7), (0.69, 0.71, 0, 1, 0.2), (0.57, 0.59, 0, 1, 0.1))
        self.weight_sum = 1.0

Sets the line-following color and ROI list.

get_area_max_contour:

    def get_area_max_contour(contours, threshold=100):
        '''
        Get the contour of the largest area
        :param contours:
        :param threshold:
        :return:
        '''
        contour_area = zip(contours, tuple(map(lambda c: math.fabs(cv2.contourArea(c)), contours)))
        contour_area = tuple(filter(lambda c_a: c_a[1] > threshold, contour_area))
        if len(contour_area) > 0:
            max_c_a = max(contour_area, key=lambda c_a: c_a[1])
            return max_c_a
        return None

Gets the contour with the largest area.

Call:

    def __call__(self, image, result_image, threshold, color=None, use_color_picker=True):
        centroid_sum = 0
        h, w = image.shape[:2]
        
        if use_color_picker:
            min_color = [int(self.target_lab[0] - 50 * threshold * 2),
                         int(self.target_lab[1] - 50 * threshold),
                         int(self.target_lab[2] - 50 * threshold)]
            max_color = [int(self.target_lab[0] + 50 * threshold * 2),
                         int(self.target_lab[1] + 50 * threshold),
                         int(self.target_lab[2] + 50 * threshold)]
            target_color = self.target_lab, min_color, max_color
            lowerb = tuple(target_color[1])
            upperb = tuple(target_color[2])
        else:
            lowerb = tuple(color['min'])
            upperb = tuple(color['max'])
        for roi in self.rois:
            blob = image[int(roi[0]*h):int(roi[1]*h), int(roi[2]*w):int(roi[3]*w)]  # Intercept roi
            img_lab = cv2.cvtColor(blob, cv2.COLOR_RGB2LAB)  # Convert rgb into lab
            img_blur = cv2.GaussianBlur(img_lab, (3, 3), 3)  # Perform Gaussian filtering to reduce noise
            mask = cv2.inRange(img_blur, lowerb, upperb)  # Image binarization
            eroded = cv2.erode(mask, cv2.getStructuringElement(cv2.MORPH_RECT, (3, 3)))  # Corrode
            dilated = cv2.dilate(eroded, cv2.getStructuringElement(cv2.MORPH_RECT, (3, 3)))  # Dilate
            # cv2.imshow('section:{}:{}'.format(roi[0], roi[1]), cv2.cvtColor(dilated, cv2.COLOR_GRAY2BGR))
            contours = cv2.findContours(dilated, cv2.RETR_EXTERNAL, cv2.CHAIN_APPROX_TC89_L1)[-2]  # Find the contour
            max_contour_area = self.get_area_max_contour(contours, 30)  # Get the contour corresponding to the largest contour
            if max_contour_area is not None:
                rect = cv2.minAreaRect(max_contour_area[0])  # Minimum circumscribed rectangle
                box = np.intp(cv2.boxPoints(rect))  # Four corners
                for j in range(4):
                    box[j, 1] = box[j, 1] + int(roi[0]*h)
                cv2.drawContours(result_image, [box], -1, (0, 255, 255), 2)  # Draw the rectangle composed of four points

Performs color recognition based on the set color and provides feedback on the recognized image and angle.

LineFollowingNode:

class LineFollowingNode(Node):
    def __init__(self, name):
        rclpy.init()
        super().__init__(name, allow_undeclared_parameters=True, automatically_declare_parameters_from_overrides=True)
        
        self.name = name
        self.color = ''
        self.set_above = False
        self.set_callback = False
        self.is_running = False

Init:

    def __init__(self, name):
        rclpy.init()
        super().__init__(name, allow_undeclared_parameters=True, automatically_declare_parameters_from_overrides=True)
        
        self.name = name
        self.color = ''
        self.set_above = False
        self.set_callback = False
        self.is_running = False
        self.color_picker = None
        self.follower = None
        self.scan_angle = math.radians(45)
        self.pid = pid.PID(0.01, 0.0, 0.0)
        self.empty = 0
        self.count = 0
        self.stop = False
        self.threshold = 0.5

Initializes parameters required for the program, calls the chassis and camera nodes, and starts services such as enter, exit, start, set color, get color, and set threshold.

get_node_state:

    def get_node_state(self, request, response):
        response.success = True
        return response

Sets the current node state.

Main:

    def main(self):
        while True:
            try:
                image = self.image_queue.get(block=True, timeout=1)
            except queue.Empty:
                continue

            result = cv2.cvtColor(image, cv2.COLOR_RGB2BGR)
            # cv2.imshow("image", result)
            cv2.imshow("image", cv2.resize(result, (display_size[0], display_size[1])))
            if self.debug and not self.set_callback:
                self.set_callback = True
                # Set callback function for mouse clicking event
                cv2.setMouseCallback("image", self.mouse_callback)
            k = cv2.waitKey(1)
            if k != -1:
                break

Reads the image and uses the mouse for color selection.

mouse_callback:

    def mouse_callback(self, event, x, y, flags, param):
        if event == cv2.EVENT_LBUTTONDOWN:
            self.get_logger().info("x:{} y{}".format(x, y))
            msg = SetPoint.Request()
            if self.image_height is not None and self.image_width is not None:
                msg.data.x = x / self.image_width
                msg.data.y = y / self.image_height
                self.set_target_color_srv_callback(msg, SetPoint.Response())

Mouse color selection callback function to obtain the pixel coordinates of the current mouse position.

enter_srv_callback:

    def enter_srv_callback(self, request, response):
        self.get_logger().info('\033[1;32m%s\033[0m' % "line following enter")
        with self.lock:
            self.color = ''
            self.stop = False
            self.is_running = False
            self.color_picker = None
            self.pid = pid.PID(1.1, 0.0, 0.0)
            self.follower = None
            self.threshold = 0.5
            self.empty = 0
            if self.image_sub is None:
                if 'ROSOrin Pro' in self.machine_type:
                    self.camera_type = 'Stereo'
                    self.image_sub = self.create_subscription(Image, '/depth_cam/rgb0/image_raw', self.image_callback, 1)  # Subscribe to the camera
                else:
                    self.camera_type = 'Mono'
                    self.image_sub = self.create_subscription(Image, '/usb_cam/image_raw', self.image_callback, 1)  # Subscribe to the camera
            if self.lidar_sub is None:
                qos = QoSProfile(depth=1, reliability=QoSReliabilityPolicy.BEST_EFFORT)
                self.lidar_sub = self.create_subscription(LaserScan, '/scan_raw', self.lidar_callback, qos)  # Subscribe to LiDAR data

Starts autonomous line following, initializes robot motion parameters, and subscribes to camera and LiDAR topics.

exit_srv_callback:

    def exit_srv_callback(self, request, response):
        self.get_logger().info('\033[1;32m%s\033[0m' % "line following exit")
        try:
            if self.image_sub is not None:
                self.destroy_subscription(self.image_sub)
                self.image_sub = None
            if self.lidar_sub is not None:
                self.destroy_subscription(self.lidar_sub)
                self.lidar_sub = None
        except Exception as e:
            self.get_logger().error(str(e))
        with self.lock:
            self.is_running = False
            self.color_picker = None
            self.pid = pid.PID(0.01, 0.0, 0.0)
            self.follower = None
            self.threshold = 0.5
            self.mecanum_pub.publish(Twist())
        response.success = True
        response.message = "exit"
        return response

Exits the autonomous line following service, shuts down all active subscriptions, resets the PID controller, and stops line following.

set_target_color_srv_callback:

    def set_target_color_srv_callback(self, request, response):
        self.get_logger().info('\033[1;32m%s\033[0m' % "set_target_color")
        with self.lock:
            self.use_color_picker = True
            x, y = request.data.x, request.data.y
            self.get_logger().info(f'{request.data.x},{request.data.y}')
            # request.data.x  = request.data.x 
            # request.data.y = 
            self.follower = None
            if x == -1 and y == -1:
                self.color_picker = None
            else:
                self.color_picker = ColorPicker(request.data, 5)
                self.mecanum_pub.publish(Twist())
        response.success = True
        response.message = "set_target_color"
        return response

Service to set the target line-following color.

get_target_color_srv_callback:

    def get_target_color_srv_callback(self, request, response):
        self.get_logger().info('\033[1;32m%s\033[0m' % "get_target_color")
        response.success = False
        response.message = "get_target_color"
        with self.lock:
            if self.follower is not None:
                response.success = True
                rgb = self.follower.target_rgb
                response.message = "{},{},{}".format(int(rgb[0]), int(rgb[1]), int(rgb[2]))
        return response

Service to get the target line-following color.

set_running_srv_callback:

    def set_running_srv_callback(self, request, response):
        self.get_logger().info('\033[1;32m%s\033[0m' % "set_running")
        with self.lock:
            self.is_running = request.data
            self.empty = 0
            if not self.is_running:
                self.mecanum_pub.publish(Twist())
        response.success = True
        response.message = "set_running"
        return response

Service to start the autonomous line following feature.

set_threshold_srv_callback:

    def set_threshold_srv_callback(self, request, response):
        self.get_logger().info('\033[1;32m%s\033[0m' % "set threshold")
        with self.lock:
            self.threshold = request.data
            response.success = True
            response.message = "set_threshold"
            return response

Service to set the color threshold.

lidar_callback:

    def lidar_callback(self, lidar_data):
        # Data size= scanning angle/ the increased angle per scan
        if self.lidar_type != 'G4':
            min_index = int(math.radians(MAX_SCAN_ANGLE / 2.0) / lidar_data.angle_increment)
            max_index = int(math.radians(MAX_SCAN_ANGLE / 2.0) / lidar_data.angle_increment)
            left_ranges = lidar_data.ranges[:max_index]  # Left data
            right_ranges = lidar_data.ranges[::-1][:max_index]  # Right data
        elif self.lidar_type == 'G4':
            min_index = int(math.radians((360 - MAX_SCAN_ANGLE) / 2.0) / lidar_data.angle_increment)
            max_index = int(math.radians(180) / lidar_data.angle_increment)
            left_ranges = lidar_data.ranges[min_index:max_index][::-1]  # The left data
            right_ranges = lidar_data.ranges[::-1][min_index:max_index][::-1]  # The right data

        # Get data according to settings
        angle = self.scan_angle / 2
        angle_index = int(angle / lidar_data.angle_increment + 0.50)
        left_range, right_range = np.array(left_ranges[:angle_index]), np.array(right_ranges[:angle_index])

        left_nonzero = left_range.nonzero()
        right_nonzero = right_range.nonzero()
        left_nonan = np.isfinite(left_range[left_nonzero])
        right_nonan = np.isfinite(right_range[right_nonzero])
        # Take the nearest distance left and right
        min_dist_left_ = left_range[left_nonzero][left_nonan]
        min_dist_right_ = right_range[right_nonzero][right_nonan]
        if len(min_dist_left_) > 0 and len(min_dist_right_) > 0:
            min_dist_left = min_dist_left_.min()
            min_dist_right = min_dist_right_.min()
            if min_dist_left < self.stop_threshold or min_dist_right < self.stop_threshold:
                self.stop = True
            else:
                self.count += 1
                if self.count > 5:
                    self.count = 0
                    self.stop = False

The LiDAR callback function processes the received data and determines whether to stop moving.

image_callback:

    def image_callback(self, ros_image):
        cv_image = self.bridge.imgmsg_to_cv2(ros_image, "rgb8")
        rgb_image = np.array(cv_image, dtype=np.uint8)
        rgb_image = cv2.resize(rgb_image, (640,480))
        self.image_height, self.image_width = rgb_image.shape[:2]
        result_image = np.copy(rgb_image)  # The image used to display the result
        with self.lock:
            if self.use_color_picker:
                # Color picker and line recognition are exclusive. If there is color picker, start picking
                if self.color_picker is not None:  # Color picker exists
                    try:
                        target_color, result_image = self.color_picker(rgb_image, result_image)
                        if target_color is not None:
                            self.color_picker = None
                            self.follower = LineFollower(target_color, self)
                            self.get_logger().info("target color: {}".format(target_color))
                    except Exception as e:
                        self.get_logger().error(str(e))
                else:
                    twist = Twist()
                    twist.linear.x = 0.15
                    if self.follower is not None:
                        try:
                            result_image, deflection_angle = self.follower(rgb_image, result_image, self.threshold)
                            if deflection_angle is not None and self.is_running and not self.stop:
                                self.pid.update(deflection_angle)
                                if 'Acker' in self.machine_type:
                                    steering_angle = common.set_range(-self.pid.output, -math.radians(322/2000*180), math.radians(322/2000*180))
                                    if steering_angle != 0:
                                        R = 0.17706/math.tan(steering_angle)
                                        twist.angular.z = twist.linear.x/R
                                else:
                                    twist.angular.z = common.set_range(-self.pid.output, -1.0, 1.0)
                                self.mecanum_pub.publish(twist)
                            elif self.stop:
                                self.mecanum_pub.publish(Twist())
                            else:
                                self.pid.clear()
                        except Exception as e:
                            self.get_logger().error(str(e))
            else:
                twist = Twist()
                if self.color in common.range_rgb:
                    twist.linear.x = 0.15
                    self.follower = LineFollower([None, common.range_rgb[self.color]], self)
                    result_image, deflection_angle = self.follower(rgb_image, result_image, self.threshold, self.lab_data['lab'][self.camera_type][self.color], False)
                    if deflection_angle is not None and self.is_running and not self.stop:
                        self.pid.update(deflection_angle)
                        if 'Acker' in self.machine_type:
                            steering_angle = common.set_range(-self.pid.output, -math.radians(322/2000*180), math.radians(322/2000*180))
                            if steering_angle != 0:
                                R = 0.17706/math.tan(steering_angle)
                                twist.angular.z = twist.linear.x/R
                        else:
                            twist.angular.z = common.set_range(-self.pid.output, -1.0, 1.0)
                        self.mecanum_pub.publish(twist)
                    elif self.stop:
                        self.mecanum_pub.publish(Twist())
                    else:
                        self.pid.clear()
                else:
                    self.mecanum_pub.publish(twist)
        if self.debug:
            if self.image_queue.full():
                # If the queue is full, discard the oldest image
                self.image_queue.get()

The camera callback function uses the received data to invoke the color picker and controls the robot’s movement along the recognized line using PID.

6.5 AR Vision

6.5.1 Overview

First, topic messages published by the camera node are subscribed to obtain RGB images, which are then converted to grayscale and resized.

Next, a service call is used to set the 3D model to be displayed.

Finally, when the camera detects an AprilTag marker, the corresponding 3D model is overlaid at the marker position.

6.5.2 Operation Steps

Note

Commands must be entered with correct capitalization. The Tab key can be used to auto-complete keywords.

Power on the robot and connect it to the remote control software NoMachine. For instructions on connecting to the remote desktop, refer to 1.7.2 AP Mode Connection Steps.
Click the terminal icon on the desktop to open a terminal.
Run the following command to stop the auto-start service.

sudo systemctl stop start_app_node.service

Run the following command to start the AR vision feature.

ros2 launch example ar_detect.launch.py

Run the following command to set the AR model.

ros2 service call /ar_detect/set_model interfaces/srv/SetString "{data: 'bicycle'}"

Available models:

bicycle: bicycle
fox: fox
chair: chair
cow: cow
wolf: wolf
rectangle: simple cube frame

To stop the feature, press Ctrl + C in the terminal. If the program does not close successfully, try pressing Ctrl + C again.

6.5.3 Project Outcome

After the feature starts, place the tag card within the camera field of view. When the tag is detected, coordinate axes are drawn at the tag position, the four corners of the tag are marked with blue dots, and the tag ID is displayed.

6.5.4 Program Analysis

Launch File Analysis

The source code is located at:

/home/ubuntu/ros2_ws/src/example/example/ar_detect/ar_detect.launch.py

It defines the content to be launched, gets the paths of the peripherals and example packages, stand arts the depth_camera_launch file. Creates the ROS2 node ar_detect, defines the executable file, and finally returns the launch item list.

def launch_setup(context):
    compiled = os.environ['need_compile']
    debug = LaunchConfiguration('debug', default='true')
    debug_arg = DeclareLaunchArgument('debug', default_value=debug)
    if compiled == 'True':
        peripherals_package_path = get_package_share_directory('peripherals')
        example_package_path = get_package_share_directory('example')
    else:
        peripherals_package_path = '/home/ubuntu/ros2_ws/src/peripherals'
        example_package_path = '/home/ubuntu/ros2_ws/src/example'

    depth_camera_launch = IncludeLaunchDescription(
        PythonLaunchDescriptionSource(
            os.path.join(peripherals_package_path, 'launch/depth_camera.launch.py')),
    )


    ar_detect_node = Node(
        package='example',
        executable='ar_detect',
        output='screen',
        parameters=[{'debug': debug}],
    )

    return [
            debug_arg,
            depth_camera_launch,
            ar_detect_node,
            ]

This is the entry function of the ROS2 Launch file. It defines the content to be launched.

def generate_launch_description():
    return LaunchDescription([
        OpaqueFunction(function = launch_setup)
    ])

It creates LaunchService and passes the launch content to it for execution.

if __name__ == '__main__':
    # Create a LaunchDescription object
    ld = generate_launch_description()

    ls = LaunchService()
    ls.include_launch_description(ld)
    ls.run()

Python Source Code Analysis

The source code is located at:

/home/ubuntu/ros2_ws/src/example/example/ar_detect/ar_detect.py

It gets the default storage path of the model.

MODEL_PATH = os.path.join(os.path.abspath(os.path.join(os.path.split(os.path.realpath(__file__))[0])), 'ar_models')

The draw_rectangle function is used to draw a cube. The parameter img is the image used to draw the cube. The parameter imgpts is the corner point set of the cube. The function returns the image containing the cube.

def draw_rectangle(img, imgpts):
    '''
    Draw the cube
    :param img: The image to draw the cube
    :param imgpts: Angular point of the cube
    :return: The image to draw the cube
    '''
    imgpts = np.int32(imgpts).reshape(-1, 2)
    cv2.drawContours(img, [imgpts[:4]], -1, (0, 255, 0), -3)  # Draw contour points, filled.
    for i, j in zip(range(4), range(4, 8)):
        cv2.line(img, tuple(imgpts[i]), tuple(imgpts[j]), (255), 3)  # Draw points connected by lines.
    cv2.drawContours(img, [imgpts[4:]], -1, (0, 0, 255), 3)  # Draw contour points, unfilled.
    
    return img

OBJP is a set of world-coordinate points solved with NumPy. It contains the four corner points of the AprilTag marker, left-bottom, right-bottom, left-top, and right-top, plus the center point, and is used to calculate the camera extrinsic parameters.

AXIS defines the coordinates of the eight vertices of the cube. It is used to draw the cube frame when the marker is detected, with four points on the bottom face, four points on the top face, and a height of 2.

MODELS_SCALE is a dictionary of scaling ratios for each 3D model. It is used to resize the OBJP model to a suitable display size, such as 50 for the bicycle, 400 for the chair, and 0.4 for the cow.

OBJP = np.array([[-1, -1,  0],
                 [ 1, -1,  0],
                 [-1,  1,  0],
                 [ 1,  1,  0],
                 [ 0,  0,  0]], dtype=np.float32)

# Draw the coordinate of the cube
AXIS = np.float32([[-1, -1, 0], 
                   [-1,  1, 0], 
                   [ 1,  1, 0], 
                   [ 1, -1, 0],
                   [-1, -1, 2],
                   [-1,  1, 2],
                   [ 1,  1, 2],
                   [ 1, -1, 2]])

# Model scaling
MODELS_SCALE = {
                'bicycle': 50, 
                'fox': 4, 
                'chair': 400, 
                'cow': 0.4,
                'wolf': 0.6,
                }

def __init__(self, name) Method

It initializes the ROS2 node and allows undeclared parameters and automatically declared override parameters.

        super().__init__(name, allow_undeclared_parameters=True, automatically_declare_parameters_from_overrides=True)
        self.name = name

It initializes the camera intrinsic matrix, including focal length and principal point coordinates, and sets the distortion coefficient array used for image correction.

        self.camera_intrinsic = np.matrix([[619.063979, 0,          302.560920],
                                           [0,          613.745352, 237.714934],
                                           [0,          0,          1]])
        self.dist_coeffs = np.array([0.103085, -0.175586, -0.001190, -0.007046, 0.000000])

It initializes variables such as the model object, subscribers, and target model, creates CvBridge for ROS image and OpenCV format conversion, and gets the machine type and camera type from environment variables. Instantiates the AprilTag detector using the tag36h11 family, and creates a thread lock to ensure thread safety.

        self.obj = None
        self.image_sub = None
        self.target_model = None
        self.camera_info_sub = None
        self.bridge = CvBridge()
        self.machine = os.environ['MACHINE_TYPE']
        self.camera = os.environ['DEPTH_CAMERA_TYPE']
        self.tag_detector = apriltag("tag36h11")  # Instantiate apriltag
        self.lock = threading.RLock()  # Thread lock

Create an image publisher to publish the processed result image to the ~/image_result topic. Create the following services: ~/enter to enter AR detection mode, ~/exit to exit AR detection mode, ~/set_model to set the 3D model to display, and ~/init_finish to get the node status. Create a heartbeat mechanism with a 5-second interval for automatic exit on timeout. Read the value of the debug parameter.

        self.result_publisher = self.create_publisher(Image, '~/image_result', 1)  # Publish the final image
        self.create_service(Trigger, '~/enter', self.enter_srv_callback)  # Enter the feature service
        self.create_service(Trigger, '~/exit', self.exit_srv_callback)  # Exit the feature service
        Heart(self, self.name + '/heartbeat', 5, lambda _: self.exit_srv_callback(request=Trigger.Request(), response=Trigger.Response()))  # Heartbeat package
        self.create_service(SetString, '~/set_model', self.set_model_srv_callback)  # Set the model service
        self.debug = self.get_parameter('debug').value
        self.create_service(Trigger, '~/init_finish', self.get_node_state)

If debug mode is enabled, the enter service is automatically called to start AR detection.

        if self.debug:
            self.enter_srv_callback(Trigger.Request(), Trigger.Response())

def get_node_state(self, request, response)

This is the callback method of the ~/init_finish service. It returns success = True, indicating that the node has been initialized and can be used by external systems to confirm whether the ar_detect node is ready.

    def get_node_state(self, request, response):
        response.success = True
        return response

def enter_srv_callback(self, request, response)

It responds to the ~/enter service call, starts the AR vision feature, uses a thread lock to ensure thread safety, and resets the model object and target model to None.

    def enter_srv_callback(self, request, response):
        # Enter the service
        self.get_logger().info('\033[1;32m%s\033[0m' % "ar enter")
        # If there is a node when entering the service, cancel subscription and subscribe again.
         
        with self.lock:
            self.obj = None
            self.target_model = None
            if self.image_sub is None:
                self.image_sub = self.create_subscription(Image, '/depth_cam/rgb0/image_raw', self.image_callback, 1)  # Subscribe to the camera

            if self.camera_info_sub is None:
                self.camera_info_sub = self.create_subscription(CameraInfo, '/depth_cam/rgb0/camera_info', self.camera_info_callback, 1) # Subscribe to the camera information
        
        response.success = True
        response.message = "enter"
        return response

def exit_srv_callback(self, request, response)

It responds to the ~/exit service call, stops the AR vision feature, unsubscribes to save resources, and destroys the image subscription self.image_sub to stop receiving RGB images. Destroys the camera information subscription self.camera_info_sub to stop receiving camera intrinsics, sets the subscription objects to None, and uses exception handling to ensure safe unsubscription. Logs an error if an exception occurs, and returns a successful response with the message exit.

	def exit_srv_callback(self, request, response):
        # Exit the service
        self.get_logger().info('\033[1;32m%s\033[0m' % "ar exit")
        # Cancel the subscribtion when exiting the service to save the expenditure.
        try:
            if self.image_sub is not None:
                self.destroy_subscription(self.image_sub)
                self.image_sub = None
            if self.camera_info_sub is not None:
                self.destroy_subscription(self.camera_info_sub)
                self.camera_info_sub = None
        except Exception as e:
            self.get_logger().error(str(e))
        response.success = True
        response.message = "exit"
        return response

def set_model_srv_callback(self, request, response)

It responds to the ~/set_model service call and sets the 3D model to be displayed.

With the thread lock enabled to ensure thread safety, if the request data is an empty string, the target model is cleared. Otherwise, the target model name is set.

        with self.lock:
            self.get_logger().info('\033[1;32m%s\033[0m' % "set model {}".format(request.data))
            if request.data == "":
                self.target_model = None
            else:
                self.target_model = request.data

Model loading and preprocessing:

If the model is not rectangle, load the corresponding OBJ file from the ar_models directory and reverse the face order to ensure correct rendering. Traverse all faces of the model to extract the vertex coordinates of each face from the first three values and the color information from the last three values, except for the cow and wolf models. Scale the vertices according to the MODELS_SCALE dictionary. Apply position and rotation adjustments based on the model type. For bicycle, apply a translation and rotate 180 degrees around the Z-axis. For fox and chair, rotate -90 degrees around the Z-axis. All other models remain unchanged. Convert the color values to the 0 to 255 range. For the cow and wolf models, the color value is None and will be assigned manually later. Store all processed face data in self.obj.

                if self.target_model != 'rectangle':  # If the cube is not being drawn.
                    # Load the model
                    obj = obj_load(os.path.join(MODEL_PATH, self.target_model + '.obj'), swapyz=True)
                    obj.faces = obj.faces[::-1]
                    new_faces = []
                    # Analyze the model and get the point coordinates.
                    for face in obj.faces:
                        face_vertices = face[0]
                        points = []
                        colors = []
                        for vertex in face_vertices:
                            data = obj.vertices[vertex - 1]
                            points.append(data[:3])
                            if self.target_model != 'cow' and self.target_model != 'wolf':
                                colors.append(data[3:])
                        scale_matrix = np.array([[1, 0, 0], [0, 1, 0], [0, 0, 1]]) * MODELS_SCALE[self.target_model]  # Scale
                        points = np.dot(np.array(points), scale_matrix)
                        if self.target_model == 'bicycle':
                            points = np.array([[p[0] - 670, p[1] - 350, p[2]] for p in points])
                            points = R.from_euler('xyz', (0, 0, 180), degrees=True).apply(points)
                        elif self.target_model == 'fox':
                            points = np.array([[p[0], p[1], p[2]] for p in points])
                            points = R.from_euler('xyz', (0, 0, -90), degrees=True).apply(points)
                        elif self.target_model == 'chair':
                            points = np.array([[p[0], p[1], p[2]] for p in points])
                            points = R.from_euler('xyz', (0, 0, -90), degrees=True).apply(points)
                        else:
                            points = np.array([[p[0], p[1], p[2]] for p in points])
                        if len(colors) > 0:
                            color = tuple(255 * np.array(colors[0]))
                        else:
                            color = None
                        new_faces.append((points, color))
                    self.obj = new_faces

It returns a successful response with the message set_model.

        response.success = True
        response.message = "set_model"
        return response

def camera_info_callback(self, msg)

This is the callback method for the /depth_cam/rgb0/camera_info topic. It uses a thread lock to ensure thread safety, updates the camera intrinsic parameters, converts the intrinsic matrix msg.k into a 3×3 array, and stores it in self.camera_intrinsic. It converts the distortion coefficient array msg.d and stores it in self.dist_coeffs.

    def camera_info_callback(self, msg):
        # Camera internal parameter callback
        with self.lock:
            self.camera_intrinsic = np.array(msg.k).reshape(3, -1)
            self.dist_coeffs = np.array(msg.d)

def image_callback(self, ros_image)

This is the callback method for the /depth_cam/rgb0/image_raw topic.

Image format conversion:

        cv_image = self.bridge.imgmsg_to_cv2(ros_image, "rgb8")
        rgb_image = np.array(cv_image, dtype=np.uint8)
        result_image = np.copy(rgb_image)

Image processing: A thread lock is used to ensure thread safety. The image_proc function is called for AR vision processing, including AprilTag detection, pose estimation, and model overlay. Exception handling is used to log errors without affecting node operation.

        with self.lock:
            try:
                # Process image
                result_image = self.image_proc(rgb_image, result_image)
            except Exception as e:
                self.get_logger().info(str(e))

def image_proc(self, rgb_image, result_image)

This function converts the RGB image to grayscale, uses AprilTag for detection and marking, extracts the coordinates of the four corner points, left-bottom, right-bottom, right-top, and left-top, along with the center point, draws yellow dots on the image, and marks the positions of the four corners.

        if self.target_model is not None: 
            gray = cv2.cvtColor(rgb_image, cv2.COLOR_RGB2GRAY)  # Convert into gray image.
            detections = self.tag_detector.detect(gray)  # Aprilatg recognition
            if detections != ():
                for detection in detections:  # Traverse
                    # Acquire four angular points and center point.
                    tag_center = detection['center']
                    tag_corners = detection['lb-rb-rt-lt']
                    lb = tag_corners[0]
                    rb = tag_corners[1]
                    rt = tag_corners[2]
                    lt = tag_corners[3]
                    # Draw four angular points
                    cv2.circle(result_image, (int(lb[0]), int(lb[1])), 2, (0, 255, 255), -1)
                    cv2.circle(result_image, (int(lt[0]), int(lt[1])), 2, (0, 255, 255), -1)
                    cv2.circle(result_image, (int(rb[0]), int(rb[1])), 2, (0, 255, 255), -1)
                    cv2.circle(result_image, (int(rt[0]), int(rt[1])), 2, (0, 255, 255), -1)
                    # cv2.circle(result_image, (int(tag_center[0]), int(tag_center[1])), 3, (255, 0, 0), -1)

Pose estimation: cv2.solvePnP is used to solve the camera extrinsic parameters, namely the rotation vector rvecs and translation vector tvecs.

                     corners = np.array([lb, rb, lt, rt, tag_center]).reshape(5, -1)
                    # Use the world coordinate system k point coordinates (OBJP), the k point coordinates (corners) corresponding to the 2D image coordinate system, and the camera internal parameters camera_intrinsic and dist_coeffs to reverse the external parameters r, t of the picture.
                    ret, rvecs, tvecs = cv2.solvePnP(OBJP, corners, self.camera_intrinsic, self.dist_coeffs)

Model rendering: If the model is rectangle, the cube vertices are projected to the image plane, and draw_rectangle is called to draw the cube frame. For other models, the code iterates through all faces, projects the 3D points to the image plane, and fills the polygons with cv2.fillConvexPoly. The cow model is filled in cyan, the wolf model is filled in yellow, and other models use their built-in colors. Finally, the result image with the AR model overlay is returned.

                    if self.target_model == 'rectangle':  # If the cube needs to be displayed, process independently
                        # Backprojection converts world coordinate system points to image points
                        imgpts, jac = cv2.projectPoints(AXIS, rvecs, tvecs, self.camera_intrinsic, self.dist_coeffs)
                        result_image = draw_rectangle(result_image, imgpts)
                    else:
                        for points, color in self.obj:
                             dst, jac = cv2.projectPoints(points.reshape(-1, 1, 3)/100.0, rvecs, tvecs, self.camera_intrinsic, self.dist_coeffs)
                             imgpts = dst.astype(int)
                             # Manual coloring
                             if self.target_model == 'cow':
                                 cv2.fillConvexPoly(result_image, imgpts, (0, 255, 255))
                             elif self.target_model == 'wolf':
                                 cv2.fillConvexPoly(result_image, imgpts, (255, 255, 0))
                             else:
                                 cv2.fillConvexPoly(result_image, imgpts, color)
        return result_image