This project is part of Udacity’s Self-Driving-Car Nanodegree. The project resources and build instructions can be found here.
Computer vision based lane detection
The main goal of this project is the development of a software pipeline to detect the lane boundaries in the provided videos. The pipeline can be subdividided into the following steps:
- One-time determination of the camera calibration matrix and its distortion coefficients.
- Use it to correct the distortion of each video frame.
- Create a binary feature mask to detect the lane lines.
- Transform the binary image into bird’s-eye perspective.
- Detecting and fitting the left and right lane lines to second order polynomials.
- Determination of average radius of curvature and vehicle offset with respect to
lane center. - Visualization of lane boundaries and additional information on each video frame.
Camera calibration
The required code for the camera calibration can be found in camera_calibration.py
.
There is also a nice OpenCV tutorial on this topic.
Chessboard sample images were taken from different angles with the embedded camera and
can be found in the camera_cal
folder. On most of them A 9x6 pattern can
be seen. To determine the camera calibration matrix, two sets of points must be compared:
- the object points of the chessboard corners in the 3D world (X, Y, Z), and
- the corresponding 2D chessboard corner image points (X, Y)
For simplicity, it is assumed that the 3D object points lie on the XY-plane and the Z-coordinate is zero.
# Creation of the array of evenly spaced 9x6 object points (x, y, z)
cols = 9 # x-axis
rows = 6 # y-axis
object_corners = np.zeros((rows * cols, 3)).astype(np.float32)
# Fill the array with the x and y coordinates. The z-coordinate stays 0.
object_corners[:, :2] = np.mgrid[0: cols, 0: rows].T.reshape(-1, 2)
# >>> object_corners
# array([[0., 0., 0.],
# [1., 0., 0.],
# ...
# [7., 5., 0.],
# [8., 5., 0.]])
The corresponding 2D image points are determined by using the
cv2.findChessboardCorners
function after the chessboard image has
been converted to grayscale.
object_corner_list = [] # corner 3-d coordinates in the world space
image_corner_list = [] # corner 2-d coodinates in the image
for file in Path("camera_cal/").rglob("*.jpg"):
# Preprocess images
img = cv2.imread(str(file))
gray = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)
ret, image_corners = cv2.findChessboardCorners(gray, (cols, rows), None)
# Collect corner coordinates if found
if ret is True:
object_corner_list.append(object_corners)
image_corner_list.append(image_corners)
The camera calibration can now be perfomed with cv2.calibrateCamera
using the obtained image and object points. The same object points are used for the
comparison with the detected image points on all chessboard images.
# Camera calibration
ret, mtx, dist, rvecs, tvecs = cv2.calibrateCamera(
object_corner_list, image_corner_list, gray.shape[::-1], None, None)
# Store the camera matrix and the distortion coefficients for later use
pickle.dump({"mtx": mtx, "dist": dist}, open("calibration_results.p", "wb"))
From now on, images taken with this camera can be undistorted using
cv2.undistort
with the camera matrix mtx
and the distortion
coefficients dist
.
# Undistort one of the calibration images
img = mpimg.imread("camera_cal/calibration1.jpg")
undist_img = cv2.undistort(img, mtx, dist, dst=None, newCameraMatrix=mtx)
mpimg.imsave(f"output_images/calibration1_undistorted.jpg", undist_img)
plot_undistort_example(img, undist_img, save="./output_images/camera_calibration_example")
Undistortion of current video frame
Each new video frame must be undistorted before it can be further processed. This can be done exactly as in the camera calibration example above
# Undistort one of the test images
img = mpimg.imread("./test_images/test1.jpg")
undist_img = cv2.undistort(img, mtx, dist, dst=None, newCameraMatrix=mtx)
mpimg.imsave(f"output_images/test1_undistorted.jpg", undist_img)
plot_undistort_example(img, undist_img, save="./output_images/undistort_example")
Binary feature mask for lane detection
This is one of the major steps of the project. A binary feature mask needs to be created for the subsequent lane line detection in each undistorted video frame. Three features were chosen to accomplish this task:
- The
white_feature
, whose major purpose is to detect the white lane lines - The
yellow_feature
, whose major purpose is to detect the yellow lane lines - The
sobel_feature
which provides additional edge information about both lane lines
The complete feature code including all helper functions can be found in features.py
.
The white feature
The white feature uses the concept of contrast limited adaptive histogram equalization
(CLAHE). After performing the histogram equalization for the RGB frame with the
rgb2clahe
helper function, the resulting image is thresholded to obtain
the white feature mask.
def white_feature(rgb_image, thresh_min=210):
""" Create a mask for white lane lines detection.
A contrast limited adaptive histogram equalization is peformed on the
given frame and the resulting gray image is subsequently thresholded by
thresh_min.
Args:
rgb_image: current undistorted rgb frame
thresh_min: lower threshold for the gray clahe image
Returns:
a binary mask with 255 values where white lines are assumed.
"""
clahe = rgb2clahe(rgb_image)
mask = np.zeros_like(clahe)
mask[(clahe > thresh_min)] = 255
return mask
def rgb2clahe(rgb_img, clip_limit=2.0, tile_grid_size=(8, 8)):
"""
Perform a contrast limited adaptive histogram equalization on given image.
The clahe operation is performed on the grayscale version of the given rgb frame.
Args:
rgb_img: current undistorted rgb frame
clip_limit: threshold for contrast limiting
tile_grid_size: size of the grid for the histogram equalization.
The image will be divided into equally sized rectangular tiles.
tile_grid_size defines the number of tiles in row and column.
Returns:
a gray image as result of the application of clahe
"""
gray = cv2.cvtColor(rgb_img, cv2.COLOR_RGB2GRAY)
clahe = cv2.createCLAHE(clipLimit=clip_limit, tileGridSize=tile_grid_size)
gray_clahe = clahe.apply(gray)
return gray_clahe
The lower threshold thresh_min=210
was chosen as default after some tuning iterations.
The yellow feature
The yellow feature uses color thresholding in the HSV color space to detect yellow color in the given rgb frame.
def yellow_feature(rgb_image, lbound=(17, 20, 140), ubound=(32, 255, 255)):
""" Create a mask for yellow lane lines detection.
The yellow lines are detected by converting the rgb_image into hsv space
and thesholding the yellow color.
Args:
rgb_image: current undistorted rgb frame
lbound: lower hsv bounds
ubound: upper hsv bounds
Returns:
a binary mask with 255 values where yellow lines are assumed.
"""
hsv = cv2.cvtColor(rgb_image, cv2.COLOR_RGB2HSV)
ylw_mask = cv2.inRange(hsv, lbound, ubound)
return ylw_mask
The thresholds of each channel were tuned to preserve only the yellow color.
The sobel feature
The sobel feature creates a binary mask by determining and thresholding the sobel magnitude on the saturation and lightness channels of the HLS color space. A morphological closure operation is then performed on the binary mask.
def sobel_feature(rgb_image, thresh_min=20, thresh_max=255, ksize=9, iterations=5):
""" Create a mask by thresholding the sobel magnitude in the HLS color space.
The sobel magnitude is determined on the saturation and lightness channels.
A binary mask is created by thresholding the magnitudes in between
thresh_min and thresh_max. The composition of both binary masks finally
undergoes a morphological closing operation by the given iterations.
Args:
rgb_image: current undistorted rgb frame
thresh_min: lower threshold for the sobel magnitude
thresh_max: upper threshold for the sobel magnitude
ksize: kernel size for cv2.Sobel
iterations: iterations for morphological closing
Returns:
binary mask with 255 values at actice pixels
"""
s_img = rgb2hls(rgb_image, channel=2)
l_img = rgb2hls(rgb_image, channel=1)
s_mask = sobel(s_img, thresh_min, thresh_max, ksize)
l_mask = sobel(l_img, thresh_min, thresh_max, ksize)
sobel_mask = cv2.bitwise_or(s_mask, l_mask)
mask = cv2.morphologyEx(sobel_mask, cv2.MORPH_CLOSE, kernel=np.ones((3, 3)),
iterations=iterations)
return mask
def sobel(img_channel, thresh_min=20, thresh_max=255, ksize=9):
""" Determine and threshold the sobel magnitudes on a single channel image.
Args:
img_channel: single channel image
thresh_min: lower threshold for the sobel magnitude
thresh_max: upper threshold for the sobel magnitude
ksize: kernel size for cv2.Sobel
Returns:
a binary mask with 255 values where the threshold condition is fullfilled
"""
# Sobel gradients
sobel_x = cv2.Sobel(img_channel, cv2.CV_64F, 1, 0, ksize=ksize)
sobel_y = cv2.Sobel(img_channel, cv2.CV_64F, 0, 1, ksize=ksize)
# Magnitude
sobel_mag = np.sqrt(sobel_x ** 2, sobel_y ** 2)
# Scale to 8-bit (0 - 255) then convert to type = np.uint8
mag_scaled = np.uint8(255 * sobel_mag / np.max(sobel_mag))
# Mask of 255's where the scaled gradient magnitude
# is > thresh_min and < thresh_max
mask = np.zeros_like(mag_scaled)
mask[(mag_scaled > thresh_min) & (mag_scaled < thresh_max)] = 255
return mask
def rgb2hls(rgb_img, channel=None):
""" Convert an image from rgb to hls.
Args:
rgb_img: rgb image
channel: the hls channel that shall be returned, defaults to all
"""
hls = cv2.cvtColor(rgb_img, cv2.COLOR_RGB2HLS)
return hls if channel is None else hls[:, :, channel]
The lightness and saturation channels were chosen because they work well together and are both “activated” on the white and yellow lane lines.
Feature composition
The final feature mask is created by combining all three subfeatures. It only keeps pixels active if at least two of the three subfeatures are active.
def feature_mask(rgb_image):
""" Main lane line detection feature.
The combination of white_feature, yellow_feature and sobel_feature.
Args:
rgb_image: current undistorted rgb frame
Returns:
A mask for the lane line detection
"""
white_mask = white_feature(rgb_image).astype(float)
yellow_mask = yellow_feature(rgb_image).astype(float)
sobel_mask = sobel_feature(rgb_image).astype(float)
# Equal weighted mask
mask = (white_mask + yellow_mask + sobel_mask) / 255 >= 2
return np.where(mask, 255, 0).astype(np.uint8)
Perspective transformation
To perform the perspective transformation, four points are required for the mapping of the source locations to the destination locations. In this approach, the points are defined as ratios of image size. The source points additionally serve for the defintion of the region of interest in front of the vehicle.
# Region of interest, used as perspective transform source
# in ratios of the image width (x) and height (y)
ROI_SRC = {"x1": 0.03, "y1": 0.98, # lower left corner
"x2": 1.00, "y2": 0.98, # lower right corner
"x3": 0.59, "y3": 0.65, # upper right corner
"x4": 0.42, "y4": 0.65} # upper left corner
# Perspective transform destination
# in ratios of the image width (x) and height (y)
ROI_DST = {"x1": 0.0, "y1": 1.0, # left lower corner
"x2": 1.0, "y2": 1.0, # right lower corner
"x3": 1.0, "y3": 0.0, # right upper corner
"x4": 0.0, "y4": 0.0} # left upper corner
The warp
function in process.py
performs the perspective
transformation on the given undistorted image. It returns the transformed image and
additionally the transformation matrix and its inverse for the back transformation,
which is performed in a later step.
def warp(img, roi_src, roi_dst):
""" Perform a perspective transform on the given undistorted image.
Args:
img: undistorted image of current frame, e.g. the binary feature mask
roi_src: perspective transform source as ratios for image dimensions x and y
roi_dst: perspective transform destination in the same format as src
Returns:
(warped_img, trsf_mtx, trsf_mtx_inv)
"""
# Image dimensions
x, y = img.shape[1], img.shape[0]
src = np.float32([[(roi_src["x1"] * x, roi_src["y1"] * y),
(roi_src["x2"] * x, roi_src["y2"] * y),
(roi_src["x3"] * x, roi_src["y3"] * y),
(roi_src["x4"] * x, roi_src["y4"] * y)]])
dst = np.float32([[(roi_dst["x1"] * x, roi_dst["y1"] * y),
(roi_dst["x2"] * x, roi_dst["y2"] * y),
(roi_dst["x3"] * x, roi_dst["y3"] * y),
(roi_dst["x4"] * x, roi_dst["y4"] * y)]])
# Transformation matrix for the perspective transform and its inverse
trsf_mtx = cv2.getPerspectiveTransform(src, dst)
trsf_mtx_inv = cv2.getPerspectiveTransform(dst, src)
# Warp the image
warped_img = cv2.warpPerspective(img, trsf_mtx, (x, y))
return warped_img, trsf_mtx, trsf_mtx_inv
In the main pipeline, the binary feature mask is transformed into bird’s-eye view. The following image is only for illustration of the warping process.
Lane lines detection
The part of the main pipeline dealing with lane detection can be described by the following steps:
- Undistortion of the current RGB frame
- Determination of the binary feature mask on the undistorted image
- Warping of the binary feature mask into birds-eye view
- Detection of the left and right lane lines and fitting to a second order polynomial
4.1. by a sliding window approach (if previously undetected or in lost status) or
4.2. by using the information of recently detected lane lines
Both functions fit_lines_by_windows
and
fit_lines_by_recent_lines
can be found in
process.py
.
def fit_lines_by_windows(warped_feature_binary,
left_line: Line, right_line: Line,
nwindows=9, margin=100, minpix=100):
""" Detection of the lane lines by the sliding windows approach.
The starting centers of the first windows of each lane line are determined
by finding the peaks in a histogram computed for the bottom half of the
image. In each window the active pixels are counted. If they surpass the
minpix value the window will be recentered. After iterating over all windows
up to the top of the image, a second order polynomial is fitted through
the found pixels for each lane line. Both lane Line instances are updated
and returned together with an information image of the found windows and
polynomials.
Args:
warped_feature_binary: warped feature binary of the current frame
left_line: the Line instance of the left lane line
right_line: the Line instance of the right lane line
nwindows: number of sliding windows per line distributed over the
height of the image
margin: left and right extent of each window in number of pixels
minpix: number of pixels at which the window will be recentered
Returns:
(left_line, right_line, window_image)
"""
def fit_lines_by_recent_lines(warped_feature_binary,
left_line: Line, right_line: Line, margin=100):
"""
Detection of the lane lines by using information of recently found lines.
This approach uses the polynomials of previously found lines to determine
the area in which to search for the current lane lines. The search area
boundaries are determined by shifting the previous polynomials to the left
and right by the number of pixels given by margin. New polynomials are then
fitted though the active pixels found inside the left and right lane line
search windows. Both lane Line instances are updated and returned together
with an information image of the search areas and polynomials.
Args:
warped_feature_binary: warped feature binary of the current frame
left_line: the Line instance of the left lane line
right_line: the Line instance of the right lane line
margin: left and right extent of each lane line polynomial window in
number of pixels
Returns:
(left_line, right_line, window_image)
"""
The Line
class is used to collect the required information for each lane line.
It is updated by the active pixels determined by fit_lines_by_windows
or fit_lines_by_recent_lines
on
each new incoming video frame. Among others, the following steps are performed:
- Sanity check of new polynomial fit (coefficient deviations too high?)
- Depending on the sanity check, setting to lost or detected status
- Update of the current radius of curvature
- Update of the current offset of the line from the vehicle location
The complete Line
class code can be found in process.py
.
class Line:
""" Gather and update lane lines information. """
def __init__(self, cache=20):
# Memory size
self.cache = cache
# Current and recent line detection status
self.detected = False
self.recent_detections = deque(maxlen=cache * 2)
# Current active lane line pixel coordinates
self.x = None
self.y = None
# current and recent coefficients of the fitted lane line polynomial
self.coeff = None
self.recent_coeff = deque(maxlen=cache)
# Current radius of line curvature in meters
self.radius_of_curvature = None
# Current line offset from vehicle location
self.offset_from_vehicle = None
def update(self, x, y, warped_binary_frame):
""" Update the lane line with the new determined pixel coordinates.
- Sanity check of new polynomial fit (coefficient deviations too high?)
- Depending on the sanity check, setting to lost or detected status
- Update the current radius of curvature
- Update the current offset of the line from the vehicle location
Args:
x: x-coordinates of the active pixels in the warped frame
y: y-coordinates of the active pixels in the warped frame
warped_binary_frame: the current warped binary frame
"""
if x.any() and y.any():
# If pixels were found check if the the new polynomial coefficients
# do not differ to much from the recent ones.
# If the deviations are to high, set the line to lost status
if len(self.recent_detections) >= self.cache:
# Start the check if the line is not too long in lost status
if np.sum(np.array(self.recent_detections)[-self.cache:]) != 0:
self.coeff = np.polyfit(y, x, deg=2)
deviation = np.abs(1 - self.coeff / self.avg_coeff)
if deviation[2] > 0.3 or deviation[0] > 40:
# If deviation is to high, set to lost status
self.set_lost_status()
return
else:
# Reset the memory if the line is lost for too long
self.recent_coeff = deque(maxlen=self.cache)
# If still here update the line with the current data
self.detected = True
self.recent_detections.append(self.detected)
# Update the current lane line pixel coordinates
self.x = x
self.y = y
# Update the current coeff by the new lane line pixels
self.coeff = np.polyfit(y, x, deg=2)
# Append current polynomial coeffs to the queue of recents
self.recent_coeff.append(self.coeff)
# Update the offset of the lane line from the vehicle loaction
self.update_offset_from_vehicle(warped_binary_frame)
# Update curvature radius at the vehicle location
self.update_radius_of_curvature_m(warped_binary_frame)
else:
# If no pixels were found set lost status
self.set_lost_status()
def set_lost_status(self):
""" Set the line to lost status.
In lost status the recent coefficients will be used to keep estimating
the lane line.
"""
self.detected = False
self.recent_detections.append(self.detected)
self.x = None
self.y = None
self.coeff = self.recent_coeff[-1]
Radius of curvature and vehicle offset
Both values are computed by taking the average of the values of each individual lane line.
This is a step in the main processing function process
in
advanced_lane_finding.py
.
# Compute the vehicle offset and the average curvature radius
avg_vehicle_offset = 0.5 * abs(left_line.offset_from_vehicle + right_line.offset_from_vehicle)
avg_curv_rad = 0.5 * (left_line.radius_of_curvature + right_line.radius_of_curvature)
The actual computation of the individual lane line values is done by the methods
Line.update_offset_from_vehicle
and
Line.update_radius_of_curvature_m
.
It is assumed that the vehicle location is at the bottom center of the image. The vehicle’s
offset to the lane center is computed by taking the average horizontal distance of both lane
line polynomials to the vehicle location. The formula for the radius of curvature can be
found here.
class Line:
# Convertion of image to real world dimensions
ym_per_pix = 30 / 720 # meters per pixel in y dimension
xm_per_pix = 3.7 / 700 # meters per pixel in x dimension
def update_offset_from_vehicle(self, warped_frame_binary):
"""
Update the line offset from the vehicle based on the last N=cache fits.
Args:
warped_frame_binary: the current warped binary frame
"""
if self.detected:
image_height, image_width = warped_frame_binary.shape
# The vehicle location is estimated to be at the image bottom center
x_vehicle = image_width // 2
y_vehicle = image_height - 1
# Evaluate the lane line polynomial on the vehicle y-location
x_line_at_y_vehicle = self.evaluate_average_polynomial(y_vehicle)
# Compute the offset and convert to meters
offset_pixels = x_line_at_y_vehicle - x_vehicle
offset_meters = offset_pixels * Line.xm_per_pix
else:
# If line is in lost status keep the previous offset
offset_meters = self.offset_from_vehicle
# Update the line offset
self.offset_from_vehicle = offset_meters
def update_radius_of_curvature_m(self, warped_frame_binary):
""" Update the curvature radius [m] based on the last N=cache fits.
Args:
warped_frame_binary: the current warped binary frame
"""
# The vehicle location is estimated to be at the bottom of the image
y_vehicle = warped_frame_binary.shape[0] - 1
# Coefficients of the recent polynomials in pixels
A, B, C = self.avg_coeff
# Convert coefficients for the use in meters
A_m = A * self.xm_per_pix / self.ym_per_pix ** 2
B_m = B * self.xm_per_pix / self.ym_per_pix
# Compute the radius of curvature for the lane line
curv = (1 + (2 * A_m * y_vehicle + B_m) ** 2) ** (3 / 2) / abs(2 * A_m)
# Update the radius of curvature
self.radius_of_curvature = curv
Visualization
The complete processing pipeline is performed on each incoming video frame by passing it to
the process
function in advanced_lane_finding.py
.
def process(frame):
""" The main processing function.
Each frame undergoes the complete pipeline.
Args:
frame: the current raw video frame
Returns:
the processed frame
"""
global left_line, right_line
# Undistort the current frame
undistorted_frame = undistort(frame)
# Create the binary feature mask for the undistorted frame
features_binary = feature_mask(undistorted_frame)
# Perspective transformation (warp) of the features binary
warped_features, trsf_mtx, trsf_mtx_inv = warp(features_binary, ROI_SRC, ROI_DST)
# Detect and refit the lane lines on the warped features binary
if not left_line.detected or not right_line.detected:
# If the lines were not detected before or are in lost status relocate
# by the sliding windows approach
left_line, right_line, window_img = fit_lines_by_windows(
warped_features, left_line, right_line)
else:
# If the lines are still detected, the recent lines information will be
# used for the current detection
left_line, right_line, window_img = fit_lines_by_recent_lines(
warped_features, left_line, right_line)
# Draw the detected lane lines and lane area on the unwarped frame
unwarped_lane = draw_lane_on_unwarped_frame(
undistorted_frame, left_line, right_line, trsf_mtx_inv)
# Compute the vehicle offset and the average curvature radius
avg_vehicle_offset = 0.5 * abs(left_line.offset_from_vehicle
+ right_line.offset_from_vehicle)
avg_curv_rad = 0.5 * (left_line.radius_of_curvature
+ right_line.radius_of_curvature)
# Draw intermediate results on top of the main frame for information
processed_frame = post_process(undistorted_frame, features_binary,
warped_features, window_img, unwarped_lane,
avg_curv_rad, avg_vehicle_offset,
left_line, right_line)
return processed_frame
The processed project_video.mp4
and challenge_video.mp4
can be found in the
output_videos
folder.
Processed project video
Processed challenge video
Discussion
The approach presented above works well for the project_video.mp4
and the challenge_video.mp4
.
Of course, this is also due to the fact that it was developed on basis of test images taken from these
two videos. There is certainly room for improvement, especially in cases where the lane lines are lost
over several frames.
Shorter periods could be bridged in this approach by giving the Line classes a memory of 20 frames.
By averaging the lane line polynomials, extreme jumps in individual frames could be reduced.
Additionally, the limitation of possible deviations of the polynomial coefficients A and C, also serves
this purpose. If, due to a lack of information in individual frames, these parameters
are about to change strongly in comparison to the previous history, the line status is set to lost and
navigation is continued based on the recent history. However, if the lost status exceeds a certain number
of frames, the history is deleted and an attempt is made to get back on track with a complete reset.
This method certainly has a lot of optimization potential:
- How big should the memory cache be?
- Number of lost status frames before reset?
- How to make the reset behave soft?
- Which parameters to use to measrure the lane lines quality?
Other critical situations are certainly also:
- Sharp curves with lanes running out of the visible image
- Shadows and glare
- Dirty roads
- Concealed lane lines
- Overtaking cars crossing the own lane