Хабрахабр

[Из песочницы] Real-time edge detection using FPGA

Introduction

Our design is built on a Cyclone IV FPGA board which enables us to optimize the performance using the powerful features of the low-level hardware and parallel computations which is important to meet the requirements of the real-time system. Our project implements a real-time edge detection system based on capturing image frames from an OV7670 camera and streaming them to a VGA monitor after applying a grayscale filter and Sobel operator.

Also, we used Quartus Prime Lite Edition as a development environment and Verilog HDL as a programming language. We used ZEOWAA FPGA development board which is based on Cyclone IV (EP4CE6E22C8N). In addition, we used the built-in VGA interface to drive the VGA monitor, and GPIO (General Pins for Input and Output) to connect the external hardware with our board.

ZEOWAA FPGA development board

Architecture

Our design is divided into 3 main parts:

  1. Reading the data pixels from the camera.
  2. Implementing our edge detection algorithm (grayscale converter and Sobel operator).
  3. Displaying the final image by interfacing with a VGA monitor.

For this purpose, we implemented two buffers that work as temporary space for pixels before they are used. Also, there is an intermediate memory storage between reading/writing the data and operating on this data.

The implemented architecture

Instead, we converted it to the grayscale then we stored it in the buffer. Note that after we took the pixel from the camera, we did not store it directly into the intermediate memory buffer. Also, we have another buffer which stores the data after applying the Sobel operator to make them ready to be displayed on the monitor. This is because storing 8-bit grayscale pixels takes less memory than storing the colored pixels which are 16-bits.

Here are the details about the implementation of our architecture:

Camera

Also, this camera can work on 3. We used OV7670 camera which is one of the cheapest camera modules that we found. It only requires SCCB interface which is similar to I2c interface to set the configuration of the camera in terms of color format (RGB565, RGB555, YUV, YCbCr 4:2:2), resolution (VGA, QVGA, QQVGA, CIF, QCIF) and many others settings. 3V and does not need difficult communication protocols like I2c or SPI to extract the data of the image.

OV7670 camera module

One frame is an image consisting of rows and columns of pixels where each pixel is represented by color values. The video consists of frames which are being changed at a specific rate. 3 Megapixels), and the pixel's color format is RGB565 (5 bits for Red, 6 bits for Blue, 5 bits for Green) and the rate of changing the frames is 30 fps. In this project, we used the default configuration of the camera where the frame's size is the VGA resolution 640 x 480 (0.

In below, the connections of the camera to the FPGA using the GPIO which exists in the development board:

Pin in the camera

pin in the FPGA

Description

Pin in the camera

pin in the FPGA

Description

3.3V

3.3V

Power Supply (+)

GND

GND

Ground Supply Level (-)

SDIOC

GND

SCCB clock

SDIOD

GND

SCCB Data

VSYNC

P31

Vertical synchronization

HREF

P55

Horizontal Synchronization

PCLK

P23

Pixel's clock

XCLK

P54

Input System clock (25 MHz)

D7

P46

8th bit of data

D6

P44

7th bit of data

D5

P43

6th bit of data

D4

P42

5th bit of data

D3

P39

4th bit of data

D2

P38

3rd bit of data

D1

P34

2nd bit of data

D0

P33

1st bit of data

RESET (Active Low)

3.3V

Reset pin

PWDN

GND

Power Down pin

So, we put their corresponding wires on the ground to prevent any floating signals that can affect the data. Note that we did not use SCCB interface for configuration.

To implement the PLL, we used the internal IP catalog tool inside Quartus software. To provide the 25MHz clock for the camera we used Phase-Locked Loop (PLL) which is a closed-loop frequency-control system to provide the needed clock from the 50MHz provided from the board.

This camera uses only 8 lines of data (D0-D7) to transfer the bits which represent the pixel's color values as the camera divides the 16-bit RGB pixel value into 2 (8-bit) parts and send each one separately. This camera uses vertical synchronization (VSYNC) signal to control the sending process of the frame and the horizontal synchronization (HREF) signal to control the sending of each row of the frame.

The below figures from the datasheet of OV7670 camera module illustrate the signals of vertical and horizontal synchronization.

VGA Frame Timing

Horizontal Timing

RGB565 Output Timing Diagram

Grayscale converter

Moreover, the image should preserve the relative luminance of the color space. To produce a grayscale image from its original colored image, many factors should be taken into consideration, because the image may lose contrast, sharpness, shadow, and structure. Accordingly, to achieve our objective we used the colorimetric (perceptual luminance-preserving) conversion to grayscale represented in the following equation: Several linear and non-linear techniques are used for converting the color image to grayscale.

Hence, the equation above can be reduced to the following: To enhance the performance in terms of computations, it is faster to use the shift operator.

The grayscaled image is easier to store in the memory and fast enough to serve the functionality of our real-time system as its complexity is approximately logarithmic and FPGA can make it even faster by accessing the memory in parallel. As a result, after capturing a (565 RGB) pixel value from the camera, it can be immediately converted into an 8-bit grayscale pixel value applying the formula of conversion. After that, the stored image is ready for implementing the edge detection algorithm.

Intermediate memory (The buffer)

Unfortunately, 150 x 150 buffers do not store the whole image from the camera but stores only part of it. We have 2 buffers, the first one is used to store the pixels after converting them to grayscale and its size (8-bits x 150 x 150) and the second one is used to store the pixels after applying Sobel operator and the threshold for the output value and its size (1-bit x 150 x 150).

480 Kbit while our two buffers take 202. We have chosen our buffers’ size as 150 x 150 because of the limitation of cyclone IV memory as it only has 276. 24% from the original memory of cyclone IV and the rest of the memory is used for storing the algorithm and the architecture. 500 Kbit (150 x 150 x 9) which is equivalent to 73. 07% from the memory which does not leave enough space implementing the algorithm. Furthermore, we tried (170 x 170) as a size for our buffers which takes 94.

Here, we created our implementation instead of using the IP catalog tool inside Quartus software to have more flexibility in the implementation. Our buffers are True Dual-port RAM which can read and write in different clock cycles simultaneously. Also, we integrated both buffers in only one module instead of having different modules.

Sobel operator

To be more precise, as it is a straightforward and efficient method in terms of memory usage and time complexity, we used Sobel gradient operator that uses 3x3 kernel centered on a chosen pixel to represent the strength of the edge. We used a first derivative edge detection operator which is a matrix area gradient operator that determines the change of luminance between different pixels. The Sobel operator is the magnitude of the gradient computed by:

G equation

Where Gx and Gy can be represented using convolution masks:

Gx and Gy convolution matrices

Also, Gx and Gy can be calculated as follows: Note that the pixels that are closer to the center of the mask are given more weight.

Gx and Gy equations

Where pi is the corresponding pixel in the following array, and the value of pi is 8-bit grayscale value:

pixels matrix

It is a common practice to approximate the gradient magnitude of Sobel operator by absolute values:

the equation

This approximation is easier to implement and faster to calculate which again serves our functionality in terms of time and memory.

Here is the block diagram of Sobel operator which takes 9 (8-bit) pixels as input and produces (8 bit) pixel value:

Sobel core

And here is the detailed block diagram of the Sobel operator implementation.

Detailed Sobel core

VGA monitor

This has made our debugging harder as it prevents us to display the image from the camera directly to the monitor. Our development board has a built-in VGA interface which has the capability to display only 8 colors on the VGA monitor as it has only 3-bits to control the colors through one bit for Red, one for Green and one for Blue. So, we used a threshold to convert the pixels into 1-bit value so it is possible to display the image.

Using the vertical and horizontal synchronization, we can synchronize the signals that control the flow of pixels. The VGA interface works like the camera as it operates pixel by pixel from the upper-left corner to the lower-right corner.

Also, both signals use front porch, sync pulse and back porch as synchronization signals to separate the old row from the new row in the horizontal synchronization signal, and the old frame from the new frame in the vertical synchronization signal. The vertical synchronization signal is used to represent the index of the row while the horizontal synchronization signal is used to represent the index of the column.

VGA Signal Timing diagram

All the standard specifications of the signal is described here. We used the standard VGA signal interface (640 x 480 @60 MHz).

Testing

We first had to test each part separately. Before putting everything together and testing the real-time system. Then, with the help of OpenCV using Python programming language, we were able to apply Sobel filter on several images to compare the results with our algorithm and check the correctness of our logic. At first, we checked the values and signals that come from the camera by displaying certain pixel values. Furthermore, by changing the value of the threshold, the accuracy of the image is affected. Moreover, we tested our buffers and VGA driver by displaying several static images on the VGA monitor after applying Sobel operator and thresholding.

The python code which we used:

# This code is made to test the accuracy of our algorithm on FPGA
import cv2 #import opencv library f = open("sample.txt",'w') # Open file to write on it the static image initialization lines
img = cv2.imread('us.jpg') # Read the image which has our faces and its size 150x150
gray = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY) #convert to grayscale
sobelx = cv2.Sobel(gray,cv2.CV_64F,1,0,ksize=3) #x-axis sobel operator sobely = cv2.Sobel(gray,cv2.CV_64F,0,1,ksize=3) #y-axis sobel operator
abs_grad_x = cv2.convertScaleAbs(sobelx) abs_grad_y = cv2.convertScaleAbs(sobely) grad = abs_grad_x + abs_grad_y for i in range(0,150): for x in range(0,150): #read the pixels of the grayscaled image and Store them into file with specific format to initialize the buffer in FPGA code f.write("data_a[]<=8'd{:d};\n".format(i*150+x,gray[i][x])) #apply threshold to be exactly like the code on FPGA if(grad[i][x] < 100): grad[i][x] = 255 else: grad[i][x] = 0 cv2.imshow("rgb", img) #Show the real img
cv2.imshow("gray",gray) #Show the grayscale img
cv2.imshow("sobel",grad)#Show the result img
cv2.waitKey(0) #Stop the img to see it

Results

The implemented system provides 30 fps. As a result of our implementation, we got a real-time edge detection system that produces a 150x150 image after applying the grayscale filter and Sobel operator. Moreover, the threshold value can affect the amount of details and the noise in the final image. The camera runs on a 25MHz clock and the system, in general, meets real-time deadlines without noticeable lag.

Here is a comparison between Sobel operator on FPGA and OpenCV sobel operator:

Comparison

Below is an illustrative video of the results:

Video of the project

Here is the link of the repository on Github which has all the source codes.

Future improvements

Hence, as a future improvement, we can use an external memory source or we can implement our work on another board so we can display all the pixels from the image received from the camera. As we are using FPGA Cyclone IV, we are limited to its memory capacity and the number of logic gates.

To eliminate the produced noise, we can use a noise filter like the non-linear median filter which works perfectly fine with our system if we had enough memory to implement a third buffer. Furthermore, although Sobel operator is fast and simple to implement, it is noticeably sensitive to noise. This will produce a smoother image with sharp features removed.

Thus, we couldn’t display the grayscaled image as it needs 8 bits to be displayed. Accordingly, we used the built-in VGA interface of the FPGA that can only produce a 3-bits image. As a result, implementing another interface or using more powerful board will enhance the flexibility of displaying the image.

Conclusion

We were able to use our knowledge and understanding of crucial concepts in embedded systems as state-machines, computations parallelism, and hardware-software interfacing to create an efficient edge detection application that meets our objectives.

Acknowledgment

This project is built by a team consisting of two students: Hussein Youness and Hany Hamed in first year bachelor of Computer Science in Innopolis University in Russia.

This project is part of Computer Architecture course Fall 2018 in Innopolis University.

References

Теги
Показать больше

Похожие статьи

Добавить комментарий

Ваш e-mail не будет опубликован. Обязательные поля помечены *

Кнопка «Наверх»
Закрыть