ECE445: Senior Design Labtoratory
Haining, Zhejiang
SPRING 2025
Group member of Four-member Team
- Goal: Grasping Any Object with Robotic Arms with Language Instructions
- Automatic Speech Recognition: Translate human-language input into a string and identify the name of the target object. We use Wave2Vec 2.0 as the ASR model and add a NLP model after this to extract the label.
- Computer Vision: Use the RGB image taken from the depth camera, identify the target object using a detection box of pixel-level coordinate. We use YOLOE-V8L.
- 6D-Pose Generator: Utilize both RGB image and depth image from the depth camera, determine the 6D coordinate, [x,y,z] and three Euler angles for the object in the detection box. We use Open-3D.
- Path Translator: Use the 6D coordinate of the target object to choose the optimal grasping position, and map the position to a set of destination angles of the motors.
- PCB Display: Display the name of the target object on a 0.96-inch OLED screen.
- Code and report is provided here.
ECE408: CUDA Optimization for LeNet
Urbana, IL
Fall 2024
Individual Completed
- Stream: Overlap the data transfer with kernel execution. In this way, I divide large vectors into segments and simultaneously execute a kernel while performing a copy between device and host memory.
- Kernel Fusion: I first implement convolution with matrix multiplication by three kernel: unrolling kernel, shared matrix multiplication kernel and permute kernel. Then I use kernel fusion to combine three kernels into one kernel for optimization.
- High-Level Libraries: I use Tensor Cores via Warp Matrix Functions and CUDA Basic Linear Algebra Subprograms (cuBLAS) library.
- Other optimization: I also make other optmization, such as constant memory for weight matrix, "__restrict__" keyword and loop unrolling.
ECE391: Computer Systems Engineering Implementation
Urbana, IL
Fall 2023
Group member of Four-member Team
- Constructed a Linux-like operating system kernel with C, having basic function such as paging virtual memory, fully functional IDT, GDT and i8259-based interrupt controller, etc.
- Constructed a file system, operating device driver such as Real Time Clock, keyboard, Programmable Interval Timer and ATA driver.
- Used x86 to establish the system call linkage between user-level program and kernel, passing all test cases provided by the course. Furthermore, realized single CPU task scheduling and multiple terminals switching.
- Full point for the overall 5-checkpoints project.