Thursday, August 10, 2023

Quick Note: Training with Low-rank Matrices

When training a large matrix M with size WxH parameters is expensive, instead take the matrix into the multiplication of 2 smaller matrices. For example: matrix A is in size of (Wx3) and matrix B is in size of (3xH). And let A * B = M to give back a matrix of WxH dimensions. Since W * 3 + 3 * H < W *H, less amount of parameters are required.

This technique is mentioned in both of the following videos:

 https://www.coursera.org/learn/generative-ai-with-llms/lecture/NZOVw/peft-techniques-1-lora

https://youtu.be/exVPXVFPMDk?t=205

No comments: