made by
https://cneuralnets.netlify.app/
You built a model. Really nice, but how to make it better? How to make it work seamlessly, so that it solves the issues that it faces in the first place. Exploding gradients, unstable learning rate? We try to solve each of them
This first part of the blog series will contain 4 topics
Paper Link
- https://arxiv.org/abs/2106.09685
Let’s consider a neural network, which we want to build to classify cats and dogs.
Now, to build an intricate model like the above one and train it from scratch, we will be needing a lot of resources, so what we do is we load the weights of a pretrained network like the above one, and then we construct a new layer that segregates the learning of earlier detailed model into two classes of cats and dogs. We freeze the weights of the earlier model to retain the learning and we just train the last layer of H3
By this method, we use earlier knowledge and save a lot of training time. This kind of finetuning is called Classification Fine-tuning. There are other different types too, but we’ll not go in detail as e intend to just learn LoRA now.
What’s a major problem with finetuning is, even if we train just the last layer or even any other layer, it’s huge in size and processes the earlier stuff as well, which is computationally very expensive. Additionally, if we are using many models, loading weights for each of them individually will be very time inefficient and expensive!
To fix this, we introduce LoRA!!!
Usually we would have used the pre-trained weights in the neural network. Let’s analyse how many params that would take.