June 8, 2025
AI news

5 main ways to make it better with less data – dan rose he

5 main ways to make it better with less data – dan rose he

https://images.squarespace-cdn.com/content/v1/5ee8617eedf4d13dcedda79e/1598268753900-7TO5LGP58QUWRNHS63J6/hello-i-m-nik-qXakibuQiPU-unsplash.jpg?format=1500w

1. Transfer lesson

Transfer lesson is widely used in machinery learning now as the benefits are great. The general idea is simple. You train a large nervous network for purposes with many data and many training. When then you have a specific problem, you “cut the bottom” large network and train some new layers with your data. The big network already understands many general models that you do not have to learn the network with the transfer lesson this time.

A good example is if you try to train a network to recognize images of different dog species. Without learning the transfer you need a lot of data, perhaps a 100,000 images of different dog species as the network has to learn everything from scratch. If you train a new model with transfer learning, you may only need 50 images of each species.

You can read more about Transfer the lesson here.

2. Active learning

Active learning is a data collection strategy that enables you to choose the data that your models it will benefit the most from when you train. Let them stick with the example of dog species. You have trained a model that can distinguish between different species, but for some reason the model always has trouble identifying German shepherds. With an active learning strategy, you will automatically or at least with a determined process select these images and send them for labeling.

I did a longer post about How much active learning works here.

3. Best data

I have put in a strategy here that may seem visible, but sometimes it is overlooked. With better quality data, you often need less data as it does not have to train through the same amount of noise and wrong signals. In the media he is often spoken as “with many data you can do nothing”. But in many cases by making an additional effort to get rid of bad data and make sure that only correctly labeled data are used for training, they make more sense than you go for more data.

4. Gan’s

GAN or generating networks are a way to build nerve networks that sound almost futuristic in its design. In essence, this type of nerve network is built by having two networks competing against each other in a game where one network creates new examples of fake data from the data group and the other is trying to think about what is false and what the real data is. False network construction data are called generator and network trying to assume what is false and what is true is called discriminatory. This is a deep learning approach and both networks continue to improve during the game. When the generator is so good at generating fake data that the discriminator constantly has problems with fake real separation we have a finished model.

For Gan, you still need a lot of data, but you do not need labeled data and since it is usually labeling that is the costly part that you can save time and your data with this approach.

5. Probabilistic programming

One of my favorite technologies. Possible programming has many benefits and one of them is that you can often go away with the use of less data. The reason is simply for you to build “priors” in your models. This means that you can encode your domain knowledge in the model and let the data get them from there. In more other machine learning approaches everything has to be learned from the model from scratch no matter how clear it is.

A good example here is the patterns of document data capture. In many cases, the data we are looking for are visible from the keyword to its left. Like “ID number: # number # is a common format. With probabilistic programming you can tell the model before the training that you expect the data to be right of the word. Many nerve networks are learned from the first to require more data.

You can also read more about the possible programming here: https://www.danrose.ai/blog/63qy8s3vwq8p9MogsbbbldldLti4yojon

Leave feedback about this

  • Quality
  • Price
  • Service

PROS

+
Add Field

CONS

+
Add Field
Choose Image
Choose Video