How to increase the sample for processing with a neural network?

0 like 0 dislike
Classification for HC.
There are 1,700 objects, including 130 objects of class A, 1570 objects of class B. each object of this 130 characteristics, by screening for multicollinearity (Tau Kendall more than 0,7) and using genetic algorithms to probabilistic networks (statistica 6.1) was selected 50 significant characteristics. Continue in the same package want to run a mlp to classify these objects, but I can only submit 260 (130 per class), because otherwise NS a priori to include all objects of class B, but I read that the number of parameters (weights?) in the na should be 10 times smaller than the sample. Obviously, if you follow this rule, then the hidden layer will be a pair of neurons, and this, in theory, is not enough. Need to increase these 130 pieces of class A. Thoughts go in the direction of reproduction by adding random noise for each characteristic, but this is not accurate. And yet, perhaps we should take away from the characteristics of only those with a normal distribution and then noise to add or on the basis of empirical characteristics to finish somehow.
Programming languages don't know it, please tell me a software product with the realized increase in the sample size or other ways to solve this problem, is also preferably implemented programmatically :)
by | 9 views

3 Answers

0 like 0 dislike
Usually do differently: create all possible signs that can come up with, then expand dataset any possible methods, and then simpleroot him batchi so that during one era to circumvent the entire dataset and go around it 100 times or more, looking at the graph of the loss function on validation.
0 like 0 dislike
On dataset this size neural networks, it is better not to use the augmentation does not help.
0 like 0 dislike
Well, first, you can still try to teach and array of available data (albeit skewed in the direction B). If you want stupid to propagate the class A.

Second, it will turn to other classifiers. For example, a tree (or forest) solutions.

If the task is already solved, you can tell how decided?

Related questions

110,608 questions
257,186 answers
33,676 users