Цитата(SurgeonAnastasia @ 9.03.2017 - 14:34)

Уважаемый p2004r Вы проделали очень большую работу, спасибо Вам большое, разбираюсь пока в этом. Я так понимаю, мои подсчеты вообще неактуальны.
Вот еще результат восьми процедур отбора значимых предикторов для каждой из переменных выборки.
p=0.01
Код
> Boruta(к.т~., data=data, maxRuns = 2600)
Boruta performed 2453 iterations in 51.44179 secs.
No attributes deemed important.
7 attributes confirmed unimportant: иг, лк, м.т, уд, ш and 2 more.
> Boruta(м.т~., data=data, maxRuns =

0)
Boruta performed 23 iterations in 0.4761589 secs.
1 attributes confirmed important: уд.
6 attributes confirmed unimportant: иг, к.т, лк, ш, э.н and 1 more.
> Boruta(уд~., data=data, maxRuns = 1800)
Boruta performed 70 iterations in 1.545961 secs.
1 attributes confirmed important: м.т.
6 attributes confirmed unimportant: иг, к.т, лк, ш, э.н and 1 more.
> Boruta(иг~., data=data, maxRuns = 1800)
Boruta performed 82 iterations in 2.246107 secs.
No attributes deemed important.
7 attributes confirmed unimportant: к.т, лк, м.т, уд, ш and 2 more.
> Boruta(э.н~., data=data, maxRuns = 1800)
Boruta performed 14 iterations in 0.370903 secs.
No attributes deemed important.
7 attributes confirmed unimportant: иг, к.т, лк, м.т, уд and 2 more.
> Boruta(э.у~., data=data, maxRuns = 1800)
Boruta performed 75 iterations in 1.928164 secs.
1 attributes confirmed important: лк.
6 attributes confirmed unimportant: иг, к.т, м.т, уд, ш and 1 more.
> Boruta(ш~., data=data, maxRuns = 1800)
Boruta performed 26 iterations in 0.591598 secs.
No attributes deemed important.
7 attributes confirmed unimportant: иг, к.т, лк, м.т, уд and 2 more
> Boruta(лк~., data=data, maxRuns = 1800)
Boruta performed 470 iterations in 11.84426 secs.
2 attributes confirmed important: к.т, э.у.
5 attributes confirmed unimportant: иг, м.т, уд, ш, э.н.
Код
> Boruta(к.т~., data=data, maxRuns = 6000, pValue = 0.05)
Boruta performed 1314 iterations in 26.98678 secs.
No attributes deemed important.
7 attributes confirmed unimportant: иг, лк, м.т, уд, ш and 2 more.
> Boruta(м.т~., data=data, maxRuns = 6000, pValue = 0.05)
Boruta performed 53 iterations in 1.084951 secs.
1 attributes confirmed important: уд.
6 attributes confirmed unimportant: иг, к.т, лк, ш, э.н and 1 more.
> Boruta(уд~., data=data, maxRuns = 6000, pValue = 0.05)
Boruta performed 41 iterations in 0.8648179 secs.
1 attributes confirmed important: м.т.
6 attributes confirmed unimportant: иг, к.т, лк, ш, э.н and 1 more.
> Boruta(иг~., data=data, maxRuns = 6000, pValue = 0.05)
Boruta performed 851 iterations in 23.46029 secs.
No attributes deemed important.
7 attributes confirmed unimportant: к.т, лк, м.т, уд, ш and 2 more.
> Boruta(э.н~., data=data, maxRuns = 6000, pValue = 0.05)
Boruta performed 53 iterations in 1.339764 secs.
No attributes deemed important.
7 attributes confirmed unimportant: иг, к.т, лк, м.т, уд and 2 more.
> Boruta(э.у~., data=data, maxRuns = 6000, pValue = 0.05)
Boruta performed 51 iterations in 1.270474 secs.
1 attributes confirmed important: лк.
6 attributes confirmed unimportant: иг, к.т, м.т, уд, ш and 1 more.
> Boruta(ш~., data=data, maxRuns = 6000, pValue = 0.05)
Boruta performed 11 iterations in 0.3183038 secs.
No attributes deemed important.
7 attributes confirmed unimportant: иг, к.т, лк, м.т, уд and 2 more.
> Boruta(лк~., data=data, maxRuns = 6000, pValue = 0.05)
Boruta performed 241 iterations in 5.959469 secs.
2 attributes confirmed important: к.т, э.у.
5 attributes confirmed unimportant: иг, м.т, уд, ш, э.н.
Как видно имеется одна связь между наборами которую выборка способна подтвердить -- "лк--к.т" . Зная состояние э.у и к.т можно более состоятельно предсказывать лк.
Код
> ranger(лк~., data=data, num.trees = 15500)
Ranger result
Call:
ranger(лк ~ ., data = data, num.trees = 15500)
Type: Classification
Number of trees: 15500
Sample size: 46
Number of independent variables: 7
Mtry: 2
Target node size: 1
Variable importance mode: none
OOB prediction error: 43.48 %
> ranger(лк~., data=data[c("лк", "э.у", "к.т")], num.trees = 15500)
Ranger result
Call:
ranger(лк ~ ., data = data[c("лк", "э.у", "к.т")], num.trees = 15500)
Type: Classification
Number of trees: 15500
Sample size: 46
Number of independent variables: 2
Mtry: 1
Target node size: 1
Variable importance mode: none
OOB prediction error: 26.09 %
Код
> randomForest(лк~., data=data[c("лк", "э.у", "к.т")], ntree = 15500)
Call:
randomForest(formula = лк ~ ., data = data[c("лк", "э.у", "к.т")], ntree = 15500)
Type of random forest: classification
Number of trees: 15500
No. of variables tried at each split: 1
OOB estimate of error rate: 26.09%
Confusion matrix:
0 1 class.error
0 23 5 0.1785714
1 7 11 0.3888889
> randomForest(лк~., data=data, ntree = 15500)
Call:
randomForest(formula = лк ~ ., data = data, ntree = 15500)
Type of random forest: classification
Number of trees: 15500
No. of variables tried at each split: 2
OOB estimate of error rate: 43.48%
Confusion matrix:
0 1 class.error
0 19 9 0.3214286
1 11 7 0.6111111
>
По отдельности они хуже
Код
> randomForest(лк~., data=data[c("лк", "э.у")], ntree = 15500)
Call:
randomForest(formula = лк ~ ., data = data[c("лк", "э.у")], ntree = 15500)
Type of random forest: classification
Number of trees: 15500
No. of variables tried at each split: 1
OOB estimate of error rate: 30.43%
Confusion matrix:
0 1 class.error
0 21 7 0.2500000
1 7 11 0.3888889
> randomForest(лк~., data=data[c("лк", "к.т")], ntree = 15500)
Call:
randomForest(formula = лк ~ ., data = data[c("лк", "к.т")], ntree = 15500)
Type of random forest: classification
Number of trees: 15500
No. of variables tried at each split: 1
OOB estimate of error rate: 34.78%
Confusion matrix:
0 1 class.error
0 28 0 0.0000000
1 16 2 0.8888889
Как видим ошибка предсказания резко сокращается на отобранных Boruta значимых предикторах.
PS 6-7% дает точности дополнительной к.т