Toolkit to train a net without gensfen nor selfplay

deeds · Post by **deeds** » 16/08/2021, 6:51

2000m_d14_nn2 : 2B D14 as training data with 1M D18 as validation data

training command :

stockfish.bmi2.halfkp_256x2-32-32.nnue-learn.2020-07-19.exe
uci
setoption name SkipLoadingEval value true
setoption name Threads value 36
isready
learn targetdir training loop 100 batchsize 1000000 eta 1 lambda 1 eval_limit 32000 nn_batch_size 1000 newbob_decay 0.5 eval_save_interval 500000000 loss_output_interval 125000000 mirror_percentage 50 validation_set_file_name validation\1m_d18.bin

NN2.BIN (evalsave6_rejected)
sfens              : 3 375 000 000
test_cross_entropy :      0.252433
move accuracy      :      33.1046%
loss               :     0.0466118

ordo ranking :

Code: Select all

   # PLAYER                              :  RATING  ERROR  POINTS  PLAYED   (%)     W     D     L  D(%)  OppAvg  OppN
   1 2000m_d14_nn2_evalsave6_rejected    :      61     11  1168.5    2000  58.4   601  1135   264  56.8       0     1
   2 2000m_d14_nn2_evalsave3_rejected    :      46     11  1127.0    2000  56.4   532  1190   278  59.5       0     1
   3 2000m_d14_nn2_evalsave5_rejected    :      43     12  1118.5    2000  55.9   574  1089   337  54.5       0     1
   4 2000m_d14_nn2_evalsave2             :      35     11  1098.5    2000  54.9   526  1145   329  57.3       0     1
   5 2000m_d14_nn2_evalsave4             :      35     11  1097.5    2000  54.9   485  1225   290  61.3       0     1
   6 2000m_d14_nn2_evalsave1             :      26     11  1072.0    2000  53.6   512  1120   368  56.0       0     1
   7 2000m_d14_nn2_evalsave0             :      12     11  1033.5    2000  51.7   484  1099   417  55.0       0     1
   8 stockfish 190720 no-nnue            :       0   ----  6284.5   14000  44.9  2283  8003  3714  57.2      37     7

White advantage = 55.13 +/- 2.18
Draw rate (equal opponents) = 50.00 % +/- 0.00

This training run was longer with 3.5B used sfens from 2B training data and now i got a net at +61 elo over the evaluation of the base engine.

evalsave elo curves :

When i saw that the first evalsave was even stronger than the base engine, it smelt good...

evalsave loss cuves :

It's weird as the same loss values can produce so different strong nets. But the trend is here : when the loss values decrease, it smells good.

At the moment, i'm testing if the training can use more than 3.5B sfens thanks to the 1m d16 or 1m d19 as validation data :

deeds · Post by **deeds** » 16/08/2021, 15:05

deeds wrote: ↑09/08/2021, 6:17 2000m_d14_plain.txt (207 601 371 Ko) :
Games = 19 182 119
EPD = 2 088 285 268
epd/game = min. 1 (1), avg. 109 (124), max. 398 (398)

EPDStringLength = min. 32 (32), max. 83 (83)
PlainTextBlocSize = min. 66 (66), max. 119 (120)

score = min. -319,99, max. +319,99
Plies = min. 1 (1), max. 398 (400)
max50 = 99 coups (99)

PieceCount = min. 2 (2), avg. 17 (22), max. 32 (32)
MaterialImbalance = min. -44 (-44), max. 62 (62)

1000m_d10_plain.txt (103 450 112 ko) :

Games = 14 557 924
EPD = 1 000 000 000
epd/game = min. 1 (1), avg. 69 (124), max. 379 (398)

EPDStringLength = min. 34 (32), max. 83 (83)
PlainTextBlocSize = min. 68 (66), max. 120 (120)

score = min. -165,60, max. +160,80
Plies = min. 1 (1), max. 397 (400)
max50 = 63 coups (99)

PieceCount = min. 3 (2), avg. 20 (22), max. 32 (32)
MaterialImbalance = min. -24 (-44), max. 24 (62)

Main differences between these 2 training data : the material imbalance !
Clearly the gensfen command by default produced positions with poor variety.
With only 69 plies/game in average, not all the moves of a game were used, maybe it can explain why first nets were bad at endgame...

This is the number of pieces of the positions :

My 2000m_d14 training data use more plies/game, a little less positions in the opening but more positions in the endgame.

deeds · Post by **deeds** » 19/08/2021, 11:54

I tested my last net with some engines :

Code: Select all

   # PLAYER                       :  RATING  ERROR  POINTS  PLAYED   (%)    W    D    L  D(%)  OppAvg  OppN
   1 brainlearn 12.1 2000m_d14    :     109     26   260.0     400  65.0  135  250   15  62.5       0     1
   2 brainlearn 12.1 classical    :       0   ----   140.0     400  35.0   15  250  135  62.5     109     1

White advantage = -32.86 +/- 12.81
Draw rate (equal opponents) = 50.00 % +/- 0.00

Code: Select all

   # PLAYER                  :  RATING  ERROR  POINTS  PLAYED   (%)    W    D    L  D(%)  OppAvg  OppN
   1 dragon 1.0 regular      :       0   ----   201.0     400  50.3  107  188  105  47.0      -2     1
   2 dragon 1.0 2000m_d14    :      -2     24   199.0     400  49.8  105  188  107  47.0       0     1

White advantage = -31.63 +/- 12.12
Draw rate (equal opponents) = 50.00 % +/- 0.00

Code: Select all

   # PLAYER                       :  RATING  ERROR  POINTS  PLAYED   (%)    W    D    L  D(%)  OppAvg  OppN
   1 igel 2.9.0 2000m_d14         :      15     25   208.5     400  52.1   99  219   82  54.8       0     1
   2 igel 2.9.0 ign-0-9b1937cc    :       0   ----   191.5     400  47.9   82  219   99  54.8      15     1

White advantage = -14.93 +/- 11.91
Draw rate (equal opponents) = 50.00 % +/- 0.00

Code: Select all

   # PLAYER                  :  RATING  ERROR  POINTS  PLAYED   (%)    W    D    L  D(%)  OppAvg  OppN
   1 minic 2.53 2000m_d14    :     329     36   345.0     400  86.3  298   94    8  23.5       0     1
   2 minic 2.53 classical    :       0   ----    55.0     400  13.8    8   94  298  23.5     329     1

White advantage = -60.66 +/- 19.21
Draw rate (equal opponents) = 50.00 % +/- 0.00

Code: Select all

   # PLAYER                        :  RATING  ERROR  POINTS  PLAYED   (%)    W    D    L  D(%)  OppAvg  OppN
   1 stockfish 190720 2000m_d14    :      55     25   231.0     400  57.8   86  290   24  72.5       0     1
   2 stockfish 190720 nonnue       :       0   ----   169.0     400  42.3   24  290   86  72.5      55     1

White advantage = -32.43 +/- 12.42
Draw rate (equal opponents) = 50.00 % +/- 0.00

Code: Select all

   # PLAYER                        :  RATING  ERROR  POINTS  PLAYED   (%)    W    D    L  D(%)  OppAvg  OppN
   1 stockfish 210920 2000m_d14    :      90     25   250.0     400  62.5  115  270   15  67.5       0     1
   2 stockfish 210920 classical    :       0   ----   150.0     400  37.5   15  270  115  67.5      90     1

White advantage = -24.35 +/- 12.47
Draw rate (equal opponents) = 50.00 % +/- 0.00

Outskirts CheSS ForuM

Outskirts CheSS ForuM