Toolkit to train a net without gensfen nor selfplay

Moderators: Elijah, Igbo, timetraveller

moonstonelight
Forum Contributions
Points: 7 463,00 
Posts: 92
Joined: 04/11/2019, 13:44
Status: Offline (Active 3 Days, 2 Hours, 59 Minutes ago)
Topics: 10
Reputation: 12
Location: Turkey
Has thanked: 4 times
Been thanked: 41 times

Toolkit to train a net without gensfen nor selfplay

Post by moonstonelight »

deeds wrote:
moonstonelight wrote:Well, I can always give you a text with full of King & Pawn endgames analysed with Depth 20 of Eman.(The reason I selected Eman was that experiencing makes engine's evals more stabile and I thought it might be useful.)
Thank you but i have even several databases with endgame data :
7men
Image

8men
Image

and thanks to the cutechess tip, i get an average rate of 150 positions/game
Image
Thank you for your tools. I made my third Kings & Pawns NNUE today. I made a mistake by changing every score between -8 and 8 to 0 in my first NNUE and that was a total failure. It didn't understand the subtlity of endgames due to that mistake. The second one (maden with same data but without any zeroing) was competitive but not as strong as the default. But the third one seems very strong. I'm starting my tests now.
deeds
I've been banned!
Points: 6 000,00 
Posts: 246
Joined: 08/11/2019, 7:32
Status: Offline (Active 1 Year, 9 Months, 2 Weeks, 5 Days, 19 Hours, 46 Minutes ago)
Topics: 12
Reputation: 218
1
Location: France
Been thanked: 288 times

Toolkit to train a net without gensfen nor selfplay

Post by deeds »

In february i plan to evaluate my 5th dataset (2m games from XGFS, a discord's user).

At first sight, they contain 150 plies/game so i expect about 300m PLAIN TEXT data but after duplicated EPD detection i should get only from 230m to 240m really new positions to add to my actual 480m dataset.

The 480m net is +20 elo from classical nodchip release, i hope the next one will be stronger yet.
deeds
I've been banned!
Points: 6 000,00 
Posts: 246
Joined: 08/11/2019, 7:32
Status: Offline (Active 1 Year, 9 Months, 2 Weeks, 5 Days, 19 Hours, 46 Minutes ago)
Topics: 12
Reputation: 218
1
Location: France
Been thanked: 288 times

Toolkit to train a net without gensfen nor selfplay

Post by deeds »

Another tip for the nnue_eval / nnue_extract tools

For those who use these tools with a large PGN file containing several thousand parts, it might take several hours / days even if you have configured several threads.

If you don't want to occupy your machines for so long, you can split your large PGN file with pgn-extract.

Example with a large PGN file which contains 3000 games :
dir /b /o *.pgn > liste.lst
pgn-extract.exe -D -bl20 --fixresulttags -#300 -fliste.lst
pause
At the end you get 10 small PGN files containing only 300 games each and you can process them simultaneously.
IbaiBuR

Top contribute Forum
Administrators
Points: 7 707,00 
Forum Contributions
Posts: 149
Joined: 05/01/2021, 15:29
Status: Offline (Active 11 Hours, 51 Minutes ago)
Medals: 1
Topics: 6
Reputation: 252
Location: Madrid, ES
Has thanked: 64 times
Been thanked: 319 times

Toolkit to train a net without gensfen nor selfplay

Post by IbaiBuR »

Hello, thanks for all your tips and tools. I have some questions about it that you may be able to answer.

I would like to make a net with all the ccrl games, the 40/4 ones if I am not mistaken. Is that suitable for nnue_extract, I mean the commented ones?

How long should it take to do that on my 2 core machine(4 threads 1.7 GHz)

Will dividing the whole ccrl pgn make the process faster?

Is is better to directly extract the games to nnue_extract or to rescore them?

Thanks in advance.
deeds
I've been banned!
Points: 6 000,00 
Posts: 246
Joined: 08/11/2019, 7:32
Status: Offline (Active 1 Year, 9 Months, 2 Weeks, 5 Days, 19 Hours, 46 Minutes ago)
Topics: 12
Reputation: 218
1
Location: France
Been thanked: 288 times

Toolkit to train a net without gensfen nor selfplay

Post by deeds »

IbaiBuR wrote:Hello, thanks for all your tips and tools. I have some questions about it that you may be able to answer.
I would like to make a net with all the ccrl games, the 40/4 ones if I am not mistaken. Is that suitable for nnue_extract, I mean the commented ones?
Yes the nnue_extract tool handles the commented PGN format from CCRL 40/15.
IbaiBuR wrote:How long should it take to do that on my 2 core machine(4 threads 1.7 GHz)
Will dividing the whole ccrl pgn make the process faster?
Hard to say like that. We can test nnue_extract with a little sample 2021-01.commented.[9559].pgn.7z (2021, January, 9559 games, 5.49 MB).

Here with 1 thread @ 2.9 GHz it takes 2min :
253 032 games/h (9 559), 7 803 epd/s @ avg. D25 (1 061 220), 111 plies/game, 14,15% rejected (150 140 with 16% duplicated) :
finished (02 min) @ 2021-01.commented.[9559].pgn : 9 559 games, 1 061 220 epd, 23 572 duplicated / 150 140 rejected
So here, the 1 211 492 commented games divided by a rate of 253 032 games/hour should take less than 5 hours with only one thread @ 2.9 GHz.
If you decompress then split the commented PGN file from CCRL 40/15 (700 MB) into 2 or 4 parts, you can process several parts at the same time and save tons of time.
IbaiBuR wrote:Is is better to directly extract the games to nnue_extract or to rescore them?
Thanks in advance.
nnue_extract only extracts the data so we get PLAIN TEXT data from various engine strength, scores at various depth, etc.
nnue_eval rescores at fixed depth so we get PLAIN TEXT data from the same engine strength, scores at the same depth, etc.
Until now, my "fixed depth" nets are stronger than my "various depth" ones but it is reliable to the dataset too.
deeds
I've been banned!
Points: 6 000,00 
Posts: 246
Joined: 08/11/2019, 7:32
Status: Offline (Active 1 Year, 9 Months, 2 Weeks, 5 Days, 19 Hours, 46 Minutes ago)
Topics: 12
Reputation: 218
1
Location: France
Been thanked: 288 times

Toolkit to train a net without gensfen nor selfplay

Post by deeds »

New release of nnue_eval tool :
- some speedup / cleanup in the code
- new statistics ("XXeme" show the average of the first evaluated position of a game = the first no duplicated one)
- bigger writing buffer (5 MB) to reduce disk stress

New release of nnue_clean tool :
- less system stress (idle priority by default)
- slower display rate to fit with faster system
- bigger writing buffer (2 MB) to reduce disk stress

ENJOY !
deeds
I've been banned!
Points: 6 000,00 
Posts: 246
Joined: 08/11/2019, 7:32
Status: Offline (Active 1 Year, 9 Months, 2 Weeks, 5 Days, 19 Hours, 46 Minutes ago)
Topics: 12
Reputation: 218
1
Location: France
Been thanked: 288 times

Toolkit to train a net without gensfen nor selfplay

Post by deeds »

New release of nnue_extract tool :
- less system stress (idle priority by default)
- slower display rate to fit with faster system
- new statistics ("XXeme" show the average of the first extracted position of a game = the first no book or duplicated one)
- better compatibility with nnue_eval's INI file

ENJOY !

ps : nnue_eval, nnue_extract, nnue_clean, plain_stats
deeds
I've been banned!
Points: 6 000,00 
Posts: 246
Joined: 08/11/2019, 7:32
Status: Offline (Active 1 Year, 9 Months, 2 Weeks, 5 Days, 19 Hours, 46 Minutes ago)
Topics: 12
Reputation: 218
1
Location: France
Been thanked: 288 times

Toolkit to train a net without gensfen nor selfplay

Post by deeds »

In order to use with nnue_eval, you can find more than 250m uncommented games at :
https://mega.nz/folder/C4QDRKwZ#cFLrIihXlFymGtl1dPrNfg

With an average of about 100 plies/game and without duplicated EPDs, then you could train some nets from 10b or 15b PLAIN TEXT data.

From my experience, after the evaluation of 500m PLAIN TEXT data by nnue_eval at d14, the net is even stronger than classical strength.
deeds
I've been banned!
Points: 6 000,00 
Posts: 246
Joined: 08/11/2019, 7:32
Status: Offline (Active 1 Year, 9 Months, 2 Weeks, 5 Days, 19 Hours, 46 Minutes ago)
Topics: 12
Reputation: 218
1
Location: France
Been thanked: 288 times

Toolkit to train a net without gensfen nor selfplay

Post by deeds »

As my previous MEGA's account is even full, you can find the other data here :
https://mega.nz/folder/lQEDFQDY#6smKdBKT1szgiNJaaCHVTQ
deeds
I've been banned!
Points: 6 000,00 
Posts: 246
Joined: 08/11/2019, 7:32
Status: Offline (Active 1 Year, 9 Months, 2 Weeks, 5 Days, 19 Hours, 46 Minutes ago)
Topics: 12
Reputation: 218
1
Location: France
Been thanked: 288 times

Toolkit to train a net without gensfen nor selfplay

Post by deeds »

As a precaution, before my 2nd MEGA's account will be full, here is my GDRIVE's account :
https://drive.google.com/drive/folders/1Yn2Nz7FAESP3b-E8kuFdGDSKEHx8iaqF?usp=sharing
deeds
I've been banned!
Points: 6 000,00 
Posts: 246
Joined: 08/11/2019, 7:32
Status: Offline (Active 1 Year, 9 Months, 2 Weeks, 5 Days, 19 Hours, 46 Minutes ago)
Topics: 12
Reputation: 218
1
Location: France
Been thanked: 288 times

Toolkit to train a net without gensfen nor selfplay

Post by deeds »

2020.10.30 : i had used nnue_eval to evaluate about 90m EPD from various games (private tourneys) and from the FCP's 537k games.

Then i had used :

Code: Select all

stockfish.bmi2.halfkp_256x2-32-32.nnue-learn.2020-07-19.exe
uci
setoption name SkipLoadingEval value true
setoption name Threads value 12
isready
learn targetdir training loop 100 batchsize 1000000 eta 1 lambda 1 eval_limit 32000 nn_batch_size 1000 newbob_decay 0.5 eval_save_interval 30000000 loss_output_interval 10000000 mirror_percentage 50 validation_set_file_name validation\1m_d18.bin
In order to find the best net, i had used :

Code: Select all

cutechess-cli -engine conf="stockfish" -engine conf="evalsave0" -engine conf="evalsave1" -engine conf="evalsave3" -tournament gauntlet -each option.Hash=128 tc=60+1 -games 2000 -openings file="SALC_V3_10moves.pgn" start=1 -pgnout "%computername% - 90m_d14_nn2_256x2_evalsave %date:~6,4%-%date:~3,2%-%date:~0,2%.pgn" fi -repeat -recover -concurrency 39 -maxmoves 200 -draw movenumber=40 movecount=5 score=10 -tb "C:\Syzygy" -tbpieces 6 -event %computername% -site "dual xeon e5-2660v3" -ratinginterval 10
And finally i had got this :

Code: Select all

   # PLAYER                            :  RATING  ERROR   POINTS  PLAYED   (%)     W     D     L  D(%)  OppAvg  OppN
   1 stockfish 190720 no-nnue          :       0   ----  10944.0   16000  68.4  6971  7946  1083  49.7    -139     8
   2 90m_d14_nn2_evalsave6             :    -104     12    715.0    2000  35.8   187  1056   757  52.8       0     1
   3 90m_d14_nn2_evalsave7_rejected    :    -105     12    713.5    2000  35.7   182  1063   755  53.1       0     1
   4 90m_d14_nn2_evalsave5             :    -112     11    696.5    2000  34.8   154  1085   761  54.3       0     1
   5 90m_d14_nn2_evalsave4             :    -123     12    668.0    2000  33.4   140  1056   804  52.8       0     1
   6 90m_d14_nn2_evalsave3             :    -127     12    658.0    2000  32.9   142  1032   826  51.6       0     1
   7 90m_d14_nn2_evalsave2             :    -142     12    620.5    2000  31.0   137   967   896  48.4       0     1
   8 90m_d14_nn2_evalsave1             :    -171     13    553.0    2000  27.6    96   914   990  45.7       0     1
   9 90m_d14_nn2_evalsave0             :    -230     13    431.5    2000  21.6    45   773  1182  38.6       0     1

White advantage = 45.64 +/- 2.15
Draw rate (equal opponents) = 50.00 % +/- 0.00
Some stats :
- between 169 to 206 sec/game
- about 136 plies/game
- average depth between 27 to 28 plies
- between 1500 to 1800 ms/ply
deeds
I've been banned!
Points: 6 000,00 
Posts: 246
Joined: 08/11/2019, 7:32
Status: Offline (Active 1 Year, 9 Months, 2 Weeks, 5 Days, 19 Hours, 46 Minutes ago)
Topics: 12
Reputation: 218
1
Location: France
Been thanked: 288 times

Toolkit to train a net without gensfen nor selfplay

Post by deeds »

2020.11.08 : Almost same conditions but this one i had evaluated about 220m EPD from same previous games and from the CCRL 40-40's 1m games.

Same training's command as before but with eval_save_interval 80 000 000 loss_output_interval 20 000 000.

Same cutechess command as before.

With only +130m (from 90m to 220m), i had got +53 elo (-104 elo to -51 elo) :

Code: Select all

   # PLAYER                         :  RATING  ERROR  POINTS  PLAYED   (%)     W     D     L  D(%)  OppAvg  OppN
   1 stockfish 190720 no-nnue       :       0   ----  6182.5   10000  61.8  3401  5563  1036  55.6     -86     5
   2 220m_d14_evalsave4_rejected    :     -58     11   837.5    2000  41.9   262  1151   587  57.5       0     1
   3 220m_d14_evalsave3_rejected    :     -63     11   825.0    2000  41.3   236  1178   586  58.9       0     1
   4 220m_d14_evalsave2             :     -81     11   775.5    2000  38.8   215  1121   664  56.0       0     1
   5 220m_d14_evalsave1             :     -98     11   730.5    2000  36.5   181  1099   720  55.0       0     1
   6 220m_d14_evalsave0             :    -130     12   649.0    2000  32.5   142  1014   844  50.7       0     1

White advantage = 42.06 +/- 2.65
Draw rate (equal opponents) = 50.00 % +/- 0.00
Some stats :
- between 163 to 204 sec/game
- about 134 plies/game
- average depth at 27 plies
- between 1500 to 1800 ms/ply
deeds
I've been banned!
Points: 6 000,00 
Posts: 246
Joined: 08/11/2019, 7:32
Status: Offline (Active 1 Year, 9 Months, 2 Weeks, 5 Days, 19 Hours, 46 Minutes ago)
Topics: 12
Reputation: 218
1
Location: France
Been thanked: 288 times

Toolkit to train a net without gensfen nor selfplay

Post by deeds »

2020.12.07 : i had evaluated about 350m EPD from same previous games and from the KINGBASE's 1.8m games.

Same training's command with eval_save_interval 90m loss_output_interval 30m.

Same cutechess command.

Again with only +130m (from 220m to 350m), i had got +59 elo (-51 elo to +8 elo) :

Code: Select all

   # PLAYER                         :  RATING  ERROR  POINTS  PLAYED   (%)     W     D     L  D(%)  OppAvg  OppN
   1 350m_d14_evalsave6_rejected    :       8     11  1021.5    2000  51.1   376  1291   333  64.5       0     1
   2 350m_d14_evalsave5             :       4     11  1012.0    2000  50.6   375  1274   351  63.7       0     1
   3 350m_d14_evalsave4_rejected    :       4     11  1010.0    2000  50.5   411  1198   391  59.9       0     1
   4 stockfish 190720 no-nnue       :       0   ----  7209.0   14000  51.5  2876  8666  2458  61.9     -11     7
   5 350m_d14_evalsave3             :       0     12   999.5    2000  50.0   376  1247   377  62.4       0     1
   6 350m_d14_evalsave2_rejected    :     -14     11   960.5    2000  48.0   348  1225   427  61.3       0     1
   7 350m_d14_evalsave1             :     -24     12   933.5    2000  46.7   307  1253   440  62.6       0     1
   8 350m_d14_evalsave0             :     -52     11   854.0    2000  42.7   265  1178   557  58.9       0     1

White advantage = 38.41 +/- 2.15
Draw rate (equal opponents) = 50.00 % +/- 0.00
Some stats :
- 158 sec/game
- about 129 plies/game
- average depth between 27 to 28 plies
- between 1500 to 1500 ms/ply
► Show Spoiler
deeds
I've been banned!
Points: 6 000,00 
Posts: 246
Joined: 08/11/2019, 7:32
Status: Offline (Active 1 Year, 9 Months, 2 Weeks, 5 Days, 19 Hours, 46 Minutes ago)
Topics: 12
Reputation: 218
1
Location: France
Been thanked: 288 times

Toolkit to train a net without gensfen nor selfplay

Post by deeds »

2021.01.16 : i had evaluated about 480m EPD from same previous games and from selfplay's 813k games.

To produce selfplay's games without the stockfish's gensfen tool, i use cutechess :

Code: Select all

cutechess-cli -engine conf="stockfish" -engine conf="stockfish" -each option.Hash=128 depth=14 tc=inf -games 813039 -openings file="813039_court.pgn" start=1 -pgnout "%computername% - selfplay_d14 %date:~6,4%-%date:~3,2%-%date:~0,2%.pgn" fi -recover -concurrency 40 -maxmoves 200 -draw movenumber=40 movecount=5 score=10 -tb "C:\Syzygy" -tbpieces 6 -event %computername% -site "dual xeon e5-2660v3" -ratinginterval 100
813039_court.pgn : all the first 5 plies moves without the duplicated ending positions (pgn-extract.exe -s -D --fuzzydepth 0...) and without the positions which can be mated in less than 15 plies.

Same training's command with eval_save_interval 120m loss_output_interval 40m.

Same cutechess command.

With +130m (from 350m to 480m), i had got only +12 elo (+8 elo to +20 elo) :

Code: Select all

   # PLAYER                             :  RATING  ERROR  POINTS  PLAYED   (%)     W     D     L  D(%)  OppAvg  OppN
   1 480m_d14_nn1_evalsave5_rejected    :      20     11  1056.5    2000  52.8   432  1249   319  62.5       0     1
   2 480m_d14_nn1_evalsave4_rejected    :      10     11  1029.0    2000  51.5   395  1268   337  63.4       0     1
   3 480m_d14_nn1_evalsave3             :       5     11  1014.5    2000  50.7   387  1255   358  62.8       0     1
   4 480m_d14_nn1_evalsave2             :       4     11  1010.0    2000  50.5   383  1254   363  62.7       0     1
   5 stockfish 190720 no-nnue           :       0   ----  6032.0   12000  50.3  2300  7464  2236  62.2      -2     6
   6 480m_d14_nn1_evalsave1             :     -10     11   971.5    2000  48.6   360  1223   417  61.1       0     1
   7 480m_d14_nn1_evalsave0             :     -40     11   886.5    2000  44.3   279  1215   506  60.8       0     1

White advantage = 39.19 +/- 2.23
Draw rate (equal opponents) = 50.00 % +/- 0.00
Some stats :
- 159 sec/game
- about 129 plies/game
- average depth at 27 plies
- 1500 ms/ply
deeds
I've been banned!
Points: 6 000,00 
Posts: 246
Joined: 08/11/2019, 7:32
Status: Offline (Active 1 Year, 9 Months, 2 Weeks, 5 Days, 19 Hours, 46 Minutes ago)
Topics: 12
Reputation: 218
1
Location: France
Been thanked: 288 times

Toolkit to train a net without gensfen nor selfplay

Post by deeds »

It seemed adding only +130m EPD bringed too few elo versus the error margin.

So i thinked i had to add more than +250m EPD for the next training and had to test the evalsave files with more than 2000 games to get and error margin under the elo gap.
► Show Spoiler
Post Reply

Return to “GUI's, Chess Utility, NNUE Free Networks”