Thank you for your tools. I made my third Kings & Pawns NNUE today. I made a mistake by changing every score between -8 and 8 to 0 in my first NNUE and that was a total failure. It didn't understand the subtlity of endgames due to that mistake. The second one (maden with same data but without any zeroing) was competitive but not as strong as the default. But the third one seems very strong. I'm starting my tests now.deeds wrote:Thank you but i have even several databases with endgame data :moonstonelight wrote:Well, I can always give you a text with full of King & Pawn endgames analysed with Depth 20 of Eman.(The reason I selected Eman was that experiencing makes engine's evals more stabile and I thought it might be useful.)
7men
8men
and thanks to the cutechess tip, i get an average rate of 150 positions/game
Toolkit to train a net without gensfen nor selfplay
Moderators: Elijah, Igbo, timetraveller
-
- Forum Contributions
- Points: 7 463,00
- Posts: 92
- Joined: 04/11/2019, 13:44
- Status: Offline (Active 3 Days, 2 Hours, 59 Minutes ago)
- Topics: 10
- Reputation: 12
- Location: Turkey
- Has thanked: 4 times
- Been thanked: 41 times
Toolkit to train a net without gensfen nor selfplay
-
- I've been banned!
- Points: 6 000,00
- Posts: 246
- Joined: 08/11/2019, 7:32
- Status: Offline (Active 1 Year, 9 Months, 2 Weeks, 5 Days, 19 Hours, 46 Minutes ago)
- Topics: 12
- Reputation: 218
- Location: France
- Been thanked: 288 times
Toolkit to train a net without gensfen nor selfplay
In february i plan to evaluate my 5th dataset (2m games from XGFS, a discord's user).
At first sight, they contain 150 plies/game so i expect about 300m PLAIN TEXT data but after duplicated EPD detection i should get only from 230m to 240m really new positions to add to my actual 480m dataset.
The 480m net is +20 elo from classical nodchip release, i hope the next one will be stronger yet.
At first sight, they contain 150 plies/game so i expect about 300m PLAIN TEXT data but after duplicated EPD detection i should get only from 230m to 240m really new positions to add to my actual 480m dataset.
The 480m net is +20 elo from classical nodchip release, i hope the next one will be stronger yet.
-
- I've been banned!
- Points: 6 000,00
- Posts: 246
- Joined: 08/11/2019, 7:32
- Status: Offline (Active 1 Year, 9 Months, 2 Weeks, 5 Days, 19 Hours, 46 Minutes ago)
- Topics: 12
- Reputation: 218
- Location: France
- Been thanked: 288 times
Toolkit to train a net without gensfen nor selfplay
Another tip for the nnue_eval / nnue_extract tools
For those who use these tools with a large PGN file containing several thousand parts, it might take several hours / days even if you have configured several threads.
If you don't want to occupy your machines for so long, you can split your large PGN file with pgn-extract.
Example with a large PGN file which contains 3000 games :
For those who use these tools with a large PGN file containing several thousand parts, it might take several hours / days even if you have configured several threads.
If you don't want to occupy your machines for so long, you can split your large PGN file with pgn-extract.
Example with a large PGN file which contains 3000 games :
At the end you get 10 small PGN files containing only 300 games each and you can process them simultaneously.dir /b /o *.pgn > liste.lst
pgn-extract.exe -D -bl20 --fixresulttags -#300 -fliste.lst
pause
-
- Administrators
- Points: 7 707,00
- Forum Contributions
- Posts: 149
- Joined: 05/01/2021, 15:29
- Status: Offline (Active 11 Hours, 51 Minutes ago)
- Medals: 1
- Topics: 6
- Reputation: 252
- Location: Madrid, ES
- Has thanked: 64 times
- Been thanked: 319 times
Toolkit to train a net without gensfen nor selfplay
Hello, thanks for all your tips and tools. I have some questions about it that you may be able to answer.
I would like to make a net with all the ccrl games, the 40/4 ones if I am not mistaken. Is that suitable for nnue_extract, I mean the commented ones?
How long should it take to do that on my 2 core machine(4 threads 1.7 GHz)
Will dividing the whole ccrl pgn make the process faster?
Is is better to directly extract the games to nnue_extract or to rescore them?
Thanks in advance.
I would like to make a net with all the ccrl games, the 40/4 ones if I am not mistaken. Is that suitable for nnue_extract, I mean the commented ones?
How long should it take to do that on my 2 core machine(4 threads 1.7 GHz)
Will dividing the whole ccrl pgn make the process faster?
Is is better to directly extract the games to nnue_extract or to rescore them?
Thanks in advance.
-
- I've been banned!
- Points: 6 000,00
- Posts: 246
- Joined: 08/11/2019, 7:32
- Status: Offline (Active 1 Year, 9 Months, 2 Weeks, 5 Days, 19 Hours, 46 Minutes ago)
- Topics: 12
- Reputation: 218
- Location: France
- Been thanked: 288 times
Toolkit to train a net without gensfen nor selfplay
Yes the nnue_extract tool handles the commented PGN format from CCRL 40/15.IbaiBuR wrote:Hello, thanks for all your tips and tools. I have some questions about it that you may be able to answer.
I would like to make a net with all the ccrl games, the 40/4 ones if I am not mistaken. Is that suitable for nnue_extract, I mean the commented ones?
Hard to say like that. We can test nnue_extract with a little sample 2021-01.commented.[9559].pgn.7z (2021, January, 9559 games, 5.49 MB).IbaiBuR wrote:How long should it take to do that on my 2 core machine(4 threads 1.7 GHz)
Will dividing the whole ccrl pgn make the process faster?
Here with 1 thread @ 2.9 GHz it takes 2min :
So here, the 1 211 492 commented games divided by a rate of 253 032 games/hour should take less than 5 hours with only one thread @ 2.9 GHz.253 032 games/h (9 559), 7 803 epd/s @ avg. D25 (1 061 220), 111 plies/game, 14,15% rejected (150 140 with 16% duplicated) :
finished (02 min) @ 2021-01.commented.[9559].pgn : 9 559 games, 1 061 220 epd, 23 572 duplicated / 150 140 rejected
If you decompress then split the commented PGN file from CCRL 40/15 (700 MB) into 2 or 4 parts, you can process several parts at the same time and save tons of time.
nnue_extract only extracts the data so we get PLAIN TEXT data from various engine strength, scores at various depth, etc.IbaiBuR wrote:Is is better to directly extract the games to nnue_extract or to rescore them?
Thanks in advance.
nnue_eval rescores at fixed depth so we get PLAIN TEXT data from the same engine strength, scores at the same depth, etc.
Until now, my "fixed depth" nets are stronger than my "various depth" ones but it is reliable to the dataset too.
-
- I've been banned!
- Points: 6 000,00
- Posts: 246
- Joined: 08/11/2019, 7:32
- Status: Offline (Active 1 Year, 9 Months, 2 Weeks, 5 Days, 19 Hours, 46 Minutes ago)
- Topics: 12
- Reputation: 218
- Location: France
- Been thanked: 288 times
Toolkit to train a net without gensfen nor selfplay
New release of nnue_eval tool :
- some speedup / cleanup in the code
- new statistics ("XXeme" show the average of the first evaluated position of a game = the first no duplicated one)
- bigger writing buffer (5 MB) to reduce disk stress
New release of nnue_clean tool :
- less system stress (idle priority by default)
- slower display rate to fit with faster system
- bigger writing buffer (2 MB) to reduce disk stress
ENJOY !
- some speedup / cleanup in the code
- new statistics ("XXeme" show the average of the first evaluated position of a game = the first no duplicated one)
- bigger writing buffer (5 MB) to reduce disk stress
New release of nnue_clean tool :
- less system stress (idle priority by default)
- slower display rate to fit with faster system
- bigger writing buffer (2 MB) to reduce disk stress
ENJOY !
-
- I've been banned!
- Points: 6 000,00
- Posts: 246
- Joined: 08/11/2019, 7:32
- Status: Offline (Active 1 Year, 9 Months, 2 Weeks, 5 Days, 19 Hours, 46 Minutes ago)
- Topics: 12
- Reputation: 218
- Location: France
- Been thanked: 288 times
Toolkit to train a net without gensfen nor selfplay
New release of nnue_extract tool :
- less system stress (idle priority by default)
- slower display rate to fit with faster system
- new statistics ("XXeme" show the average of the first extracted position of a game = the first no book or duplicated one)
- better compatibility with nnue_eval's INI file
ENJOY !
ps : nnue_eval, nnue_extract, nnue_clean, plain_stats
- less system stress (idle priority by default)
- slower display rate to fit with faster system
- new statistics ("XXeme" show the average of the first extracted position of a game = the first no book or duplicated one)
- better compatibility with nnue_eval's INI file
ENJOY !
ps : nnue_eval, nnue_extract, nnue_clean, plain_stats
-
- I've been banned!
- Points: 6 000,00
- Posts: 246
- Joined: 08/11/2019, 7:32
- Status: Offline (Active 1 Year, 9 Months, 2 Weeks, 5 Days, 19 Hours, 46 Minutes ago)
- Topics: 12
- Reputation: 218
- Location: France
- Been thanked: 288 times
Toolkit to train a net without gensfen nor selfplay
In order to use with nnue_eval, you can find more than 250m uncommented games at :
https://mega.nz/folder/C4QDRKwZ#cFLrIihXlFymGtl1dPrNfg
With an average of about 100 plies/game and without duplicated EPDs, then you could train some nets from 10b or 15b PLAIN TEXT data.
From my experience, after the evaluation of 500m PLAIN TEXT data by nnue_eval at d14, the net is even stronger than classical strength.
https://mega.nz/folder/C4QDRKwZ#cFLrIihXlFymGtl1dPrNfg
With an average of about 100 plies/game and without duplicated EPDs, then you could train some nets from 10b or 15b PLAIN TEXT data.
From my experience, after the evaluation of 500m PLAIN TEXT data by nnue_eval at d14, the net is even stronger than classical strength.
-
- I've been banned!
- Points: 6 000,00
- Posts: 246
- Joined: 08/11/2019, 7:32
- Status: Offline (Active 1 Year, 9 Months, 2 Weeks, 5 Days, 19 Hours, 46 Minutes ago)
- Topics: 12
- Reputation: 218
- Location: France
- Been thanked: 288 times
Toolkit to train a net without gensfen nor selfplay
As my previous MEGA's account is even full, you can find the other data here :
https://mega.nz/folder/lQEDFQDY#6smKdBKT1szgiNJaaCHVTQ
https://mega.nz/folder/lQEDFQDY#6smKdBKT1szgiNJaaCHVTQ
-
- I've been banned!
- Points: 6 000,00
- Posts: 246
- Joined: 08/11/2019, 7:32
- Status: Offline (Active 1 Year, 9 Months, 2 Weeks, 5 Days, 19 Hours, 46 Minutes ago)
- Topics: 12
- Reputation: 218
- Location: France
- Been thanked: 288 times
Toolkit to train a net without gensfen nor selfplay
As a precaution, before my 2nd MEGA's account will be full, here is my GDRIVE's account :
https://drive.google.com/drive/folders/1Yn2Nz7FAESP3b-E8kuFdGDSKEHx8iaqF?usp=sharing
https://drive.google.com/drive/folders/1Yn2Nz7FAESP3b-E8kuFdGDSKEHx8iaqF?usp=sharing
-
- I've been banned!
- Points: 6 000,00
- Posts: 246
- Joined: 08/11/2019, 7:32
- Status: Offline (Active 1 Year, 9 Months, 2 Weeks, 5 Days, 19 Hours, 46 Minutes ago)
- Topics: 12
- Reputation: 218
- Location: France
- Been thanked: 288 times
Toolkit to train a net without gensfen nor selfplay
2020.10.30 : i had used nnue_eval to evaluate about 90m EPD from various games (private tourneys) and from the FCP's 537k games.
Then i had used :
In order to find the best net, i had used :
And finally i had got this :
Some stats :
- between 169 to 206 sec/game
- about 136 plies/game
- average depth between 27 to 28 plies
- between 1500 to 1800 ms/ply
Then i had used :
Code: Select all
stockfish.bmi2.halfkp_256x2-32-32.nnue-learn.2020-07-19.exe
uci
setoption name SkipLoadingEval value true
setoption name Threads value 12
isready
learn targetdir training loop 100 batchsize 1000000 eta 1 lambda 1 eval_limit 32000 nn_batch_size 1000 newbob_decay 0.5 eval_save_interval 30000000 loss_output_interval 10000000 mirror_percentage 50 validation_set_file_name validation\1m_d18.bin
Code: Select all
cutechess-cli -engine conf="stockfish" -engine conf="evalsave0" -engine conf="evalsave1" -engine conf="evalsave3" -tournament gauntlet -each option.Hash=128 tc=60+1 -games 2000 -openings file="SALC_V3_10moves.pgn" start=1 -pgnout "%computername% - 90m_d14_nn2_256x2_evalsave %date:~6,4%-%date:~3,2%-%date:~0,2%.pgn" fi -repeat -recover -concurrency 39 -maxmoves 200 -draw movenumber=40 movecount=5 score=10 -tb "C:\Syzygy" -tbpieces 6 -event %computername% -site "dual xeon e5-2660v3" -ratinginterval 10
Code: Select all
# PLAYER : RATING ERROR POINTS PLAYED (%) W D L D(%) OppAvg OppN
1 stockfish 190720 no-nnue : 0 ---- 10944.0 16000 68.4 6971 7946 1083 49.7 -139 8
2 90m_d14_nn2_evalsave6 : -104 12 715.0 2000 35.8 187 1056 757 52.8 0 1
3 90m_d14_nn2_evalsave7_rejected : -105 12 713.5 2000 35.7 182 1063 755 53.1 0 1
4 90m_d14_nn2_evalsave5 : -112 11 696.5 2000 34.8 154 1085 761 54.3 0 1
5 90m_d14_nn2_evalsave4 : -123 12 668.0 2000 33.4 140 1056 804 52.8 0 1
6 90m_d14_nn2_evalsave3 : -127 12 658.0 2000 32.9 142 1032 826 51.6 0 1
7 90m_d14_nn2_evalsave2 : -142 12 620.5 2000 31.0 137 967 896 48.4 0 1
8 90m_d14_nn2_evalsave1 : -171 13 553.0 2000 27.6 96 914 990 45.7 0 1
9 90m_d14_nn2_evalsave0 : -230 13 431.5 2000 21.6 45 773 1182 38.6 0 1
White advantage = 45.64 +/- 2.15
Draw rate (equal opponents) = 50.00 % +/- 0.00
- between 169 to 206 sec/game
- about 136 plies/game
- average depth between 27 to 28 plies
- between 1500 to 1800 ms/ply
-
- I've been banned!
- Points: 6 000,00
- Posts: 246
- Joined: 08/11/2019, 7:32
- Status: Offline (Active 1 Year, 9 Months, 2 Weeks, 5 Days, 19 Hours, 46 Minutes ago)
- Topics: 12
- Reputation: 218
- Location: France
- Been thanked: 288 times
Toolkit to train a net without gensfen nor selfplay
2020.11.08 : Almost same conditions but this one i had evaluated about 220m EPD from same previous games and from the CCRL 40-40's 1m games.
Same training's command as before but with eval_save_interval 80 000 000 loss_output_interval 20 000 000.
Same cutechess command as before.
With only +130m (from 90m to 220m), i had got +53 elo (-104 elo to -51 elo) :
Some stats :
- between 163 to 204 sec/game
- about 134 plies/game
- average depth at 27 plies
- between 1500 to 1800 ms/ply
Same training's command as before but with eval_save_interval 80 000 000 loss_output_interval 20 000 000.
Same cutechess command as before.
With only +130m (from 90m to 220m), i had got +53 elo (-104 elo to -51 elo) :
Code: Select all
# PLAYER : RATING ERROR POINTS PLAYED (%) W D L D(%) OppAvg OppN
1 stockfish 190720 no-nnue : 0 ---- 6182.5 10000 61.8 3401 5563 1036 55.6 -86 5
2 220m_d14_evalsave4_rejected : -58 11 837.5 2000 41.9 262 1151 587 57.5 0 1
3 220m_d14_evalsave3_rejected : -63 11 825.0 2000 41.3 236 1178 586 58.9 0 1
4 220m_d14_evalsave2 : -81 11 775.5 2000 38.8 215 1121 664 56.0 0 1
5 220m_d14_evalsave1 : -98 11 730.5 2000 36.5 181 1099 720 55.0 0 1
6 220m_d14_evalsave0 : -130 12 649.0 2000 32.5 142 1014 844 50.7 0 1
White advantage = 42.06 +/- 2.65
Draw rate (equal opponents) = 50.00 % +/- 0.00
- between 163 to 204 sec/game
- about 134 plies/game
- average depth at 27 plies
- between 1500 to 1800 ms/ply
-
- I've been banned!
- Points: 6 000,00
- Posts: 246
- Joined: 08/11/2019, 7:32
- Status: Offline (Active 1 Year, 9 Months, 2 Weeks, 5 Days, 19 Hours, 46 Minutes ago)
- Topics: 12
- Reputation: 218
- Location: France
- Been thanked: 288 times
Toolkit to train a net without gensfen nor selfplay
2020.12.07 : i had evaluated about 350m EPD from same previous games and from the KINGBASE's 1.8m games.
Same training's command with eval_save_interval 90m loss_output_interval 30m.
Same cutechess command.
Again with only +130m (from 220m to 350m), i had got +59 elo (-51 elo to +8 elo) :
Some stats :
- 158 sec/game
- about 129 plies/game
- average depth between 27 to 28 plies
- between 1500 to 1500 ms/ply
Same training's command with eval_save_interval 90m loss_output_interval 30m.
Same cutechess command.
Again with only +130m (from 220m to 350m), i had got +59 elo (-51 elo to +8 elo) :
Code: Select all
# PLAYER : RATING ERROR POINTS PLAYED (%) W D L D(%) OppAvg OppN
1 350m_d14_evalsave6_rejected : 8 11 1021.5 2000 51.1 376 1291 333 64.5 0 1
2 350m_d14_evalsave5 : 4 11 1012.0 2000 50.6 375 1274 351 63.7 0 1
3 350m_d14_evalsave4_rejected : 4 11 1010.0 2000 50.5 411 1198 391 59.9 0 1
4 stockfish 190720 no-nnue : 0 ---- 7209.0 14000 51.5 2876 8666 2458 61.9 -11 7
5 350m_d14_evalsave3 : 0 12 999.5 2000 50.0 376 1247 377 62.4 0 1
6 350m_d14_evalsave2_rejected : -14 11 960.5 2000 48.0 348 1225 427 61.3 0 1
7 350m_d14_evalsave1 : -24 12 933.5 2000 46.7 307 1253 440 62.6 0 1
8 350m_d14_evalsave0 : -52 11 854.0 2000 42.7 265 1178 557 58.9 0 1
White advantage = 38.41 +/- 2.15
Draw rate (equal opponents) = 50.00 % +/- 0.00
- 158 sec/game
- about 129 plies/game
- average depth between 27 to 28 plies
- between 1500 to 1500 ms/ply
► Show Spoiler
-
- I've been banned!
- Points: 6 000,00
- Posts: 246
- Joined: 08/11/2019, 7:32
- Status: Offline (Active 1 Year, 9 Months, 2 Weeks, 5 Days, 19 Hours, 46 Minutes ago)
- Topics: 12
- Reputation: 218
- Location: France
- Been thanked: 288 times
Toolkit to train a net without gensfen nor selfplay
2021.01.16 : i had evaluated about 480m EPD from same previous games and from selfplay's 813k games.
To produce selfplay's games without the stockfish's gensfen tool, i use cutechess :
813039_court.pgn : all the first 5 plies moves without the duplicated ending positions (pgn-extract.exe -s -D --fuzzydepth 0...) and without the positions which can be mated in less than 15 plies.
Same training's command with eval_save_interval 120m loss_output_interval 40m.
Same cutechess command.
With +130m (from 350m to 480m), i had got only +12 elo (+8 elo to +20 elo) :
Some stats :
- 159 sec/game
- about 129 plies/game
- average depth at 27 plies
- 1500 ms/ply
To produce selfplay's games without the stockfish's gensfen tool, i use cutechess :
Code: Select all
cutechess-cli -engine conf="stockfish" -engine conf="stockfish" -each option.Hash=128 depth=14 tc=inf -games 813039 -openings file="813039_court.pgn" start=1 -pgnout "%computername% - selfplay_d14 %date:~6,4%-%date:~3,2%-%date:~0,2%.pgn" fi -recover -concurrency 40 -maxmoves 200 -draw movenumber=40 movecount=5 score=10 -tb "C:\Syzygy" -tbpieces 6 -event %computername% -site "dual xeon e5-2660v3" -ratinginterval 100
Same training's command with eval_save_interval 120m loss_output_interval 40m.
Same cutechess command.
With +130m (from 350m to 480m), i had got only +12 elo (+8 elo to +20 elo) :
Code: Select all
# PLAYER : RATING ERROR POINTS PLAYED (%) W D L D(%) OppAvg OppN
1 480m_d14_nn1_evalsave5_rejected : 20 11 1056.5 2000 52.8 432 1249 319 62.5 0 1
2 480m_d14_nn1_evalsave4_rejected : 10 11 1029.0 2000 51.5 395 1268 337 63.4 0 1
3 480m_d14_nn1_evalsave3 : 5 11 1014.5 2000 50.7 387 1255 358 62.8 0 1
4 480m_d14_nn1_evalsave2 : 4 11 1010.0 2000 50.5 383 1254 363 62.7 0 1
5 stockfish 190720 no-nnue : 0 ---- 6032.0 12000 50.3 2300 7464 2236 62.2 -2 6
6 480m_d14_nn1_evalsave1 : -10 11 971.5 2000 48.6 360 1223 417 61.1 0 1
7 480m_d14_nn1_evalsave0 : -40 11 886.5 2000 44.3 279 1215 506 60.8 0 1
White advantage = 39.19 +/- 2.23
Draw rate (equal opponents) = 50.00 % +/- 0.00
- 159 sec/game
- about 129 plies/game
- average depth at 27 plies
- 1500 ms/ply
-
- I've been banned!
- Points: 6 000,00
- Posts: 246
- Joined: 08/11/2019, 7:32
- Status: Offline (Active 1 Year, 9 Months, 2 Weeks, 5 Days, 19 Hours, 46 Minutes ago)
- Topics: 12
- Reputation: 218
- Location: France
- Been thanked: 288 times
Toolkit to train a net without gensfen nor selfplay
It seemed adding only +130m EPD bringed too few elo versus the error margin.
So i thinked i had to add more than +250m EPD for the next training and had to test the evalsave files with more than 2000 games to get and error margin under the elo gap.
So i thinked i had to add more than +250m EPD for the next training and had to test the evalsave files with more than 2000 games to get and error margin under the elo gap.
► Show Spoiler