Android engines tests. - Page 55 - Outskirts CheSS ForuM

Android engines tests.

Moderators: Elijah, Igbo, timetraveller

Forum rules

838 posts

Page 55 of 56
- Jump to page:
Previous
1
…
52
53
54
55
56
Next

Joachim26: I've been banned!; Points: 6 000,00; Posts: 95; Joined: 01/02/2020, 23:58; Status: Offline (Active 1 Year, 2 Weeks, 2 Days, 23 Hours, 54 Minutes ago); Topics: 0; Reputation: 189; Has thanked: 29 times; Been thanked: 214 times

Re: Android engines tests.

11
Quote

Post by Joachim26 » 26/02/2023, 15:25

SkyNet wrote: ↑26/02/2023, 12:51 That was Leo's compilation, next Joakim's.
Tablet - Samsung Galaxy A10.1 (32GB) - Android 9 / 64bit - 2GB RAM - 8x Cortex-A53 - 1,59 GHz
GUI=C4a - Hash 128 mb - 2min+1sec - Book=topGM 4moves - 6 cores per engine - ponder=NO
syzygy=NO, Adjudication Rule=Resign - Move 25, Move Count 3, Score (in cp) 480
Draw=Move Number 30, Move count 4, Score (in cp) 0
Code: Select all
   # PLAYER                  :  RATING  ERROR  PLAYED   (%)    W    D    L  D(%)  CFS(%)
   1 Spider 1.1dT 64 NEON    :    3213      4     622  53.5   72  522   28  83.9     100
   2 Cfish Sopel 64 NEON     :    3187      4     622  46.5   28  522   72  83.9     ---
Games https://pixeldrain.com/u/9uNZAhBc

I was very curious, since last night I have thus made three 100 game tournaments on CfA with the above two engines and a CFishNN (with AspirationLine64_2). TC was Stufe 1 (schnell/fast) the buildin book was used.
My 3rd tournament gave a result which is in accordance with the above table: +1 point +3 +/-37 Elo LOS: 54.8% in favor of Spider.
These 100 games alone are obviously statistically nearly meaningless.
Why? Because both engines are relatively equal in strength.
Since I already know the outcome of the other two matches my curiosity is gone. So let's keep up the tension and let's wait for Alex, to complete these tournaments.
Only one hint I give: The CFishNNs are all tuned with a lot of FatTitz-patches (and AspLines) and usually do not play games with their parents in the same ballpark. But if they do it anyway, then they want to have some fun lol

SkyNet: Forum Contributions; Points: 33 205,00; Posts: 325; Joined: 11/11/2022, 1:55; Status: Offline (Active 22 Hours, 57 Minutes ago); Medals: 3; Topics: 6; Reputation: 3125; Location: 3th dimension.; Has thanked: 5177 times; Been thanked: 2350 times

Re: Android engines tests.

12
Quote

Post by SkyNet » 02/03/2023, 7:01

Tablet - Samsung Galaxy A10.1 (32GB) - Android 9 / 64bit - 2GB RAM - 8x Cortex-A53 - 1,59 GHz
GUI=C4a - Hash 128 mb - 2min+1sec - Book=topGM 4moves - 6 cores per engine - ponder=NO
syzygy=NO, Adjudication Rule=Resign - Move 25, Move Count 3, Score (in cp) 480
Draw=Move Number 30, Move count 4, Score (in cp) 0

Code: Select all

   # PLAYER                  :  RATING  ERROR  PLAYED   (%)    W    D    L  D(%)  CFS(%)
   1 Cfish 200223 64 NEON    :    3210      4     945  53.0   90  821   34  86.9     100
   2 Spider 1.1dT 64 NEON    :    3190      4     945  47.0   34  821   90  86.9     ---

Games https://pixeldrain.com/u/EodF1vCH

Joachim26: I've been banned!; Points: 6 000,00; Posts: 95; Joined: 01/02/2020, 23:58; Status: Offline (Active 1 Year, 2 Weeks, 2 Days, 23 Hours, 54 Minutes ago); Topics: 0; Reputation: 189; Has thanked: 29 times; Been thanked: 214 times

Re: Android engines tests.

7
Quote

Post by Joachim26 » 02/03/2023, 8:15

SkyNet wrote: ↑02/03/2023, 7:01 Tablet - Samsung Galaxy A10.1 (32GB) - Android 9 / 64bit - 2GB RAM - 8x Cortex-A53 - 1,59 GHz
GUI=C4a - Hash 128 mb - 2min+1sec - Book=topGM 4moves - 6 cores per engine - ponder=NO
syzygy=NO, Adjudication Rule=Resign - Move 25, Move Count 3, Score (in cp) 480
Draw=Move Number 30, Move count 4, Score (in cp) 0
Code: Select all
   # PLAYER                  :  RATING  ERROR  PLAYED   (%)    W    D    L  D(%)  CFS(%)
   1 Cfish 200223 64 NEON    :    3210      4     945  53.0   90  821   34  86.9     100
   2 Spider 1.1dT 64 NEON    :    3190      4     945  47.0   34  821   90  86.9     ---
Games https://pixeldrain.com/u/EodF1vCH

My CFishNN_dev aka 'Cfish 200223 64 NEON" won by +20 ELO. I expected more.

But SkyNet's result is, nearly, inside the error bars of my two tournaments so +20 ELO is a nice rounded value, easy to remember.

I learned that unique and simple names for future 'releases' will make things easier. Skynet has already suggested this in an earlier post. Thanks. Thus, I'll make a release later this day with name
CFishNN_230302
( Naming it CFish 14 or CFishNN 14.1 would be misleading since there is no (directly) corresponding SF version as in the past.
I should also change the internal name to the above scheme but not yet today.)
I hope that we will get soon from Archimedes also an OEX version named CFishNN_230302.apk or of a later release.

Joachim26: I've been banned!; Points: 6 000,00; Posts: 95; Joined: 01/02/2020, 23:58; Status: Offline (Active 1 Year, 2 Weeks, 2 Days, 23 Hours, 54 Minutes ago); Topics: 0; Reputation: 189; Has thanked: 29 times; Been thanked: 214 times

Re: Android engines tests.

5
Quote

Post by Joachim26 » 02/03/2023, 11:27

Alex, which App you have used?

I have used CfA 6.2.1 which is AFAIK the last Chess for Android version which supports non-OEX engines.
The big drawback of this version is, that at the end of the tournament only two values, the points of both opponents is given, but no number of the drawn games! Thus the error can not be calculated exactly. I had to estimate the drawrate and could then calculate the approximate error.

BTW: If I calculate ELO and error with your above results
W=90 D=821 L=34 I get

+21 +/-8 ELO
which is in full and better agreement with my two tournaments than your calculation. To be honest, my expectation was something like 30 to 35 ELO for TCs in the range of LTC (fishtest). However, your TC was even longer thus +21 ELO makes sense.

Thanks for this very detailed test Alex.

Will a test of the remaining two opponents follow?
I don't need it, too many commits are between these two, thus two different ballparks

Archimedes: Forum Contributions; Points: 42 582,00; Posts: 2059; Joined: 04/11/2019, 21:13; Status: Offline (Active 4 Hours, 54 Minutes ago); Medals: 2; Topics: 158; Reputation: 7111; Been thanked: 6477 times

Re: Android engines tests.

13
Quote

Post by Archimedes » 02/03/2023, 16:43

CfishNN 230302 vs Stockfish 230227

Tournament was played on Motorola Moto G7 Power (Android 10, Snapdragon 632) with Termux and the CETSA script.
Offset for rating of one of the engines was not set in Bayeselo.

c-chess-cli.conf:

Code: Select all

../bin/c-chess-cli/$CETSA_ABI/c-chess-cli \
  -each tc=10+0.1 option.Hash=16 option.Threads=1 \
  -engine cmd=../uci/CfishNN \
  -engine cmd=../uci/Stockfish \
  -games 500 \
  -concurrency 2 \
  -openings file=../epd/IM_4mvs.epd order=random -repeat \
  -resign number=150 count=5 score=900 \
  -draw number=150 count=5 score=5 \
  -pgn games.pgn

c-chess-cli.txt:

Code: Select all

3-fold repetition: 169
50 moves rule: 52
Adjudication: 10
Checkmate: 194
Insufficient material: 63
Rules infraction: 0
Stalemate: 12
Time forfeit: 0
Unterminated: 0

Bayeselo.txt:

Code: Select all

Rank Name              Rating   Δ     +    -     #     Σ    Σ%     W    L    D   W%    =%   OppR 
---------------------------------------------------------------------------------------------------------
   1 CfishNN 230302     3100   0.0   10   10   500  251.0  50.2   98   96  306  19.6  61.2  3100 
   2 Stockfish 230227   3100   0.8   10   10   500  249.0  49.8   96   98  306  19.2  61.2  3100 
---------------------------------------------------------------------------------------------------------
  Δ = delta from the next higher rated opponent
  # = number of games played
  Σ = total score, 1 point for win, 1/2 point for draw

Games:
https://app.box.com/s/o47ud1j5dx6dvekdem0v5b524bc7o6c4

Joachim26: I've been banned!; Points: 6 000,00; Posts: 95; Joined: 01/02/2020, 23:58; Status: Offline (Active 1 Year, 2 Weeks, 2 Days, 23 Hours, 54 Minutes ago); Topics: 0; Reputation: 189; Has thanked: 29 times; Been thanked: 214 times

Re: Android engines tests.

6
Quote

Post by Joachim26 » 02/03/2023, 16:56

Thanks for this match Archimedes

Wow

If it wouldn't be my engine I wouldn't believe it.

Edit: But CFishNN is nearly two times faster than SF. At longer TCs, SF will win, I'm quite sure.

SkyNet: Forum Contributions; Points: 33 205,00; Posts: 325; Joined: 11/11/2022, 1:55; Status: Offline (Active 22 Hours, 57 Minutes ago); Medals: 3; Topics: 6; Reputation: 3125; Location: 3th dimension.; Has thanked: 5177 times; Been thanked: 2350 times

Re: Android engines tests.

7
Quote

Post by SkyNet » 02/03/2023, 23:17

Joachim26 wrote: ↑02/03/2023, 8:15 My CFishNN_dev aka 'Cfish 200223 64 NEON" won by +20 ELO. I expected more.

The number of games is small, plus, the number of cores was 6 (instead of usual 1) and TC much longer, so the elo difference could be bigger eg after 3000 games. Unfortunately it impossible to run tests with TC 2+1 with only one game per time, the above test took me 3 days.

Joachim26 wrote: ↑02/03/2023, 11:27 Alex, which App you have used?
-----------------------------------------
Will a test of the remaining two opponents follow?

I used Cfа 6.2.1, the results of the test was calculated by Ordo (on my PC).
--------------------------------
Leo's compilation vs your's? This is not necessary, because we have an approximate result, and also do not want to ruin my tablet.

Joachim26: I've been banned!; Points: 6 000,00; Posts: 95; Joined: 01/02/2020, 23:58; Status: Offline (Active 1 Year, 2 Weeks, 2 Days, 23 Hours, 54 Minutes ago); Topics: 0; Reputation: 189; Has thanked: 29 times; Been thanked: 214 times

Re: Android engines tests.

5
Quote

Post by Joachim26 » 03/03/2023, 9:16

Archimedes wrote: ↑02/03/2023, 16:43 CfishNN 230302 vs Stockfish 230227

Tournament was played on Motorola Moto G7 Power (Android 10, Snapdragon 632) with Termux and the CETSA script.
Offset for rating of one of the engines was not set in Bayeselo.

c-chess-cli.conf:

Code: Select all

../bin/c-chess-cli/$CETSA_ABI/c-chess-cli \
  -each tc=10+0.1 option.Hash=16 option.Threads=1 \
  -engine cmd=../uci/CfishNN \
  -engine cmd=../uci/Stockfish \
  -games 500 \
  -concurrency 2 \
  -openings file=../epd/IM_4mvs.epd order=random -repeat \
  -resign number=150 count=5 score=900 \
  -draw number=150 count=5 score=5 \
  -pgn games.pgn

c-chess-cli.txt:

Code: Select all

3-fold repetition: 169
50 moves rule: 52
Adjudication: 10
Checkmate: 194
Insufficient material: 63
Rules infraction: 0
Stalemate: 12
Time forfeit: 0
Unterminated: 0

Bayeselo.txt:

Code: Select all

Rank Name              Rating   Δ     +    -     #     Σ    Σ%     W    L    D   W%    =%   OppR 
---------------------------------------------------------------------------------------------------------
   1 CfishNN 230302     3100   0.0   10   10   500  251.0  50.2   98   96  306  19.6  61.2  3100 
   2 Stockfish 230227   3100   0.8   10   10   500  249.0  49.8   96   98  306  19.2  61.2  3100 
---------------------------------------------------------------------------------------------------------
  Δ = delta from the next higher rated opponent
  # = number of games played
  Σ = total score, 1 point for win, 1/2 point for draw

Games:
https://app.box.com/s/o47ud1j5dx6dvekdem0v5b524bc7o6c4

Thanks again for this tournament

, nice, but now it's time to think about what could be the reason for the surprising result.
Pure luck of CFishNN? The error bar is +/-20 ELO (right?). Thus luck may explain it, but is not a good explanation. An other possibility is that something was wrong with SF230227.
I'm sure you checked the speed of your compile but did you also check the bench of your build? The be honest, when I just compile SFnps, I very rarely, positively formulated, do this check

because, if it compiles and plays on Droidfish a game vs itself, why anything should be wrong...
Your bench is probably
https://github.com/Joachim26/StockfishNPS/commit/876906965b8d552866486c0e6eda1184fdb1d636

i.e. bench: 4814343
But there were other updates shortly before, so I'm not sure which version you compiled.
If your bench is OK then my standard formulation will be:
More tests are needed...

SkyNet: Forum Contributions; Points: 33 205,00; Posts: 325; Joined: 11/11/2022, 1:55; Status: Offline (Active 22 Hours, 57 Minutes ago); Medals: 3; Topics: 6; Reputation: 3125; Location: 3th dimension.; Has thanked: 5177 times; Been thanked: 2350 times

Re: Android engines tests.

7
Quote

Post by SkyNet » 03/03/2023, 9:31

Joachim26 wrote: ↑03/03/2023, 9:16
Pure luck of CFishNN? The error bar is +/-20 ELO (right?).

I suppose, mainly speed advantage of CF, and there is a potential ~40 elo difference in the air, so a lot more games required.

Archimedes: Forum Contributions; Points: 42 582,00; Posts: 2059; Joined: 04/11/2019, 21:13; Status: Offline (Active 4 Hours, 54 Minutes ago); Medals: 2; Topics: 158; Reputation: 7111; Been thanked: 6477 times

Re: Android engines tests.

0
Quote

Post by Archimedes » 03/03/2023, 9:54

The tested version of Stockfish (230227) shows me 4814343.

Joachim26: I've been banned!; Points: 6 000,00; Posts: 95; Joined: 01/02/2020, 23:58; Status: Offline (Active 1 Year, 2 Weeks, 2 Days, 23 Hours, 54 Minutes ago); Topics: 0; Reputation: 189; Has thanked: 29 times; Been thanked: 214 times

Re: Android engines tests.

0
Quote

Post by Joachim26 » 03/03/2023, 9:58

Archimedes wrote: ↑03/03/2023, 9:54 The tested version of Stockfish (230227) shows me 4814343.

Fine! So my standard answer:
More tests are needed...

On Android! That's the BIG problem!

Joachim26: I've been banned!; Points: 6 000,00; Posts: 95; Joined: 01/02/2020, 23:58; Status: Offline (Active 1 Year, 2 Weeks, 2 Days, 23 Hours, 54 Minutes ago); Topics: 0; Reputation: 189; Has thanked: 29 times; Been thanked: 214 times

Re: Android engines tests.

2
Quote

Post by Joachim26 » 03/03/2023, 10:06

On Windows these tests are much easier to do and I have already made some. With the not very surprising result that Stockfish is roughly +50 ELO above CFishNN. For example this short match here:
http://outskirts.altervista.org/forum/viewtopic.php?p=53248&view=single_post#p53248

Archimedes: Forum Contributions; Points: 42 582,00; Posts: 2059; Joined: 04/11/2019, 21:13; Status: Offline (Active 4 Hours, 54 Minutes ago); Medals: 2; Topics: 158; Reputation: 7111; Been thanked: 6477 times

Re: Android engines tests.

14
Quote

Post by Archimedes » 04/03/2023, 11:57

Cfish 210626 vs CfishNN 230302

Tournament was played on Samsung Galaxy Tab A7 (Android 12, Snapdragon 662) with Termux and the CETSA script.
Offset for rating of one of the engines was not set in Bayeselo.

c-chess-cli.conf:

Code: Select all

../bin/c-chess-cli/$CETSA_ABI/c-chess-cli \
  -each tc=10+0.1 option.Hash=16 option.Threads=1 \
  -engine cmd=../uci/Cfish \
  -engine cmd=../uci/CfishNN \
  -games 500 \
  -concurrency 2 \
  -openings file=../epd/IM_4mvs.epd order=random -repeat \
  -resign number=150 count=5 score=900 \
  -draw number=150 count=5 score=5 \
  -pgn games.pgn

c-chess-cli.txt:

Code: Select all

3-fold repetition: 184
50 moves rule: 24
Adjudication: 9
Checkmate: 215
Insufficient material: 62
Rules infraction: 0
Stalemate: 6
Time forfeit: 0
Unterminated: 0

Bayeselo.txt:

Code: Select all

Rank Name            Rating   Δ     +    -     #     Σ    Σ%     W    L    D   W%    =%   OppR 
---------------------------------------------------------------------------------------------------------
   1 CfishNN 230302   3130   0.0   10   10   500  293.0  58.6  151   65  284  30.2  56.8  3070 
   2 Cfish 210626     3070  59.1   10   10   500  207.0  41.4   65  151  284  13.0  56.8  3130 
---------------------------------------------------------------------------------------------------------
  Δ = delta from the next higher rated opponent
  # = number of games played
  Σ = total score, 1 point for win, 1/2 point for draw

Games:
https://app.box.com/s/aknhjcq0944020hc6judrtwle3pe15vs

Archimedes: Forum Contributions; Points: 42 582,00; Posts: 2059; Joined: 04/11/2019, 21:13; Status: Offline (Active 4 Hours, 54 Minutes ago); Medals: 2; Topics: 158; Reputation: 7111; Been thanked: 6477 times

Re: Android engines tests.

14
Quote

Post by Archimedes » 10/03/2023, 21:12

CfishNN 230309 vs Stockfish 15.1

Tournament was played on Motorola Moto G7 Power (Android 10, Snapdragon 632) with Termux and the CETSA script.
Offset for rating of one of the engines was not set in Bayeselo.

c-chess-cli.conf:

Code: Select all

../bin/c-chess-cli/$CETSA_ABI/c-chess-cli \
  -each tc=10+0.1 option.Hash=16 option.Threads=1 \
  -engine cmd=../uci/CfishNN \
  -engine cmd=../uci/Stockfish \
  -games 750 \
  -concurrency 2 \
  -openings file=../epd/IM_4mvs.epd order=random -repeat \
  -resign number=150 count=5 score=900 \
  -draw number=150 count=5 score=5 \
  -pgn games.pgn

c-chess-cli.txt:

Code: Select all

3-fold repetition: 293
50 moves rule: 50
Adjudication: 11
Checkmate: 294
Insufficient material: 89
Rules infraction: 0
Stalemate: 13
Time forfeit: 0
Unterminated: 0

Bayeselo.txt:

Code: Select all

Rank Name            Rating   Δ     +    -     #     Σ    Σ%     W    L    D   W%    =%   OppR 
---------------------------------------------------------------------------------------------------------
   1 CfishNN 230309   3111   0.0    8    8   750  399.0  53.2  172  124  454  22.9  60.5  3089 
   2 Stockfish 15.1   3089  21.8    8    8   750  351.0  46.8  124  172  454  16.5  60.5  3111 
---------------------------------------------------------------------------------------------------------
  Δ = delta from the next higher rated opponent
  # = number of games played
  Σ = total score, 1 point for win, 1/2 point for draw

Games:
https://app.box.com/s/njtbwp32wy9znf6d4voskyqcwphgkwnj

fuddur: Forum Contributions; Points: 8 879,00; Posts: 28; Joined: 28/01/2020, 13:45; Status: Offline (Active 1 Week, 3 Days, 18 Hours, 23 Minutes ago); Topics: 0; Reputation: 37; Has thanked: 5 times; Been thanked: 38 times

Re: Android engines tests.

9
Quote

Post by fuddur » 16/03/2023, 4:11

Caissa 1.5 vs frozenight6.0

Tournament was played on Realme 5pro(Android 11, Snapdragon 712) with Termux and the CETSA script.

c-chess-cli.conf

Code: Select all

../bin/c-chess-cli/$CETSA_ABI/c-chess-cli \
  -each tc=10+0.1 option.Hash=16 option.Threads=1 \
  -engine cmd=../uci/Caissa \
  -engine cmd=../uci/Frozenight \
  -games 1000 \
  -concurrency 2 \
  -openings file=../epd/IM_4mvs.epd order=random \
  -repeat \
  -resign number=200 count=5 score=900 \
  -draw number=200 count=5 score=5 \
  -pgn games.pgn 1

c-chess-cli.txt

Code: Select all

 3-fold repetition: 276
50 moves rule: 89
Adjudication: 13
Checkmate: 560
Insufficient material: 56
Rules infraction: 0
Stalemate: 2
Time forfeit: 4
Unterminated: 0

Bayeselo.txt

Code: Select all

 Rank Name              Rating   Δ     +    -     #     Σ    Σ%     W    L    D   W%    =%   OppR 
---------------------------------------------------------------------------------------------------------
   1 Frozenight 6.0.0   3117   0.0    9    9  1000  548.5  54.9  331  234  435  33.1  43.5  3083 
   2 Caissa 1.5         3083  33.2    9    9  1000  451.5  45.1  234  331  435  23.4  43.5  3117 
---------------------------------------------------------------------------------------------------------
  Δ = delta from the next higher rated opponent
  # = number of games played
  Σ = total score, 1 point for win, 1/2 point for draw

games:https://pixeldrain.com/u/E1XTNRLa

Post Reply

838 posts

Page 55 of 56
- Jump to page:
Previous
1
…
52
53
54
55
56
Next

Return to “Mobile Chess Software, Engines”

⇧

⇩