GameKnot related: engine analysis
« Back to forum
Pages: 12
Go to the last post
FromMessage
stalhandske
07-May-18, 22:37

engine analysis
For quite some time now I have consistently noted weakness of the engine that GK provides for analysis of GK games. I don't have the opportunity to compare with commercially available chess engines, but a consistently poor performance (over at least 2-3 years) made me make this query. What is the formal "strength" of the GK engine? Would it be too much to ask to have this amended/upgraded? In my opinion the availability of a computer analysis of completed games at GK is a very important and strong feature as such, but then the quality should be better than it is now.
Gameknot.com
08-May-18, 09:04

The commercially available chess engines (Houdini, Deep Rybka, etc.) cost $80-$100 and require a powerful multi-core computer with a lot of RAM which will cost another $2,000 or more. Our own chess engine might not be as powerful, but it is included with your premium membership starting at $6.99 per month.

Also, depending on how many post-game analyses are pending at the time, we have to reduce the per-move thinking time for the engine in order to accommodate everyone in a reasonable time. There is a minimum depth limit to maintain quality -- i.e. the minimum number of moves that will be analyzed ahead, but if there is time, the engine will continue analyzing. If there are only a few pending requests, your analysis will be allowed to run much longer. Which generally means that the strength of the analysis might vary somewhat from game to game.

We haven't officially tested the strength of our engine vs commercial chess engines head-to-head, but in game analysis tests, it was returning practically identical results to Stockfish on several hundreds of random games we used in our test.
stalhandske
08-May-18, 09:29

Many thanks for the reply. I suspect that it is a relatively short per-move-thinking-time that has resulted in the performance I have experienced. Especially since you in your tests seem to have obtained comparable results to commercial engines - I guess your tests were performed without "competition" with others and therefore had longer thinking times?  
penelope80
19-May-18, 11:02

Considering that(if I understood correctly)the accuracy of the computer analysis depends on how many requests are pending at the time,it would be better,as far as I'm concerned,to have less analyses a month,but more accurate.
tugger
19-May-18, 14:39

I estimate the engine strength to be around 2000.

I say this because I find myself an an interesting stage of my chess here at gameknot... I'm starting to "beat" the engine.

I have had my doubts for a while, but this game is what confirmed it for me...
game

13. Bxf7+!
- engine flags it as a mistake, my score goes from -0.66 to -2.61
13... Kxf7 (forced, Kh8 mates in six)
14. Qh5+ Kg8 (forced)
- now my score is =0

This is a winning attack for white, I'm near certain that black cannot refute this.

Of course, the engine found checkmates that I didn't, one of which I really should be seeing (34. Qd6), but I was on auto-pilot by this point, I knew Bf4 was winning too.

But the key tactical move that won me this game, the GK engine failed to recognise it as a winning move, so sadly I think it's time for me to find an alternative for post-game analysis.
tugger
19-May-18, 14:46

Another engine fail in that game is 25. Qxg7+, which is also flagged as a mistake, since I have a mate in 12 instead. However, after black makes his forced move Ke8, I have 26. Nxc6 and now I have mate in ten... better than what the engine found before!

So yeah, I'm definitely hitting it's limit.
stalhandske
20-May-18, 04:09

Tugger's examples match very well with the numerous ones I have from my own games, and which caused me to write the "complaint" in the first place.

I am not at all sure of this, but I spontaneously feel that if tugger is right and the rating of the GK computer is around 2000 in practise (which I think is quite a reasonable estimate), it might as well be removed altogether, because then it is doing more harm than good.
tugger
20-May-18, 04:31

I wouldn't say that. It's certainly stronger than I am when it comes to endgame, but that's to be expected because there are less lines of analysis, meaning brute force will cover a greater percentage of the potential outcomes.

I've been happy with the engine here up to my current rating, and only now am I finding that I need an alternative engine for post-game analysis. I consider it a good thing... it shows I'm still improving.

I think it's a fine tool for sub-2000 players, but it shouldn't be considered as 100% reliable.

For the 2000+, perhaps investing in something more accurate is necessary for post-game analysis.
myrydin
20-May-18, 07:27

That’s a shame, as I’ve been relying on it to give accurate feedback on whether sacrifices were dubious or not.

Along similar lines to Penelope, I’d be happy to wait a few more days for a better analysis. I guess GK has these sort of tricky dilemmas all the time, how quickly, how strong etc.
tugger
20-May-18, 08:32

I'd be inclined to think that the vast majority of those who regularly seek post-game engine analysis would prioritise accuracy over time. If it really is as simple as allowing it more analysis time, then just make us wait longer. If it takes a day or more, so be it.
Gameknot.com
20-May-18, 09:51

Our engine strength is most definitely above 2,000. Again, I'm unable to commit to an exact rating as we have never done official head-to-head tests, but it should be on par with the top engines out there. The limiting factor here is time control, as we have hundreds of game analysis requests coming in every hour, and we cannot allocate an hour of analysis time for each game, as it would stretch out the waiting time to months, if not years.

It's not as simple as allowing more analysis time, "and if it takes a day or more longer, so be it". We optimize the analysis time per game to make sure we can process the daily queue within a day. That means the analysis queue is not growing. But if you increase the analysis time by just 10%, that means you can only process 90% of the daily queue, and the next day you need to process 110%. And the next day it's 121%, and the next day it's 133% (the percentage compounds each day). Obviously, it's not sustainable.

And due to the way the move tree grows, you need to double your analysis time to look just another move ahead once you get deep enough. Which means after a certain rating, to improve your analysis by a modest amount, you need to double your analysis time, which means to halve the number of games processed. There is a very fine balance between processing all games within a day, and never processing any new games for years.

If you require the absolute top-notch analysis, by all means, do acquire a multi-core computer with gobs of RAM, and purchase the best commercial chess engine, and let it run for a couple of hours to analyse each game. Regretfully, we cannot compete with that, as we have to accommodate thousands of players simultaneously.
stalhandske
20-May-18, 20:56

I now understand the problem; thanks GK for the explanation! Even though the nominal GK engine strength is much higher than about 2000, it is generally not higher than that in practise because time restrictions (due to high demand) prevents the analysis to go very deep. This explains very well the example by tugger, where a move was flagged as a mistake by the engine analysis, only to correct the positional situation later in his favour after two more (forced) moves. I have had numerous examples precisely like that in the past, which made me make the enquiry.
tugger
21-May-18, 01:22

Thanks GK for taking the time to explain in detail, also thanks for the re-analysis of the game I posted in this thread. It's still not perfect but it's a much better analysis than before.

I understand the problem you guys have with this. I'm certainly not complaining, just sharing my experience. I'll continue to use the GK engine, just with a sense of awareness that it's not perfect.
myrydin
21-May-18, 01:57

I’ll second that, it’s obviously a major undertaking with considerable logistics. I still think the analysis is pretty useful as a guide.
Gameknot.com
21-May-18, 09:33

We have now implemented a small change to the post-game analysis that adds extra per-move analysis time based on the average rating of players in the game. If both players are rated 1400 or below, the default time is used (which should be plenty for this rating level), and for players rated higher, extra time is added on a sliding scale, up to double the time for ratings 2400 and above. The game referenced by tugger was re-analysed using the new time settings as a test. Unfortunately we do not have the server resources to re-analyse the thousands of past games, so this change only applies for new post-game analysis requests going forward.
penelope80
21-May-18, 09:37

That's a very good news!Thanks Gameknot!
tugger
21-May-18, 10:10

Thank you for listening to us and seeking solutions.
stalhandske
21-May-18, 20:34

Great news - thanks very much GK!
myrydin
22-May-18, 00:00

This improvement has made all the difference for me, so I’d like to add my thanks to GK too.
ludus_puerorum
14-Jun-18, 05:49

How do I interpret an engine analysis? We are told, 'Your game will be analyzed on the server, and a score will be assigned to each move," so I expect to see numbers indicating scores. Instead I see tiny green horizontal bars and pink horizontal bars. What do they mean? Why are some of the moves highlighted in yellow? I made a move in miguelpereira (1519) vs. i_play_slowly (1478) that I felt compelled me to resign, but the blunder check did not suggest an alternative move. Does blunder have a precise definition, and, if so, what is it? Thanks in advance to all responses.
penelope80
14-Jun-18, 05:56

If you click on "computer analysis"(down the chessboard)
you will read:
-mistake/and the sequence of moves
-best/that says the best move with the continuation
I hope it helps
ludus_puerorum
14-Jun-18, 06:08

Thank you, penelope80!
archduke_piccolo
15-Jun-18, 18:49

I like the amendment...
... but really the limitations to the GK engine was due, I think, to the 'horizon effect' of the ply depth of analysis.

It was really in the matter of middlegame tactics that the GK engine was at its best, but also where its horizon effect weakness was most apparent. Nevertheless, I found it generally reliable tactically.

I still use it to check out my games, and found with some recent games, my performance wasn't as strong as I thought at the time!
lord_shiva
10-Jul-18, 20:28

<<If tugger is right and the rating of the GK computer is around 2000 in practise (which I think is quite a reasonable estimate), it might as well be removed altogether, because then it is doing more harm than good.>>

If the analysis is useful for someone rated at 1400, should it be eliminated because a 2400 player doesn't get that much out of it? I think there are lots more 1400 rated players on GK than 2400 rated players.

But I like GK's rating/depth solution--it is a clever idea that should work well enough.
stalhandske
10-Jul-18, 20:41

<should it be eliminated because a 2400 player doesn't get that much out of it? >

My wording was imprecise. But a 1400 rated player will equally be served much better by a better engine. My aim was to point out some truly embarrassing positional judgements that the older version did, and which it did - of course - equally irrespective of the player's rating.

It was great that GK was able to improve the engine usage in a difficult situation. I had not realised that the issue limiting its use was the large number of users.
haratta
26-Jul-18, 07:41

If people have problems with the analysis strength of the GK engine, why can´t they just export the finished game from the GK website and let it be analysed, either with a strong engine on their own computer or on a website that offers analysis with the strongest possible engines?
lord_shiva
26-Jul-18, 10:16

Strong Engine Analysis
What are some web sites that offer analysis?
yon_cassius
26-Jul-18, 10:50

I can think of a couple (that I visit), don't know if mentioning them in the forum would constitute "advertising" though.

Nick  
utopianfragments
06-Aug-18, 00:45

thanks
for me, as a player, the upgrade is not so important. I am way beyond understanding it. But I find it wonderful that you have taken the steps to deal with.

tugger
16-Dec-18, 04:19

game

Here's another sub-2000 performance from the engine.

Black has just played 51... Kf6, and I duly resigned.
GK's engine suggests 51... Kf6 is a mistake, scoring just +0.56. My calculation (and that of my opponent) was that black queens and I'm lost.
GK engine recommends 51... Bxd5, which I was hoping black plays, and which I calculated as drawn. GK scores it +1.68.

So I took a look a Stockfish, and the results are that me myself and bookie are correct in our calculations, and that the GK engine is wrong. SF scores the following...
51... Kf6 +9.66
51... Kf7 +7.40
51... Bxd5 +0.41

How is the GK engine able to fail at endgame so badly? I can understand engines struggling for accuracy in middlegame, but I shouldn't be able to touch an engine when it comes to endgame, and for sure I can't touch Stockfish.

For clarity, it was bookie who requested the analysis, he is a premium subscriber. I currently am not. Does that affect bookie's quality of analysis?
Pages: 12
Go to the last post