Schachcomputer.info Community - Einzelnen Beitrag anzeigen - Test: Some thoughts about Strength Tests

03.06.2024, 10:12

CC 7

Fidelity Elite Avantgarde 68060

Dabei seit 20 Jahren, 10 Monaten und 26 Tagen.

Registriert seit: 10.12.2004

Land:

Beiträge: 412

Abgegebene Danke: 0

Erhielt 458 Danke für 206 Beiträge

AW: Some thoughts about Strength Tests

Zitat von spacious_mind

Although a player might perform significantly better or worse from one game to the next, Elo assumed that the mean value of the performances of any given player changes only slowly over time.

The sitution for chess computers is even better: they never change their strength and performance on the long run, opimal for Elo calculations.
I like the Elo system a lot - as far as I can judge the elo ratings and rankings produced by Elo are pretty fair and very useful for making predictions.

Zitat von spacious_mind

Everything written in the above summary is in my opinion correct, except for one key assumption which is nowadays completely outdated:

"A further assumption is necessary because chess performance in the above sense is still not measurable. One cannot look at a sequence of moves and derive a number to represent that player's skill."

To move or not to move - that's the question !

Why not - let's have a try !
Thanks to your work (and Eric's) it is possible to see the results, do the comparison getting a final judgement.

I do believe that it is also possible to use a large position test (with at least 200 positions and lots of chess computers) in order to get quite reliable elo ratings -
with Elo-Stat by Frank Schubert.
But that's another story.

Zitat von spacious_mind

Some day in the future (maybe not in our lifetime), I fully expect that you will be able to take the complete game database of for example Wilhelm Steinitz or Garry Kasparov, plug it into the
master computer program and get the strength results, overall and year by year for each and every player, and accurately compare the evolving improvements of chess knowledge throughout history for players and computer programs.

Elo and imagination - I do like your fantasy..."Nick Verne"

Once in the future would it be possible to get through our Schachcomputer.info Tournament list...?

Zitat von spacious_mind

Another is that the requirements of a rating system for updating current players' ratings on a day-to-day basis are different from those of a rating system for players in some historical epoch.

That's not clear to me.
You need a starting point in "history epoch elo" for initializing -
you needed a starting point as well in modern elo calculations "initializing 1966-69".
Where is the difference at the start - the lower data base - what else ?

Zitat von spacious_mind

While we can safely bet that Elo did a careful job of rating historical players, inevitably many choices have to be made in such an attempt, and other approaches could be taken. Indeed, Clarke
made an attempt at rating players in history before Elo. Several other approaches have recently appeared, including the Chessmetrics system by Jeff Sonas, the Glicko system(s) of Mark Glickman,
a version of which has been adopted by the US Chess Federation, and an unnamed rating method applied to results from 1836-1863 and published online on the Avler Chess Forum by Jeremy Spinrad.
By and large these others have applied sequential updating methods, though the new (2005) incarnation of the Chessmetrics system is an interesting exception (see below) and Spinrad achieved
a kind of simultaneous rating (at least more symmetric in time) by running an update algorithm alternately forwards and backwards in time.

There are pros and cons to all of these."[/I]

I've to confess that most of these systems I don't know at all or only very rudimentary.
Many years ago there was an article in CSS about Jeff Sonas - his contribution was not convincing to me.
Sonas Rating Formula - better than Elo ? No - IMHO.
AFAIK John Nunn critized a change concerning predicting game results made by Jeff Sonas (Nunn on the K-factor: show me the proof, 2009).

Mark Glickman - AFAIK his Glicko-systems (contrary to the Elo-system) aren't totally stable - they could be flawed when intentionally manipulating data - of course that's not the case here.
https://www.reddit.com/r/TheSilphRoa...jor_flaw_in_a/

Is there a Glicko-program that can use pgn-data for calculating elo - or only Excel-sheets provided ?

Zitat von spacious_mind

... as there is a difference between performance and strength. For example, De Labourdannais maybe have performed at a level of 2600 against his opponents.
But the strength of play in the early 1800s may also certainly been lower than what it is today. This is the difference between strength and performance.

Absolutely there is a difference between performance and strength !
The elo system only measures performance - nothing else.

So IMHO that's the drawback in your system using the moves of a game - you are regarding not the performance but only the strength of a person,
which is not the aim of the Elo-sytem.

To say it more drastically: the strength of a person or computer is totally irrelevant for calculating elo - only performance matters !
Nothing else than 1-0, 1/2:1/2, 0:1, simple as that !

Zitat von spacious_mind

In Summary similar to EDOChess who rates his results with EDO, as my strength tests are not performance based, I have changed all references to ELO to STR which is short for STRENGTH or
SPACIOUSMIND TESTS RATING. Whichever you prefer.
...In my Tests I have changed ELO Sneak to STR Sneak, which you will see on future updates that I will provide for download.

That's the consequence !
You shouldn't call it Elo...STR...or maybe StrElo.

Zitat von spacious_mind

My STR ratings formula under Renaissance was of course created to approximate ELO, in order to have a fun comparison. The exact same calculations will be used for all future tests and the test
results may likely differ through the history of the games; however, the constant will always remain the same, which is the distance between the Master and its subjects, game by game and average.

Keep on going with your epoch tests !

The main thing after all - it's fun !

Regards
Hans-Jürgen

Folgende 4 Benutzer sagen Danke zu CC 7 für den nützlichen Beitrag:
kamoj (03.06.2024), spacious_mind (03.06.2024), Tibono (03.06.2024), Wandersleben (03.06.2024)