Skip to main content

My Chess Improvement Process: Step 1 - Data Mining

This post is part of a series describing the process I am using to attempt improvement in chess. The main post gives an overview of the process while this article explains one of the steps in detail.

The first step on our chess improvement journey is to understand factually the strengths and weaknesses in our game. I am speaking from lifelong experience when I say that this is difficult to intuit. It is simple to remember recent blunders and draw conclusions that are not aligned with your performance over a longer period of time. The cognitive term for this is Recency Bias.

So, the question we are faced with is: How do you measure your chess ability?

There are a great many approaches for this, and my choice is not necessarily the best fit for everyone.

My approach is to identify all mistakes from my competitive chess games with a classical time control. Basically, taking the idea of the "List of Mistakes" that Axel Smith describes in the excellent book Pump Up Your Rating, and turning it up a few notches further by adding quantifiable metadata for each mistake. 

I selected games I have played in rated OTB games (tournaments and league games) as well as online games which were part of a tournament. The important thing about this selection process of games is that it should be games played with serious focus and deep concentration, so that the quality of the data can be relied upon. If the data quality that goes into the analysis is low, the analysis of the data will be useless to you. Garbage in, garbage out.

As we now have a list of games to extract mistakes from, we need to define what constitutes a mistake. My approach is to import each game on lichess, have the server analyze it and present to us a list of inaccuracies, mistakes and blunders.

lichess analysis output

For each of the identified sub-optimal moves, I manually add metadata, such as:

  • the severity of the error,
  • the type of mistake (tactical, positional)
  • the sub-type of mistake (for example candidate move, shallow calculation, exchange, etc.)
  • the phase of the game (opening, middlegame, endgame)
  • comment
This exercise in itself is very useful, in that it forces you to analyze your errors systematically. I think there are lessons to be learned just by noting down the errors in a list. This is the reason that I scrapped attempts at automating this process using code. By actually making this a labor intensive process, I am picking up insights just by getting the source data for our analysis in place.

The game itself will also be added to a separate list, with information such as color, result, tournament, date, opponent Elo, and also the evaluation of the position at the end of the opening phase and middlegame phase, as defined by the lichess analysis.

One important detail to consider is in which unit to quantify the error, as well as the evaluation of a position. In the old days we used ECO symbols, today we are used to engine evaluations, but both of these approaches are not optimal when it comes to quantifying an error. For example, let's say you are playing a move, which reduces the engine evaluation by 1.0. There is a huge difference between the loss of winning chances if the initial evaluation is 0.5 or 4.5. In the former case (the evalution goes from 0.5 to -0.5) the expected score is reduced by 14%, while in the latter case (the evaluation goes from 4.5 to 3.5) the loss is only 5%. This conversion between engine evaluation and winning chances assumes that the players are of equal strength, and the formula could be improved for uneven matchups.

For this reason, I am quantifying each mistake in the loss of my expected score, so a 14% error means that on average I am expected to score 0.14 points less after playing this erroneous move.

In the next post I will dive into the exciting world of exploring this data to find insights and actionable information!

Comments

Popular posts from this blog

HOWTO: Fix a Broken Laptop Lid for $1

A few months ago my laptop lid's hinges gave up and my lid kept falling over. I will show you how I fixed the problem in five minutes by using materials for $1. But first some background info. At first, I assumed there would be a quick and simple fix to this common laptop problem. My laptop is an Evo N800v. HP has bought Compaq since I purchased the computer so that's where I'm supposed to turn for help. I was kind of startled to hear that HP support wanted $500 for fixing the broken hinges - presumably they intended to replace the entire lid. Obviously, shelling out $500 for fixing a 6 year old laptop is not the way to go, so I started to look for alternative solutions. First, I disassembled the laptop numerous times, trying to make the hinges more sturdy (that's spelled S-U-P-E-R-G-L-U-E). Anyway, that didn't help. Option number two was to do something similar to what user xrobevansx did on instructables.com . Basically he bought a lid support in a hardware store...

HOWTO: GTD with Google Docs & PocketMod

Take control of your unwieldy to do-list by combining Google Docs and PocketMod. With the system described here you will always be ready to take notes, and never run the risk of losing an idea! Update (July 30, 2009): Now using a Google Docs template. I use a subset of GTD (" Getting Things Done ") by having a digital copy of my next actions, sorted by context (@Home, @Office, @Shopping, @Computer, etc.). This lets me easily look up what I need to do, depending on where I am. However, a digital copy is not very useful by itself, since it is not accessible when I am offline. Putting it in my PDA is not ideal either, since the overhead of adding a new note is too big (turning on the device, opening the right application, having it recognize my handwriting). That's why I print out my to-do list on paper once a week and carry it in my pocket. It's the ideal way of accessing and editing tasks. Before I print out a new list I spend a minute or two copying the edits from my...

Reading on Paper vs. on Screen

One of the basic premises behind FeedJournal is that it's better to read text on paper than on a screen. While it might not sound like a bold assumption, it still is an assumption and as such worth to examine deeper. Today, office workers and many other professionals are required to focus their eyes on a computer screen during most of their work day. Many of them continue to use the computer at home. FeedJournal was created with many goals in mind; one of them is to release you from the screen while enabling you to read the content you love. You shouldn't have to spend more time reading off a screen, just because you want to access fresh and relevant content. Recent research has found that reading a longer text on paper is 25% faster than reading the same text on a computer screen. At the same time, reading comprehension and article overview are improved. Although screen resolutions have increased and font rendering technologies such as ClearType make it much easier to rea...