Rybka - for the serious chess player. [ #Home ] [ Persistent hash ]

••• Current position: Persistent hash

Rybka 3 Persistent Hash

Introduction

Rybka's persistent hash capability allows Rybka to benefit from her previous work when she is asked to analyze positions for a second time or when she is asked to analyze positions whose proper assessment depends on previously-analyzed positions. Persistent hash makes Rybka more efficient in these scenarios.

Most of the persistent hash capability operates seamlessly, without requiring any special user control. There are however scenarios where user input is needed to maximize the performance of the persistent hash. In addition, users can share persistent hash files, so that user A can benefit from the analysis done by user B. Persistent hash files can also be merged, so that user A can benefit from analysis done separately by users B and C. These various scenarios are the topic of this document.

Basic Steps

For normal users, the following basic steps are approriate and more or less sufficient for reasonable performance:

1) Set the (Persistent Hash File) engine parameter to where you want the persistent hash file to be placed. It should point directly to the file itself (ie. not just the directory). By convention, persistent hash files have an extension of .rph.

2) Check the (Persistent Hash Enabled) box.

3) Set the (Saved Hash File) engine parameter to where you want hash contents to be saved. The path should point directly to the file itself (ie. not just the directory). By convention, saved hash files have an extension of .rsh.

4) Whenever you have performed a deep analysis on some specific position and expect to want Rybka to access this work later, invoke the (Save Hash) button from the engine parameter list.

5) When you want to restore the analysis from step #3, make sure that the (Preserve Analysis) box is checked and invoke the (Load Hash) button from the engine parameter list.

Note that some new interfaces which are tailored to Rybka (ie. interfaces sold by Convekta & ChessBase together with Rybka 3) will provide special GUI support for these steps.

Introduction (Advanced)

The rest of this document is for power users.

Hash Table

Every top modern chess program has what is typically called a hash table. This term is actually not very descriptive, as hashing is a well-known algorithm with wide application. Anyway, a chess engine's hash table stores conclusions about positions visited during the search. The engine takes advantage of these conclusions when the positions are revisited later in the search, either in subsequent iterations or due to a transposition. This information is extremely valuable and contributes greatly to the efficiency of a search.

The problem with the above is that the hash table has a limited size - it must fit in the computer's memory. Hash table entries constantly have to be discarded in favor of higher-priority entries. In this prioritization process, there is a strong preference for newer entries, and old entries usually don't last very long.

In many cases, however, users would like to be able to have the engine take advantage of work which was done some time ago. In other words, they would like older hash entries to be better preserved. There are three ways in which this is supported in Rybka.

Preserve Analysis

One Rybka user option for helping to deal with this issue is the (Preserve Analysis) engine parameter. When this is checked, Rybka puts a higher priority on high-quality (ie. deep) entries and less priority on recent entries. This prioritization scheme is less efficient from the point of view of game play, where positions arise once and are quickly left behind forever. During interactive analysis, where the user will return again and again to the same few positions, it is more efficient.

Preserve analysis is however not a universal solution.

First, it has a tendency after some time to occupy the hash table with old, high-quality entries, interfering with efficient analysis of new positions. For this reason, when analysis with (Preserve Analysis) selected is done, it is best to clear the hash table when moving to a completely new position.

Second, preserve analysis does not allow analysis to be saved from one engine session to the next. It also does not allow users to share, merge and catalogue their analysis. That is what persistent hash is for.

Downward & Upward Propagation

Before we get to the persistent hash itself, let's consider the question of when we most want Rybka to remember her old analysis. There are two scenarios.

The first is what I call downward propagation. In this scenario, Rybka has analyzed some root position in great detail, and we want her to remember the little details which she put together in the course of that analysis. For example, we had Rybka analyze some root position for three hours, and now we want to essentially browse her analysis under that root position, expecting her to quickly remember everything rather than having to recreate it. The term downward propagation refers to the fact that the analysis should be remembered at a point which is 'downstream' (ie. lower in the variation tree) from the point of the original analysis.

The second scenario is what I call upward propagation. In this scenario, Rybka has come to a conclusion about some position, and we want her to remember this conclusion when we later ask her to analyze an earlier position. The term upward propagation refers to the fact that the analysis should be remembered at a point which is 'upstream' (ie. higher in the variation tree) from the point of the original analysis.

As it turns out, the mechanism needed to deal with downward propagation is completely different from the mechanism needed to deal with upward propagation.

Usage & Operation

Hash Save & Load

Downstream propagation has the property that the conclusions which must be remembered have a huge volume. The search tree is shaped ... well, like a tree, with a geometric size. It is not possible to store all downstream analysis ever done on a single machine, much less to merge this information between multiple users. It is only feasible to store this analysis for a single position or set of related positions.

In Rybka, this functionality is supported via a simple hash save & load mechanism. When the user invokes the (Hash Save) engine parameter, Rybka saves her current hash contents to the path set in (Saved Hash File). When the user invokes the (Hash Load) engine parameter, Rybka loads the hash contents from that path.

The general usage sequence is:

1) Rybka runs for a long time on some position of interest or positions of interest which are close in the variation tree
or
1a) With preserve analysis selected, the user interactively analyzes with Rybka some set of positions which are close in the variation tree

2) User invokes (Save Hash) to save the contents of the hash table

3) At a later time, user invokes (Load Hash) to restore the contents of the hash table and resumes work

Advanced users may be expected to create entire libraries of saved hash files for their favorite areas of investigation. They may also be expected to share the files with each other.

Currently, this functionality does not support resizing or merging. When the hash file is loaded, the current hash size is changed to match the size of the file. Merging is not a critical operation here, as different files should be maintained for different positions.

Persistent Hash

Upstream propagation has the very pleasant property that the conclusions which must be stored have a low volume. In fact, for the purposes of upstream propagation, two hours of analysis of a single root position could in theory be represented entirely with a single entry, for the root position itself. This allows us to be more ambitious: for upstream propagation, we want to save everything, automatically and forever, and we want to be able to merge it with everything that other users have done. That is what persistent hash is for.

The operations of the persistent hash are simple:

- The user should set in the (Persistent Hash File) engine parameter the path to the file where the persistent hash is kept.

- The user should make sure that the (Persistent Hash Enabled) engine parameter is checked.

- The persistent hash file is always kept on disk, and accessed there directly. It can be as big as the hard drive. Hard drive performance is not an issue, as the number of accesses is trivial compared to the size of the search tree.

- When a persistent hash file exists at the specified path when the engine is loaded, it is used. When it doesn't exist, an empty persistent hash file is automatically created there.

- To leave persistent hash disabled, leave the (Persistent Hash Enabled) parameter unchecked (as is the default).

- The (Persistent Hash Size) engine parameter is used in two scenarios: 1) When the engine is loaded without a persistent hash file existing at the specified path, the created file is of the specified size 2) When the (Persistent Hash Resize) button is invoked, the hash file is resized to the specified size.

- The (Persistent Hash Reset) engine parameter clears the contents of the persistent hash file.

- To merge two persistent hash file, set the (Persistent Hash Merge File) parameter to the second persistent hash file and invoke (Persistent Hash Do Merge). The two files do not need to have the same size. The merged information will be contained in the main persistent hash file, whose size will be unchanged.

- The (Persistent Hash Write Depth) parameter controls the frequency of writes to the persistent hash file during the search. Higher write depths lead to lower write frequencies.

- The (Persistent Hash Play Depth) parameter instructs the engine to automatically play a move without further thinking during game play if the position already exists in the persistent hash file with at least the specified depth. This setting should probably be tailored to the time control for best performance.

Persistent Hash Maintenance

The simplest way for users to deal with the persistent hash is to set it up once the first time they use Rybka, merge it periodically with other persistent hashes as they get the chance, and otherwise just let everything run without any interference.

There is however an argument to be made for manual persistent hash maintenance. Every few weeks, the user can archive his current persistent hash (this can be done with a simple file copy) and then merge it with his previous archive. In certain scenarios, this will improve performance.

(to do - explain why)

Additionally, there is an argument to be made for keeping different persistent hashes for different positions. Unlike the case of downstream propagation (ie. hash save & load), where this is essentially mandatory, for persistent hash this type of maintenance will carry a much lesser benefit, possibly not worth the extra effort. This topic needs more investigation.

(to do - explain more)