Jump to content

Bitboard

From Wikipedia, the free encyclopedia

A bitboard is a specialized bit array data structure commonly used in computer systems that play board games, where each bit corresponds to a game board space or piece.

Bitboards are applicable to any board game whose game state is represented by the presence of pieces on discrete spaces of a gameboard, including chess, checkers, othello and word games. The scheme was first employed in checkers programs in the 1950s, and since the mid-1970s has been the de facto standard for game board representation in computer automatons.

Bitboards are a more space-efficient board representation than the traditional mailbox representation, where each piece or space on the board is an array element.

Bitboards are often also more time-efficient. When the associated bits of related states on the bitboard fit into a single word or double word of the CPU architecture, single bitwise operators like AND and OR can be used to operate on the bitboard. These parallel bitwise operations can be much faster at setting and querying game states, and determining moves and plays in the game, than performing iterative array operations on a mailbox representation would be.

Modern chess engines predominantly utilize magic bitboards, which employ perfect hashing to map a piece's occupancy mask directly to a pre-computed array of attack patterns in a single lookup operation.

Description

[edit]

A bitboard is a specialized bit field: a format that represents the state of a board game by packing multiple Boolean variables into the same machine word. Each bit represents a space; when the bit is positive, a property of that space is true.

Bitboards allow the computer to answer questions about game state using very few bitwise operations. For example, if a chess program wants to know if the white player has any pawns in the center of the board (center four squares), it can just compare a bitboard for the player's pawns with one for the center of the board using a bitwise AND operation. If there are no center pawns, the result will be all zero bits (i.e. equal to zero). Multiple bitboards may represent different properties of spaces over the board, and special or temporary bitboards (like temporary variables) may represent local properties or hold intermediate collated results.

The efficacy of bitboards is augmented by two other properties of the implementation. First, bitboards are fast to incrementally update; for example flipping the bits at the source and destination positions in a bitboard for piece location when a piece is moved. Second, bitmaps representing static properties like all spaces attacked by each piece type for every position on a chessboard can be pre-collated and stored in a table. This allows a question like "what are the legal moves of a knight on space e4?" to be answered by a single memory fetch.

Bitboard implementations take advantage of the presence of fullword (32-bit or 64-bit) bitwise logical operations like AND, OR, NOT and others on modern CPU architectures in order for operations to be fast. Bitboards may not be as effective on earlier 8- and 16-bit minicomputer and microprocessor architectures.

Implementation

[edit]

Because the implementation of bitboards requires correct compression and encoding of massive tables and complex game states, bitboard programs can be tedious for software developers to write and debug.

Processor use

[edit]

Advantages

[edit]

Bitboard representations use parallel bitwise operations available on nearly all CPUs that complete in one cycle, and are typically among the fastest instructions (being fully pipelined and cached, as well as other CPU-specific optimizations). Nearly all CPUs have the AND, OR, NOR, and XOR operations. Furthermore, modern CPUs have instruction pipelines that queue instructions for execution. A processor with multiple execution units can perform more than one instruction per cycle if more than one instruction is available in the pipeline. Normal instruction sequences with branches may cause the pipeline to empty if a branch is mispredicted. Many bitboard operations require fewer conditionals, increasing pipelining and making effective use of multiple execution units on many CPUs.

CPUs have a bit width, the number of completable bitwise operations in one cycle. So, on a 64-bit or more CPU, 64-bit operations can occur in one instruction. There may be support for higher or lower width instructions. Many 32-bit CPUs may have some 64-bit instructions and those may take more than one cycle or otherwise be handicapped compared to their 32-bit instructions.

If the bitboard is larger than the width of the instruction set, multiple instructions will be required to perform a full-width operation on it. So a program using 64-bit bitboards would run faster on a 64-bit processor than on a 32-bit processor.

Disadvantages

[edit]

Bitboard representations have much longer code, both source and object code. Long bit-twiddling sequences are technically tricky to write and debug. This issue may increase cache misses or cause cache thrashing.

If the processor does not have hardware instructions for 'first one' (or 'count leading zeros') and 'count ones' (or 'count zeros'), the implementation will be significantly handicapped, as these operations are extremely inefficient to code as loops or other high-level constructs.

Cache and memory use

[edit]

Advantages

[edit]

Bitboards require more memory than piece-list board data structures, but are more execution efficient because many loop-and-compare operations are reduced to a single (or small number of) bitwise operation(s). For example, in mailbox, determining whether piece attacks space requires generating and looping through legal moves of piece and comparing the final space with space. With bitboards, the legal moves of piece are stored in a bitmap, and that map is ANDed with the bitmap for space. A non-zero result means that piece attacks space.

Disadvantages

[edit]

For some games, writing a bitboard engine requires a fair amount of source code, including data tables that will be longer than the compact mailbox/enumeration implementation. This can cause problems for mobile devices (such as cell phones) with a limited number of registers or processor instruction cache. For full-sized computers, it may cause cache misses between level-one and level-two cache. This is only a potential problem, not a major drawback, as most machines will have enough instruction cache for this not to be an issue.

Incremental update

[edit]

Some kinds of bitboards are derived from others by an elaborate process of cross-correlation, such as the attack maps in chess. Reforming all these maps at each change of game state (such as a move) can be prohibitively expensive, so derived bitmaps are incrementally updated, a process which requires intricate and precise code. This is much faster to execute, because only bitmaps associated with changed spaces, not all bitmaps over the board, need to change. Without incremental update, bitmapped representation may not be more efficient than the older mailbox representation where update is intrinsically local and incremental.

Precomputed bitmaps and table lookup

[edit]

Some kinds of bitmaps that don't depend on board configurations can be precomputed and retrieved by table lookup rather than collated after a move or state change of the board, such as spaces attacked by a knight or king located on each of 64 spaces of a chessboard that would otherwise require an enumeration.

In chess

[edit]

The obvious, and simplest representation of the configuration of pieces on a chessboard, is as a list (array) of pieces in a conveniently searchable order (such as smallest to largest in value) that maps each piece to its location on the board. Analogously, collating the spaces attacked by each piece requires a serial enumeration of such spaces for a piece. This scheme is called mailbox addressing. Separate lists are maintained for white and black pieces, and often for white and black pawns. The maps are updated each move, which requires a linear search (or two if a piece was captured) through the piece list. The advantage of mailbox is simple code; the disadvantage is linear lookups are slow. Bitboards are faster, but more elaborate.

Standard

[edit]
Image
Chessboard with algebraic notation

In bitboard representations, each bit of a 64 bit word (or double word on 32-bit architectures) is associated with a square of the chessboard. Any mapping of bits to squares can be used, but by broad convention, bits are associated with squares from left to right and bottom to top, so that bit 0 represents square a1, bit 7 is square h1, bit 56 is square a8 and bit 63 is square h8.

Many different configurations of the board are often represented by their own bitboards including the locations of the kings, all white pawns, all black pawns, as well as bitboards for each of the other piece types or combinations of pieces, like all white pieces. Two attack bitboards are also universal: one bitboard per square for all pieces attacking the square, and the inverse bitboard for all squares attacked by a piece for each square containing a piece. Bitboards can also be constants such as one representing the first rank, which would have one bits in positions 0 - 7. Other local or transitional bitboards such as "all spaces adjacent to the king attacked by opposing pieces" may be collated as necessary or convenient.[1]

An example of the use of the bitboards would be determining whether a piece is en prise: bitboards for "all friendly pieces guarding space" and "all opposing pieces attacking space" would allow matching the pieces to readily determine whether a target piece on space is en prise.

One of the drawbacks of standard bitboards is collating the attack vectors of the sliding pieces (rook, bishop, queen), because they have an indefinite number of attack spaces depending on other occupied spaces. This requires several lengthy sequences of masks, shifts and complements per piece.

Auxiliary bitboard representations

[edit]

Alternate bitboard data structures have been devised to collate the code size and computing complexity of generating bitboards for the attack vectors of sliding pieces. The bitboard representations of knights, kings, pawns and other board configurations are unaffected by the use of auxiliary bitboards for the sliding pieces.

Rotated bitboards

[edit]

Rotated bitboards are complementary bitboard data structures that enable tabularizing of sliding piece attack vectors, one for file attack vectors of rooks, and one each for the diagonal and anti-diagonal attack vectors of bishops (rank attacks of rooks can be indexed from standard bitboards). With these bitboards, a single table lookup replaces lengthy sequences of bitwise operations.

These bitboards rotate the board occupancy configuration by 90 degrees, 45 degrees, and/or 315 degrees. A standard bitboard has one byte per rank of the chess board. With this bitboard, it is easy to determine rook attacks across a rank, using a table indexed by the occupied square and the occupied positions in the rank (because rook attacks stop at the first occupied square). By rotating the bitboard 90 degrees, rook attacks up and down a file can be examined the same way. Bitboards rotated 45 degrees and 315 degrees (-45 degrees) have diagonals that are easy to examine, for determining bishop attacks. The queen can be examined by combining rook and bishop attacks. The rotation of a bitboard, however, is an inelegant transformation that can take dozens of instructions.[2][3]

Direct hashing

[edit]

The attack vectors of rooks and bishops can be separately masked and used as indices into a hash table of precomputed attack vectors depending on occupancy: 8 bits each for rooks and from 2 to 8 bits each for bishops. The full attack vector of a piece is obtained as the union of each of the two unidirectional vectors indexed from the hash table. The number of entries in the hash table is modest, on the order of bytes, or about 2 kilobytes. Two hash function computations and two lookups per piece are required for the hashing scheme.[4][5]

Magic bitboards

[edit]

Magic bitboards are an extrapolation of the time-space tradeoff of direct hashing lookup of attack vectors. These use a transmutation of the full attack vector as an index into the hash table. The term magic is a misnomer, and simply refers to the generation and use of a perfect hash function in conjunction with tricks to reduce the potential size of the hash table that would have to be stored in memory, which is bytes, or 144 exabytes.[nb 1]

The outer squares, or first and eighth ranks together with the 'a' and 'h' files, are irrelevant to the occupancy of the attack vector: the piece attacks those squares or not (depending on other blocking pieces) regardless of occupancy, so these can be eliminated from consideration, leaving just 6x6 or 36 squares (~bits in the corresponding hash function).

The attack index for a given square is computed using the formula: where is the 64-bit occupancy bitboard representing blocking pieces on the relevant sliders' rays, is a square-specific 64-bit "magic multiplier," and is the number of bits required to index the attack look-up table for that specific square. The resulting bitwise shift isolates the most significant bits of the multiplication, creating a collision-free index for rapid lookups.[6]

As with other schemes that require a perfect hashing function, an exhaustive process of enumeration, partly algorithmic and partly trial and error, is necessary to generate the hash function. Magic bitboards are also very active tables, and their size (fewer than a million entries in most cases) is huge relative to the lower level cache sizes of modern chip architectures, resulting in cache flooding. In many applications, magic bitboards provide no performance gain over more modest hashing schemes or rotated bitboards.[7][8]

BMI2 instruction set optimization

[edit]

On modern x86-64 processors supporting the BMI2 (Bit Manipulation Instruction Set 2) architecture, the calculation of chess sliding piece attacks can be accelerated directly via hardware, removing the need for magic multiplier hashing. This approach utilizes the parallel bit extract (_pext_u64) and parallel bit deposit (_pdep_u64) assembly instructions. The PEXT instruction uses a pre-computed blocker mask to extract the relevant occupancy bits directly into a contiguous, zero-extended index. This index is then used to perform an instantaneous array lookup for the attack map, optimizing CPU register usage and eliminating potential cache misses associated with larger magic lookup tables.[9]

History

[edit]

The bitboard method for representing a board game is credited to Arthur Samuel, who used it in the mid-1950s in his checkers program.[10] For the more complicated game of chess, the method was credited to both the Kaissa team in the Soviet Union in the late 1960s,[11] and to the authors of the U.S. Northwestern University program "Chess" in the early 1970s. The 64-bit word length of 1970s supercomputers like Amdahl and Cray machines facilitated the development of bitboard representations, which conveniently mapped the 64-squares of the chessboard to bits of a word.

Rotated bitboards for collating the moves of sliding pieces were invented by Professor Robert Hyatt, author of Cray Blitz and Crafty chess engines, sometime in the mid-1990s and shared with the Dark Thought programming team. They were later implemented in Crafty and Dark Thought, but the first published description wasn't until 1997.

A decade later, direct lookup methods using masked ranks, files and diagonals to index a table of attack vectors depending on occupancy states of bits under the mask, were introduced. One such scheme utilizing a perfect hash function to eliminate hash collisions, was termed "magic bitboards". Nonetheless, the large size and high access rates of such tables caused memory occupancy and cache contention issues, and weren't necessarily more efficacious than the rotated bitboard approach. As of 2026, game programs remain divided, with the best scheme being application dependent.

Other games

[edit]

Many other games besides chess benefit from bitboards.

  • In Connect Four, they allow for very efficient testing for four consecutive discs, by just two shift + AND operations per direction.
  • In Conway's Game of Life, they are a possible alternative to arrays.
  • They are also used in Reversi (also known as Othello).

See also

[edit]

Notes

[edit]
  1. Use of a perfect hash function is not required for implementation of this method, and provides only a vanishingly small benefit over standard hashing methods.

References

[edit]
  1. Atkin, Larry R.; Slate, David J. (1983) [1977]. "Chess 4.5: the Northwestern University Chess Program". In Frey, Peter W. (ed.). Chess Skill in Man and Machine (2 ed.). Springer Verlag. pp. 82–118. CiteSeerX 10.1.1.111.926. ISBN 0-387-90790-4.
  2. Heinz, Ernst A. (September 1997). "How Dark Thought Plays Chess". ICCA Journal. 20 (3): 166–176.
  3. Hyatt, Robert (1999). "Rotated Bitboards: New Twist on an Old Idea". Archived from the original on 2005-04-28.
  4. Tannous, Sam (2007-07-23) [2006]. "Avoiding Rotated Bitboards with Direct Lookup". ICGA Journal. 30 (2) (2 ed.). Durham, North Carolina, USA: 85–91. arXiv:0704.3773v2. CiteSeerX 10.1.1.561.3461. doi:10.3233/ICG-2007-30204.
  5. Knuth, Donald (1973). "Section 6.4. Algorithm D (Open addressing with double hashing)". The Art of Computer Programming. Vol. 3.
  6. "Magic Bitboards Optimization". Chess Programming Wiki. Chess Programming Wiki. 2026-05-14. Retrieved 2026-06-18.
  7. Sherwin, Michael; Isenberg, Gerd (2006-12-04). "Magic Bitboards Explained!". Winboard Forum. Call it Kindergarten Bitboards
  8. Hansen, Lasse (2006-06-14). "Fast(er) bitboard move generator". Winboard Forum..
  9. "BMI2 Optimization for Bitboards". Chess Programming Wiki. Chess Programming Wiki. 2026-03-03. Retrieved 2026-06-18.
  10. "Some Studies in Machine Learning Using the Game of Checkers". IBM Journal of Research and Development. 1959.
  11. Adel'Son-Vel'Skii, G. M.; Arlazarov, V. L.; Bitman, A. R.; Zhivotovskii, A. A.; Uskov, A. V. (1970). "Programming a computer to play chess". Russian Mathematical Surveys. 25 (2): 221. Bibcode:1970RuMaS..25..221A. doi:10.1070/RM1970v025n02ABEH003792.

Further reading

[edit]
[edit]

Calculators

[edit]

Checkers

[edit]

Chess

[edit]

Articles

[edit]

Code examples

[edit]
  • The author of the Frenzee engine had posted some source examples.
  • A 155 line java Connect-4 program demonstrating the use of bitboards.

Implementations

[edit]
Open source
[edit]
  • Beowulf Unix, Linux, Windows. Rotated bitboards.
  • Crafty See the Crafty article. Written in straight C. Rotated bitboards in the old versions, now uses magic bitboards.
  • GNU Chess See the GNU Chess Article.
  • Stockfish UCI chess engine ranking second in Elo as of 2010
  • Gray Matter C++, rotated bitboards.
  • KnightCap GPL. ELO of 2300.
  • Pepito C. Bitboard, by Carlos del Cacho. Windows and Linux binaries as well as source available.
  • Simontacci Rotated bitboards.
Closed source
[edit]

Othello

[edit]

Word games

[edit]