site-content/source/modules/ROOT/pages/blog/Apache-Cassandra-4.1-Denylisting-Partitions.adoc (61 lines of code) (raw):
= Apache Cassandra 4.1: Denylisting Partitions
:page-layout: single-post
:page-role: blog-post
:page-post-date: June 30, 2022
:page-post-author: Jordan West
:description: New Denylisting Partitions in Apache Cassandra 4.1
:keywords: denylisting partitions, 4.1, apache cassandra
:!figure-caption:
.Image credit: https://unsplash.com/@kylejglenn[Kyle Glenn on Unsplash^]
image::blog/apache-cassandra-4.1-denylisting-partitions-unsplash-kyle-glenn.jpg[Denylisting Partitions]
Like any database, Apache Cassandra can be impacted by data modeling choices gone awry or the unexpected use of even a well-designed data model. This occurs when partitions get too large or have too many rows or tombstones. These problematic partitions can impact the reading and writing of other partitions, which can lead to service-impacting incidents. While the Cassandra community is always actively working to reduce the impact they have, it can be necessary for operators to take immediate action when they are encountered. Apache Cassandra 4.1 introduces a new feature to give operators this control: the https://issues.apache.org/jira/browse/CASSANDRA-12106[Partition Denylist^].
The Partition Denylist allows operators to https://s3.amazonaws.com/systemsandpapers/papers/FOX_Brewer_99-Harvest_Yield_and_Scalable_Tolerant_Systems.pdf[make a choice^] between providing access to the entire data set with reduced performance or reducing the available data set (by preventing reads and writes to specific partitions) to ensure performance is not affected. This choice gives operators new control over how these problematic partitions affect other reads and writes serviced by Cassandra, converting operations that would be slow and eat resources to immediate failures before any resources are consumed. Operators have the choice to prevent reads and/or writes to the partition as well as range reads that include the partition.
== The Operators Perspective
For operators, JMX is the primary interface to interact with the Denylist. There are four operations available to operators: add a key to the denylist, check if a key is currently in the denylist, remove a key from the denylist, and force a refresh of a node’s local denylist cache. The examples below will use https://github.com/jiaqi/jmxterm[jmxterm^] but any JMX interface will work.
=== Denylisting a Key
To add a key to the denylist all that is needed is the keyspace, table, and partition key. In jmxterm denylisting the key `baz` in the keyspace `foo` and table `bar` looks like:
----
bean org.apache.cassandra.db:type=StorageProxy
run denylistKey foo bar baz
----
All operations are defined on the StorageProxy mbean.
=== Checking Denylist Status
To determine if a key (`baz`) for a particular keyspace and table (`foo.bar`) is in the denylist use the `isKeyDenylisted` operation:
----
bean org.apache.cassandra.db:type=StorageProxy
run isKeyDenylisted foo bar baz
----
=== Removing a Key from the Denylist
After mitigation or for other reasons, operators will eventually want to remove a key from the denylist. Like the other operations, all that is needed is the keyspace, table, and key. In this case, to remove the key `baz` in `foo.bar` from the denylist:
----
bean org.apache.cassandra.db:type=StorageProxy
run removeDenylistKey foo bar baz
----
=== Refreshing the Denylist Cache
For performance purposes, the denylist implementation maintains a local cache on every node. The cache is refreshed periodically (by default every 5 minutes) but it may be necessary to manually refresh it. This can be done with a quick JMX call on the nodes where its necessary (typically on the nodes that own a partition that was recently added):
----
bean org.apache.cassandra.db:type=StorageProxy
run loadPartitionDenylist
----
=== Other Denylist Configuration
Other aspects of the denylist are https://cassandra.apache.org/doc/trunk/cassandra/configuration/cass_yaml_file.html[controlled via properties in `cassandra.yaml`^]. Operators can control whether reads, range reads, and/or writes fail for partitions listed in the denylist. They can also control the frequency of local node cache refreshes and the consistency level used to load the cache (higher consistency gives more accuracy at the cost of potentially not being able to complete the query while lower consistency increases the chances of a successful load while also increasing the potential to have stale or missed records in the denylist).
== The Client Perspective
For clients of Cassandra, they have no control over what partitions are in the denylist. They will however get an informative exception telling them why the read or write failed. In the example below key 3 has been placed into the denylist while key 9 remains unaffected – both reads and writes are configured to be denied.
----
cqlsh> select * from system_distributed.partition_denylist;
ks_name | table_name | key
---------+------------+------------
foo | buz | 0x00000003
cqlsh> select * from foo.buz where key=3;
InvalidRequest: Error from server: code=2200 [Invalid query] message= \
"Unable to read denylisted partition [0xDecoratedKey(9010454139840013625, 00000003)] in foo/buz"
cqlsh> select * from foo.buz where key=9;
key | c1 | c2 | c3 | c4
-----+----+----+----+----
9 | 1 | 3 | 4 | 5
9 | 9 | 3 | 4 | 5
cqlsh> insert into foo.buz (key, c1, c2, c3, c4) VALUES (3, 1, 1, 1, 1);
InvalidRequest: Error from server: code=2200 [Invalid query] message= \
"Unable to write to denylisted partition [0xDecoratedKey(9010454139840013625, 00000003)] in foo/buz"
----
The Partition Denylist provides operators with a powerful tool when data models run off the tracks. The Cassandra community continues to https://issues.apache.org/jira/browse/CASSANDRA-11206[strengthen and widen those tracks^] with the hopes that the need to prevent reads and writes to specific partitions will diminish. However, users always find new ways to surprise database authors and administrators. For those cases, the Partition Denylist provides a quick solution with a significant impact.