Indexing in Cassandra with Storage Attached Indexes (SAI)

Have you ever wondered why Cassandra requires the ALLOW FILTERING keyword for some queries? It’s because you’ve tried to query on a column that isn’t part of the partition key. While it’s not recommended to use ALLOW FILTERING in most cases for performance reasons, Cassandra’s secondary indexes provide a better approach. The Storage Attached Index (SAI) is a new secondary index implementation now available in Datastax Astra and Datastax Enterprise. SAI provides a filtering capability which is easier to use, more efficient, and simpler to maintain than Cassandra’s current indexing or add-on search solutions.

When to use SAI

When to use SAI

Indexes allow you to query columns outside the Cassandra partition key without using the ALLOW FILTERING keyword or creating custom tables for each query pattern, as you would according to the classic best practices for Cassandra data modeling. You can create a table that is most natural for you, write to just that table, and query it any way you want. Your queries are not restricted by your primary key.

Next concept: Defining SAI indexes

Defining SAI indexes

After creating your database, a keyspace, and one or more tables, use

CREATE CUSTOM INDEX ... USING 'StorageAttachedIndex'
DDL commands to define one or more SAI indexes on the table that you wish to index.

Next concept: Querying your table
Defining SAI indexes
Querying your table

Querying your table

Once the index has been created, it is simply a matter of querying the table and specifying the SAI-indexed columns.

SAI is supported by DataStax Enterprise 6.8.3 and later (see the DSE release notes), and you can also give it a try on your free database in DataStax Astra in the skill building section below.

SAI is on the roadmap to be added to OSS Cassandra in the near future. See the Cassandra Enhancement Proposal (CEP-7) for more details.

More Resources

Hands-on learning, articles, and documentation for SAI

SAI Quick Start
SAI

SAI Quick Start

Follow this short tutorial to get started quickly with using indexes on DSE or Astra.

See the Docs
What is SAI?
SAI

What is SAI?

Storage-Attached Indexing is a highly-scalable, globally-distributed index for Apache Cassandra®.

See the Docs
Better Cassandra Indexes for a Better Data Model: Introducing Storage-Attached Indexing
SAI

Better Cassandra Indexes for a Better Data Model: Introducing Storage-Attached Indexing

The future of indexing in Apache Cassandra is here.

Read More