Skip to content

Java driver prepared statements

opuneet edited this page Jan 11, 2014 · 2 revisions

Users are encouraged to use prepared statements for their queries when using either Java Driver or the Astyanax adaptor over Java Driver.

Prepared statements (by design) need to be managed by the application that uses them. Consider this example that illustrates this point.

No prepared statements - direct query
while (!perfTest.stopped()) {   
    String query = "select * from astyanaxperf.test1 where key=" +   random.nextInt();  
    ResultSet resultSet = session.executeQuery(query);
    parseResultSet(resultSet);
}
Using prepared statements for same query
String query = "select * from astyanaxperf.test1 where key=?";  
// Cache the prepared statement for later re-use
PreparedStatement pStatement = session.prepareQuery(query);   
while (!perfTest.stopped()) {     
    // Build exact query from prepared statement
    BoundStatement bStatement = new BoundStatement(pStatement);
    bStatement.bind(random.nextInt());  
    ResultSet resultSet = session.executeQuery(bStatement);
    parseResultSet(resultSet);
}

In the examples above, you can see that the PreparedStatement needs to be supplied back to driver for re-use (and hence better performance), which means that the caller has to manage this statement.

When client apps build sophisticated DAOs, they generally make use of several queries and hence need to maintain the mapping between their use cases (query patterns) and the corresponding prepared statements. This will complicate the DAO implementations.

This is where Astyanax can help. Astyanax uses a fluent yet structured query syntax. The structured API design makes it feasible to construct a query signature for each query. The query signature can be used by Astyanax to detect recurring queries with similar structure (signature) and hence re-use the corresponding prepared statement. Hence Astyanax users can get automatic prepared statement management for free!

Here are examples on how to do this.

Writes
MutationBatch m = keyspace.prepareMutationBatch(); 
m.withRow( myCF, rowKey)
 .useCaching()   // tell Astyanax to reuse prepared statements
 .addColumn( columnName, columnValue )
 .execute();
Reads
pStmt = keyspace.prepareQuery( myCF )
        .useCaching(true)
        .withRow( myRowKey )
        .execute();

What's going on under the hood

Astyanax understands your table schema and also knows what type of query you are running - also known as query signature. Here are some examples of queries with unique signatures that help Astyanax tell them apart from each other.

  • Row query with all columns
  • Row query with column slice specified using column collection
  • Row query with column range query
  • Row slice query with row keys .... etc

The unique query signature along with a known schema definition helps Astyanax guess the exact CQL query you want to perform using Java Driver. This helps Astyanax pre-construct a CQL query and then cache for subsequent re-use when you re-submit another query with the exact same signature.

Note here that the signature of your queries needs to be consistent. Read the section below where automatic prepared statement caching WILL NOT work in Astyanax.

The catch with prepared-statements management!

The important caveat to note here is that prepared statements work well when your queries are highly cacheable, which means that your queries have the same signature.

For writes this means that if you are adding, updating, deleting columns with different timestamps and TTLs then you can't leverage prepared statements since there is no cacheability in those queries. If you want the mutation batches to be highly cacheable, then you must use the same TTLs and timestamps when re-using prepared statements for subsequent batches.

Similarly for reads, you can't prepare a statement by doing a row query using a columns slice component and then reuse that statement for a subsequent row query that also has a column slice component to it but which is slightly different, since the underlying query structure is actually different.

Here is an example to illustrate my point

// Select a row with a column slice specification (column range query)
select * from test1 where key = ? and column1 > ?; 
// is very different from a similar column slice query
select * from test1 where key = ? and column1 > ? and column1 < ?;

Hence both these queries have different signatures from the Astyanax point of view i.e Row query with Column Slice. Hence, in this case you cannot re-use prepared statements.

Hence use the automatic statement management feature in Astyanax with caution. Inspect your table schema and your query patterns. When using queries with different patterns, turn caching OFF.

Clone this wiki locally