Class Indexer

java.lang.Object
io.trino.plugin.accumulo.index.Indexer
All Implemented Interfaces:
Closeable, AutoCloseable

@NotThreadSafe public class Indexer extends Object implements Closeable
This utility class assists the Trino connector, and external applications, in populating the index table and metrics table for Accumulo-backed Trino tables.

This class is totally not thread safe.

When creating a table, if it contains indexed columns, users will have to create the index table and the index metrics table, the names of which can be retrieved using the static functions in this class. Additionally, users MUST add iterators to the index metrics table (also available via static function), and, while not required, recommended to add the locality groups to the index table to improve index lookup times.

Sample usage of an indexer:

 
 Indexer indexer = new Indexer(connector, userAuths, table, writerConf);
 for (Mutation m : mutationsToNormalTable) {
      indexer.index(m);
 }

 // can flush indexer w/regular BatchWriter
 indexer.flush()

 // finished adding new mutations, close the indexer
 indexer.close();
 
 
  • Field Details

    • METRICS_TABLE_ROW_ID

      public static final ByteBuffer METRICS_TABLE_ROW_ID
    • METRICS_TABLE_ROWS_CF

      public static final ByteBuffer METRICS_TABLE_ROWS_CF
    • METRICS_TABLE_ROW_COUNT

      public static final io.trino.plugin.accumulo.index.Indexer.MetricsKey METRICS_TABLE_ROW_COUNT
    • METRICS_TABLE_FIRST_ROW_CQ

      public static final ByteBuffer METRICS_TABLE_FIRST_ROW_CQ
    • METRICS_TABLE_LAST_ROW_CQ

      public static final ByteBuffer METRICS_TABLE_LAST_ROW_CQ
    • CARDINALITY_CQ

      public static final byte[] CARDINALITY_CQ
    • CARDINALITY_CQ_AS_TEXT

      public static final org.apache.hadoop.io.Text CARDINALITY_CQ_AS_TEXT
    • METRICS_TABLE_ROWS_CF_AS_TEXT

      public static final org.apache.hadoop.io.Text METRICS_TABLE_ROWS_CF_AS_TEXT
    • METRICS_TABLE_ROWID_AS_TEXT

      public static final org.apache.hadoop.io.Text METRICS_TABLE_ROWID_AS_TEXT
  • Constructor Details

    • Indexer

      public Indexer(org.apache.accumulo.core.client.Connector connector, org.apache.accumulo.core.security.Authorizations auths, AccumuloTable table, org.apache.accumulo.core.client.BatchWriterConfig writerConfig) throws org.apache.accumulo.core.client.TableNotFoundException
      Throws:
      org.apache.accumulo.core.client.TableNotFoundException
  • Method Details

    • index

      public void index(org.apache.accumulo.core.data.Mutation mutation)
      Index the given mutation, adding mutations to the index and metrics table

      Like typical use of a BatchWriter, this method does not flush mutations to the underlying index table. For higher throughput the modifications to the metrics table are tracked in memory and added to the metrics table when the indexer is flushed or closed.

      Parameters:
      mutation - Mutation to index
    • index

      public void index(Iterable<org.apache.accumulo.core.data.Mutation> mutations)
    • flush

      public void flush()
      Flushes all Mutations in the index writer. And all metric mutations to the metrics table. Note that the metrics table is not updated until this method is explicitly called (or implicitly via close).
    • close

      public void close()
      Flushes all remaining mutations via flush() and closes the index writer.
      Specified by:
      close in interface AutoCloseable
      Specified by:
      close in interface Closeable
    • getMetricIterators

      public static Collection<org.apache.accumulo.core.client.IteratorSetting> getMetricIterators(AccumuloTable table)
      Gets a collection of iterator settings that should be added to the metric table for the given Accumulo table. Don't forget! Please!
      Parameters:
      table - Table for retrieving metrics iterators, see AccumuloClient#getTable
      Returns:
      Collection of iterator settings
    • getIndexColumnFamily

      public static ByteBuffer getIndexColumnFamily(byte[] columnFamily, byte[] columnQualifier)
      Gets the column family of the index table based on the given column family and qualifier.
      Parameters:
      columnFamily - Trino column family
      columnQualifier - Trino column qualifier
      Returns:
      ByteBuffer of the given index column family
    • getLocalityGroups

      public static Map<String,Set<org.apache.hadoop.io.Text>> getLocalityGroups(AccumuloTable table)
      Gets a set of locality groups that should be added to the index table (not the metrics table).
      Parameters:
      table - Table for the locality groups, see AccumuloClient#getTable
      Returns:
      Mapping of locality group to column families in the locality group, 1:1 mapping in this case
    • getIndexTableName

      public static String getIndexTableName(String schema, String table)
      Gets the fully-qualified index table name for the given table.
      Parameters:
      schema - Schema name
      table - Table name
      Returns:
      Qualified index table name
    • getIndexTableName

      public static String getIndexTableName(SchemaTableName tableName)
      Gets the fully-qualified index table name for the given table.
      Parameters:
      tableName - Schema table name
      Returns:
      Qualified index table name
    • getMetricsTableName

      public static String getMetricsTableName(String schema, String table)
      Gets the fully-qualified index metrics table name for the given table.
      Parameters:
      schema - Schema name
      table - Table name
      Returns:
      Qualified index metrics table name
    • getMetricsTableName

      public static String getMetricsTableName(SchemaTableName tableName)
      Gets the fully-qualified index metrics table name for the given table.
      Parameters:
      tableName - Schema table name
      Returns:
      Qualified index metrics table name
    • getMinMaxRowIds

      public static Map.Entry<byte[],byte[]> getMinMaxRowIds(org.apache.accumulo.core.client.Connector connector, AccumuloTable table, org.apache.accumulo.core.security.Authorizations auths) throws org.apache.accumulo.core.client.TableNotFoundException
      Throws:
      org.apache.accumulo.core.client.TableNotFoundException