Class AvroUtils
- java.lang.Object
-
- org.apache.pinot.plugin.inputformat.avro.AvroUtils
-
public class AvroUtils extends Object
Utils for handling Avro records
-
-
Method Summary
Modifier and Type Method Description static FieldSpec.DataTypeextractFieldDataType(org.apache.avro.Schema.Field field)Extract the data type stored in Pinot for the given Avro field.static org.apache.avro.file.DataFileStream<org.apache.avro.generic.GenericRecord>getAvroReader(File avroFile)Get the Avro file reader for the given file.static org.apache.avro.SchemagetAvroSchemaFromPinotSchema(Schema pinotSchema)Helper method to build Avro schema from Pinot schema.static SchemagetPinotSchemaFromAvroDataFile(File avroDataFile)Given an Avro data file, count all columns as dimension and return the equivalent Pinot schema.static SchemagetPinotSchemaFromAvroDataFile(File avroDataFile, Map<String,FieldSpec.FieldType> fieldTypeMap, TimeUnit timeUnit)Given an Avro data file, map from column to field type and time unit, return the equivalent Pinot schema.static SchemagetPinotSchemaFromAvroSchema(org.apache.avro.Schema avroSchema, Map<String,FieldSpec.FieldType> fieldTypeMap, TimeUnit timeUnit)Given an Avro schema, map from column to field type and time unit, return the equivalent Pinot schema.static SchemagetPinotSchemaFromAvroSchemaFile(File avroSchemaFile, Map<String,FieldSpec.FieldType> fieldTypeMap, TimeUnit timeUnit, boolean complexType, List<String> fieldsToUnnest, String delimiter, ComplexTypeConfig.CollectionNotUnnestedToJson collectionNotUnnestedToJson)Given an Avro schema file, map from column to field type and time unit, return the equivalent Pinot schema.static SchemagetPinotSchemaFromAvroSchemaWithComplexTypeHandling(org.apache.avro.Schema avroSchema, Map<String,FieldSpec.FieldType> fieldTypeMap, TimeUnit timeUnit, List<String> fieldsToUnnest, String delimiter, ComplexTypeConfig.CollectionNotUnnestedToJson collectionNotUnnestedToJson)Given an Avro schema, flatten/unnest the complex types based on the config, and then map from column to field type and time unit, return the equivalent Pinot schema.static booleanisSingleValueField(org.apache.avro.Schema.Field field)Return whether the Avro field is a single-value field.
-
-
-
Method Detail
-
getPinotSchemaFromAvroSchema
public static Schema getPinotSchemaFromAvroSchema(org.apache.avro.Schema avroSchema, @Nullable Map<String,FieldSpec.FieldType> fieldTypeMap, @Nullable TimeUnit timeUnit)
Given an Avro schema, map from column to field type and time unit, return the equivalent Pinot schema.- Parameters:
avroSchema- Avro schemafieldTypeMap- Map from column to field typetimeUnit- Time unit- Returns:
- Pinot schema
-
getPinotSchemaFromAvroSchemaWithComplexTypeHandling
public static Schema getPinotSchemaFromAvroSchemaWithComplexTypeHandling(org.apache.avro.Schema avroSchema, @Nullable Map<String,FieldSpec.FieldType> fieldTypeMap, @Nullable TimeUnit timeUnit, List<String> fieldsToUnnest, String delimiter, ComplexTypeConfig.CollectionNotUnnestedToJson collectionNotUnnestedToJson)
Given an Avro schema, flatten/unnest the complex types based on the config, and then map from column to field type and time unit, return the equivalent Pinot schema.- Parameters:
avroSchema- Avro schemafieldTypeMap- Map from column to field typetimeUnit- Time unitfieldsToUnnest- the fields to unnestdelimiter- the delimiter to separate components in nested structurecollectionNotUnnestedToJson- the mode of converting collection to JSON- Returns:
- Pinot schema
-
getPinotSchemaFromAvroDataFile
public static Schema getPinotSchemaFromAvroDataFile(File avroDataFile, @Nullable Map<String,FieldSpec.FieldType> fieldTypeMap, @Nullable TimeUnit timeUnit) throws IOException
Given an Avro data file, map from column to field type and time unit, return the equivalent Pinot schema.- Parameters:
avroDataFile- Avro data filefieldTypeMap- Map from column to field typetimeUnit- Time unit- Returns:
- Pinot schema
- Throws:
IOException
-
getPinotSchemaFromAvroDataFile
public static Schema getPinotSchemaFromAvroDataFile(File avroDataFile) throws IOException
Given an Avro data file, count all columns as dimension and return the equivalent Pinot schema.Should be used for testing purpose only.
- Parameters:
avroDataFile- Avro data file- Returns:
- Pinot schema
- Throws:
IOException
-
getPinotSchemaFromAvroSchemaFile
public static Schema getPinotSchemaFromAvroSchemaFile(File avroSchemaFile, @Nullable Map<String,FieldSpec.FieldType> fieldTypeMap, @Nullable TimeUnit timeUnit, boolean complexType, List<String> fieldsToUnnest, String delimiter, ComplexTypeConfig.CollectionNotUnnestedToJson collectionNotUnnestedToJson) throws IOException
Given an Avro schema file, map from column to field type and time unit, return the equivalent Pinot schema.- Parameters:
avroSchemaFile- Avro schema filefieldTypeMap- Map from column to field typetimeUnit- Time unitcomplexType- if allows complex-type handlingfieldsToUnnest- the fields to unnestdelimiter- the delimiter separating components in nested structurecollectionNotUnnestedToJson- to mode of converting collection to JSON string- Returns:
- Pinot schema
- Throws:
IOException
-
getAvroSchemaFromPinotSchema
public static org.apache.avro.Schema getAvroSchemaFromPinotSchema(Schema pinotSchema)
Helper method to build Avro schema from Pinot schema.- Parameters:
pinotSchema- Pinot schema.- Returns:
- Avro schema.
-
getAvroReader
public static org.apache.avro.file.DataFileStream<org.apache.avro.generic.GenericRecord> getAvroReader(File avroFile) throws IOException
Get the Avro file reader for the given file.- Throws:
IOException
-
isSingleValueField
public static boolean isSingleValueField(org.apache.avro.Schema.Field field)
Return whether the Avro field is a single-value field.
-
extractFieldDataType
public static FieldSpec.DataType extractFieldDataType(org.apache.avro.Schema.Field field)
Extract the data type stored in Pinot for the given Avro field.
-
-