Class ReplicaGroupInstanceSelector

  • All Implemented Interfaces:
    InstanceSelector
    Direct Known Subclasses:
    StrictReplicaGroupInstanceSelector

    public class ReplicaGroupInstanceSelector
    extends Object
    Instance selector for replica-group routing strategy.

    The selection algorithm will always evenly distribute the traffic to all replicas of each segment, and will select the same index of the enabled instances for all segments with the same number of replicas. The algorithm is very light-weight and will do best effort to select the least servers for the request.

    The algorithm relies on the mirror segment assignment from replica-group segment assignment strategy. With mirror segment assignment, any server in one replica-group will always have a corresponding server in other replica-groups that have the same segments assigned. For an example, if S1 is a server in replica-group 1, and it has mirror server S2 in replica-group 2 and S3 in replica-group 3. All segments assigned to S1 will also be assigned to S2 and S3. In stable scenario (external view matches ideal state), all segments assigned to S1 will have the same enabled instances of [S1, S2, S3] sorted (in alphabetical order). If we always pick the same index of enabled instances for all segments, only one of S1, S2, S3 will be picked, so it is guaranteed that we pick the least server instances for the request (there is no guarantee on choosing servers from the same replica-group though). In transitioning/error scenario (external view does not match ideal state), there is no guarantee on picking the least server instances, but the traffic is guaranteed to be evenly distributed to all available instances to avoid overwhelming hotspot servers.

    If the query option NUM_REPLICA_GROUPS_TO_QUERY is provided, the servers to be picked will be from different replica groups such that segments are evenly distributed amongst the provided value of NUM_REPLICA_GROUPS_TO_QUERY. Thus in case of [S1, S2, S3] if NUM_REPLICA_GROUPS_TO_QUERY = 2, the ReplicaGroup S1 and ReplicaGroup S2 will be selected such that half the segments will come from S1 and other half from S2. If NUM_REPLICA_GROUPS_TO_QUERY value is much greater than available servers, then ReplicaGroupInstanceSelector will behave similar to BalancedInstanceSelector.

    If AdaptiveServerSelection is enabled, a single snapshot of the server ranking is fetched. This ranking is referenced to pick the best available server for each segment. The algorithm ends up picking the minimum number of servers required to process a query because it references a single snapshot of the server rankings. Currently, NUM_REPLICA_GROUPS_TO_QUERY is not supported is AdaptiveServerSelection is enabled.

    • Constructor Detail

      • ReplicaGroupInstanceSelector

        public ReplicaGroupInstanceSelector​(String tableNameWithType,
                                            org.apache.helix.store.zk.ZkHelixPropertyStore<org.apache.helix.zookeeper.datamodel.ZNRecord> propertyStore,
                                            org.apache.pinot.common.metrics.BrokerMetrics brokerMetrics,
                                            @Nullable
                                            AdaptiveServerSelector adaptiveServerSelector,
                                            Clock clock)
    • Method Detail

      • init

        public void init​(Set<String> enabledInstances,
                         org.apache.helix.model.IdealState idealState,
                         org.apache.helix.model.ExternalView externalView,
                         Set<String> onlineSegments)
        Description copied from interface: InstanceSelector
        Initializes the instance selector with the enabled instances, ideal state, external view and online segments (segments with ONLINE/CONSUMING instances in the ideal state and pre-selected by the SegmentPreSelector). Should be called only once before calling other methods.
        Specified by:
        init in interface InstanceSelector
      • onInstancesChange

        public void onInstancesChange​(Set<String> enabledInstances,
                                      List<String> changedInstances)
        Processes the instances change. Changed instances are pre-computed based on the current and previous enabled instances only once on the caller side and passed to all the instance selectors.

        Updates the cached enabled instances and re-calculates segmentToEnabledInstancesMap and unavailableSegments based on the cached states.

        Specified by:
        onInstancesChange in interface InstanceSelector
      • onAssignmentChange

        public void onAssignmentChange​(org.apache.helix.model.IdealState idealState,
                                       org.apache.helix.model.ExternalView externalView,
                                       Set<String> onlineSegments)
        Processes the segment assignment (ideal state or external view) change based on the given online segments (segments with ONLINE/CONSUMING instances in the ideal state and pre-selected by the SegmentPreSelector).

        Updates the cached maps (segmentToOnlineInstancesMap, segmentToOfflineInstancesMap and instanceToSegmentsMap) and re-calculates segmentToEnabledInstancesMap and unavailableSegments based on the cached states.

        Specified by:
        onAssignmentChange in interface InstanceSelector
      • select

        public InstanceSelector.SelectionResult select​(org.apache.pinot.common.request.BrokerRequest brokerRequest,
                                                       List<String> segments,
                                                       long requestId)
        Description copied from interface: InstanceSelector
        Selects the server instances for the given segments queried by the given broker request, returns a map from segment to selected server instance hosting the segment and a set of unavailable segments (no enabled instance or all enabled instances are in ERROR state).
        Specified by:
        select in interface InstanceSelector
        Parameters:
        brokerRequest - BrokerRequest for the query
        segments - segments for which instance needs to be selected
        requestId - requestId generated by the Broker for a query
        Returns:
        instance of SelectionResult which describes the instance to pick for a given segment