Class TextCleaner


  • public class TextCleaner
    extends Object
    This class is used to clean up plain text fields based on the selected set of options. It optionally escapes certain special characters, as well as replacing various HTML and Unicode entities with their plaintext equivalents.
    Author:
    Phil DeJarnett
    • Constructor Detail

      • TextCleaner

        public TextCleaner​(Options options)
        Create a new TextCleaner based on the configured options.
        Parameters:
        options - Options that will affect what is cleaned.
    • Method Detail

      • clean

        public String clean​(Object input)
        Clean the given input text based on the original configuration Options. Newlines are also replaced with a single space.
        Parameters:
        input - The text to be cleaned. Can be any object. JSoup nodes are handled specially.
        Returns:
        The cleaned text.
      • cleanCode

        public String cleanCode​(Object input)
        Clean the given input text based on the original configuration Options. The text is treat as code, so it is not escaped, and newlines are preserved.
        Parameters:
        input - The text to be cleaned. Can be any object. JSoup nodes are handled specially.
        Returns:
        The cleaned text.
      • cleanInlineCode

        public String cleanInlineCode​(Object input)
        Method to clean inline code, and, if necessary, add spaces to make sure that internal, leading, or trailing '`' characters don't break the inline code. Newlines are also replaced with spaces. This method also adds the leading and trailing '`' or '```' as necessary.
        Parameters:
        input - String to clean. Can be any object. JSoup nodes are handled specially.
        Returns:
        The cleaned text.
      • unescapeLeadingCharacters

        public String unescapeLeadingCharacters​(String input)
        Removes the escaping on leading characters, for example, when they are going to be rendered inside another node, such as a table.
        Parameters:
        input - String to process
        Returns:
        Cleaned string.
      • cleanUrl

        public String cleanUrl​(String input)
        Handles escaping special characters in URLs to avoid issues when they are rendered out (ie: spaces, parentheses)
        Parameters:
        input - URL to process
        Returns:
        Cleaned URL