Class Remark


  • public class Remark
    extends Object
    The class that manages converting HTML to Markdown.

    It is recommended that you save this class if it is going to be reused for better performance. This class is thread-safe, but can only process a single document concurrently.

    Usage:

    Basic usage involves instantiating this class with a specific set of options, and calling one of the convert* methods on some form of input.

    Examples:

     // Create a generic remark that converts to pure-Markdown spec. 
     Remark remark = new Remark();
     String cleanedUp = remark.convertFragment(inputString);
     
     // Create a remark that converts to pegdown with all extensions enabled. 
     Remark pegdownAll = new Remark(Options.pegdownAllExtensions());
     cleanedUp = pegdownAll.convert(new URL("http://www.example.com"), 15000);
     
     // stream the conversion
     pegdownAll.withStream(System.out).convert(new URL("http://www.overzealous.com"), 15000);
     
    Author:
    Phil DeJarnett
    • Constructor Detail

      • Remark

        public Remark()
        Creates a default, pure Markdown-compatible Remark instance.
      • Remark

        public Remark​(Options options)
        Creates a Remark instance with the specified options.
        Parameters:
        options - Specified options to use on this instance. See the docs for the Options class for common options sets.
    • Method Detail

      • getConverter

        public DocumentConverter getConverter()
        Provides access to the DocumentConverter for customization.
        Returns:
        the configured DocumentConverter.
      • isCleanedHtmlEchoed

        public boolean isCleanedHtmlEchoed()
        Returns true if the cleaned HTML document is echoed to System.out.
        Returns:
        true if the cleaned HTML document is echoed
      • setCleanedHtmlEchoed

        public void setCleanedHtmlEchoed​(boolean cleanedHtmlEchoed)
        To see the cleaned and processed HTML document, set this to true. It will be rendered to System.out for debugging purposes.
        Parameters:
        cleanedHtmlEchoed - true to echo out the cleaned HTML document
      • withWriter

        public Remark withWriter​(Writer writer)
        Use this method in a chain to handle streaming the output to a Writer. The returned class can be saved for repeated writing to the same streams.

        Note: The convert methods on the returned class will always return null.

        Note: It is up to the calling class to handle closing the writer!

        Example:

        new Remark(options).withWriter(myWiter).convert(htmlText);
        Parameters:
        writer - Writer to receive the converted output
        Returns:
        A Remark that writes to streams.
      • withOutputStream

        public Remark withOutputStream​(OutputStream os)
        Use this method in a chain to handle streaming the output to an OutputStream. The returned class can be saved for repeated writing to the same streams.

        Note: The convert methods on the returned class will always return null.

        Note: It is up to the calling class to handle closing the stream!

        Example:

        new Remark(options).withOutputStream(myOut).convert(htmlText);
        Parameters:
        os - OutputStream to receive the converted output
        Returns:
        A Remark that writes to streams.
      • convert

        public String convert​(URL url,
                              int timeoutMillis)
                       throws IOException
        Converts an HTML document retrieved from a URL to Markdown.
        Parameters:
        url - URL to connect to.
        timeoutMillis - Maximum time to wait before giving up on the connection.
        Returns:
        Markdown text.
        Throws:
        IOException - If an error occurs while retrieving the document.
        See Also:
        Jsoup.parse(URL, int)
      • convert

        public String convert​(File file)
                       throws IOException
        Converts an HTML file to Markdown.
        Parameters:
        file - The file to load.
        Returns:
        Markdown text.
        Throws:
        IOException - If an error occurs while loading the file.
        See Also:
        Jsoup.parse(File, String, String)
      • convert

        public String convert​(File file,
                              String charset)
                       throws IOException
        Converts an HTML file to Markdown.
        Parameters:
        file - The file to load.
        charset - The charset of the file (if not specified and not UTF-8). Set to null to determine from http-equiv meta tag, if present, or fall back to UTF-8 (which is often safe to do).
        Returns:
        Markdown text.
        Throws:
        IOException - If an error occurs while loading the file.
        See Also:
        Jsoup.parse(File, String, String)
      • convert

        public String convert​(File file,
                              String charset,
                              String baseUri)
                       throws IOException
        Converts an HTML file to Markdown.
        Parameters:
        file - The file to load.
        charset - The charset of the file (if not specified and not UTF-8). Set to null to determine from http-equiv meta tag, if present, or fall back to UTF-8 (which is often safe to do).
        baseUri - The base URI for resolving relative links.
        Returns:
        Markdown text.
        Throws:
        IOException - If an error occurs while loading the file.
        See Also:
        Jsoup.parse(File, String, String)
      • convert

        public String convert​(String html)
        Converts HTML in memory to Markdown.
        Parameters:
        html - The string to processConvert from HTML
        Returns:
        Markdown text.
        See Also:
        Jsoup.parse(String, String)
      • convert

        public String convert​(String html,
                              String baseUri)
        Converts HTML in memory to Markdown.
        Parameters:
        html - The string to processConvert from HTML
        baseUri - The base URI for resolving relative links.
        Returns:
        Markdown text.
        See Also:
        Jsoup.parse(String, String)
      • convertFragment

        public String convertFragment​(String body)
        Converts an HTML body fragment to Markdown.
        Parameters:
        body - The fragment string to processConvert from HTML
        Returns:
        Markdown text.
        See Also:
        Jsoup.parseBodyFragment(String, String)
      • convertFragment

        public String convertFragment​(String body,
                                      String baseUri)
        Converts an HTML body fragment to Markdown.
        Parameters:
        body - The fragment string to processConvert from HTML
        baseUri - The base URI for resolving relative links.
        Returns:
        Markdown text.
        See Also:
        Jsoup.parseBodyFragment(String, String)
      • convert

        public String convert​(org.jsoup.nodes.Document doc)
        Converts an already-loaded JSoup Document to Markdown.
        Parameters:
        doc - Document to be processed
        Returns:
        Markdown text.