Class Splitter


  • @GwtCompatible(emulated=true)
    public final class Splitter
    extends java.lang.Object
    Extracts non-overlapping substrings from an input string, typically by recognizing appearances of a separator sequence. This separator can be specified as a single character, fixed string, regular expression or CharMatcher instance. Or, instead of using a separator at all, a splitter can extract adjacent substrings of a given fixed length.

    For example, this expression:

        
    
       Splitter.on(',').split("foo,bar,qux")
     
    ... produces an Iterable containing "foo", "bar" and "qux", in that order.

    By default, Splitter's behavior is simplistic and unassuming. The following expression:

        
    
       Splitter.on(',').split(" foo,,,  bar ,")
     
    ... yields the substrings [" foo", "", "", " bar ", ""]. If this is not the desired behavior, use configuration methods to obtain a new splitter instance with modified behavior:
     {
            @code
    
            private static final Splitter MY_SPLITTER = Splitter.on(',').trimResults().omitEmptyStrings();
     }
     

    Now MY_SPLITTER.split("foo,,, bar ,") returns just ["foo", "bar"]. Note that the order in which these configuration methods are called is never significant.

    Warning: Splitter instances are immutable. Invoking a configuration method has no effect on the receiving instance; you must store and use the new splitter instance it returns instead.

     {
            @code
    
            // Do NOT do this
            Splitter splitter = Splitter.on('/');
            splitter.trimResults(); // does nothing!
            return splitter.split("wrong / wrong / wrong");
     }
     

    For separator-based splitters that do not use omitEmptyStrings, an input string containing n occurrences of the separator naturally yields an iterable of size n + 1. So if the separator does not occur anywhere in the input, a single substring is returned containing the entire input. Consequently, all splitters split the empty string to [""] (note: even fixed-length splitters).

    Splitter instances are thread-safe immutable, and are therefore safe to store as static final constants.

    The Joiner class provides the inverse operation to splitting, but note that a round-trip between the two should be assumed to be lossy.

    See the Guava User Guide article on Splitter.

    Since:
    1.0
    • Nested Class Summary

      Nested Classes 
      Modifier and Type Class Description
      static class  Splitter.MapSplitter
      An object that splits strings into maps as Splitter splits iterables and lists.
    • Method Summary

      All Methods Static Methods Instance Methods Concrete Methods 
      Modifier and Type Method Description
      static Splitter fixedLength​(int length)
      Returns a splitter that divides strings into pieces of the given length.
      Splitter limit​(int limit)
      Returns a splitter that behaves equivalently to this splitter but stops splitting after it reaches the limit.
      Splitter omitEmptyStrings()
      Returns a splitter that behaves equivalently to this splitter, but automatically omits empty strings from the results.
      static Splitter on​(char separator)
      Returns a splitter that uses the given single-character separator.
      static Splitter on​(CharMatcher separatorMatcher)
      Returns a splitter that considers any single character matched by the given CharMatcher to be a separator.
      static Splitter on​(java.lang.String separator)
      Returns a splitter that uses the given fixed string as a separator.
      static Splitter on​(java.util.regex.Pattern separatorPattern)
      Returns a splitter that considers any subsequence matching pattern to be a separator.
      static Splitter onPattern​(java.lang.String separatorPattern)
      Returns a splitter that considers any subsequence matching a given pattern (regular expression) to be a separator.
      java.lang.Iterable<java.lang.String> split​(java.lang.CharSequence sequence)
      Splits sequence into string components and makes them available through an Iterator, which may be lazily evaluated.
      java.util.List<java.lang.String> splitToList​(java.lang.CharSequence sequence)
      Splits sequence into string components and returns them as an immutable list.
      Splitter trimResults()
      Returns a splitter that behaves equivalently to this splitter, but automatically removes leading and trailing whitespace from each returned substring; equivalent to trimResults(CharMatcher.WHITESPACE).
      Splitter trimResults​(CharMatcher trimmer)
      Returns a splitter that behaves equivalently to this splitter, but removes all leading or trailing characters matching the given CharMatcher from each returned substring.
      Splitter.MapSplitter withKeyValueSeparator​(char separator)
      Returns a MapSplitter which splits entries based on this splitter, and splits entries into keys and values using the specified separator.
      Splitter.MapSplitter withKeyValueSeparator​(Splitter keyValueSplitter)
      Returns a MapSplitter which splits entries based on this splitter, and splits entries into keys and values using the specified key-value splitter.
      Splitter.MapSplitter withKeyValueSeparator​(java.lang.String separator)
      Returns a MapSplitter which splits entries based on this splitter, and splits entries into keys and values using the specified separator.
      • Methods inherited from class java.lang.Object

        clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
    • Method Detail

      • on

        public static Splitter on​(char separator)
        Returns a splitter that uses the given single-character separator. For example, Splitter.on(',').split("foo,,bar") returns an iterable containing ["foo", "", "bar"].
        Parameters:
        separator - the character to recognize as a separator
        Returns:
        a splitter, with default settings, that recognizes that separator
      • on

        public static Splitter on​(CharMatcher separatorMatcher)
        Returns a splitter that considers any single character matched by the given CharMatcher to be a separator. For example, Splitter.on(CharMatcher.anyOf(";,")).split("foo,;bar,quux") returns an iterable containing ["foo", "", "bar", "quux"].
        Parameters:
        separatorMatcher - a CharMatcher that determines whether a character is a separator
        Returns:
        a splitter, with default settings, that uses this matcher
      • on

        public static Splitter on​(java.lang.String separator)
        Returns a splitter that uses the given fixed string as a separator. For example, Splitter.on(", ").split("foo, bar,baz") returns an iterable containing ["foo", "bar,baz"].
        Parameters:
        separator - the literal, nonempty string to recognize as a separator
        Returns:
        a splitter, with default settings, that recognizes that separator
      • on

        @GwtIncompatible("java.util.regex")
        public static Splitter on​(java.util.regex.Pattern separatorPattern)
        Returns a splitter that considers any subsequence matching pattern to be a separator. For example, Splitter.on(Pattern.compile("\r?\n")).split(entireFile) splits a string into lines whether it uses DOS-style or UNIX-style line terminators.
        Parameters:
        separatorPattern - the pattern that determines whether a subsequence is a separator. This pattern may not match the empty string.
        Returns:
        a splitter, with default settings, that uses this pattern
        Throws:
        java.lang.IllegalArgumentException - if separatorPattern matches the empty string
      • onPattern

        @GwtIncompatible("java.util.regex")
        public static Splitter onPattern​(java.lang.String separatorPattern)
        Returns a splitter that considers any subsequence matching a given pattern (regular expression) to be a separator. For example, Splitter.onPattern("\r?\n").split(entireFile) splits a string into lines whether it uses DOS-style or UNIX-style line terminators. This is equivalent to Splitter.on(Pattern.compile(pattern)).
        Parameters:
        separatorPattern - the pattern that determines whether a subsequence is a separator. This pattern may not match the empty string.
        Returns:
        a splitter, with default settings, that uses this pattern
        Throws:
        java.util.regex.PatternSyntaxException - if separatorPattern is a malformed expression
        java.lang.IllegalArgumentException - if separatorPattern matches the empty string
      • fixedLength

        public static Splitter fixedLength​(int length)
        Returns a splitter that divides strings into pieces of the given length. For example, Splitter.fixedLength(2).split("abcde") returns an iterable containing ["ab", "cd", "e"]. The last piece can be smaller than length but will never be empty.

        Exception: for consistency with separator-based splitters, split("") does not yield an empty iterable, but an iterable containing "". This is the only case in which Iterables.size(split(input)) does not equal IntMath.divide(input.length(), length, CEILING). To avoid this behavior, use omitEmptyStrings.

        Parameters:
        length - the desired length of pieces after splitting, a positive integer
        Returns:
        a splitter, with default settings, that can split into fixed sized pieces
        Throws:
        java.lang.IllegalArgumentException - if length is zero or negative
      • omitEmptyStrings

        @CheckReturnValue
        public Splitter omitEmptyStrings()
        Returns a splitter that behaves equivalently to this splitter, but automatically omits empty strings from the results. For example, Splitter.on(',').omitEmptyStrings().split(",a,,,b,c,,") returns an iterable containing only ["a", "b", "c"].

        If either trimResults option is also specified when creating a splitter, that splitter always trims results first before checking for emptiness. So, for example, Splitter.on(':').omitEmptyStrings().trimResults().split(": : : ") returns an empty iterable.

        Note that it is ordinarily not possible for split(CharSequence) to return an empty iterable, but when using this option, it can (if the input sequence consists of nothing but separators).

        Returns:
        a splitter with the desired configuration
      • limit

        @CheckReturnValue
        public Splitter limit​(int limit)
        Returns a splitter that behaves equivalently to this splitter but stops splitting after it reaches the limit. The limit defines the maximum number of items returned by the iterator.

        For example, Splitter.on(',').limit(3).split("a,b,c,d") returns an iterable containing ["a", "b", "c,d"]. When omitting empty strings, the omitted strings do no count. Hence, Splitter.on(',').limit(3).omitEmptyStrings().split("a,,,b,,,c,d") returns an iterable containing ["a", "b", "c,d". When trim is requested, all entries, including the last are trimmed. Hence Splitter.on(',').limit(3).trimResults().split(" a , b , c , d ") results in @{code ["a", "b", "c , d"]}.

        Parameters:
        limit - the maximum number of items returns
        Returns:
        a splitter with the desired configuration
        Since:
        9.0
      • trimResults

        @CheckReturnValue
        public Splitter trimResults()
        Returns a splitter that behaves equivalently to this splitter, but automatically removes leading and trailing whitespace from each returned substring; equivalent to trimResults(CharMatcher.WHITESPACE). For example, Splitter.on(',').trimResults().split(" a, b ,c ") returns an iterable containing ["a", "b", "c"].
        Returns:
        a splitter with the desired configuration
      • trimResults

        @CheckReturnValue
        public Splitter trimResults​(CharMatcher trimmer)
        Returns a splitter that behaves equivalently to this splitter, but removes all leading or trailing characters matching the given CharMatcher from each returned substring. For example, Splitter.on(',').trimResults(CharMatcher.is('_')).split("_a ,_b_ ,c__") returns an iterable containing ["a ", "b_ ", "c"].
        Parameters:
        trimmer - a CharMatcher that determines whether a character should be removed from the beginning/end of a subsequence
        Returns:
        a splitter with the desired configuration
      • split

        public java.lang.Iterable<java.lang.String> split​(java.lang.CharSequence sequence)
        Splits sequence into string components and makes them available through an Iterator, which may be lazily evaluated. If you want an eagerly computed List, use splitToList(CharSequence).
        Parameters:
        sequence - the sequence of characters to split
        Returns:
        an iteration over the segments split from the parameter.
      • splitToList

        @Beta
        public java.util.List<java.lang.String> splitToList​(java.lang.CharSequence sequence)
        Splits sequence into string components and returns them as an immutable list. If you want an Iterable which may be lazily evaluated, use split(CharSequence).
        Parameters:
        sequence - the sequence of characters to split
        Returns:
        an immutable list of the segments split from the parameter
        Since:
        15.0
      • withKeyValueSeparator

        @CheckReturnValue
        @Beta
        public Splitter.MapSplitter withKeyValueSeparator​(java.lang.String separator)
        Returns a MapSplitter which splits entries based on this splitter, and splits entries into keys and values using the specified separator.
        Since:
        10.0
      • withKeyValueSeparator

        @CheckReturnValue
        @Beta
        public Splitter.MapSplitter withKeyValueSeparator​(char separator)
        Returns a MapSplitter which splits entries based on this splitter, and splits entries into keys and values using the specified separator.
        Since:
        14.0
      • withKeyValueSeparator

        @CheckReturnValue
        @Beta
        public Splitter.MapSplitter withKeyValueSeparator​(Splitter keyValueSplitter)
        Returns a MapSplitter which splits entries based on this splitter, and splits entries into keys and values using the specified key-value splitter.
        Since:
        10.0