java - How to split a string, including punctuation marks? -
i need split string (in java) punctuation marks being stored in same array words:
string sentence = "in preceding examples, classes derived from..."; string[] split = sentence.split(" ");
i need split array be:
split[0] - "in" split[1] - "the" split[2] - "preceding" split[3] - "examples" split[4] - "," split[5] - "classes" split[6] - "derived" split[7] - "from" split[8] - "..."
is there elegant solution?
you need arounds:
string[] split = sentence.split(" ?(?<!\\g)((?<=[^\\p{punct}])(?=\\p{punct})|\\b) ?");
look arounds assert, (importantly here) don't consume input when matching.
some test code:
string sentence = "foo bar, baz! who? me..."; string[] split = sentence.split(" ?(?<!\\g)((?<=[^\\p{punct}])(?=\\p{punct})|\\b) ?"); arrays.stream(split).foreach(system.out::println);
output;
foo bar , baz ! ? me ...
Comments
Post a Comment