Tokenizing in java

Tokenizing :

Tokenizing is a mechanism of taking larg pieces of source data, breaking them into small pieces, and storing these small pieces in variables.

Example :
The most common tokenizing situation is reading a delimited file in order to find the contents of the file and moved into useful places such as objects, arrays and collections.
There are two classes in the java API that provides tokenizing mechanism :

  •  String
  • Scanner                                                                                                                                                                                                                                                                                                                                                                                                                                                                                               Both classes provide many methods which is useful in tokenizing.

What is token and delimiter?

Token is just a actual piece of data and delimiters are the expression that used to separate token from each other, it may be a comma, a backslashes or a single whitespace.

Example :


package com.pkjavacode.com;

import java.util.StringTokenizer;

public class TokenizingDemo {

/**
 * @param args
 */
 public static void main(String[] args) {
String str = "I am pradeep kumar yadav";
 String[] tokens = str.split(" ");
 System.out.println("Count tokens" + tokens.length);
 for (String s : tokens) {
 System.out.println(" " + s);
 }
 }
}

Output :

I
am
pradeep
kumar
yadav

Tokenizing with Scanner class :

When you need to do serious tokenizing you go for Scanner. It’s have following feature :

  • Scanners can be constructed using files, streams and Strings as a source.
  • Tokenizing mechanism is performed within a loop so that you can exit the process at any point.
  • Tokens can be converted to their appropriate primitive types automatically.

Example :


package com.pkjavacode.com;

import java.util.Scanner;

public class TokenizingScannerExample {

/**
* @author pradeep
*/
public static void main(String[] args) {
String str = "I am Pradeep Kumar Yadav";
Scanner s = new Scanner(str);
String s2;
while (s.hasNext()) {
s2 = s.next();
System.out.println("" + s2);
}
}
}

Output :

I
am
Pradeep
Kumar
Yadav