Find Unicode Category for a Given Character in Java



A Character class is a subclass of the class named Object, and it wraps a value of the primitive type char. An object of type Character contains a single field whose type is char.

Unicode Category of a Character

In Java, the Unicode category of a character refers to the classification of characters based on their general type or usage, such as letters, digits, punctuation, symbols, etc. The java.lang.Character class provides methods to find the category of a character according to the Unicode standard.

Unicode Category of a Character using getType()

One of the basic ways to find the Unicode category for a particular character is by using the getType() method. It is a "static method" of the Character class, and it returns an integer value of the character represented in the "Unicode general category".

Syntax

Following is the syntax of the getType() method:

Character.getType(char ch)

Here, the ch is a character whose Unicode general category type is to be found.

Example 1

In the following example, we use the getType() method to find the Unicode category for the specified character, such as 'T', '@', ')', '1', '-', '_', and 'a':

public class CharacterTypeTest {
   public static void main(String args[]) {
      System.out.println("T represents unicode category of: " + Character.getType('T'));
      System.out.println("@ represents unicode category of: " + Character.getType('@'));
      System.out.println(") represents unicode category of: " + Character.getType(')'));
      System.out.println("1 represents unicode category of: " + Character.getType('1'));
      System.out.println("- represents unicode category of: " + Character.getType('-'));
      System.out.println("_ represents unicode category of: " + Character.getType('_'));
      System.out.println("a represents unicode category of: " + Character.getType('a'));
   }
}

The above program produces the following output:

T represents unicode category of: 1
@ represents unicode category of: 24
) represents unicode category of: 22
1 represents unicode category of: 9
- represents unicode category of: 20
_ represents unicode category of: 23
a represents unicode category of: 2

Example 2

Here is another example of finding the Unicode category of the specified character. In this example, we will loop through each character in the string and use the getType() method to get the Unicode category of each character in a string:

public class CharacterTypeTest {
    public static void main(String args[]) {
      String str = "Hello Tutorialspoint";
      System.err.println("The string is: " + str);
      for(int i = 0; i<str.length(); i++){
         System.out.println("The unicode category of character '" + str.charAt(i) + "' is: " + Character.getType(str.charAt(i)));
      }
   }
}

Following is the output of the above program:

The string is: Hello Tutorialspoint
The unicode category of character 'H' is: 1
The unicode category of character 'e' is: 2
The unicode category of character 'l' is: 2
The unicode category of character 'l' is: 2
The unicode category of character 'o' is: 2
The unicode category of character ' ' is: 12
The unicode category of character 'T' is: 1
The unicode category of character 'u' is: 2
The unicode category of character 't' is: 2
The unicode category of character 'o' is: 2
The unicode category of character 'r' is: 2
The unicode category of character 'i' is: 2
The unicode category of character 'a' is: 2
The unicode category of character 'l' is: 2
The unicode category of character 's' is: 2
The unicode category of character 'p' is: 2
The unicode category of character 'o' is: 2
The unicode category of character 'i' is: 2
The unicode category of character 'n' is: 2
The unicode category of character 't' is: 2
Updated on: 2025-05-14T14:12:22+05:30

5K+ Views

Kickstart Your Career

Get certified by completing the course

Get Started
Advertisements