Regular Expressions in Java
"If we want to represent a group of strings according to a particular pattern, then we should use Regular Expressions."
For example -
1. We can find a regular expression to represent all mobile numbers.
2. We can write a regular expression to represent all mail id's.
Applications of Regular Expressions -
1. Validation of formats like - email, mobile number etc.
2. Pattern matching applications like - CTRL F in windows and grep in UNIX.
3. Designing translators like - Compilers, interpreters etc.
4. Designing circuits.
5. For developing communication protocols like - TCP/IP.
Example -
In this example we are searching for the string "ab" in the given target string "ababbaba". This is the code by which we can able to find the given string in the given target string.
Search this - "ab"
Target String - "ababbaba"
Some important methods used in the code -
1. m.start() - The return type of this method is int. It is used to print the starting index of the pattern in the given target string.
2. m.end() - The return type of the given method is int. This is used to print the ending index of the string present in the given target string. It returns end+1.
3. m.group() - The return type of this method is String. It will return which thing got matched in the target string got matched in the target expression. It returns matched pattern.
4. m.find() - The return of this method is Boolean. It checks for next identical pattern otherwise returns false.
5. Pattern - It is a static factory method.
- It is a complied version of regular expression.
- Equivalent Java object of regular expression.
- It can be created by compile() method.
- Pattern p = Pattern.compile("Pattern String"); // Write string which we have to search in the target string.
- We can use matcher object to match the given pattern in the given target string.
- We can create Matcher object by using matcher() method of Pattern class.
- Matcher m = P.matcher("Target String"); // Write target string in which we have to search
Character Class in Java
We can define the whole class of the characters to find the matched string in the given string. For this we use Character class in regular expressions in Java.
The list of character classes is given -
1. [abc] - either a or b or c
2. [^abc] - except a, b and c
3. [a-z] - any lower case alphabet symbol
4. [A-Z] - any upper case alphabet symbol
5. [a-z A-Z] - any alphabet symbol
6. [0-9] - any digit from 0 to 9
7. [a-z A-Z 0-9] - any alpha-numeric symbol
8. [^a-z A-Z 0-9] - except alpha-numeric symbol (special characters)
Example -
In the example we are using character classes.
Pre-defined Character classes in Regular Expression
Here we are going tp discuss some pre-defined character classes in regular expressions in Java.
1. \s (small s) - space character
2. \S (big S) - any character except space
3. \d (small d) - any digit from 0 to 9
4. \D (big D) - any character except digit
5. \w (small w) - any word character (any alpha-numeric character)
6. . (single dot sign) - any symbol including special character also
Example -
You can remove the comments and run the program for the output.
Quantifiers in regular expressions
We can use Quantifiers to specify the number of occurrences to match in the given target expression. You can assume quantifier as 'Quantity'.
a - Exactly one a
a+ - At least one a (sequence of similar characters is treated like single character)
a* - Any number of a's including zero also
a? - At most one a
Example -
The output of the given example is as follows, you can also check this example by running in your IDE by removing the comments in the program.
split() method in Pattern class
We can use Pattern class split() to split the given target string according to the given pattern.
Example -
The output of the given program is given below -
Here except space(\\s) all are tokens. So, it will print all tokens.
split() method in String class
String class also contains split() to split given target string according to particular pattern.
The output of the given program is as follows -
NOTE - Pattern class split() can take target string as an argument where as string class split() can take regular Expression as an argument.
String Tokenizer
- Specially designed class for tokenization activity.
- Present in java.util package.
- Default argument of StringTokenization is space(\\s) as a regular expression.
- We can provide user defined argument which is known as 'delim' in regular expression.
Some applications of regular expression
1. To represent 10-digit Mobile number -
- Each number should contain exactly 10 digits
- The first digit should be 7 or 8 or 9
No comments:
Post a Comment