Regular Expressions (RegEx) in Java programming langauge | Fully Explained with handwritten notes - Enggfact

Latest

Convert your Passion into your Profession

Wednesday, April 14, 2021

Regular Expressions (RegEx) in Java programming langauge | Fully Explained with handwritten notes

 Regular Expressions in Java 

"If we want to represent a group of strings according to a particular pattern, then we should use Regular Expressions."

For example

1. We can find a regular expression to represent all mobile numbers.

2. We can write a regular expression to represent all mail id's.

Applications of Regular Expressions -

1. Validation of formats like - email, mobile number etc.

2. Pattern matching applications like - CTRL F in windows and grep in UNIX.

3. Designing translators like - Compilers, interpreters etc.

4. Designing circuits.

5. For developing communication protocols like - TCP/IP.

Example -

In this example we are searching for the string "ab" in the given target string "ababbaba".  This is the code by which we can able to find the given string in the given target string.

Search this - "ab"

Target String - "ababbaba"




The output of the given program is given as following -



Some important methods used in the code -

1. m.start() - The return type of this method is int. It is used to print the starting index of the pattern in                        the given target string.

2. m.end() - The return type of the given method is int. This is used to print the ending index of the                            string present in the given target string. It returns end+1.

3. m.group() - The return type of this method is String. It will return which thing got matched in the                                target string got matched in the target expression. It returns matched pattern.

4. m.find() - The return of this method is Boolean. It checks for next identical pattern otherwise                                 returns false.

5. Pattern - It is a static factory method. 

  • It is a complied version of regular expression.
  • Equivalent Java object of regular expression.
  • It can be created by compile() method.
  • Pattern p = Pattern.compile("Pattern String"); // Write string which we have to search in the target string.
6. Matcher

  • We can use matcher object to match the given pattern in the given target string.
  • We can create Matcher object by using matcher() method of Pattern class.
  • Matcher m = P.matcher("Target String"); // Write target string in which we have to search

Character Class in Java 

We can define the whole class of the characters to find the matched string in the given string. For this we use Character class in regular expressions in Java.

The list of  character classes is given -

1. [abc] - either a or b or c

2. [^abc] - except a, b and c

3. [a-z] - any lower case alphabet symbol

4. [A-Z] - any upper case alphabet symbol

5. [a-z A-Z] - any alphabet symbol

6. [0-9] - any digit from 0 to 9

7. [a-z A-Z 0-9] - any alpha-numeric symbol

8. [^a-z A-Z 0-9] - except alpha-numeric symbol (special characters)


Example

In the example we are using character classes.



The output of the given program is given by -



Pre-defined Character classes in Regular Expression

Here we are going tp discuss some pre-defined character classes in regular expressions in Java.

1. \s (small s) - space character

2. \S (big S) - any character except space

3. \d (small d) - any digit from 0 to 9

4. \D (big D) - any character except digit

5. \w (small w) - any word character (any alpha-numeric character)

6. . (single dot sign) - any symbol including special character also

Example -

You can remove the comments and run the program for the output.


Quantifiers in regular expressions

We can use Quantifiers to specify the number of  occurrences to match in the given target expression. You can assume quantifier as 'Quantity'. 

a    -     Exactly one a

a+    -     At least one a (sequence of similar characters is treated like single character)

a*    -    Any number of a's including zero also

a?    -    At most one a

Example


The output of the given example is as follows, you can also check this example by running in your IDE by removing the comments in the program.



split() method in Pattern class

We can use Pattern class split() to split the given target string according to the given pattern.

Example


The output of the given program is given below -


Here except space(\\s) all are tokens. So, it will print all tokens.


split() method in String class

String class also contains split() to split given target string according to particular pattern.


The output of the given program is as follows -


NOTE - Pattern class split() can take target string as an argument where as string class split() can take                  regular Expression as an argument.


String Tokenizer

  • Specially designed class for tokenization activity.
  • Present in java.util package.
Example


  • Default argument of StringTokenization is space(\\s) as a regular expression.


  • We can provide user defined argument which is known as 'delim' in regular expression.

Some applications of regular expression

1. To represent 10-digit Mobile number -

           Rules -
  • Each number should contain exactly 10 digits
  • The first digit should be 7 or 8 or 9
       So, the required regular expression for 10-digit mobile number is :
       [7-9][0-9][0-9][0-9][0-9][0-9][0-9][0-9][0-9][0-9]
        
       OR

       [7-9][0-9]{9}

       OR

       [789][0-9]{9}


NOTE - 

1. If there are 10-digits or 11-digits in the Mobile number, then first digit will be 0 (zero). Then the required regular expression is : 0?[7-9][0-9]{9}

2. If there are 10-digits or 11-digits or 12-digits in mobile number, then first two digits will be 9 and 1. Then required regular expression will be : (0|91)?[7-9][0-9]{9}

The code for checking validity of Mobile number is given as following -


The output for the given program is given as -





You can run this program by giving  user argument.


2. To represent all mail id -

 For representing all mail id's we can derive a regular expression as follows :

[a-z A-Z 0-9][a-z A-Z 0-9_.]*@[a-z A-Z 0-9]+([.][a-z A-Z]+)+

Regular expression for representing only Gmail id's :

[a-z A-Z 0-9][a-z A-Z 0-9_.]*gmail[.]com



This is an end of regular expressions in Java. We have discussed all variety of formats and methods in regular expressions. If you have any query then comment down in the comment section. Thankyou for reading this article. If you liked this post then please share it with your friends and classmates.

Handwritten notes link - https://github.com/adv11/RegularExpressionsInJava


No comments:

Post a Comment