[/{REGEX} + /{PYTHON}]

Regex or regular expressions can be used to identify a particular pattern in a large dump of data. This comes handy when there is a requirement to extract/identify a particular pattern of data from a heap of data.

USAGE:

  • Log files contain a large amount of data which is not very eye friendly. If we exactly know what we are searching for, regex can help.
  • Post enumeration: picking out emails or phone numbers from a large file in case there are any present

I will be taking a small example on how to identify and extract the phone number from a particular piece of data. Will keep the example straightforward so that it is easy to digest for beginners.

QUESTION 1: what we are lo0king for?

  • Phone numbers (landline)

QUESTION 2: What is the pattern?

  • 011-99999999 (country code – 8 digit landline number)

QUESTION 3: What should be the regex look like?

  • ‘\d{3}-\d{8}’ -> 3 digits followed by dash and then again 3 digits

Let’s bake the code:

C0de fl0w :

  1. Import re module for regex
  2. Enter the data string. (string starts with what we are searching for)
  3. Enter the second data string. (data to be searched in the middle of the string)
  4. Compile the regex with re.compile function
  5. Use of match function on data to get the data. (starting of string has the phone number pattern)
  6. Use of search function to get the data. (will search for only the first occurrence of the string)
  7. Use of findall function to get the data. (finds all patterns in the string)

py regex

Advertisements

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s