73 lines
3.5 KiB
Markdown
73 lines
3.5 KiB
Markdown
# Regular Expressions
|
|
|
|
First of all, this will be a bit painful but as with `vim` once you overcome the initial learning curve you start to see the potential regular expressions bring to the table.
|
|
To make matters even worse, there are multiple *flavors* of regexes.
|
|
An overview and comparison between different flavors can be found on [wikipedia](https://en.wikipedia.org/wiki/Comparison_of_regular-expression_engines).
|
|
Don't see this as a reason *not* to learn some basic expressions though, a little experience goes a long way.
|
|
|
|
## What are they?
|
|
|
|
> A regular expression (shortened as regex or regexp;[1] also referred to as rational expression[2][3]) is a sequence of characters that specifies a search pattern. Usually such patterns are used by string-searching algorithms for "find" or "find and replace" operations on strings, or for input validation. It is a technique developed in theoretical computer science and formal language theory.
|
|
|
|
[wikipedia](https://en.wikipedia.org/wiki/Regular_expression)
|
|
|
|
You can see regular expressions as find (and replace) on steroids.
|
|
As a practical example, I used *a lot* of regular expressions to clean up the multiple choice LPI questionnaires.
|
|
This was done in `vim` so I used the vim flavor regex but it's not too much different from the main one you should know, `grep`.
|
|
|
|
From a practical system administer point of view you'll probably use regexes in this order:
|
|
|
|
1. with `grep`
|
|
2. with `sed` (went copy pasting commands found online)
|
|
3. with `vim`
|
|
4. with a scripting language such as `python3`
|
|
|
|
## How to learn them?
|
|
|
|
Some tips and pointers before we head into the actual syntax.
|
|
|
|
### Vim
|
|
|
|
There is a setting in `vim` that is disabled by default but highly advised to learn vim regexes.
|
|
By setting `set incsearch` in your `~/.vimrc` or in the **expert** command line vim will highlight whatever matches the pattern you're searching for.
|
|
This can be a tremendous help when building complex patterns.
|
|
|
|
### Grep
|
|
|
|
By default `grep` only interprets basic regular expressions.
|
|
If you want, or more likely *need* to use [extended](https://www.gnu.org/software/grep/manual/html_node/Basic-vs-Extended.html) expressions you should use `grep -E` or `egrep` instead.
|
|
For completeness's sake I should mention there is a third *version* of `grep` invoke with `grep -P` that interprets the patterns as [perl regex](https://perldoc.perl.org/perlre).
|
|
One of the advantages of perl regexes is reverse matching.
|
|
|
|
## The basics
|
|
|
|
TODO
|
|
|
|
## Exercises
|
|
|
|
Below are some practical exercises and files to go with them.
|
|
Use them to test out you grepping skills and as inspiration for personal challenges.
|
|
|
|
* configuration [file](./assets/sysctl.conf)
|
|
* print only lines with actual configuration settings (ignore comments)
|
|
* css [file](./assets/teddit.css)
|
|
* extract all the hex color codes
|
|
* html [file](./assets/teddit.html)
|
|
* html extract pictures
|
|
* just jpg
|
|
* jpg and png at the same time
|
|
* log [file](./assets/auth.log)
|
|
* extract all IP addresses
|
|
* plus only the unique ones
|
|
* extract all wrong logins for known users
|
|
* extract all unknown users (this is tricky and requires backwards searching using `grep -P`)
|
|
* extract all the dates and times for successful logins (might require multiple greps in a pipe)
|
|
* mail dump [file](./assets/dump.mail)
|
|
* extract all unique email addresses
|
|
* extract all web links
|
|
* only the base link (https://www.example.co.uk)
|
|
* both http and https links
|
|
|
|
There are some very good regex exercises online as well.
|
|
[This](http://regextutorials.com/) is a good starting point.
|