= 15.1 (conspect) Regexp = == Patterns == There are two types ways to work with them: 1. Matching is testing if the whole string can be described by pattern. 1. Searching is finding a substring that matches pattern. An integer cannot be used for Patterns. So you can't request to show files consisting only of any number, you should classify it (for example its length=3). == Regular expressions == Why do we need regular expressions? To filter from the strings such that match the query. We are not looking for the entire string exactly , but for the string that has the desired substring. Regular expressions do not allow describing arbitrary strings. Its difference is that the class and characters in it are separated. It can describe almost any possible pattern, not bound to context and having no internal parts dependence (e. g. «a if precedes by b» or «integer number, than that number of characters» can not be described). '''grep-search for a substring in a text file.''' grep August ycall / / - August Call in ycall. grep '[7-9] ' ycal / / - highlighting digits from 7 to 9 in white grep 'u. u' ycall / / - looking for where there is a substring from [u, < something>, u]. the return of August grep command 'e*o' g ycall / / = > 'e, p*, o'. Here we look for , e e, r * (zero or more times), o Line start marker: grep '^[0-9][0-9]*' ycall End of line marker: grep '[0-9][0-9]*$' ycall == Search and replace == sed — stream editor. * replace once {{{#!highlight console $ cal | sed 's/[12][23]/@@/' March 2020 Su Mo Tu We Th Fr Sa 1 2 3 4 5 6 7 8 9 10 11 @@ 13 14 15 16 17 18 19 20 21 @@ 23 24 25 26 27 28 29 30 31 }}} * replace all ('''g'''lobally) {{{#!highlight console $ cal | sed 's/[12][23]/@@/g' March 2020 Su Mo Tu We Th Fr Sa 1 2 3 4 5 6 7 8 9 10 11 @@ @@ 14 15 16 17 18 19 20 21 @@ @@ 24 25 26 27 28 29 30 31 }}} == Extended regexp and dialects == Disadvantages of traditional regexp: it's not easy to: * Search for one regexp ''or'' another. * Use «one-or-more» repeater (ok, it's easy, but ''boring''); also, "`*`" repeater is dangerous. * Use ''character class'' like letters or spaces (also boring). * Use those "`\`"-s every time (most boring, in fact).