Interests: Regular Expressions, Linux CLI one-liners, Scripting Languages and Vim

GitHub: https://github.com/learnbyexample

Books: https://leanpub.com/u/learnbyexample

  • 52 Posts
  • 22 Comments
Joined 2 years ago
cake
Cake day: June 20th, 2023

help-circle

  • Well, if you are comfortable with Python scripts, there’s not much reason to switch to awk. Unless perhaps you are equating awk to Python as scripting languages instead of CLI usage (like grep, sed, cut, etc) as my ebook focuses on. For example, if you have space separated columns of data, awk '{print $2}' will give you just the second column (no need to write a script when a simple one-liner will do). This of course also allows you to integrate with shell features (like globs).

    As a practical example, I use awk to filter and process particular entries from financial data (which is in csv format). Just a case of easily arriving at a solution in a single line of code (which I then save it for future use).













  • Here’s a solution with perl (assuming you don’t want to change http/https after the start of ( instead of start of a line):

    perl -pe 's/\[[^]]+\]\(\K(?!https?)[^)]+(?=\))/lc $&=~s|%20|-|gr/ge' ip.txt
    
    • e flag allows you to use Perl code in the substitution portion.
    • \[[^]]+\]\(\K match square brackets and use \K to mark the start of matching portion (text before that won’t be part of $&)
    • (?!https?) don’t match if http or https is found
    • [^)]+(?=\)) match non ) characters and assert that ) is present after those characters
    • $&=~s|%20|-|gr change %20 to - for the matching portion found, the r flag is used to return the modified string instead of change $& itself
    • lc is a function to change text to lowercase