2014年11月2日 星期日

Ack-grep Tutorial

Recently, ack became the replacement for grep when I need to spot editing point in my code bases. It wins over grep in terms of:
  • A better pattern match syntax with Perl regular expressions
  • Smarter to limit searches in directories or certain file types
  • Much prettier display of result
  • Config files to make customization permanent

Installation in Ubuntu

$ apt-get install ack-grep
$ ln -s /usr/bin/ack-grep /usr/bin/ack

What Ack Pays Attention To

Before we get into the actual usage of ack, let's discuss for a moment how it differs from grep and what files are within the realm of ack.

The ack tool was created specifically for finding text within the source code of programs. Because of this, the tool has been optimized to search certain files and ignore others.

For instance, if you are searching your project's directory structure, you will almost never want to search the version control system's repository hierarchy. This contains information about older versions of files, and would likely result in many duplicates. Ack realizes that this is not where you want to search, so it ignores these directories. This leads to more focused results, as well as fewer false positives.

In a similar vein, it will ignore common backup files created by certain text editors. It will also not attempt to search non-coding files commonly found in source directories, such as "minified" versions of web files, image and PDF files, etc. All of these things lead to better results for almost all searches. You can always override these settings during execution.

Another feature of ack is that it knows about the source files of different languages. You can ask it to find all Python files in the directory structure. It will return all files that end with .py, but it will also return any file that begins with the lines:

#!/path/to/python

This will match files identified by their extension and also files instructed to call the Python interpreter using the common first line magic number calls:

#!/path/to/interpreter/to/run

This creates a powerful way to categorize very different kinds of files as being related. You can also add or modify the groupings to your liking.

Search with Modern Regexp Pattern 

Search for lines containing a match to 'diff'. It prints out all the files whose lines contain the string 'diff' within the current working directory.
ack diff

Search for lines containing a match word to 'diff'.
ack -w diff

Advanced regular expression syntax search. Detect a string as a function call such as 'diff(o, n)'' .
ack 'diff\(.+\)'

Limit Where the Search Happens

You might have already noticed that in the above example, ack automatically search under your current working directory.

Only search for matches in this file
ack href js/modal.js

Search for matches in this folder
ack href js

To cancel the recursive behavior. No descending into subdirectories.
ack -n href js

It will seek 'href' in all directories other than 'docs'.
ack href --ignore-dir=docs

Besides reading from arguments for files, ack could also reads from STDIN. This makes ack a nice candidate in unix pipeline. We could chain multiple ack together to zero in on what the text you really care about.

First find matches for 'postError' in all 'js' directory and within the result it find matches for 'message'.
ack postError js | ack message

If we want to tell ack to only show us the results found in Python files, we can do this painlessly

Only find matches from python files
ack -w --python restrict

Analyzing our Search Focus

Sometimes, we want to ask "how many matches were returned ?". This is very easy for ack tool.

Suppress normal output; instead print a count of matching lines for each input file.
ack -c restrict
Doxyfile:3
Makefile:0
uncrustify.cfg:0
.travis.yml:0
neovim.rb:0
vim-license.txt:5
...

If we don't want to see some line counts may be zeros, we should add '-l' options to eliminate those files.
ack -cl restrict
Doxyfile:3
vim-license.txt:5
clint.py:1
test/unit/formatc.lua:1
src/nvim/main.c:4
src/nvim/ex_cmds.c:5
src/nvim/misc1.c:1
...

More complex example that combines many options together. It help us narrow down the search result.
ack -ch -w --python restrict

Modifying the Search Output

If you want to see the column that a match is found within a line, you can tell ack to print that information as well with the --column option:
ack -w --column --python restrict
clint.py
107:31:      Specify a number 0-5 to restrict errors to certain verbosity levels.

You can specify a general purpose context specification that will print a number of lines above and below the matches with the -C flag. For instance, to get 3 lines of context in either direction, type:
ack -w --python -C 3 restrict
104-      compatible output (vs7) may also be used.  Other formats are unsupported.
105-
106-    verbose=#
107:      Specify a number 0-5 to restrict errors to certain verbosity levels.
108-
109-    filter=-x,+y,...
110-      Specify a comma-separated list of category-filters to apply: only

Working with File Types

This feature is similar with 'find' tool. It can help us find the files whose path matches the given pattern.

We can tell ack to only show us the python language files by typing:
ack -f --python
clint.py
contrib/YouCompleteMe/ycm_extra_conf.py

We can search for all of the C language files that have the pattern "log" somewhere in their path by typing:
ack -g log --cc
src/nvim/log.h
src/nvim/log.c

You can see all of the languages that ack knows about, and which extensions and file properties it associates with each category by typing:
ack --help-types
Usage: ack-grep [OPTION]... PATTERN [FILES OR DIRECTORIES]

The following is the list of filetypes supported by ack-grep.  You can
specify a file type with the --type=TYPE format, or the --TYPE
format.  For example, both --type=perl and --perl work.

Note that some extensions may appear in multiple types.  For example,
.pod files are both Perl and Parrot.

    --[no]actionscript .as .mxml
    --[no]ada          .ada .adb .ads
    --[no]asm          .asm .s
    --[no]asp          .asp
. . .

Make Configuration ~/.ackrc

In most cases, we want to limit our searches to certain types, then it will be tedious to type --type-set or --type-add every time we want to search beyond build-in file types. ~/.ackrc comes into play in these case. This is the configuration file which will be loaded by ack. All the options we introduced above could be written to it to make it permanent.

Take my configuration as an example:
# ~/.ackrc ack configuration file
# Direct output through program
--pager=less -RFX

# Sort files by default
--sort-files

# Use smart-case by default
--smart-case

# Extended File Types
--type-add=css=.less,.scss,.sass
--type-add=ruby=.haml
--type-set=coffee=.coffee
--type-set=markdown=.md,.markdown
--type-set=json=.json

One thing to be noticed, instead of use whitespace in --type-add less=.less use =. Also, line begins with # is ignored.

After adding our own file types, we could use --type less / --less to limit searches in certain file types or --type noless or --noless to exclude them.

Conclusion

As you can see, ack is a very flexible tool for working with programming source code. Even if you are just using it to find files within your Linux environment, most of the time, the increased power of ack will be useful

Reference


沒有留言:

張貼留言