Pages

searching in pdf files using grep : pdfgrep

Just as we use grep to search for patterns in a text file we can use pdfgrep to search for strings in a pdf file. In debian based systems we can install the package from the package manager or from the terminal using



Once installed we can use this command from the terminal to search for the strings in pdf files. The syntax for use of the command is



Let us say we are searching for string "ioctl" in a pdf file name ch03.pdf (which is the third chapter from Linux device drivers book) .



It lists out all the lines that contain the string "ioctl". To make the output look more easier to read we can prefix each line with the page number on which it occurs using the option "-n".



If we want to count the number of occurrences instead of viewing the lines on which they appear we can use the option "-c"



We can also prefix each line of output with the name of the file in which the line appears, which is useful when searching in multiple files, by using the option -H.



We can see that the file names have been prefixed to each line of the output. Note that when searching in multiple files the default behavior of pdfgrep is to prefix the filenames to each line.

No comments:

Post a Comment