Pages

Creating cylic redundancy check number for a file in linux

CRC stands for Cylic Redudancy Check.

According to wikipedia

A cyclic redundancy check (CRC) is an error-detecting code commonly used in digital networks and storage devices to detect accidental changes to raw data. Blocks of data entering these systems get a short check value attached, based on the remainder of a polynomial division of their contents. On retrieval the calculation is repeated, and corrective action can be taken against presumed data corruption if the check values do not match.

ref: http://en.wikipedia.org/wiki/Cyclic_redundancy_check

So CRC is used when we want to ensure that the data being transmitted is error free. Now there are a number of methods to calculate the CRC and different types of CRC which ware listed out in the above mentioned CRC link of wikipedia.

Linux provides a simple command to find the CRC with out bothering about the mathematical details of the CRC calculation.

The command to calculate CRC of a text file is CRC32

The syntax is simple



Let us say we have a file called hello with the contents





Please note that as many times as you run the command, the CRC32 will remain the same as long as there is no change in the file. Note that even a space in the file change the CRC32.

colrm to remove columns from a text file

Organizing data is columns is a very common occurrence and need for manipulation these columns is needed often. colrm is a command that helps us remove a set of columns easily from a text file.

colrm takes two optional arguments


start: The first column that has to be removed stop: The last column that has to be removed.
The command treats every character in a row as a column. If we specify only the start column, the characters starting from 'start' column number till the end.
For example if we have the following file
data:



The file has data in three columns. One set of numbers, one set of blank characters and one set of alphabets. To remove the alphabets, which is the third and last column we need to specify only the column number from which to begin as the alphabets are the last column.



If we want to remove the column from between, like the column of numbers which is the first column we can pass the starting and ending column numbers as shown below.



We can see that only the first and second columns are removed and the other columns are displayed.