Thursday, January 26, 2012
You must hope that all the bad sectors are somewhat grouped together. You can recover the portion before the bad sectors, and you may be able to recover some data after all bad sectors. You can't recover data bewteen those bad sectors, unless they are very far apart from each other.
To recover the portion before the bad sectors, just do:
gunzip < damaged.gz > part1
gunzip will stop when it sees the bad data. All data in the file "part1" is guaranteed to be correct, but of course the rest will be missing. If the file "damaged" is a .tar file, you can recover some files with:
gunzip < damaged.tar.gz | tar xvf -
gunzip and tar will complain at some point, but tar may have recovered some files already.
Now let's try to recover something after the bad sectors. You first have to find the boundary of the first undamaged compression block after the damaged portion. The boundary is bit aligned. To find the damaged portion, add
fprintf(stderr, "bytes_in %ld\n", bytes_in);
error("invalid compressed data--format violated");
in unzip.c. Then round bytes_in this to the next disk block boundary and create a new .gz file by concatenating a valid .gz header and the data believed to be undamaged. Then try repeatedly "gzip -t" on the new .gz file, removing from 1 bit to 8*64K bits from the compressed data portion, until you get a crc error instead of a "format violated" error. At this point do
gunzip < damaged.gz > damaged
The gzip CRC will always fail because you will miss some 'history', but after some time, the history effect will be reduced and you might be able to recover part of the data. You will have no guarantee that the data will be correct except by manual inspection.
To get a valid .gz header, look at the file algorithm.doc in the gzip distribution, or just copy the header from any valid .gz file. The header ends at the zero terminated file name. To speed up the search for a block header, the first 3 bits should be 0,0,1 (starting from least significant bit) so that when aligned on a byte boundary you get first_byte & 7 == 4. So you only have to test about 1/8 of all possible bit alignments. Of course if your block was not byte aligned
you have to bit-shift the entire file.
As you can see, all this is not a trivial task, so you should attempt it only if your data is very valuable. gzip 2.0 will have a new blocksize option, allowing to recover easily all undamaged blocks after the damaged portion.
06/16/2014 - Note using fixgz.exe I just tried to recover a DOCX zip format file that looks like it was opened in Notepad and saved, but my effort was not successful.
http://www.dewassoc.com/support/bios/bios_password.htm "Unfortunately, access to computers can, at times, be blocked for all of t...
Typical error from a Word 2013 or before installation when loading a corrupt file This is a cross-post from an answer I made t...
The first error you'll see if you can't open a Word file due to corruption. DOCX format Microsoft Word 2007 and afterwards (as...
You can find this free tool in the Windows 8 App store. It's pretty neat. It would be used with corrupt Excel files that still can be o...