carrier at sleuthkit dot org
May 15, 2004
In this 14th article of The Sleuth Kit Informer, I have an article on FAT file recovery in The Sleuth Kit. This new functionality will be included in the 1.70 release of TSK, which will be released in late May. The purpose of this article is to describe how it is being implemented and what its limitations are.
Version 2.00 of Autopsy was released on March 19, 2004. This version has a new internal design and allows you to more easily analyze a live system. The tool will create a directory that can be burned to a CD so that a response team can verify if a system has been compromised. Some parts of the interface were also changed.
Version 1.69 of The Sleuth Kit was released on April 20, 2004. This release contains a couple of bug fixes, one of which fixed an issue where the last sector of a FAT file system could have been missed.
As part of another project, I have been going through the code in TSK and removing unused things, adding new features, and cleaning up other things. These fixes will be released throughout the summer and I am no longer going to wait for a big version 2.0 release. I completed most of the FAT review and added file recovery, which is described in this issue.
The Sleuth Kit Informer is looking for articles on open source tools and techniques for digital investigations (computer / digital forensics) and incident response. Articles that discuss The Sleuth Kit and Autopsy are appreciated, but not required. Example topics include (but are not limited to):
The ability to recover deleted files from a FAT file system is not new. MS-DOS came with the 'undelete' command that would search for deleted files that could be recovered. The Sleuth Kit (TSK) has not had the ability to do FAT file recovery because it was out of its original design. The original design and purpose of TSK was to represent the on-disk data in a format that a user could easily read. When representing only the data that was on-disk TSK can have a 0% error rate, if there are no software bugs.
The next version of TSK includes the ability to recover deleted FAT files. The 'icat' tool has a '-r' flag that will try to "guess" where the file could have existed in the file system and therefore this process will not have a 0% error rate (even if there are no software bugs). Because this process is not defined by any "official" specification and there seems to be no "generally accepted" and documented procedure for FAT file recovery, this article will document how I implemented it in TSK and what its limitations are. I will also document the results of how it performs with the FAT file recovery test image that I released back in February.
Before we start to discuss the details of file recovery, lets review how a file is deleted in a FAT file system. An allocated FAT file has a directory entry structure, which contains the size and starting cluster address of the file. The File Allocation Table (FAT) is used to find the remaining clusters in the file. To find the next cluster in the file, the current cluster is used as an index into the table and the table entry contains either the address of the next cluster or it contains an end of file marker to show that it is the last cluster in the file. Clusters that are not allocated to files have a 0 in their table entry.
File deletion is OS-specific and may differ between implementations. Most implementations of FAT file systems will delete a file by setting the directory entry to an unallocated state (which overwrites the first letter of the name) and sets the table entries for the file's clusters to 0. In general, the starting cluster and the size of the file are not wiped from the directory entry. Again, this is OS-specific and an OS may choose to wipe those fields when a file is deleted.
TSK tries to recover a file by using the starting cluster and size value. If the starting cluster has an allocated state, then TSK will not recover any data. This scenario occurs when the cluster has been reallocated to a new file or if the old directory entry was unallocated because the file was moved within the same partition and the same clusters are used for the new file. In the former case, the recovery will not return accurate data and in the latter case there exists another directory entry for the same file that will have a more accurate file size. Therefore, TSK will not recover the data when the starting cluster is allocated.
If the starting cluster is not allocated, then TSK will start with it and advance by consecutive clusters. If the cluster is unallocated, then TSK considers it part of the file being recovered and subtracts the cluster size from the total file size. If the cluster is allocated, then TSK will skip over it and examine the next cluster. If the end of the file system is reached and TSK has not found enough clusters for the file size, then TSK will not recover any data. If enough clusters have been found for the file, then the data from the unallocated clusters are returned to the user.
Some tools use a different strategy and recover the total number of clusters needed for the file and ignore the cluster's allocation status. This difference was found by Eoghan Casey while testing two tools for a paper in the next issue of the Journal of Digital Investigation. By using the allocation status of the clusters that are recovered, TSK and other tools will recover some files that were fragmented and the clusters in between the fragments are still allocated. If the file was fragmented and the clusters in between the fragments are not allocated, then both strategies will fail.
I will now describe what the limitations of this process are. Any tool that tries to recover files that were not designed to be recovered will have limitations. If any of the clusters that were allocated to the file are currently allocated, then TSK will not return the original file content because it will chose different clusters. If any of the original clusters were reallocated and had new data written to them, then TSK will not return the original file.
If the file was not fragmented and none of the clusters were overwritten, then TSK should return the original file. If the file was fragmented, none of the clusters were overwritten, and the clusters in between the fragments are still allocated, then TSK should return the original file. If the file was fragmented, none of the clusters were overwritten, and the clusters in between the fragments are no longer allocated, then TSK will not return the original file.
In February, I released a test image of a FAT file system that had several deleted files in it. This image was released to the CFTT e-mail list on YahooGroups. At the time, TSK did not have support for file recovery and was therefore not tested. This section will outline the results using the reporting form provided on the website.
1. Can you see the frag1.dat, frag2.dat, sing.dat, mult1.dat, and dir1 file and directory names in the root directory?
Yes. The contents are:
% fls -f fat 6-fat-undel.dd
r/r 3: FAT_REC_1 (Volume Label Entry)
r/r * 4: _rag1.dat
r/r * 5: _rag2.dat
r/r * 6: _ing.dat
r/r * 7: _ult1.dat
d/d * 8: _ir1
d/d * 11: System Volume Information (_YSTEM~1)
2. Can you see the dir2 and mult2.dat names in the dir1 directory?
Yes. The contents are:
% fls -f fat 6-fat-undel.dd 8
d/d * 869: dir2
r/r * 870: mult2.dat
3. Can you see the frag3.dat name in the dir1\dir2 directory?
Yes. The contents are:
% fls -f fat 6-fat-undel.dd 869
r/r * 965: frag3.dat
4. Can you recover the sing.dat file? Does it have the correct MD5?
% icat -f fat -r 6-fat-undel.dd 6 | md5
5. Can you recover the mult1.dat file? Does it have the correct MD5?
% icat -f fat -r 6-fat-undel.dd 7 | md5
6. Can you recover the dir1\mult2.dat file? Does it have the correct MD5?
% icat -f fat -r 6-fat-undel.dd 870 | md5
7. Can you recover the frag1.dat file? Does it have the correct MD5?
% icat -f fat -r 6-fat-undel.dd 4 | md5
8. Can you recover the frag2.dat file? Does it have the correct MD5?
% icat -f fat -r 6-fat-undel.dd 5 | md5
9. Can you recover the dir1\dir2\frag3.dat file? Does it have the correct MD5?
% icat -f fat -r 6-fat-undel.dd 965 | md5
The final three cases represent fragmented files where the clusters in between the fragments are also unallocated. There is not a test case in this test image of a fragmented file with allocated clusters in between the fragments. In hindsight, the test image should have included that scenario. For reference, none of the analysis tools were able to recover the final three files.
As I mentioned before, this functionality was added by using a '-r' flag with 'icat'. This functionality was also added to the 'fls' tool when the recursive flag (also '-r') is given. 'fls' will now recurse into deleted directories and display the contents.
For those who look at the source code, a new flag (FS_FLAG_FILE_RECOVER) was created and it is passed to the file_walk functions. If the recovery flag is not given for a deleted FAT file, then only the first cluster will be returned from the 'icat' tool. The first cluster is given in the on-disk image and therefore falls into the original TSK design.
This article has described how FAT file recovery was implemented in the 1.70 version of TSK (which has not been released yet). This functionality is different than existing file system tool functionality because it is making guesses on where data should be. The results of a test image were given and these results were consistent with what other tools provide.
Brian Carrier, FAT Undelete Test #1, February 2004,
Brian Carrier, The Sleuth Kit,