The Sleuth Kit Informer
                                      
                    http://www.sleuthkit.org/informer 
                 http://sleuthkit.sourceforge.net/informer 
                                      
                               Brian Carrier
                        carrier at sleuthkit dot org
                                      
                                 Issue #19
                               March 15, 2005
     _________________________________________________________________
                                      
Contents

     * Introduction
     * What's New?
     * Call For Papers
     * New Image File Support
     * Hooking IO Calls for Multi-Format Image Support (By: Michael
       Cohen)
     _________________________________________________________________
   
Introduction

   This issue of the Informer is unique because it has articles that
   describe two different approaches to the same problem. The first main
   article describes the new image file features in The Sleuth Kit
   version 2, which supports disk and split image files. The second
   article is by Michael Cohen and it describes how PyFlag added support
   for different image file formats before TSK v2 existed. PyFlag uses
   TSK to analyze file system images.
   
   In the last issue of the Informer, I mentioned that I was no longer
   going to make the text version because it took a lot of time to
   manually convert between the two. Alexander Ehlert e-mailed me to tell
   me that lynx can be used to dump an HTML page to text, so I will be
   using that for this and future issues (until I find a better document
   management system).
     _________________________________________________________________
   
What's New?

   New versions of TSK and Autopsy are being released soon. TSK v2 has
   many new features including disk and split image support (as discussed
   later in this issue), autodetect file system types, and a new internal
   design. There were also several new features added to existing tools.
   The new disk_sreset tool was added to remove an HPA from an ATA disk
   and the diskstat tool was renamed to disk_stat as an attempt to make
   the tool names more clear. Autopsy has been updated to version 2.04
   and it supports the new disk and split images.
   
   The 5th Annual Digital Forensic Research Workshop (DFRWS) announced
   its Call for Papers in January. One of the areas that we are
   interested in is general tool design or testing, so it could be a good
   place to publish papers on tools based on TSK/Autopsy or other open
   source tools.
   
   www.dfrws.org
     _________________________________________________________________
   
Call For Papers

   The Sleuth Kit Informer is looking for articles on open source tools
   and techniques for digital investigations (computer / digital
   forensics) and incident response. Articles that discuss The Sleuth Kit
   and Autopsy are appreciated, but not required. Example topics include
   (but are not limited to):
   
     * Tutorials on open source tools
     * User experiences and reviews of open source tools
     * New investigation techniques using open source tools
     * Open source tool testing results
       
   http://www.sleuthkit.org/informer/cfp.html
     _________________________________________________________________
   
New Image File Support

   Brian Carrier
   
  Overview
  
   Version 2 of The Sleuth Kit (TSK) has (finally) added support for
   image files other than only raw partitions. TSK now supports raw disk
   and split images and future versions will support non-raw and
   compressed formats. This article describes how to use the new features
   and a high-level description of how it was implemented.
   
  Usage
  
   There are two new things that you must consider with TSK. One is the
   image file format and the second is the offset location of a specific
   partition or file system in a disk image. Accordingly, there are also
   two new command line flags. The -i flag is optional and is used to
   specify the image file format. If it is not given, then the tool will
   try to detect the format. The second flag is -o and it is used to
   specify the offset where a specific partition starts.
   
   When specified, the image type argument is a list of one or more
   format types separated by commas. Currently, the argument needs only
   one type, but future versions may require multiple types. The
   currently supported types are raw and split and a basic image would
   use the arguments -i raw or -i split. In the future, the tools may
   support ACME Company's image file format with embedded data and if the
   image file is split among several files you would use -i acme,split.
   
   The offset argument is, by default, in sectors. For example, to
   specify that the partition starts at sector 63 you would use -o 63. If
   you want to specify the offset using a different block size, then the
   block size can be given with the format of offset@blocksize. For
   example, to specify that the partition starts at block 1000 and each
   block is 2,048 bytes then you would use -o 1000@2048.
   
   The location of the image file names for each command has not changed.
   If split images are used, then the names must be given in the sorted
   order. For example: fls image.dd.01 image.dd.02 image.dd.03
   image.dd.04 .... You can use the * wildcard to specify a large number
   of files: fls image.dd.*.
   
   Here is an example with all possible arguments:
    # fls -i split -o 63 -f ntfs disk1.dd.*

   Here is an example using the new autodetect features:
    # fls -o 63 disk1.dd.*

   If you have a raw partition image, then you can skip the -o argument
   (and take advantage of the new file system type autodetect feature):
    # fls part1.dd

  New Tool
  
   There is a new tool to help with the image file formats. The img_stat
   tool will display details about the image file. Example information
   includes the sector ranges of each split image file and other embedded
   data will be shown for future file formats. The -t flag can be used to
   determine the file format type.
   
  Implementation Overview
  
   For those interested in code-level information about the new image
   support, this section will fill you in. The new features were added by
   creating a new imgtools library. This library is used to read the data
   from the image. The file system code never knows which image format is
   used.
   
   Before the file system is processed, the image file is opened using
   the img_open function. This function determines the format type and
   initializes an IMG_INFO data structure. That data structure is passed
   to the file system and media management code and is used to read all
   data. The imgtools library contains all of the code to read the image
   files.
   
  Conclusion
  
   Version 2 of TSK has finally introduced disk and split image support,
   which will make setting up a case much easier. This article has
   described the basics of using the new features and tools.
     _________________________________________________________________
   
Hooking IO Calls for Multi-Format Image Support

   Michael Cohen <scudette at users dot sourceforge dot net >
   
  Overview
  
   Often when analysing hard disk images, the image may be provided in a
   slightly different format to the expected partition dd image. This may
   happen because the image was split into multiple files, or it might be
   that the image was acquired using Encase (TM) which uses its own
   proprietary image file format.
   
   Many forensic tools require the image to be in a specific format. For
   example previous versions of the Sleuthkit required the image to be an
   uncompressed partition images, for example that obtained using the dd
   command line::
   
  dd if=/dev/hda1 of=image.dd

   If the raw disk was used, i.e. /dev/hda, the investigator was forced
   to use dd to "slice" the original image into partitions depending on
   the partition table (Note that the 63 sector skip is normally found
   from the partition table, using sfdisk, mmls or a similar tool)::
   
  dd if=disk_image.dd of=partition_image.dd bs=512 skip=63

   If the original disk was very large to start with, this was a time
   consuming operation. It would be nice to have an abstraction layer
   which converts between the different formats of images (a partition
   image vs. a disk image) on the fly without requiring to copy the image
   again.
   
   This functionality becomes even more desirable when considering the
   analysis of images which have been stored using compression. For
   example, the popular forensic package Encase(tm) stores images in a
   proprietary format called `The Expert Witness Compression Format`[1].
   This format provides compression as well as splitting large images
   into manageable parts. By providing a transparent abstraction layer it
   is possible to enable any tool to automatically support the image
   format.
   
  Hooking IO for fun and profit
  
   The PyFlag[2] forensic package used to have an IO Subsystem patch for
   the Sleuthkit which enabled it to operate on a number of different
   file formats. Although the Sleuthkit is an excellent tool, it soon
   became obvious that the same functionality was also required of other
   tools, like strings, sfdisk etc.
   
   Modifying the source code of an application resulted in an increased
   amount of code maintenance required to retrofit the IO subsystem patch
   as each version of the Sleuthkit was released. The developers of
   PyFlag had to find a better way. Ideally the tool would have to
   involve no source code modification, and allow arbitrary programs to
   handle the supported file formats transparently.
   
   The obvious solution to this problem was an abstraction layer based on
   library hooking techniques.
   
   When a program wishes to perform an IO operation on a file (for
   example open, read or write the file), it is very rare that the
   program issue the kernel system call directly. In fact, most programs
   will call the C library's open(), read() and write() calls as
   required. Since most programs are dynamically linked rather than
   statically compiled, the linking of the C library code is done during
   run time, by the dynamic linker.
   
   Most dynamic linker implementations (and in particular the GNU libc
   dynamic loader) allow a library to be loaded first, before loading
   other system libraries. Also, if a library provides a required symbol,
   the linker will stop searching for that symbol in other libraries.
   This property allows a library to "hook" a library function by simply
   masking the library function with a locally defined function.
   
   An example serves to illustrate the technique. Assume we have the
   following program, written in pseudo C code::
   
 main() {
   fd=open("somefile",O_RDONLY);
   read(fd,buffer,SIZE);
   close(fd);
 }

   When this program is executed, it calls the C library's open function
   (which actually does the system call). The program then reads some
   data from the filehandle, by calling the C library's read function,
   and finally calls the library's close function to close the
   filehandle.
   
   In the glibc implementation of the dynamic loader (The one used in
   most Linux systems), the environment variable LD_PRELOAD specifies to
   the linker that the named library should be loaded before any other
   libraries. If the desired symbol is present within the named library
   it will mask other functions with the same name present in other
   libraries.
   
   In our case, we wish to hook the open(), read() and close() functions,
   hence we need to create a shared object (a library - we shall call it
   the hooker object) with these functions defined. After setting
   LD_PRELOAD to the location of the hooker object we have created, our
   library will trap all calls to the specified function::
   
  External program ---> Hooker object ---> real libc functions

   The result of this is that as far as the external program is
   concerned, it is operating on a simple partition image as would have
   been obtained using dd. In practice however, the hooker object is able
   to read more complex images, emulating a simple partition image to the
   external program.
   
  Implementation
  
   The PyFlag iohooker tool implements this technique. Not only does it
   hook open, read, write etc, but also hooks the stream functions fopen,
   fread, fwrite etc. It currently supports many different external
   programs, such as dd, sfdisk, all Sleuthkit executables, strings and
   many more.
   
   IOHooker is distributed in two components. The main component is a
   shared object called libio_hooker.so. In order to control this object,
   environment variables are set by a wrapper program: iowrapper.
   
   For the purposes of demonstration we download the `binary version of
   PyFlag`[3]. We untar the distribution in our home directory, and
   change directory into it.
   
   The first step, prior to being able to use the iowrapper is to set the
   LD_LIBRARY_PATH environment variable. This is required to allow the
   dynamic linker to find libio_hooker.so. If we fail to set this
   properly, the linker can not run the iowrapper::
   
  ~/pyflag$ ./bin/iowrapper -h
  ./bin/iowrapper: error while loading shared libraries:
  libio_hooker.so: cannot open shared object file: No such
  file or directory

   After setting the LD_LIBRARY_PATH environment variable, we are able to
   run the iowrapper normally::
   
  ~/pyflag$ export LD_LIBRARY_PATH=`pwd`/libs/
  ~/pyflag$ ./bin/iowrapper

  This program wraps library calls to enable binaries to operate
  on images with various formats. NOTE: Ensure that libio_hooker.so
  is in your LD_LIBRARY_PATH before running this wrapper.

  Usage: ./bin/iowrapper -i subsys -o option prog arg1 arg2 arg3...
        -i subsys: The name of a subsystem to use (help for a list)
        -o optionstr: The option string for the subsystem (help for an example)
        -f wrapped filename: All wrapped filenames will start
  with this string. This is useful for programs that need to
  open other files as well as the target file (for example
  /usr/bin/file needs to open magic files as well).
  Loading library now for hooking

   The final message "Loading library now for hooking" confirms that the
   hooker object is properly initialised and ready. Let us first check to
   see what IO Subsystems are supported by the iowrapper::
   
  ~/pyflag$ ./bin/iowrapper -i help
  Loading library now for hooking
  Available Subsystems:

        standard - Standard Sleuthkit IO Subsystem
        advanced - Advanced Sleuthkit IO Subsystem
        sgzip - Seekable Gzip format
        ewf - Expert Witness Compression format
        raid - Raid 5 implementation
  Unhandled Exception(IO Error): No such IO subsystem: help

   Each subsystem requires specific options that make sense for it. The
   Advanced filesystem, allows users to specify arbitrary offsets, as
   well as multiple split image sets. We can get a more detailed
   explanation of these options::
   
  ~/pyflag$ ./bin/iowrapper -i advanced -o help
  Loading library now for hooking
  Advanced io subsystem options

        offset=bytes            Number of bytes to seek to in
  the image file. Useful if there is some extra data at the start
  of the dd image (e.g. partition table/other partitions)
        file=filename           Filename to use for split files.
  If your dd image is split across many files, specify this parameter
  in the order required as many times as needed for seamless
  integration
        A single word without an = sign represents a filename to use

   For our first example, we use the Sleuthkit's fls tool to list the
   files present in partition 6 of a hard disk image. The fls tool does
   not provide the option of selecting an offset into the image for the
   start of the filesystem, hence we need to wrap it. First we calculate
   the offset where the partition starts::
   
  /pyflag# sfdisk -uS -l /tmp/test.dd
  Disk /tmp/test.dd: cannot get geometry

  Disk /tmp/test.dd: 0 cylinders, 0 heads, 0 sectors/track
  read: Inappropriate ioctl for device

  Warning: The partition table looks like it was made
    for C/H/S=*/255/63 (instead of 0/0/0).
  For this listing I'll assume that geometry.
  Units = sectors of 512 bytes, counting from 0

     Device Boot    Start       End   #sectors  Id  System
  /tmp/test.dd1            63     96389      96327  de  Dell Utility
  /tmp/test.dd2   *     96390  19647494   19551105   7  HPFS/NTFS
  /tmp/test.dd3      19647495  58733639   39086145   c  W95 FAT32 (LBA)
  /tmp/test.dd4      58733640 117210239   58476600   5  Extended
  /tmp/test.dd5      58733703  59328044     594342  82  Linux swap
  /tmp/test.dd6      59328108 117210239   57882132  83  Linux

   The start of partition 6 is at 59328108 sectors * 512 bytes =
   30375991296. We can therefore use the wrapper to force fls to read the
   file system located at that offset::
   
  ~/pyflag$ ./bin/iowrapper -i advanced -o offset=30375991296,filename=/tmp/test.dd fls \
  -f linux-ext3 foobar
  Set file to read from as /tmp/test.dd
  d/d 11: lost+found
  d/d 32769:      etc
  l/l 12: cdrom
  d/d 131073:     var
  ...
  d/d 3211272:    opt
  d/d 3555336:    initrd
  l/l 16: vmlinuz

   Note that as far as fls is concerned it is opening and reading the
   file foobar. It does not realise that foobar does not exist, since the
   wrapper provides it with valid data.
   
   For the next example, we used Encase(tm) to create an evidence file of
   a floppy disk. The file command is unable to determine what is stored
   inside the image, due to it being encoded in the proprietary EWF
   format::
   
  ~/pyflag$ file test.e01
  test.e01: data
  ~/pyflag$ hexdump -C test.e01 | head
  00000000  45 56 46 09 0d 0a ff 00  01 01 00 00 00 68 65 61  |EVF...�......hea|
  00000010  64 65 72 00 00 00 00 00  00 00 00 00 00 b2 00 00  |der..........�..|
  00000020  00 00 00 00 00 a5 00 00  00 00 00 00 00 80 00 10  |.....�..........|

   Lets wrap the hexdump program to show the contents of the raw image::
   
  ~/pyflag$ ./bin/iowrapper -i ewf -o filename=test.e01 hexdump -C test.e01 | head
  00000000  eb 3c 90 4d 53 44 4f 53  35 2e 30 00 02 01 01 00  |�<.msdos5.0.....|
  00000010  02 e0 00 40 0b f0 09 00  12 00 02 00 00 00 00 00  |.�.@.�..........|
  00000020  00 00 00 00 00 00 29 fc  02 29 08 4e 4f 20 4e 41  |......)�.).no na|
  00000030  4d 45 20 20 20 20 46 41  54 31 32 20 20 20 33 c9  |me    fat12   3�|

   From this hexdump it looks like the image is that of a FAT 12 floppy
   disk. To confirm we can run the file command over the image. Since
   file opens other files other than the image (it needs to open the
   magic file), we need to prevent the hooker from hooking those other
   files (otherwise when the file program tries to open its magic file,
   it will be getting the image instead). To this end we can use the -f
   flag to restrict hooking only to files of a given name::
   
  ~/pyflag$ ./bin/iowrapper -i ewf -f test.e01 -o filename=test.e01 file test.e01
  test.e01: x86 boot sector, code offset 0x3c, OEM-ID "MSDOS5.0", root entries 224,
  sectors 2880 (volumes <=32 mb) , sectors/fat 9, serial number 0x82902fc, 
  unlabeled, fat (12 bit)

   Sleuthkit's fls can be used on this Encase image::
   
  ~/pyflag# ./bin/iowrapper -i ewf -f test.e01 -o filename=test.e01 ./bin/fls -f fat12 test.e01
  r/r 9:  gunzip.exe
  r/r 11: Hiew.exe
  r/r 12: tar.exe
  r/r 22: cygwin1.dll
  ..

   Finally we wish to extract the Encase image into a standard dd image.
   We wrap dd and redirect the output to a file::
   
  ~/pyflag$ ./bin/iowrapper -i ewf -f test.e01 -o filename=test.e01 dd if=test.e01 > /tmp/test.dd

  Remote Access to live systems
  
   Sometimes we wish to analyse a live unix system remotely. This may be
   so we can quickly see if the system is compromised, without having to
   acquire the entire image first. We can use our forensic tools to
   examine the remote raw device by using the remote IO subsystem.
   
   .. note:: This type of analysis is quite fragile because the system is
   still live, and using its file system. The forensic tools are
   accessing the raw device while it is being modified which makes it
   susceptible to race conditions. For example, if a file is removed just
   as the forensic utility is accessing its directory inode inconsistant
   data may be obtained.
   
   The ramifications of this is that forensic tools may crash, or provide
   inconsistant results. It is impossible, however, for the IO subsystem
   to alter the live system in any way (since the raw device is opened as
   read only).
   
   One of the common problems with accessing a remote system is
   authentication and encryption. Access to the raw device over the
   network could easily lead to a root compromise by disclosing sensitive
   system information (e.g. the shadow file). The problem of
   authentication and encryption is best left to dedicated programs, such
   as Secure Shell (ssh). This is the approach taken by the remote access
   IO subsystem. The only requirements on the live system are an ssh
   server, and the remote_server program (which may be compiled
   staticly).
   
   These are the steps required to access remote raw devices over the
   network:
    1. Have a static version of remote_server - the remote server
       component installed on the remote system.
    2. Have an ssh server available with root logons allowed.
    3. Use the local system to access the remote raw device by wrapping
       library calls through the wrapper.
       
   The following is an example of a session which might be run on a
   remote target machine::
   
  ~/pyflag$ ./bin/iowrapper -i remote -o host=target,\
    server_path=/path/to/remote_server,device=/dev/hda \
    mmls -t dos foo

  DOS Partition Table
  Units are in 512-byte sectors

       Slot    Start        End          Length       Description
  00:  -----   0000000000   0000000000   0000000001   Primary Table (#0)
  01:  -----   0000000001   0000000062   0000000062   Unallocated
  02:  00:00   0000000063   0000096389   0000096327   Dell Utilities FAT (0xde)
  03:  00:01   0000096390   0019647494   0019551105   NTFS (0x07)
  04:  00:02   0019647495   0058733639   0039086145   Win95 FAT32 (0x0C)
  05:  00:03   0058733640   0117210239   0058476600   DOS Extended (0x05)
  06:  -----   0058733640   0058733640   0000000001   Extended Table (#1)
  07:  -----   0058733641   0058733702   0000000062   Unallocated
  08:  01:00   0058733703   0059328044   0000594342   Linux Swap / Solaris x86
(0x82)
  09:  01:01   0059328045   0117210239   0057882195   DOS Extended (0x05)
  10:  -----   0059328045   0059328045   0000000001   Extended Table (#2)
  11:  -----   0059328046   0059328107   0000000062   Unallocated
  12:  02:00   0059328108   0117210239   0057882132   Linux (0x83)


   We can now list the contents of the windows partition::
   
  ~/pyflag$ ./bin/iowrapper -i remote -o host=target,\
    server_path=/path/to/remote_server,device=/dev/hda,\
    offset=0000096390s fls -f ntfs foo

  d/d 12763-144-4:        Documents and Settings
  d/d 6672-144-3: DRIVERS
  d/d 6941-144-6: I386
  r/r 6915-128-3: IO.SYS
  d/d 62628-144-5:        LDIR
  r/r 6916-128-3: MSDOS.SYS
  d/d 16844-144-1:        My Music
  r/r 6671-128-3: NTDETECT.COM
  r/r 6670-128-3: NTLDR
  d/d 13231-144-4:        Program Files
  ...

   In the above analysis we use the following parameters:
   
host
 The host we should try to log on to.

server_path
 The path to the remote_server program. This program must reside on the remote
machine.

device
 The raw device to export

offset
 An offset to use on the remote device. This can be speficied in sectors (s), k
ilobytes (k) or meganbytes(m) depending on the suffix.

   .. note:: This analysis would easily reveal to us if there are hidden
   files or directories, even in cases where kernel level rootkits are
   installed. This is because most kernel level rootkits trap system
   calls accessing files on the filesystem, but do not filter access to
   raw devices. Since fls is reading the filesystem structures on the raw
   device, it is independant of the kernel's filesystem driver or
   filesystem related system calls.
   
   Although it is conceivable that rootkits can filter the raw device to
   hide files, this will dramatically increase the complexity of the
   rootkit.
   
  Conclusions
  
   Library hooking is a powerful technique which enables a wrapper to be
   inserted between an arbitrary executable, and the image. PyFlag has
   developed an image abstraction layer which allows arbitrary programs
   to automatically support a variety of forensic image formats
   transparently.
   
   The remote IO subsystem allows for the remote access and analysis of
   raw devices by forensic tools, making it possible to detect some
   kernel level rootkits remotely.
   
   [1] The Expert Witness Compression Format:
   http://www.asrdata.com/SMART/whitepaper.html
   [2] PyFlag: http://pyflag.sourceforge.net/
   [3] binary version of PyFlag:
   http://pyflag.sourceforge.net/Downloads/index.html
     _________________________________________________________________
   
Copyright (c) 2005 by Brian Carrier.  All Rights ReservedThis article is licensed under the Creative CommonsAttribution-NonCommercial-ShareAlike License.http://creativecommons.org/licenses/by-nc-sa/2.0/Creative Commons, 559 Nathan Abbott Way, Stanford, California 94305