Windows Server can do data deduplication on files. Seeing that Microsoft implemented that feature in a quite interesting way, it can cause issues during IT forensics. Fortunately, this deduplication is not a standard feature nor enabled by default but has to be explicitly installed as a feature and enabled on a per volume basis. This feature is NOT implemented in the filesystem but as a bolt-on solution using a File System Filter.
First learning: Deduplication works well – it saved around 37% space in the case we have here. But if you want to do filesystem forensics you have to know about it and understand how it works: https://docs.microsoft.com/en-us/windows-server/storage/data-deduplication/understand#how-does-dedup-work
To cut a long story short:
- Writing to files is done normally at first, deduplication kicks in later and optimizes the files once they are on the disk.
- Deduplication is base on file level but divides the files during optimization in chunks of varying size.
- All chunks will be stored in one big data store in C:\System Volume Information\Dedup, the information which file consist contains which chunks an in what order is also stored under C:\System Volume Information
- The original datastream will be replaced by a Reparse Point. Thus reading a deduplicated file will be redirected to a “Data Deduplication file system filter” (Dedup.sys), caring for reassembling the file.
- Which extensions are subject to deduplication can be configured using a policy
- To see that status of the deduplication: “Get-DedupStatus | format-list“
This means that for digital forensics on the file system you have to specially take care of this functionality. It seems that the dedup information are system specific or at least tied to a certain disk layout – meaning that a file-level copy will not work. Obviously mounting the filesystem under Linux or other servers without the dedup file system filter installed will NOT give access to the file data (although you will see the file names).
Learnings for ROBOCOPY:
- If you use robocopy to copy from the original file to a new drive:
- Robocopy will use the File System Filter to access the files, deduplication will be disabled on the new drive as long as it is not enabled for the new volume
- Take care to exclude the System Volume Information (/XD „System Volume Information“). Otherwise your copy will take a lot more space then you expected, as you are also copy the dedup chunk-files.
- If you use robocopy in “/MIR” mode to into a drive with deduplication enabled:
- Take care to exclude the System Volume Information (/XD „System Volume Information“). Otherwise robocopy will in delete your dedup chunk-files and your files will be unusable.
- Further information