A Different View of Forensic Artefact Typologies

There are many ways to categorise forensic artefacts. Probably the best known (and well put) is the SANS Windows Forensics Analysis poster. This poster lists artefacts by the formula of ‘Evidence of…'. It’s a great reference and training tool. But could we look at forensic artefacts differently? And why should we?

By categorising data differently, we can look at problems differently. Solving complex problems is a significant part of our work. It also helps when we are required to explain (verbally or in writing) complex topics to non-technical persons.

So what is another way to categorise forensic artefacts?

User Data:

The most identifiable forensic artefacts are created by a user themselves. They are transparent to everyone, high quality and high fidelity. They are usually controlled by the user, in that they have the opportunity to delete or modify the data. A user can always deny creating the artefact, and it can be vulnerable to anti-forensics tools and methods. Automated tools can also create user data, so an analyst has to be aware, and consider the appropriate context and assumptions.

Examples: user files (documents, spreadsheets), browser bookmarks, photos/videos, SMS, call logs etc.

User Experience:

‘User Experience’ artefacts are created by the OS or application to enable user functionality & productivity, but also provide an additional benefit to a DFIR analyst. They are not directly created by the user but are a by-product of user created artefacts and generally can’t exist without each other. Often anti-forensic tools target this data as well.

Examples: OS and software customisations, link files, recycle bin, shellbags, account data, wallpaper, search history, prefetch, recent file lists, thumbnails, command-line history, cookies etc.

###Operating System, Application, or Network Data: This category of artefacts are created by the operating system, software or network which enable system optimisation & functionality. They are usually opaque to a regular user, and may be created with or without the user engaging in activity on the device. General anti-forensic tools are less effective against these artefacts.

Examples: NTFS metadata, previously existing data, registry hive data, configuration files, switch/router data, non-security logs.

Purposely Collected Data:

Purposely collected artefacts are specifically captured to increase security. This may be automatically instigated, configured, and controlled by a user or network administrator. In general, this doesn’t provide any additional functionality to the end-user, and can be either transparent or opaque depending on the artefact and its configuration.

Examples: firewall logs, AV logs, Netflow, passive DNS, memory captures, syslog, event logs.

Protected Artefacts:

Protected artefacts are user, OS and application data which is deliberately encrypted or obfuscated.

Examples: PINs, passwords, private key data, biometric data, encrypted data, SSL/TLS traffic.

Undisclosed/Undocumented Artefacts:

This category is difficult to define. Usually, these artefacts are obtained to enable access to other (often protected) artefacts. These undisclosed and undocumented artefacts include sensitive application debug information, legacy/depreciated features that should have been removed in development, or even exploits enabling access to protected artefacts. Potentially obtaining these artefacts can be destabilising to the system.

Examples of these artefacts can fall into the Donald Rumsfeld-zone of ‘known unknowns’ and ‘unknown unknowns’. For example, it is public knowledge that two companies can bypass certain Apple iPhone lock screens; however, how this is achieved is not public. Conversely, there are possibly other unknown types of artefacts that are available through methods held by governments and companies which are unknown. This category should not be confused with technical intelligence gathering such as signals intelligence which is out of scope of this discussion. Lastly, there are certain artefacts that simply haven’t been identified. The recent (and rapid) changes to Windows 10 is an example of that. For example, check out this Twitter thread for how a new forensic artefact is discovered and reverse engineered.

Examples: undocumented APIs, programming errors (e.g. encryption routines), hardware/software vulnerabilities.

(UPDATE 22 June 2018: See this CrowdStrike blog post in relation to a previous ‘known unknown’ on accessing additional Office 365 metadata using an undocumented API).

It is this last category that is potentially the most controversial. The term ‘forensic’ stems from the Latin ‘forum’ which was the open, public place where politics was discussed and business conducted. It is now understood to mean (in part) ‘relating to the courts of law’, which (in liberal democracies) should mean a transparent process that is open to scrutiny. Where does an undocumented artefact fit in this mix? What is the best way to verify this artefact? For an internal IR investigation where a company needs to identify the who/what/why/how of an incident, an undisclosed artefact may suit purposes. But what about for court? The context and end-use is critical.

However forensic artefacts are categorised, thinking differently about what we are analysing can help explain the context to ourselves and our target audience. If we are all talking from the same page, then our job is half-way complete.

If you have other ideas or typologies for forensic artefacts, please let me know at [email protected] or on Twitter at @mattnotmax.