Touch Screen Lexicon Forensics (TextHarvester/WaitList.dat)

By Barnaby Skeggs

Preamble

Since the release of Windows 8, and the ‘Metro’ interface, touch screen input has been implemented in a rapidly rising number of Windows devices including Microsoft Surface Pro/Book, 2-in-1s, convertible laptops and tablets. Microsoft has catered for this trend, implementing conversion between touch/pen handwriting to computer text in software such as OneNote. In this paper I will detail my research into the forensic artefact ‘Waitlist.dat’, which I believe to be associated with this functionality.
I identified the ‘WaitList.dat’ artefact while investigating a Windows 8.1 PC for the presence of a known email. I was provided with a copy of this email, and part of the investigation involved identifying whether or not this email ever existed on the custodian’s computer. After processing the .PST and .OST mailbox archives on the PC, I did not identify the existence of the email. I then processed shadow copies, carved and processed for various mailbox stores and email files, and still did not identify the email. As a final attempt, I ran a string search for the email subject line across the whole forensic image. I received 1 hit within ‘WaitList.dat’. Investigation of this 140mb file identified metadata, and full body text of over 36’000 emails and documents, spanning back 3 years.

Acknowledgements

Shaun Bettridge – Peer review, contribution to data structure analysis and being a sounding board for ideas throughout this analysis.
Carl - Peer review.

WaitList.dat

‘WaitList.dat’ (WaitList) is a data file which has been found to contain stripped text from email, contact and document files. The population of data within WaitList is associated with the ‘Microsoft Windows Search Indexer’ process. This process locks the WaitList file on a live system.
WaitList is located in the following directory on Windows 8.1 and 10 systems (may exist on other OS versions, however I do not have systems to test this):
C:\Users\%User%\AppData\Local\Microsoft\InputPersonalization\TextHarvester\WaitList.dat
I have only identified WaitList on PCs which have utilized touch screen handwriting recognition features. My own touch screen laptop did not contain this file, as I had not used the feature. In order to test its creation, I setup and began using the handwriting recognition in OneNote, and WaitList was soon automatically created. The following morning full text extracts of all emails I had received overnight were populated within WaitList.
Registry comparison before and after my test showed the following registry key modifications:
Key: HKEY_CURRENT_USER\SOFTWARE\Microsoft\InputPersonalization


Lexicon definition: the vocabulary of a person, language, or branch of knowledge

The "App Lexicon Timestamp" key is a Windows 64 bit FILETIME (Big Endian) timestamp which matches (within ~10 seconds) the installation time of the COM Class Object 'UserLexiconManager'. This is not the date when this key was created, or the date from when WaitList population commenced. On my PC, this date was prior to my purchase of the laptop, likely associated with the initial Windows 10 installation.

Alternatively, these registry key values can be created by enabling 'Personalised Handwriting Recognition' for a supported language in the Control Panel.

Control Panel\Clock, Language and Region\Language\Language options

Theory and Further Research

As of Windows Vista, Custom Dictionaries have been used to improve handwriting recognition results. This has worked by the Input Personalisation System (IPS) collecting user data, which a 'Text Trainer' 'tunes' and stores in 'lexicon blobs'.

The text trainer stores Application Lexicon Blobs, and User Lexicon blobs. Both blobs can be used by the Handwriting Recogniser, and both blobs are updated when new data is received by the IPS, thereby continually improving handwriting recognition accuracy.

For more information on this process, please read:
https://msdn.microsoft.com/en-us/library/bb265252.aspx


Representation of the relationship between Ink Applications and the IPS


The following files exist within the same directory as WaitList.dat:
%User%\AppData\Local\Microsoft\InputPersonalization\TextHarvester\TextHarvester.dat
%User%\AppData\Local\Microsoft\InputPersonalization\TrainedDataStore\en-AU\*


The 'DocID' (Offset 0x1C detailed in Data Structure below) appears to match entries within WaitList to values within TextHarvester.dat. This link will be investigated and detailed in a future blog post.

Whilst further research is definitely required it is possible that the 'Microsoft Windows Search Indexer' collects and stores user data in WaitList, following which TextHarvester acts as the 'Text Trainer', tuning the user data into TrainedDataStores (User Lexicon Blobs) for use by the 'Handwriting Recogniser'. 

If you know more about these files please contact me at b2dfir@REMOVE.gmail.com.

File Contents

The following data has been identified within WaitList.dat records.

Microsoft Outlook Email:
·        Date/Time
·        Email subject
·        Sent flag
·        Type (Email/Document/Contact)
·        Recipients (Does not distinguish between ‘To’, ‘CC’ and ‘BCC’)
Note: Does not store ‘From’ value, however this can often be identified in email signatures)
·        Meeting Location (only when email is a calendar invite)
·        Body of file
Contact:
·        Address
·        City
·        State
·        Country
·        Full Name
·        Title
·        Contact Details (email/phone/url)
Note: Contacts added from Skype/Lync may be recorded as a ‘sent’ email item, due to the way Outlook imports/stores the contact.
Documents (.pdf, .xlsx, .txt, .doc and .docx files have been tested):
·        Date/Time
·        DocumentID (use to compare document indexes over time) – format unknown
·        Body of file
·        Company
It is likely that other values are stored in additional data types, however this is the extent of data I have identified in my testing procedures.

Forensic Application

WaitList provides an additional source of evidence for email and document discovery. In addition to the existence and content of a document, WaitList will store multiple indexes for a single document over time. This provides a forensic examiner the ability to view historical iterations of a file, even when shadow copy is not enabled, or when the file has been deleted/wiped from the hard drive.
The population of data within WaitList.dat is associated with the ‘Microsoft Windows Search Indexer’ process. This process locks the WaitList file on a live system. Existence of an index record within WaitList only indicates the existence of the file on the computer. User interaction with the file can only be inferred when the metadata stored within the record (e.g. ‘Sent flag’, ‘Recipient’) indicates a user action.
An email or document can be recorded in WaitList without being read or opened by the user.
Limitations of the Microsoft Windows Search Indexer apply to all records within WaitList. For example, files within an archive (.zip, .rar etc.) or encrypted documents cannot be indexed with default 'Microsoft Windows Search Indexer' settings, and therefore will not be stored within WaitList. Scanned (non-text searchable) PDFs may appear as records, however the body text will be empty.
For more information on the 'Windows Search Indexer' visit:
https://msdn.microsoft.com/en-us/library/ee805985(v=vs.85).aspx
 ‘Microsoft Windows Search Indexer’ will index emails and their attachments at a similar time. As a result, these files will occur within close proximity of each other when they are written to ‘WaitList.dat’. Whilst there does not appear to be a direct parent to child relationship value, the attachment files will contain matching ‘Recipient’ values to their parent email. ‘Date/Time’ and ‘Recipient’ values can be used to associate emails with their likely attachments.

Parsing WaitList.dat

WLrip.py (WLrip) is a python program I have written to parse the contents of WaitList.dat, based on my understanding of the data structure specified below. WLrip will extract the metadata and body text of each record to a new .txt file, and produce a metadata report in .csv format.
Running WLrip with the ‘-x’ option will produce a .xlsx report with hyperlinks to each .txt file created. This is the recommended method to run WLrip, however it requires the Python ‘XLSXWriter’ module (https://github.com/jmcnamara/XlsxWriter).
Recommended execution of WLrip.py is as follows:
Wlrip.py -c -x -f <filename> -o <output directory>
Arguments:
Argument
Description
-c
Removes various null characters, in an attempt to clean up the text output.
-x
Produces a .xlsx report, as well as the default .csv report.
-k
Kills the ‘Microsoft Windows Search Indexer’ process, which will lock the WaitList.dat file on a live system. Requires administrator privileges.
-f
Specify WaitList.dat file location for processing.
-o
Specifies an output directory. If not included, the report will be generated within a new folder in the current directory.

I have done my best to write this program in a way that allows it to capture new values (which I have not yet encountered) in the ‘other’ field. Values captured in the ‘other’ field will be appended with a [type], to indicate the field value stored in the data structure. Please send unknown values to me and I can implement them in future releases.

https://github.com/B2dfir/wlrip

I have also compiled WLrip into a portable Windows executable using pyinstaller.

https://github.com/B2dfir/wlripEXE

Data Structure Analysis

I have performed analysis on the data structure of WaitList in order to understand how text and metadata are stored within each record. All values are in little endian.
Data Structure (Hex)
Data Structure (Detail)
Offset
Hex
Decimal
Length
Field Name and Description
0x00
6400000000
100
5 bytes
WaitList.dat File   Signature
0x05
03 0b 00 00
2819
4 bytes
Index Record Length (bytes)
0x09
03 0b 00 00
2819
4 bytes
Index Record Length (bytes) - repeated
0x0D
40 59 7B 44
58 F8 D1 01
-
8 bytes
Win 64bit FILETIME – Indexed file’s last modification time/date
0x15
0F 00 00 00
15
4 bytes
Record ID (incremental integer)
Note: Not included in WLrip output
0x19
00
0
1 byte
Sent Flag
00 = sent email
01 = everything else
Note: Local files, and email attachments will not contain the MailItem.Sent property*, and will default to 00.
0x1A
00
0
1 byte
Unknown – always 00 in currently tested files. Possibly a part of 'Sent Flag' (if 'Sent Flag' is a 2 byte int)
Note:  Included in report for as Unkn for community examination.
0x1B
01
1
1 byte
Type
00 = Not Email
01 = Email
0x1C
00 00 00 00
00 00 00 00
0
8 bytes
DocID – format unknown
Filter on this value to view multiple indexes of a document over time.
Known information:
- Value for emails is always 0s
- All documents contain a value here
- Not a timestamp I could identify
- Is not similar between documents with similar timestamps
- Is the same for duplicate documents
- Is the same for multiple index records pertaining to the same document (e.g. when more text has been saved to a report)
0x24
00
0
1 byte
More Metadata Flag
00 = More metadata stored in this record (e.g. Blue data / Orange data structures)
01 = No more metadata stored in this record prior to the body text.
Note: Not included in WLrip output.
0x25
07 00 00 00
7
4 bytes
Index Record Metadata Type Flag
04 00 00 00 = Recipient Email Address
06 00 00 00 = Subject
07 00 00 00 = Recipient Name
10 00 00 00 = Full Name (contact)
11 00 00 00 = Title (contact)
12 00 00 00 = Last Name
21 00 00 00 = State
0B 00 00 00 = Address (contact)
0C 00 00 00 = City (contact)
0D 00 00 00 = Country (contact)
0E 00 00 00 = Contact details (contact)
0F 00 00 00 = First Name (contact)
13 00 00 00 = Middle Name (contact)
1B 00 00 00 = Location (meetings)
0x29
00 00 00 00
0
4 bytes
Grammar Proofing Type
Note: See the following registry key for available proofing types: HKEY_CURRENT_USER\SOFTWARE\Microsoft\Shared Tools\Proofing Tools\Grammar\MSGrammar
Note: Not included in WLrip output.
0x2D
0c 00 00 00
12
4 bytes
Metadata Length (characters)
Multiply integer value by 2 for byte length
0x31
-
-
24 bytes
Metadata Text (Recipient name in this example)
0x49
00
0
1 byte
Another Metadata Value Flag
00 = Get another value
01 = No more values
0x4A
04 00 00 00
4
4 bytes
Same as blue offset 0x25
0x4E
00 00 00 00
0
4 bytes
Same as blue offset 0x29
0x52
13 00 00 00
19
4 bytes
Same as blue offset 0x2D
0x56
-
-
38 bytes
Metadata Text (Recipient email address in this example)
0x7C
01
1
1 byte
Same as blue offset 0x49
0x7D
00 00 00 00
0
4 bytes
Current Body Text   Offset
Increases with each length of indexed body text
0x81
05 00 00 00
5
4 bytes
Body Type Flag
05 00 00 00 = Email Body
17 00 00 00 = Contact
1d 00 00 00 = Document Body
0x85
09 0C 00 00
3081
4 bytes
Grammar Proofing Type
Refer to ‘HKCU\SOFTWARE\Microsoft\Shared Tools\Proofing Tools\Grammar’ for types available on your system.
Note: Not included in WLrip output.
0x89
81 02 00 00
641
4 bytes
Length of First Section of Body
Multiply integer value by 2 for byte length.
0x8D
-
-
1282 bytes
First Length of Body   Text
0x58F
01
1
1 byte
More Body Flag
01 = there is another section of body
00 = body is complete
0x590
80 02 00 00
640
4 bytes
Same as red offset 0x7D
0x594
05 00 00 00
5
4 bytes
Same as red offset 0x81
0x598
09 04 00 00
1033
4 bytes
Same as red offset 0x85
0x59C
1C 00 00 00
28
4 bytes
Same as red offset 0x89
0x5A0
-
-
56 bytes
Second Length of Body Text
0xAC9
0
0
1 byte
Same as red offset 0x58F
0xACA
06 00 00 00
4
4 bytes
Same as blue offset 0x25
0xACE
09 04 00 00
1033
4 bytes
Same as blue offset 0x29
0xAD2
1B 00 00 00
27
4 bytes
Same as blue offset 0x2D
0xAD6
-
-
54 bytes
Metadata Text (Email subject in this example)
0xB0C
00
0
1 byte
Same as blue offset 0x49
Doesn’t necessarily terminate on a 1 at the end of the record. WLrip has record length checks to mitigate parsing errors.

* For more information on MailItem objects, see:
https://msdn.microsoft.com/en-us/library/office/ff861332.aspx

Conclusion

WaitList is an additional source of email, contact and document evidence to add to our arsenal of forensic examination and e-discovery tools. Should you have any questions, recommendations or corrections for any of the detail in this blog, please post in the comments section below.
---------------------------------------------------------------------------------------------------------------------
Disclaimer: The information detailed within this report is based on my limited testing and analysis of the ‘WaitList.dat’ file. I do not currently claim to have a complete understanding of the structure or function of this file. Confirmation and testing of my findings by the broader forensic community is required before this information should be relied upon.

Comments

  1. Thanks for putting this out there. Very interesting and helpful.

    ReplyDelete
    Replies
    1. No worries at all. I am glad you found it useful!

      Delete
    2. Touch Screen Lexicon Forensics (Textharvester/Waitlist.Dat) >>>>> Download Now

      >>>>> Download Full

      Touch Screen Lexicon Forensics (Textharvester/Waitlist.Dat) >>>>> Download LINK

      >>>>> Download Now

      Touch Screen Lexicon Forensics (Textharvester/Waitlist.Dat) >>>>> Download Full

      >>>>> Download LINK cP

      Delete
  2. Well done, this yielded some horrific results on my desktop I use to work from home, NB: Not a touch device. Something that bothered me was some of the data seems as though it could have only have come from a "clipboard" type vector whilst I was using a remote desktop / remote workspace connection. I CANNOT Confirm that, it is only speculation!!! I am investigating further at the moment...

    ReplyDelete
  3. This comment has been removed by a blog administrator.

    ReplyDelete
  4. This comment has been removed by a blog administrator.

    ReplyDelete
  5. This comment has been removed by a blog administrator.

    ReplyDelete
  6. Thank you for your articles that you have shared with us. Hopefully you can give the article a good benefit to us. Programación en Python 3.7

    ReplyDelete
  7. Interactive digital signage, hdmi touch screen,industrial touch screen monitors

    Southern Stars Enterprises Co Ltd is the specialist manufacturer for commercial display since 1996.All our products are heavy-duty built to suit super long hour loop-playing 24/7/365. As one of touch screen monitor supplier, we have 15,19,22,32,42,55 inch commercial touch screen desktop monitor, industrial touch panel, touch screen monitor hdmi, retail chain store interactive digital signage, super market wall mounted hdmi touch screen,restaurant 16:9 industrial touch screen monitors etc.

    Visit here:- Touch screen monitor

    ReplyDelete
  8. Simple, educative and informative article thanks for sharing this wonderful article foot spa massager

    ReplyDelete
  9. Thanks for this informative and helpful article. I would surely share this article with my friends.

    https://www.iamdeepa.co.in/ceiling-fan-motor/

    ReplyDelete

  10. Really Nice Information It's Very Helpful Thanks for sharing such an informative post.
    https://www.vyaparinfo.com/looking-for-bulk-buyers-in-india/
    https://www.vyaparinfo.com/looking-for-distributors/

    ReplyDelete
  11. Good web site you have here.. It’s difficult to
    find quality writing like yours nowadays. I truly appreciate people like you!

    Take care!!

    Also visit my website Griffin Johnson

    ReplyDelete
  12. Therefore, it is best to steer clear of marijuana use if your job requires you to take regular drug tests or if you’re going to apply for a job. Remember, it is always safest to behave responsibly and stay healthy.The news and editorial staffs of The Denver Post had no role in this post’s preparation.. Also, be aware that your pee will look radioactively lime; which will likely alert the people who take your test. If you know you have drug testing on the horizon, and you’ve got plenty of time (5-15 days), detox pills are the best option for you. Visit: https://www.urineworld.com/

    ReplyDelete
  13. The global laptop touchscreen market currently holds a market value of US$ 4.3 billion in 2020 and is expected to grow at a CAGR of 12.1% to a projected market value of US$ 7.6 billion by 2025.
    You can acquire more information on the following website
    👇🏻
    laptop touch screen market

    ReplyDelete
  14. Baccarat - The Ultimate Guide to Playing and Win
    There are two kinds of Baccarat: regular and advanced (called 'baccarat') rules, called “baccarat'. 바카라 사이트 These rules can be found youtube mp3 in many other areas of card game deccasino

    ReplyDelete


  15. Nice Blog!
    https://crackedway.com/foxit-phantompdf-crack-latest/

    ReplyDelete
  16. I am very happy to read this article. Thanks for giving us Amazing info. Fantastic post.
    Thanks For Sharing such an informative article, Im taking your feed also, Thanks.
    windows-7-manager-crack/

    ReplyDelete
  17. Touch Screen Lexicon Forensics (Textharvester/Waitlist.Dat) >>>>> Download Now

    >>>>> Download Full

    Touch Screen Lexicon Forensics (Textharvester/Waitlist.Dat) >>>>> Download LINK

    >>>>> Download Now

    Touch Screen Lexicon Forensics (Textharvester/Waitlist.Dat) >>>>> Download Full

    >>>>> Download LINK vz

    ReplyDelete
  18. Are you looking to buy a high-quality PCAP Touch Screen at a very affordable price? if yes then order from "Obtouch.com". Buy 17 inch Touch all-in-one computer with PCAP touch screen n17 inch Touch all-in-one computer with PCAP touch screen.



    Call Now: +86 134 8064 0976


    Buy High Quality PCAP Touch Screen

    ReplyDelete
  19. Gear Net Technologies is the top Sira Approved SupplierSira Approved Supplier

    ReplyDelete
  20. Gear Net Technologies is the top Sira Approved Supplier. We provide entire protection to our customers as well as system maintenance.

    ReplyDelete
  21. I guess I am the only one who came here to share my very own experience. Guess what!? I am using my laptop for almost the past 2 years, but I had no idea of solving some basic issues. I do not know how to Crack Softwares Free Download But thankfully, I recently visited a website named crackline.net
    inMusic Brands crack
    Backup4all Professional crack
    Lucky Patcher crack
    Ultra Adware Killer crack
    Valentina Studio Pro crack
    Edraw Max Pro crack
    Topaz Sharpen AI crack
    Airmail crack
    PGWare GameBoost crack
    Bulk Image Downloader crack

    ReplyDelete
  22. I really love your work it’s very beneficial to many people’s. Your blog approach helps many people like myself. Its content is very easy to understand and helps a lot,
    Do visit my site for new and Updated software:

    Download Free Software for Mac & PC
    Lansweeper crack
    Reallusion iClone Pro Crack
    Lucky Patcher Crack
    Visuino crack

    ReplyDelete
  23. I visit your blog first time and I found you blog is containing lots of informative post. Thank you for sharing with us! best data science course in Noida

    ReplyDelete
  24. Great post! Thank you for sharing this information. It is beneficial for us. Get DNA Test For Signature Verification Forensics Test. This blog will help those looking for a DNA Test For Signature Verification.

    ReplyDelete
  25. I read this article, it is really informative one. Your way of writing and making things clear is very impressive. Thanking you for such an informative article.
    smart tft display

    ReplyDelete
  26. Nice Blog. Thanks for sharing with us. Keep sharing.

    Do you want to buy touch monitors, touch screens online?

    Touch Monitors

    ReplyDelete
  27. Organizations and drives working in the DeFi space at times battle to track down qualified engineers and different trained professionals. Organizations face inconveniences in finding blockchain and non-blockchain experts acquainted with the monetary cycle and proper advancements that could be viable with offered arrangements. As the cryptographic money industry grows, designers should keep awake by learning and applying new innovations, gauging their advantages and cons for decentralized finance drives. This has prompted a lack of qualified staff, similar to a guaranteed DeFi token improvement administrations organization>> defi staking development

    ReplyDelete
  28. This blog was quite beneficial. I am very grateful that you shared this with me and the rest of the world. custom erp development

    ReplyDelete
  29. A fascinating read on touch screen forensics! This blog elegantly breaks down the intricacies of analyzing touch screen devices for digital evidence. Kudos to the author for demystifying a complex topic with clarity. A valuable resource for anyone interested in the evolving field of digital forensics.

    ReplyDelete
  30. Nice Blog, Thanks for sharing with. Keep sharing!!

    Do you want to buy High Brightness IR Touch Monitors Online?

    Buy High Brightness IR Touch Monitors Online

    ReplyDelete
  31. These rugged screens are built to withstand harsh conditions, making them ideal for heavy-duty applications, such as oil rigs and factory floors.
    industrial touchscreen displays

    ReplyDelete
  32. Thank you for sharing this valuable content. I always appreciate insightful and well-presented information. The ideas and presentation are outstanding, making the post thoroughly enjoyable. Keep up the fantastic work.
    visit: DATABASE MANAGEMENT IN FULL STACK DEVELOPMENT: A DEEP DIVE

    ReplyDelete
  33. KeenTeQ is one of the top SIRA-approved CCTV company in Dubai and the UAE. We have licensed engineers and skilled experts working for us as CCTV security system installer in Dubai. KeenTeQ being a leading SIRA-approved CCTV company, can create and implement the ideal CCTV solutions for you, regardless of your needs. We are able to provide several kinds of CCTV, such as dome cameras, PTZ cameras, bullet cameras, etc.
    SIRA Approved CCTV Company in Dubai

    ReplyDelete
  34. The integration of advanced features such as haptic feedback in iindustrial touchscreen panels enhances user experience by providing tactile confirmation, contributing to more accurate inputs and reducing the chance of errors

    ReplyDelete

Post a Comment

Popular posts from this blog

Windows PowerShell Remoting: Host Based Investigation and Containment Techniques

LSASS.DMP... Attacker or Admin?