Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Windows: Adds scheduled tasks plugin #1307

Open
wants to merge 1 commit into
base: develop
Choose a base branch
from

Conversation

dgmcdona
Copy link
Contributor

@dgmcdona dgmcdona commented Oct 9, 2024

This adds a plugin 'ScheduledTasks' that can decode binary-encoded scheduled tasks from the Windows registry's SOFTWARE hive using a custom reader that extends the io.BytesIO class. Decoding operations are intended to be as fault tolerant as possible, swallowing exceptions and returning None to account for smear or missing data.

Because each task can have multiple triggers and multiple actions, a single entry is generated for each trigger + action pair. In the event that the either the actions or triggers could not be parsed due to missing or smeared data, an entry will still be generated using the available information from the other registry value, since trigger and action data is stored separately.

Much more information is decoded than is rendered, this was done intentionally to avoid overpopulating the TreeGrid with less pertinent data and to avoid an explosion of trigger/action-specific fields that may not apply to most other entries, while ensuring that the reader is at the correct offset.

Also documents the raised exceptions in the RegistryHive.get_key() docstring.

Example task:

{
  "Action": "C:\\Program Files\\Windows Defender\\MpCmdRun.exe",
  "Action Arguments": "Scan -ScheduleJob -ScanTrigger 55 -IdleScheduledJob",
  "Action Context": "LocalSystem",
  "Action Type": "Exe",
  "Creation Time": null,
  "Display Name": null,
  "Enabled": true,
  "Key Name": "{07846B99-B7DC-4599-AEB1-D421B479570F}",
  "Last Run Time": "2023-10-31T22:13:45+00:00",
  "Last Successful Run Time": null,
  "Principal ID": "LocalSystem",
  "Task Name": "Windows Defender Scheduled Scan",
  "Trigger Description": "Run at 2000-01-01T02:36:41+00:00 and repeat every 1 days",
  "Trigger Type": "Time",
  "Working Directory": null,
  "__children": []
}

@dgmcdona dgmcdona force-pushed the dgmcdona/windows-scheduled-tasks branch 3 times, most recently from 0ef2825 to 0cf298a Compare October 9, 2024 22:07
@dgmcdona dgmcdona force-pushed the dgmcdona/windows-scheduled-tasks branch 4 times, most recently from d8804db to 69e824c Compare October 9, 2024 22:17
This adds a plugin 'ScheduledTasks' that can decode binary-encoded
scheduled tasks from the Windows registry's SOFTWARE hive using a custom
reader that extends the `io.BytesIO` class. Decoding operations are
intended to be as fault tolerant as possible, swallowing exceptions and
returning `None` to account for smear or missing data.

Because each task can have mulitple triggers and multiple actions, a
single entry is generated for each trigger + action pair. In the event
that the either the actions could not be parsed or the triggers could
not be parsed due to missing or smeared data, an entry will still be
generated using the available information from the other registry value,
since trigger and action data is stored separately.

Much more information is decoded than is rendered, this was done
intentionally to avoid overpopulating the TreeGrid with less pertinent
data and to avoid an explosion of trigger and action-specific fields that
may not apply to most other entries.
Copy link
Member

@ikelos ikelos left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Only half reviewed so far, this is a big one, but there's a few comments for you to be getting on with and I'll take another crack at it tomorrow hopefully...

return is_localized, filetime

def read_filetime(self) -> Optional[datetime.datetime]:
return datetime.datetime.now()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This... seems like the wrong thing? Was this intentional or just a placeholder?

format(v, "x")
for v in struct.unpack(
">HQ",
buf[8:10] + b"\x00\x00" + buf[10:16],
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wow... just... wow. Microsoft do have some interesting people working for them don't they... 5:P

)


class TestActionsDecoding(unittest.TestCase):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These should be in a separate file. I'm all for test cases, but not that add huge chunks of data to the end of an already long plugin. I believe we do have a test directory. I'd be much happier if all the tests lived in there. Ideally we'd have a whole pytest setup that unit-tested each of the individual plugins and components of the framework, but I'm not going to put that on you now (and testing what is a giant parsing framework against all possible test cases isn't really feasible). However, if you could move this code somewhere appropriate with that goal in mind, I'd be very grateful...

mode_index = reader.read_u4()
try:
if mode_index is not None:
mode = list(TimeMode)[mode_index]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This feels like we're using the enum the wrong way? It should contain the value, and then a lookup dictionary should contain the text we want to display? Converting it to a list makes me worry that the order won't stay consisent, whereas is the enum were:

TimeMode.Once = 1
TimeMode.Dailed = 2
...

Then it should be possible to just go TimeMode(mode_index) and then later look that up in a dictionary if we want the text for it?

data2 = reader.read_u2()
data3 = reader.read_u2()

reader.seek(2, io.SEEK_CUR) # pad
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you need the seek_relative method, if you're always explicitly doing traditional seeks?

filetime = self.decode_filetime()
if filetime is None:
return None
return is_localized, filetime
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we know whether it's localized, could we not just return a python datetime object that includes the localization, rather than a filetime? The user might want the value changed to a timezone of their choosing, so keeping everything in one format we understand seems better than a separate flag (which appears to be ignored later on when this is called)?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants