Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Draft: implemented text parser plugin for Apple ps.txt files. #4861

Open
wants to merge 4 commits into
base: main
Choose a base branch
from

Conversation

rick-slin
Copy link
Contributor

@rick-slin rick-slin commented Apr 4, 2024

Description:

DRAFT: Implemented a text parser plugin for Apple ps.txt files found in sysdiagnose dumps. It uses the DatelessLogHelper

Related issue (if applicable): fixes4697

Notes:

All contributions to Plaso undergo code review.
This makes sure that the code has appropriate test coverage and conforms to the
Plaso style guide.

One of the maintainers will examine your code, and may request changes. Check off the items below in
order, and then a maintainer will review your code.

Checklist:

  • Automated checks (GitHub Actions, AppVeyor) pass
  • No new new dependencies are required or l2tdevtools has been updated
  • Reviewer assigned

self.CheckEventData(event_data, expected_event_values)

expected_event_values = {
'command': '/System/Library/PrivateFrameworks/'
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

style nit: use (...) to split the string cross lines

' TIME COMMAND') + pyparsing.LineEnd())

_COMMAND = pyparsing.OneOrMore(
pyparsing.Word(pyparsing.printables),
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

style nit: 4 space continuation indentation

ParseError: if the structure cannot be parsed.
"""
# Retrieve the data from the file's metadata
self._SetEstimatedDate(parser_mediator)
Copy link
Member

@joachimmetz joachimmetz Apr 6, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Move this into CheckRequiredFormat e.g. https://github.com/log2timeline/plaso/blob/main/plaso/parsers/text_plugins/syslog.py#L761 so this check only happens once

hours = 0

# Retrieve year, month, day from dateless helper
year = self._date[0]
Copy link
Member

@joachimmetz joachimmetz Apr 6, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Based on the format I assume date here should be the most recent date. However the date-less log helper will do the opposite it will return the earliest date as its base. The idea of the date-less log helper is to store successive dates relative so that one could change it after extraction if needed.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I disagree on that point. Since the file is created once by the ps command and should not be modified, the earliest timestamp should be the correct one. If the file is moved or accessed, then the most recent date would not be correct. For instance, the test file on my system as an earliest date of April 13, 2022 (modified) whereas the latest date is April 12, 2024 (accessed today).


except (TypeError, ValueError) as exception:
raise errors.ParseError(
f'Unable to parse time elements with error: {exception}')
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

style nit: to keep consistent with the rest of the code base please use {exception!s}

@joachimmetz
Copy link
Member

@rick-slin I will give this format some more thought, it does not fully match the use-case of the date-less log helper as with syslog where the log entries are (mostly) chronological.

Some hints (not the same) to the format based on Linux man ps

       bsdstart    START     time the command started.  If the process was started less than 24 hours ago, the output format is " HH:MM", else it is " Mmm:SS" (where Mmm is the three letters of the month).  See also lstart, start, start_time, and stime.

       bsdtime     TIME      accumulated cpu time, user + system.  The display format is usually "MMM:SS", but can be shifted to the right if the process used more than 999 minutes of cpu time.

Would be good to have an example of a process that has more than 999 minutes of cpu time

@rick-slin
Copy link
Contributor Author

I don't understand the significance of the entries not being chronological. I can see the usefulness of moving the handling of the three cases from the plugin to the helper.

I can try to setup an experiment for a long lived process but I don't see how that field would impact the start time column as they appear to be independent.

@joachimmetz
Copy link
Member

I can try to setup an experiment for a long lived process but I don't see how that field would impact the start time column as they appear to be independent.

this would be more to see if there is an edge case for the format of the TIME value

I don't understand the significance of the entries not being chronological.

This is related to the inner workings of the date-less log helper

…dled a compressed text file, fixed the plugin name in the presets.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants