Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: add support for JFrog Artifactory and witness provenances produced on GitLab CI #349

Merged
merged 42 commits into from
Aug 22, 2023
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
42 commits
Select commit Hold shift + click to select a range
9475994
feat: add JFrog Maven package registry
nathanwn Jul 27, 2023
61c90aa
chore: add package registry entry to the analyze context of a softwar…
nathanwn Jul 27, 2023
7530c2d
chore: take git service into account when detect ci services
nathanwn Jul 27, 2023
248a3c4
chore: improve docstrings of GitHub Actions CI service
nathanwn Jul 27, 2023
63a6a27
chore: move provenance download to provenance_available_check and imp…
nathanwn Jul 27, 2023
cc528da
chore: add expectation verification for provenances downloaded from p…
nathanwn Jul 27, 2023
3c411d1
chore: add check for witness provenances
nathanwn Jul 27, 2023
4b4ef52
chore: update integration test expected output files
nathanwn Jul 27, 2023
fc4f0c8
chore: add max valid provenance size check before downloading
nathanwn Aug 1, 2023
ed304b5
chore: fix docstring for JFrogMavenAsset
nathanwn Aug 1, 2023
3c6bd35
chore: fix docstring for the JFrogMavenAsset url property
nathanwn Aug 1, 2023
e00dbdf
chore: remove redundant timeout attribute in package_registry.jfrog.m…
nathanwn Aug 1, 2023
6db186b
chore: fix type annotations of provenance payload
nathanwn Aug 1, 2023
5709a7f
chore: add provenances discovered from package registries to the html…
nathanwn Aug 1, 2023
b6275cd
chore: fix docstring of the JFrogMavenRegistry::construct_maven_repos…
nathanwn Aug 6, 2023
8a9f82b
chore: re-implement the logic to get group ids of a Gradle repo based…
nathanwn Aug 7, 2023
02f055b
chore: clarify some docstrings in Gradle class
nathanwn Aug 7, 2023
78a4c3b
chore: rename the Asset Protocol to IsAsset
nathanwn Aug 7, 2023
3d57c7b
chore: add download method to the IsAsset Protocol
nathanwn Aug 7, 2023
4de00af
chore: add missing docstring for raised exception in the find_provena…
nathanwn Aug 7, 2023
74c860c
chore: adjust log messages in case a provenance exceeds max valid fil…
nathanwn Aug 7, 2023
d2dda6a
chore: move witness-related logic to a separate module
nathanwn Aug 7, 2023
93beee3
chore: adjust how provenances are stored in the PackageRegistryData c…
nathanwn Aug 7, 2023
efaef7f
chore: reimplement logic to determine which provenances are produced …
nathanwn Aug 7, 2023
4e6bd58
chore: rename the PackageRegistryData class to PackageRegistryInfo
nathanwn Aug 7, 2023
6fbea90
chore: add docstring for attributes of the JFrogMavenRegistry class
nathanwn Aug 7, 2023
558ccbc
chore: rename the package_registry_info module to package_registry_spec
nathanwn Aug 7, 2023
2f79de9
chore: rename asset and provenance interfaces and remove type cast on…
nathanwn Aug 10, 2023
5511ed3
chore: fix typo
nathanwn Aug 15, 2023
7b292b1
chore: adjust docstring for the asset module
nathanwn Aug 15, 2023
a0fbeaa
chore: fix typo
nathanwn Aug 15, 2023
e519b9a
chore: refactor provenance loading & validation, and the extract_repo…
nathanwn Aug 20, 2023
d234c27
chore: fix the result of witness_provenance_l1_check in case no witne…
nathanwn Aug 20, 2023
07ad8f5
chore: rename 'domain' to 'hostname' in witness ini config
nathanwn Aug 20, 2023
7ceb8b6
chore: improve docstrings for in-toto payload
nathanwn Aug 22, 2023
72854d6
chore: add note for frozen dataclass
nathanwn Aug 22, 2023
d2b1f22
chore: add TODO comment about potentially using the in-toto-attestati…
nathanwn Aug 22, 2023
3a6ea89
chore: adjust docstring of provenance module
nathanwn Aug 22, 2023
d3e3ace
chore: rename variable to use trailing underscore
nathanwn Aug 22, 2023
65a8347
chore: adjust comment on the check status of witness_provenance_l1_check
nathanwn Aug 22, 2023
1a6d6ca
chore: adjust docstring of the `validate_intoto_statement` function
nathanwn Aug 22, 2023
842f756
chore: bug fix for witness check
nathanwn Aug 22, 2023
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
3 changes: 3 additions & 0 deletions src/macaron/__main__.py
Original file line number Diff line number Diff line change
Expand Up @@ -22,6 +22,7 @@
from macaron.policy_engine.policy_engine import run_policy_engine, show_prelude
from macaron.slsa_analyzer.analyzer import Analyzer
from macaron.slsa_analyzer.git_service import GIT_SERVICES
from macaron.slsa_analyzer.package_registry import PACKAGE_REGISTRIES

logger: logging.Logger = logging.getLogger(__name__)

Expand Down Expand Up @@ -142,6 +143,8 @@ def perform_action(action_args: argparse.Namespace) -> None:
try:
for git_service in GIT_SERVICES:
git_service.load_defaults()
for package_registry in PACKAGE_REGISTRIES:
package_registry.load_defaults()
except ConfigurationError as error:
logger.error(error)
sys.exit(os.EX_USAGE)
Expand Down
17 changes: 17 additions & 0 deletions src/macaron/config/defaults.ini
Original file line number Diff line number Diff line change
Expand Up @@ -337,3 +337,20 @@ provenance_extensions =
max_download_size = 70000000
# This is the timeout (in seconds) to run the SLSA verifier.
timeout = 120

# Witness provenance. See: https://github.com/testifysec/witness.
[provenance.witness]
# The allowed values of the `predicateType` field in the provenance (data type: list).
# For more details, see:
# https://github.com/in-toto/attestation/tree/main/spec/v0.1.0#statement
predicate_types =
https://witness.testifysec.com/attestation-collection/v0.1
artifact_extensions =
jar

# Package registries.
# [package_registry.jfrog.maven]
# In this example, the Maven repo can be accessed at `https://internal.registry.org/repo-name`.
# hostname = internal.registry.org
# repo = repo-name
# download_timeout = 120
13 changes: 12 additions & 1 deletion src/macaron/slsa_analyzer/analyze_context.py
Original file line number Diff line number Diff line change
Expand Up @@ -19,6 +19,7 @@
from macaron.slsa_analyzer.slsa_req import ReqName, SLSAReq, get_requirements_dict
from macaron.slsa_analyzer.specs.build_spec import BuildSpec
from macaron.slsa_analyzer.specs.ci_spec import CIInfo
from macaron.slsa_analyzer.specs.package_registry_spec import PackageRegistryInfo

logger: logging.Logger = logging.getLogger(__name__)

Expand All @@ -38,6 +39,8 @@ class ChecksOutputs(TypedDict):
# class uses inlined functions, which is not supported by Protocol.
expectation: Expectation | None
"""The expectation to verify the provenance for this repository."""
package_registries: list[PackageRegistryInfo]
"""The package registries for this repository."""


class AnalyzeContext:
Expand Down Expand Up @@ -82,6 +85,7 @@ def __init__(
git_service=NoneGitService(),
build_spec=BuildSpec(tools=[]),
ci_services=[],
package_registries=[],
is_inferred_prov=True,
expectation=None,
)
Expand All @@ -93,12 +97,19 @@ def provenances(self) -> dict:
Returns
-------
dict
A dictionary in which each key is a CI service's name and each value is
the corresponding provenance payload.
"""
try:
ci_services = self.dynamic_data["ci_services"]
result = {}
for ci_info in ci_services:
result[ci_info["service"].name] = ci_info["provenances"]
result[ci_info["service"].name] = [payload.statement for payload in ci_info["provenances"]]
package_registry_entries = self.dynamic_data["package_registries"]
for package_registry_entry in package_registry_entries:
result[package_registry_entry.package_registry.name] = [
provenance.payload.statement for provenance in package_registry_entry.provenances
]
return result
except KeyError:
return {}
Expand Down
23 changes: 21 additions & 2 deletions src/macaron/slsa_analyzer/analyzer.py
Original file line number Diff line number Diff line change
Expand Up @@ -45,10 +45,13 @@
from macaron.slsa_analyzer.database_store import store_analyze_context_to_db
from macaron.slsa_analyzer.git_service import GIT_SERVICES, BaseGitService
from macaron.slsa_analyzer.git_service.base_git_service import NoneGitService
from macaron.slsa_analyzer.package_registry import PACKAGE_REGISTRIES
from macaron.slsa_analyzer.provenance.expectations.expectation_registry import ExpectationRegistry
from macaron.slsa_analyzer.provenance.intoto import InTotoV01Payload
from macaron.slsa_analyzer.registry import registry
from macaron.slsa_analyzer.specs.ci_spec import CIInfo
from macaron.slsa_analyzer.specs.inferred_provenance import Provenance
from macaron.slsa_analyzer.specs.package_registry_spec import PackageRegistryInfo

logger: logging.Logger = logging.getLogger(__name__)

Expand Down Expand Up @@ -808,7 +811,10 @@ def perform_checks(self, analyze_ctx: AnalyzeContext) -> dict[str, CheckResult]:
ci_service.load_defaults()
ci_service.set_api_client()

if ci_service.is_detected(analyze_ctx.component.repository.fs_path):
if ci_service.is_detected(
repo_path=analyze_ctx.component.repository.fs_path,
git_service=analyze_ctx.dynamic_data["git_service"],
):
logger.info("The repo uses %s CI service.", ci_service.name)

# Parse configuration files and generate IRs.
Expand All @@ -825,7 +831,20 @@ def perform_checks(self, analyze_ctx: AnalyzeContext) -> dict[str, CheckResult]:
callgraph=callgraph,
provenance_assets=[],
latest_release={},
provenances=[Provenance().payload],
provenances=[InTotoV01Payload(statement=Provenance().payload)],
)
)

# Determine the package registries.
# We match the repo against package registries through build tools.
build_tools = analyze_ctx.dynamic_data["build_spec"]["tools"]
for package_registry in PACKAGE_REGISTRIES:
for build_tool in build_tools:
if package_registry.is_detected(build_tool):
analyze_ctx.dynamic_data["package_registries"].append(
PackageRegistryInfo(
build_tool=build_tool,
package_registry=package_registry,
)
)

Expand Down
40 changes: 40 additions & 0 deletions src/macaron/slsa_analyzer/asset/__init__.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,40 @@
# Copyright (c) 2023 - 2023, Oracle and/or its affiliates. All rights reserved.
# Licensed under the Universal Permissive License v 1.0 as shown at https://oss.oracle.com/licenses/upl/.

"""This module defines classes and interfaces related to assets.

Assets are files published from some build.
"""

from typing import Protocol


class AssetLocator(Protocol):
"""Interface of an asset locator."""

@property
def name(self) -> str:
"""Get the name (file name) of the asset."""

@property
def url(self) -> str:
"""Get the url to the asset."""

@property
def size_in_bytes(self) -> int:
"""Get the size of the asset in bytes."""

def download(self, dest: str) -> bool:
"""Download the asset.

Parameters
----------
dest : str
The local destination where the asset is downloaded to.
Note that this must include the file name.

Returns
-------
bool
``True`` if the asset is downloaded successfully; ``False`` if not.
"""
106 changes: 106 additions & 0 deletions src/macaron/slsa_analyzer/build_tool/gradle.py
Original file line number Diff line number Diff line change
Expand Up @@ -8,7 +8,9 @@

import logging
import os
import subprocess # nosec B404

import macaron
from macaron.config.defaults import defaults
from macaron.config.global_config import global_config
from macaron.dependency_analyzer import DependencyAnalyzer, DependencyAnalyzerError, DependencyTools
Expand Down Expand Up @@ -135,3 +137,107 @@ def get_dep_analyzer(self, repo_path: str) -> CycloneDxGradle:
)

raise DependencyAnalyzerError(f"Unsupported SBOM generator for Gradle: {tool_name}.")

def get_gradle_exec(self, repo_path: str) -> str:
"""Get the Gradle executable for the repo.

Parameters
----------
repo_path: str
The absolute path to a repository containing Gradle projects.

Returns
-------
str
The absolute path to the Gradle executable.
"""
# We try to use the gradlew that comes with the repository first.
repo_gradlew = os.path.join(repo_path, "gradlew")
if os.path.isfile(repo_gradlew) and os.access(repo_gradlew, os.X_OK):
return repo_gradlew

# We use Macaron's built-in gradlew as a fallback option.
return os.path.join(os.path.join(macaron.MACARON_PATH, "resources"), "gradlew")

def get_group_ids(self, repo_path: str) -> set[str]:
"""Get the group ids of all Gradle projects in a repository.

A Gradle project is a directory containing a ``build.gradle`` file.
According to the Gradle's documentation, there is a one-to-one mapping between
a "project" and a ``build.gradle`` file.
See: https://docs.gradle.org/current/javadoc/org/gradle/api/Project.html.

Note: This method makes the assumption that projects nested in a parent project
directory has the same group id with the parent. This behavior is consistent with
the behavior of the ``get_build_dirs`` method.

Parameters
----------
repo_path: str
The absolute path to a repository containing Gradle projects.

Returns
-------
set[str]
The set of group ids of all Gradle projects in the repository.
"""
gradle_exec = self.get_gradle_exec(repo_path)
group_ids = set()

for gradle_project_relpath in self.get_build_dirs(repo_path):
gradle_project_path = os.path.join(repo_path, gradle_project_relpath)
group_id = self.get_group_id(
gradle_exec=gradle_exec,
project_path=gradle_project_path,
)
if group_id:
group_ids.add(group_id)

return group_ids

def get_group_id(self, gradle_exec: str, project_path: str) -> str | None:
"""Get the group id of a Gradle project.

A Gradle project is a directory containing a ``build.gradle`` file.
According to the Gradle's documentation, there is a one-to-one mapping between
a "project" and a ``build.gradle`` file.
See: https://docs.gradle.org/current/javadoc/org/gradle/api/Project.html.

Parameters
----------
gradle_exec: str
The absolute path to the Gradle executable.

project_path : str
behnazh-w marked this conversation as resolved.
Show resolved Hide resolved
The absolute path to the Gradle project.

Returns
-------
str | None
The group id of the project, if exists.
"""
try:
result = subprocess.run( # nosec B603
[gradle_exec, "properties"],
capture_output=True,
cwd=project_path,
check=False,
)
except (subprocess.CalledProcessError, OSError) as error:
logger.debug("Could not capture the group id of the Gradle project at %s", project_path)
logger.debug("Error: %s", error)
return None

if result.returncode == 0:
lines = result.stdout.decode().split("\n")
for line in lines:
if line.startswith("group: "):
group = line.replace("group: ", "")
# The value of group here can be an empty string.
if group:
return group
break

logger.debug("Could not capture the group id of the repo at %s", project_path)
logger.debug("Stderr:\n%s", result.stderr)
return None
27 changes: 21 additions & 6 deletions src/macaron/slsa_analyzer/checks/build_as_code_check.py
Original file line number Diff line number Diff line change
Expand Up @@ -5,6 +5,7 @@

import logging
import os
from typing import Any

from sqlalchemy import ForeignKey
from sqlalchemy.orm import Mapped, mapped_column
Expand All @@ -22,6 +23,7 @@
from macaron.slsa_analyzer.ci_service.gitlab_ci import GitLabCI
from macaron.slsa_analyzer.ci_service.jenkins import Jenkins
from macaron.slsa_analyzer.ci_service.travis import Travis
from macaron.slsa_analyzer.provenance.intoto import InTotoV01Payload
from macaron.slsa_analyzer.registry import registry
from macaron.slsa_analyzer.slsa_req import ReqName
from macaron.slsa_analyzer.specs.ci_spec import CIInfo
Expand Down Expand Up @@ -202,8 +204,12 @@ def _check_build_tool(
else "However, could not find a passing workflow run.",
]
check_result["justification"].extend(justification)
if ctx.dynamic_data["is_inferred_prov"] and ci_info["provenances"]:
predicate = ci_info["provenances"][0]["predicate"]
if (
ctx.dynamic_data["is_inferred_prov"]
and ci_info["provenances"]
and isinstance(ci_info["provenances"][0], InTotoV01Payload)
):
predicate: Any = ci_info["provenances"][0].statement["predicate"]
predicate["buildType"] = f"Custom {ci_service.name}"
predicate["builder"]["id"] = deploy_action_source_link
predicate["invocation"]["configSource"]["uri"] = (
Expand Down Expand Up @@ -261,8 +267,12 @@ def _check_build_tool(
else "However, could not find a passing workflow run.",
]
check_result["justification"].extend(justification_cmd)
if ctx.dynamic_data["is_inferred_prov"] and ci_info["provenances"]:
predicate = ci_info["provenances"][0]["predicate"]
if (
ctx.dynamic_data["is_inferred_prov"]
and ci_info["provenances"]
and isinstance(ci_info["provenances"][0], InTotoV01Payload)
):
predicate = ci_info["provenances"][0].statement["predicate"]
predicate["buildType"] = f"Custom {ci_service.name}"
predicate["builder"]["id"] = bash_source_link
predicate["invocation"]["configSource"]["uri"] = (
Expand Down Expand Up @@ -300,8 +310,13 @@ def _check_build_tool(
f"The target repository uses build tool {build_tool.name}"
+ f" in {ci_service.name} using {deploy_kw} to deploy."
)
if ctx.dynamic_data["is_inferred_prov"] and ci_info["provenances"]:
predicate = ci_info["provenances"][0]["predicate"]

if (
ctx.dynamic_data["is_inferred_prov"]
and ci_info["provenances"]
and isinstance(ci_info["provenances"][0], InTotoV01Payload)
):
predicate = ci_info["provenances"][0].statement["predicate"]
predicate["buildType"] = f"Custom {ci_service.name}"
predicate["builder"]["id"] = config_name
predicate["invocation"]["configSource"]["uri"] = (
Expand Down
18 changes: 14 additions & 4 deletions src/macaron/slsa_analyzer/checks/build_service_check.py
Original file line number Diff line number Diff line change
Expand Up @@ -5,6 +5,7 @@

import logging
import os
from typing import Any

from sqlalchemy import ForeignKey
from sqlalchemy.orm import Mapped, mapped_column
Expand All @@ -20,6 +21,7 @@
from macaron.slsa_analyzer.ci_service.gitlab_ci import GitLabCI
from macaron.slsa_analyzer.ci_service.jenkins import Jenkins
from macaron.slsa_analyzer.ci_service.travis import Travis
from macaron.slsa_analyzer.provenance.intoto import InTotoV01Payload
from macaron.slsa_analyzer.registry import registry
from macaron.slsa_analyzer.slsa_req import ReqName
from macaron.slsa_analyzer.specs.ci_spec import CIInfo
Expand Down Expand Up @@ -183,8 +185,12 @@ def _check_build_tool(
)
]

if ctx.dynamic_data["is_inferred_prov"] and ci_info["provenances"]:
predicate = ci_info["provenances"][0]["predicate"]
if (
ctx.dynamic_data["is_inferred_prov"]
and ci_info["provenances"]
and isinstance(ci_info["provenances"][0], InTotoV01Payload)
):
predicate: Any = ci_info["provenances"][0].statement["predicate"]
predicate["buildType"] = f"Custom {ci_service.name}"
predicate["builder"]["id"] = bash_source_link
predicate["invocation"]["configSource"]["uri"] = (
Expand Down Expand Up @@ -219,8 +225,12 @@ def _check_build_tool(
)
]

if ctx.dynamic_data["is_inferred_prov"] and ci_info["provenances"]:
predicate = ci_info["provenances"][0]["predicate"]
if (
ctx.dynamic_data["is_inferred_prov"]
and ci_info["provenances"]
and isinstance(ci_info["provenances"][0], InTotoV01Payload)
):
predicate = ci_info["provenances"][0].statement["predicate"]
predicate["buildType"] = f"Custom {ci_service.name}"
predicate["builder"]["id"] = config_name
predicate["invocation"]["configSource"]["uri"] = (
Expand Down
Loading
Loading