[New Rules] Azure OpenAI #3701

Mikaayenson · 2024-05-22T12:58:53Z

Issues

Related to

Summary

Here is round 2 of our detection engineering within the LLM and AI ecosystem feature Azure OpenAI. The elastic/integrations#9706 just merged and so we can start to develop detection rules. Note: This experimental integration does not yet include the Advanced Logging feature that includes fields like prompt and completion, nor does it yet include our proposed gen_ai.* fields. They are expected to be added later.

Details

Here are three ESQL rules highlighting the available dataset.

Potential Denial of Azure OpenAI ML Service Detection

Potential Azure OpenAI Model Theft Detection

Azure OpenAI Insecure Output Handling Detection

rules/integrations/azure_openai/azure_openai_denial_of_ml_service_detection.toml

brokensound77 · 2024-05-22T22:15:04Z

rules/integrations/azure_openai/azure_openai_insecure_output_handling_detection.toml

+interval = "10m"
+language = "esql"
+license = "Elastic License v2"
+name = "Azure OpenAI Insecure Output Handling Detection"


Suggested change

name = "Azure OpenAI Insecure Output Handling Detection"

name = "Azure OpenAI Insecure Output Handling"

brokensound77 · 2024-05-22T22:15:11Z

rules/integrations/azure_openai/azure_openai_model_theft_detection.toml

+interval = "10m"
+language = "esql"
+license = "Elastic License v2"
+name = "Potential Azure OpenAI Model Theft Detection"


Suggested change

name = "Potential Azure OpenAI Model Theft Detection"

name = "Potential Azure OpenAI Model Theft"

brokensound77 · 2024-05-22T22:19:38Z

rules/integrations/azure_openai/azure_openai_denial_of_ml_service_detection.toml

+from logs-azure_openai.logs-*
+| where azure.open_ai.operation_name == "ChatCompletions_Create"
+| stats count = count(), avg_request_size = avg(azure.open_ai.properties.request_length) by azure.resource.id
+| where count > 1000 OR avg_request_size > 5000


I prefer lowercase, but its purely stylistic at this point, but at a minimum, we should be consistent where possible

Suggested change

| where count > 1000 OR avg_request_size > 5000

| where count > 1000 or avg_request_size > 5000

suggestion : add a comment to explain the size is it in kb or bytes, also use >= (to trigger also on 1000 and 5000)

brokensound77 · 2024-05-22T22:20:00Z

rules/integrations/azure_openai/azure_openai_model_theft_detection.toml

+from logs-azure_openai.logs-*
+| where azure.open_ai.operation_name == "ListKey" and azure.open_ai.category == "Audit"
+| stats count = count(), max_data_transferred = max(azure.open_ai.properties.response_length) by azure.open_ai.properties.model_name
+| where count > 100 OR max_data_transferred > 1000000


Suggested change

| where count > 100 OR max_data_transferred > 1000000

| where count > 100 or max_data_transferred > 1000000

if 100 and 100000 are the max threshold then maybe use >= vs > also add a comment to expland data size unit

Samirbous · 2024-05-23T10:14:29Z

rules/integrations/azure_openai/azure_openai_denial_of_ml_service_detection.toml

+from logs-azure_openai.logs-*
+| where azure.open_ai.operation_name == "ChatCompletions_Create"
+| stats count = count(), avg_request_size = avg(azure.open_ai.properties.request_length) by azure.resource.id
+| where count > 1000 OR avg_request_size > 5000


suggestion : add a comment to explain the size is it in kb or bytes, also use >= (to trigger also on 1000 and 5000)

Samirbous · 2024-05-23T10:16:18Z

rules/integrations/azure_openai/azure_openai_denial_of_ml_service_detection.toml

+query = '''
+from logs-azure_openai.logs-*
+| where azure.open_ai.operation_name == "ChatCompletions_Create"
+| stats count = count(), avg_request_size = avg(azure.open_ai.properties.request_length) by azure.resource.id


is azure.resource.id is specific/unique to the user/source of the API calls ? would be ideal to aggregate by a field that can be used for attribution/further investigations to triage.

++ also helps with FP from collisions/RC

1001 users making 1 request each in an hour may be normal

Samirbous · 2024-05-23T10:18:20Z

rules/integrations/azure_openai/azure_openai_insecure_output_handling_detection.toml

+query = '''
+from logs-azure_openai.logs-*
+| where azure.open_ai.properties.response_length == 0 and azure.open_ai.result_signature == "200" and azure.open_ai.operation_name == "ChatCompletions_Create"
+| stats count = count() by azure.resource.id, azure.open_ai.operation_name


is there a field that can be used in the by aggregation to attribute it to a specific user.id or equivalent ?

also are there ECS compatible alternatives for any of these? Can you share some data / docs of these events

Samirbous · 2024-05-23T10:18:45Z

rules/integrations/azure_openai/azure_openai_insecure_output_handling_detection.toml

+
+https://learn.microsoft.com/en-us/azure/azure-monitor/essentials/stream-monitoring-data-event-hubs
+"""
+severity = "low"


if this behavior is rare than maybe bump up severity

Samirbous · 2024-05-23T10:19:54Z

rules/integrations/azure_openai/azure_openai_model_theft_detection.toml

+from logs-azure_openai.logs-*
+| where azure.open_ai.operation_name == "ListKey" and azure.open_ai.category == "Audit"
+| stats count = count(), max_data_transferred = max(azure.open_ai.properties.response_length) by azure.open_ai.properties.model_name
+| where count > 100 OR max_data_transferred > 1000000


if 100 and 100000 are the max threshold then maybe use >= vs > also add a comment to expland data size unit

Samirbous · 2024-05-23T10:20:51Z

rules/integrations/azure_openai/azure_openai_model_theft_detection.toml

+query = '''
+from logs-azure_openai.logs-*
+| where azure.open_ai.operation_name == "ListKey" and azure.open_ai.category == "Audit"
+| stats count = count(), max_data_transferred = max(azure.open_ai.properties.response_length) by azure.open_ai.properties.model_name


by should ideally include a field that can be used by the user to further investigate (e.g. the unique user ID or source of the API calls) to triage FPs and add exclusions.

terrancedejesus · 2024-06-05T14:14:54Z

rules/integrations/azure_openai/azure_openai_denial_of_ml_service_detection.toml

+[rule]
+author = ["Elastic"]
+description = """
+Detects patterns indicative of Denial of Service attacks on ML models, focusing on unusually high volume and frequency


Suggested change

Detects patterns indicative of Denial of Service attacks on ML models, focusing on unusually high volume and frequency

Detects patterns indicative of Denial-of-Service (DoS) attacks on machine learning (ML) models, focusing on unusually high volume and frequency

terrancedejesus · 2024-06-05T14:19:17Z

rules/integrations/azure_openai/azure_openai_insecure_output_handling_detection.toml

+    "Domain: LLM",
+    "Data Source: Azure OpenAI",
+    "Data Source: Azure Event Hubs",
+    "Use Case: Insecure Output Handling"


Any specific MITRE ATLAS tag here?

terrancedejesus · 2024-06-05T14:21:07Z

rules/integrations/azure_openai/azure_openai_model_theft_detection.toml

+[rule]
+author = ["Elastic"]
+description = """
+Monitors for suspicious activities that may indicate theft or unauthorized duplication of ML models, such as


Suggested change

Monitors for suspicious activities that may indicate theft or unauthorized duplication of ML models, such as

Monitors for suspicious activities that may indicate theft or unauthorized duplication of machine learning (ML) models, such as

Mikaayenson · 2024-06-07T01:46:15Z

Converting to draft until the integration includes the advanced logging fields.

Mikaayenson added 2 commits May 22, 2024 07:46

[New Rules] Azure OpenAI

7c159d1

update rule name

5c5643f

Mikaayenson added Rule: New Proposal for new rule Area: RAD esql ES|QL Integration: Azure Openai labels May 22, 2024

Mikaayenson requested review from brokensound77, Samirbous, terrancedejesus and a team May 22, 2024 12:58

Mikaayenson self-assigned this May 22, 2024

github-actions bot added the backport: auto label May 22, 2024

brokensound77 reviewed May 22, 2024

View reviewed changes

rules/integrations/azure_openai/azure_openai_denial_of_ml_service_detection.toml Outdated Show resolved Hide resolved

brokensound77 reviewed May 22, 2024

View reviewed changes

rules/integrations/azure_openai/azure_openai_denial_of_ml_service_detection.toml Outdated Show resolved Hide resolved

Mikaayenson and others added 2 commits May 22, 2024 16:55

line breaks

cc48be4

Merge branch 'main' into initial_azure_llm_rules

068bead

Mikaayenson requested review from brokensound77 and a team May 22, 2024 21:56

brokensound77 reviewed May 22, 2024

View reviewed changes

Samirbous reviewed May 23, 2024

View reviewed changes

terrancedejesus reviewed Jun 5, 2024

View reviewed changes

Mikaayenson marked this pull request as draft June 7, 2024 01:45

Mikaayenson added the backlog label Jun 7, 2024

Mikaayenson removed the Area: RAD label Jun 21, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[New Rules] Azure OpenAI #3701

[New Rules] Azure OpenAI #3701

Mikaayenson commented May 22, 2024

brokensound77 May 22, 2024

brokensound77 May 22, 2024

brokensound77 May 22, 2024

Samirbous May 23, 2024

brokensound77 May 22, 2024

Samirbous May 23, 2024

Samirbous May 23, 2024

Samirbous May 23, 2024

brokensound77 May 23, 2024

Samirbous May 23, 2024

brokensound77 May 23, 2024

brokensound77 May 23, 2024

Samirbous May 23, 2024

Samirbous May 23, 2024

Samirbous May 23, 2024

terrancedejesus Jun 5, 2024

terrancedejesus Jun 5, 2024

terrancedejesus Jun 5, 2024

Mikaayenson commented Jun 7, 2024

	name = "Azure OpenAI Insecure Output Handling Detection"
	name = "Azure OpenAI Insecure Output Handling"

	name = "Potential Azure OpenAI Model Theft Detection"
	name = "Potential Azure OpenAI Model Theft"

	\| where count > 1000 OR avg_request_size > 5000
	\| where count > 1000 or avg_request_size > 5000

	\| where count > 100 OR max_data_transferred > 1000000
	\| where count > 100 or max_data_transferred > 1000000

	Detects patterns indicative of Denial of Service attacks on ML models, focusing on unusually high volume and frequency
	Detects patterns indicative of Denial-of-Service (DoS) attacks on machine learning (ML) models, focusing on unusually high volume and frequency

	Monitors for suspicious activities that may indicate theft or unauthorized duplication of ML models, such as
	Monitors for suspicious activities that may indicate theft or unauthorized duplication of machine learning (ML) models, such as

[New Rules] Azure OpenAI #3701

Are you sure you want to change the base?

[New Rules] Azure OpenAI #3701

Conversation

Mikaayenson commented May 22, 2024

Issues

Summary

Details

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Mikaayenson commented Jun 7, 2024