Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add result aggregation for query templates #283

Open
wants to merge 21 commits into
base: develop
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
16 changes: 11 additions & 5 deletions docs/configuration/queries.md
Original file line number Diff line number Diff line change
Expand Up @@ -130,16 +130,21 @@ The results may look like the following:
### Configuration
The `template` attribute has the following properties:

| property | required | default | description | example |
|----------|----------|---------|---------------------------------------------------------------------|-----------------------------|
| endpoint | yes | | The endpoint to query. | `http://dbpedia.org/sparql` |
| limit | no | `2000` | The maximum number of instances per query template. | `100` |
| save | no | `true` | If set to `true`, query instances will be saved in a separate file. | `false` |
| property | required | default | description | example |
|-------------------|----------|---------|-------------------------------------------------------------------------------------------------------------------------------------------------------------------------|-----------------------------|
| endpoint | yes | | The endpoint to query. | `http://dbpedia.org/sparql` |
| limit | no | `2000` | The maximum number of instances per query template. | `100` |
| save | no | `true` | If set to `true`, query instances will be saved in a separate file. | `false` |
| individualResults | no | `false` | If set to `true`, the results of each individual template instance will be reported, otherwise if set to `false` their results will be subsumed for the query template. | `true` |

If the `save` attribute is set to `true`,
the instances will be saved in a separate file in the same directory as the query templates.
If the query templates are stored in a folder, the instances will be saved in the parent directory.

If the `individualResults` attribute is set to `false`,
the results of the instances will be subsumed for the query template.
The query template will then be considered as an actual query in the results.

Example of query configuration with query templates:
```yaml
queries:
Expand All @@ -149,4 +154,5 @@ queries:
endpoint: "http://dbpedia.org/sparql"
limit: 100
save: true
individualResults: true
```
3 changes: 2 additions & 1 deletion example-suite.yml
Original file line number Diff line number Diff line change
Expand Up @@ -75,10 +75,11 @@ tasks:
requestType: post query
queries:
path: "./example/query_pattern.txt"
pattern:
template:
endpoint: "https://dbpedia.org/sparql"
limit: 1000
save: false
individualResults: false
timeout: 180s
completionTarget:
duration: 1000s
Expand Down
9 changes: 5 additions & 4 deletions graalvm/suite.yml
Original file line number Diff line number Diff line change
Expand Up @@ -6,7 +6,7 @@ connections:
- name: "Blazegraph"
version: "1.1.1"
dataset: "DatasetName"
endpoint: "http://localhost:9999/blazegraph/sparql"
endpoint: "https://dbpedia.org/sparql"
authentication:
user: "user"
password: "test"
Expand Down Expand Up @@ -60,13 +60,14 @@ tasks:
seed: 123
lang: "SPARQL"
template:
endpoint: "http://dbpedia.org/sparql"
endpoint: "https://dbpedia.org/sparql"
limit: 1
save: false
individualResults: false
timeout: 2s
connection: Blazegraph
completionTarget:
duration: 1s
duration: 0.5s
acceptHeader: "application/sparql-results+json"
requestType: get query
parseResults: true
Expand All @@ -78,7 +79,7 @@ tasks:
timeout: 3m
connection: Blazegraph
completionTarget:
duration: 1s
duration: 0.5s
requestType: get query
acceptHeader: "application/sparql-results+json"
- number: 1
Expand Down
3 changes: 3 additions & 0 deletions schema/iguana-schema.json
Original file line number Diff line number Diff line change
Expand Up @@ -351,6 +351,9 @@
},
"save": {
"type": "boolean"
},
"individualResults": {
"type": "boolean"
}
},
"required": [
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -29,7 +29,7 @@ public AggregatedExecutionStatistics() {
public Model createMetricModel(List<HttpWorker> workers, List<HttpWorker.ExecutionStats>[][] data, IRES.Factory iresFactory) {
Model m = ModelFactory.createDefaultModel();
for (var worker : workers) {
for (int i = 0; i < worker.config().queries().getQueryCount(); i++) {
for (int i = 0; i < worker.config().queries().getRepresentativeQueryCount(); i++) {
Resource queryRes = iresFactory.getWorkerQueryResource(worker, i);
m.add(createAggregatedModel(data[(int) worker.getWorkerID()][i], queryRes));
}
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -26,7 +26,7 @@ public EachExecutionStatistic() {
public Model createMetricModel(List<HttpWorker> workers, List<HttpWorker.ExecutionStats>[][] data, IRES.Factory iresFactory) {
Model m = ModelFactory.createDefaultModel();
for (var worker : workers) {
for (int i = 0; i < worker.config().queries().getQueryCount(); i++) {
for (int i = 0; i < worker.config().queries().getRepresentativeQueryCount(); i++) {
Resource workerQueryResource = iresFactory.getWorkerQueryResource(worker, i);
Resource queryRes = IRES.getResource(worker.config().queries().getQueryId(i));
BigInteger run = BigInteger.ONE;
Expand Down
2 changes: 1 addition & 1 deletion src/main/java/org/aksw/iguana/cc/metrics/impl/QMPH.java
Original file line number Diff line number Diff line change
Expand Up @@ -28,7 +28,7 @@ public Number calculateTaskMetric(List<HttpWorker> workers, List<HttpWorker.Exec
@Override
public Number calculateWorkerMetric(HttpWorker.Config worker, List<HttpWorker.ExecutionStats>[] data) {
BigDecimal successes = BigDecimal.ZERO;
BigDecimal noq = BigDecimal.valueOf(worker.queries().getQueryCount());
BigDecimal noq = BigDecimal.valueOf(worker.queries().getExecutableQueryCount());
Duration totalTime = Duration.ZERO;
for (List<HttpWorker.ExecutionStats> datum : data) {
for (HttpWorker.ExecutionStats exec : datum) {
Expand Down
55 changes: 55 additions & 0 deletions src/main/java/org/aksw/iguana/cc/query/QueryData.java
Original file line number Diff line number Diff line change
@@ -0,0 +1,55 @@
package org.aksw.iguana.cc.query;

import org.apache.jena.update.UpdateFactory;

import java.io.IOException;
import java.io.InputStream;
import java.util.ArrayList;
import java.util.Collection;
import java.util.List;

/**
* This class stores extra information about a query.
* At the moment, it only stores if the query is an update query or not.
*
* @param queryId The id of the query
*/
public record QueryData(int queryId, QueryType type, Integer templateId) {
public enum QueryType {
DEFAULT,
UPDATE,
TEMPLATE,
TEMPLATE_INSTANCE
}

public static List<QueryData> generate(Collection<InputStream> queries) {
final var queryData = new ArrayList<QueryData>();
int i = 0;
for (InputStream query : queries) {
boolean update = true;
try {
UpdateFactory.read(query); // Throws an exception if the query is not an update query
} catch (Exception e) {
update = false;
}
queryData.add(new QueryData(i++, update ? QueryType.UPDATE : QueryType.DEFAULT, null));
try {
query.close();
} catch (IOException ignored) {}
}
return queryData;
}

public static boolean checkUpdate(InputStream query) {
try {
UpdateFactory.read(query); // Throws an exception if the query is not an update query
return true;
} catch (Exception e) {
return false;
}
}

public boolean update() {
return type == QueryType.UPDATE;
}
}
Loading
Loading