Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Querying performance with synced prepared query #44

Open
Divi opened this issue Apr 2, 2015 · 5 comments
Open

Querying performance with synced prepared query #44

Divi opened this issue Apr 2, 2015 · 5 comments

Comments

@Divi
Copy link

Divi commented Apr 2, 2015

Hello,

I'm running a benchmark with your lib, and it is pretty good 👍
But, I have some performance issues with synced prepared queries :

// running this x10000, with random key value
$connection->querySync("SELECT * FROM keyspace.columnFamily where key = 'foo'");
// got response ~ 6.255 sec

and, from your example :

// running this x10000, with random key value, I known it's stupid to put the parameter directly in the query
$preparedData = $connection->prepare("SELECT * FROM keyspace.columnFamily where key = :keyValue");
$strictValues = [...];
$response = $con->executeSync($preparedData['id'], $strictValues, Cassandra\Request\Request::CONSISTENCY_ONE, [
    'page_size' => 100,
    'names_for_values' => true,
    'skip_metadata' => true,
]);
// got response ~ 23.331 sec

Why this difference ?

TL;DR: got 6.255 sec with querySync() and 23.331 sec with executeSync() on a x10000 loop for a single SELECT.

Thanks :)

@shen2
Copy link
Member

shen2 commented Apr 2, 2015

Show your full.benchmark code.

@Divi
Copy link
Author

Divi commented Apr 2, 2015

Sure, here we go :

$connection = new \Cassandra\Connection([
    '127.0.0.1'
]);

// $connection->querySync('DROP TABLE benchmark.bar');
$connection->querySync('CREATE TABLE benchmark.bar (
  userid text PRIMARY KEY,
  column1 text, column2 text, column3 text, column4 text, column5 text,
  column6 text, column7 text, column8 text, column9 text, column10 text,
  column11 text, column12 text, column13 text, column14 text, column15 text,
  column16 text, column17 text, column18 text, column19 text, column20 text
)');

// insert 10 000 rows
for ($i=0; $i<10000; $i++) {
    $connection->queryAsync("INSERT INTO benchmark.bar (
        userid,
        column1, column2, column3, column4, column5,
        column6, column7, column8, column9,column10,
        column11, column12, column13, column14, column15,
        column16, column17, column18, column19, column20) VALUES (
        'row" . $i . "',
        'foo', 'bar', 'foo', 'bar', 'foo',
        'foo', 'bar', 'foo', 'bar', 'foo',
        'foo', 'bar', 'foo', 'bar', 'foo',
        'foo', 'bar', 'foo', 'bar', 'foo'
    )");
}

// select 10 000 rows with executeSync()
for ($i=0; $i<10000; $i++) {
    $preparedData = $connection->prepare("SELECT * FROM benchmark.bar where userid = :userid");
    $strictValues =  \Cassandra\Request\Request::strictTypeValues([
            'userid' => 'row' . $i,
    ], $preparedData['metadata']['columns']);
    $response = $con->executeSync($preparedData['id'], $strictValues, Cassandra\Request\Request::CONSISTENCY_ONE, [
        'page_size' => 100,
        'names_for_values' => true,
        'skip_metadata' => true
    ]);
}

// select 10 000 rows with querySync()
for ($i=0; $i<10000; $i++) {
    $connection->querySync("SELECT * FROM benchmark.bar where userid = 'row" . $id . "'");
}

@shen2
Copy link
Member

shen2 commented Apr 3, 2015

Obviously, you prepared the query for 10000 times.
Actually, you can prepare the query once, and execute for 10000 times.

@Divi
Copy link
Author

Divi commented Apr 3, 2015

Indeed, but it was for this benchmark. In real case, I can have ~500 queries per page, and I will have to prepare a query each time because it will not the same.
I'm surprised by the difference between a prepared & not prepared query, so I'm wondering if I'm doing well (and obviously, for this benchmark : no, but see it as 10 000 different queries).

@Divi
Copy link
Author

Divi commented Apr 3, 2015

I made another test, with one unique prepared statement, here the result :

  • Prepared queries : 7.409 seconds
  • Synced queries : 6.268 seconds

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants