Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

How to implement zero copy? #258

Open
katetsu opened this issue Feb 1, 2022 · 7 comments
Open

How to implement zero copy? #258

katetsu opened this issue Feb 1, 2022 · 7 comments

Comments

@katetsu
Copy link

katetsu commented Feb 1, 2022

I am trying to figure out zero copying with capn proto. i have the following schema file:

@0xe620bfc471ce012f;

struct Output {
    data1 @0: Data;
    data2 @1: Data;
}

and here is the decoding code where im trying to read from a stream of bytes and create my class from them without copying:

pub mod schema_capnp {
    include!(concat!(env!("OUT_DIR"), "/src/schema_capnp.rs"));
}

use crate::schema_capnp::output;
use capnp::message::{Reader, ReaderOptions};
use capnp::serialize::{self, SliceSegments};
use capnp::Result;

#[derive(Default)]
struct Output <'a>{
    data1 : &'a [u8],
    data2 : &'a [u8],
}


impl <'a> Output <'a> {
    pub fn set_data1(&mut self, value: &'a[u8]) {
        self.data1 = value;
    }

    pub fn set_data2(&mut self, value: &'a[u8]) {
        self.data2 = value;
    }
    pub fn decode(input : &mut &'a [u8]) -> Result<Self> {
        let mut result = Output::default();

        let raw_data = serialize::read_message_from_flat_slice(input, ReaderOptions::new())?;        
        let data = raw_data.get_root::<output::Reader<'a>>()?;
        let d1 = data.get_data1()?;
        let d2 = data.get_data2()?;

        result.set_data1(d1);
        result.set_data1(d2);
        Ok(result)
    }
}

but im getting this error:

18 | impl <'a> Output <'a> {
   |       -- lifetime `'a` defined here
...
32 |         let data = raw_data.get_root::<output::Reader<'a>>()?;
   |                    ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
   |                    |
   |                    borrowed value does not live long enough
   |                    argument requires that `raw_data` is borrowed for `'a`
...
42 |     }
   |     - `raw_data` dropped here while still borrowed

any idea for what im doing wrong?

@OliverEvans96
Copy link
Contributor

I'm also perplexed by this - I would expect result to hold references to input, not raw_data. However, I'm not sure off the top of my head if/how this could be encoded in the rust type system, and if possible, whether it would require modifying the library or just your usage. But I would really like to find out!

P.S. in github-flavored markdown, you can enable language-specific syntax hightlighting with ```rust or ```capnp as described here

@zopsicle
Copy link

zopsicle commented Oct 6, 2022

I think for this to work, get_root should use 'a from SliceSegments<'a>, rather than 'a self. For this, the ReaderSegments trait would have to be modified to have a lifetime parameter.

@Forsworns
Copy link

Forsworns commented Nov 1, 2022

I think you can pass a fake reader live longer than <'a> as the argument as a workaround, the easiest way is always passing a None.

#[derive(Debug, Clone)]
struct Output<'a> {
    pub data1: &'a [u8],
    pub data2: &'a [u8],
}

impl<'a> Output<'a> {
    pub fn set_data1(&mut self, value: &'a [u8]) {
        self.data1 = value;
    }

    pub fn set_data2(&mut self, value: &'a [u8]) {
        self.data2 = value;
    }

    pub fn decode<'b>(
        &'a mut self,
        input: &mut &'b [u8],
        reader: &'b mut Option<Reader<SliceSegments<'b>>>,
    ) where
        'b: 'a,
    {
        *reader = Some(
            serialize::read_message_from_flat_slice(input, ReaderOptions::new())
                .expect("fail to build reader"),
        );
        let data = reader
            .as_ref()
            .unwrap()
            .get_root::<output::Reader<'b>>()
            .expect("failed to get reader");
        let d1 = data.get_data1().expect("failed to get d1");
        let d2 = data.get_data2().expect("failed to get d2");
        self.set_data1(d1);
        self.set_data2(d2);
    }
}

To call the function, you can use

let mut reader = None;
output.decode(&mut input, &mut reader);

@Forsworns
Copy link

Forsworns commented Nov 1, 2022

My question is: Is capn-rpc zero-copy? It uses try_read_message as the reader in capnp-rpc/src/twoparty.rs.

So what's the difference between (try_)read_message and read_message_from_flat_slice. It seems (try_)read_message maintains a buffer indeed, and the API exposes the capnp::serialize::OwnedSegments, which can be derefed to &[u8].

Does read_message_from_flat_slice gain better performance with even less copy than try_read_message? Why didn't capn-rpc use read_message_from_flat_slice?

Friendly ping @dwrensha

@dwrensha
Copy link
Member

dwrensha commented Nov 1, 2022

Yes, capnp-rpc is currently hard-coded to use capnp_futures::serialize::try_read_message(), which copies the bytes of the message into an internal buffer, i.e. makes a copy. This is essentially a single memcpy per message, so it should be reasonably fast, but yes it is not free.

You might imagine that something like a shared-memory ring buffer would allow us to avoid copying those buffers, but it seems difficult to me to make that work. The problem is that user-defined objects in the RPC system may hold on to messages for arbitrary lengths of time, so we would not be able to implement a simple "sliding window" of active memory.

@Forsworns
Copy link

Forsworns commented Nov 1, 2022

Thanks for your detailed explanation :)

@dwrensha
Copy link
Member

@katetsu: To make it work, you will need to hold the outer message (raw_data) longer than the Output value that you are constructing.

It's possible that there's a way to adjust capnproto-rust's handling of lifetimes to make possible what you are attempting, but I don't see an obvious easy way to do it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants