Skip to content

Commit

Permalink
Update docs to stop referring to old namespaces (#6084)
Browse files Browse the repository at this point in the history
Summary:

Audit all instances of `\bexec_aten::` and `\btorch::` under `docs/`, updating where appropriate.

The only remaining `torch::` instances are for kernels, which I didn't get a chance to migrate before v0.4.0.

Also update the LLM Manual code to be consistent between the doc and main.cpp.

Reviewed By: mergennachin

Differential Revision: D64152344
  • Loading branch information
dbort authored and facebook-github-bot committed Oct 11, 2024
1 parent ba8dc28 commit 0bcf75f
Show file tree
Hide file tree
Showing 10 changed files with 155 additions and 148 deletions.
3 changes: 2 additions & 1 deletion docs/source/Doxyfile
Original file line number Diff line number Diff line change
Expand Up @@ -943,7 +943,8 @@ WARN_LOGFILE =
# spaces. See also FILE_PATTERNS and EXTENSION_MAPPING
# Note: If this tag is empty the current directory is searched.

INPUT = ../runtime/executor/memory_manager.h \
INPUT = ../devtools/bundled_program/bundled_program.h \
../runtime/executor/memory_manager.h \
../runtime/executor/method.h \
../runtime/executor/method_meta.h \
../runtime/executor/program.h \
Expand Down
5 changes: 2 additions & 3 deletions docs/source/build-run-coreml.md
Original file line number Diff line number Diff line change
Expand Up @@ -147,11 +147,10 @@ libsqlite3.tbd

7. Update the code to load the program from the Application's bundle.
``` objective-c
using namespace torch::executor;

NSURL *model_url = [NBundle.mainBundle URLForResource:@"mv3_coreml_all" extension:@"pte"];

Result<util::FileDataLoader> loader = util::FileDataLoader::from(model_url.path.UTF8String);
Result<executorch::extension::FileDataLoader> loader =
executorch::extension::FileDataLoader::from(model_url.path.UTF8String);
```
8. Use [Xcode](https://developer.apple.com/documentation/xcode/building-and-running-an-app#Build-run-and-debug-your-app) to deploy the application on the device.
Expand Down
30 changes: 15 additions & 15 deletions docs/source/bundled-io.md
Original file line number Diff line number Diff line change
Expand Up @@ -201,51 +201,51 @@ This stage mainly focuses on executing the model with the bundled inputs and and
### Get ExecuTorch Program Pointer from `BundledProgram` Buffer
We need the pointer to ExecuTorch program to do the execution. To unify the process of loading and executing `BundledProgram` and Program flatbuffer, we create an API:

:::{dropdown} `GetProgramData`
:::{dropdown} `get_program_data`

```{eval-rst}
.. doxygenfunction:: torch::executor::bundled_program::GetProgramData
.. doxygenfunction:: ::executorch::bundled_program::get_program_data
```
:::

Here's an example of how to use the `GetProgramData` API:
Here's an example of how to use the `get_program_data` API:
```c++
// Assume that the user has read the contents of the file into file_data using
// whatever method works best for their application. The file could contain
// either BundledProgram data or Program data.
void* file_data = ...;
size_t file_data_len = ...;

// If file_data contains a BundledProgram, GetProgramData() will return a
// If file_data contains a BundledProgram, get_program_data() will return a
// pointer to the Program data embedded inside it. Otherwise it will return
// file_data, which already pointed to Program data.
const void* program_ptr;
size_t program_len;
status = torch::executor::bundled_program::GetProgramData(
status = executorch::bundled_program::get_program_data(
file_data, file_data_len, &program_ptr, &program_len);
ET_CHECK_MSG(
status == Error::Ok,
"GetProgramData() failed with status 0x%" PRIx32,
"get_program_data() failed with status 0x%" PRIx32,
status);
```
### Load Bundled Input to Method
To execute the program on the bundled input, we need to load the bundled input into the method. Here we provided an API called `torch::executor::bundled_program::LoadBundledInput`:
To execute the program on the bundled input, we need to load the bundled input into the method. Here we provided an API called `executorch::bundled_program::load_bundled_input`:
:::{dropdown} `LoadBundledInput`
:::{dropdown} `load_bundled_input`
```{eval-rst}
.. doxygenfunction:: torch::executor::bundled_program::LoadBundledInput
.. doxygenfunction:: ::executorch::bundled_program::load_bundled_input
```
:::

### Verify the Method's Output.
We call `torch::executor::bundled_program::VerifyResultWithBundledExpectedOutput` to verify the method's output with bundled expected outputs. Here's the details of this API:
We call `executorch::bundled_program::verify_method_outputs` to verify the method's output with bundled expected outputs. Here's the details of this API:

:::{dropdown} `VerifyResultWithBundledExpectedOutput`
:::{dropdown} `verify_method_outputs`

```{eval-rst}
.. doxygenfunction:: torch::executor::bundled_program::VerifyResultWithBundledExpectedOutput
.. doxygenfunction:: ::executorch::bundled_program::verify_method_outputs
```
:::

Expand All @@ -266,13 +266,13 @@ ET_CHECK_MSG(
method.error());

// Load testset_idx-th input in the buffer to plan
status = torch::executor::bundled_program::LoadBundledInput(
status = executorch::bundled_program::load_bundled_input(
*method,
program_data.bundled_program_data(),
FLAGS_testset_idx);
ET_CHECK_MSG(
status == Error::Ok,
"LoadBundledInput failed with status 0x%" PRIx32,
"load_bundled_input failed with status 0x%" PRIx32,
status);

// Execute the plan
Expand All @@ -283,7 +283,7 @@ ET_CHECK_MSG(
status);

// Verify the result.
status = torch::executor::bundled_program::VerifyResultWithBundledExpectedOutput(
status = executorch::bundled_program::verify_method_outputs(
*method,
program_data.bundled_program_data(),
FLAGS_testset_idx,
Expand Down
8 changes: 4 additions & 4 deletions docs/source/concepts.md
Original file line number Diff line number Diff line change
Expand Up @@ -26,7 +26,7 @@ The goal of ATen dialect is to capture users’ programs as faithfully as possib

## ATen mode

ATen mode uses the ATen implementation of Tensor (`at::Tensor`) and related types, such as `ScalarType`, from the PyTorch core. This is in contrast to portable mode, which uses ExecuTorch’s smaller implementation of tensor (`torch::executor::Tensor`) and related types, such as `torch::executor::ScalarType`.
ATen mode uses the ATen implementation of Tensor (`at::Tensor`) and related types, such as `ScalarType`, from the PyTorch core. This is in contrast to ETensor mode, which uses ExecuTorch’s smaller implementation of tensor (`executorch::runtime::etensor::Tensor`) and related types, such as `executorch::runtime::etensor::ScalarType`.
- ATen kernels that rely on the full `at::Tensor` API are usable in this configuration.
- ATen kernels tend to do dynamic memory allocation and often have extra flexibility (and thus overhead) to handle cases not needed by mobile/embedded clients. e.g., CUDA support, sparse tensor support, and dtype promotion.
- Note: ATen mode is currently a WIP.
Expand Down Expand Up @@ -244,10 +244,10 @@ Kernels that support a subset of tensor dtypes and/or dim orders.

Parts of a model may be delegated to run on an optimized backend. The partitioner splits the graph into the appropriate sub-networks and tags them for delegation.

## Portable mode (lean mode)
## ETensor mode

Portable mode uses ExecuTorch’s smaller implementation of tensor (`torch::executor::Tensor`) along with related types (`torch::executor::ScalarType`, etc.). This is in contrast to ATen mode, which uses the ATen implementation of Tensor (`at::Tensor`) and related types (`ScalarType`, etc.)
- `torch::executor::Tensor`, also known as ETensor, is a source-compatible subset of `at::Tensor`. Code written against ETensor can build against `at::Tensor`.
ETensor mode uses ExecuTorch’s smaller implementation of tensor (`executorch::runtime::etensor::Tensor`) along with related types (`executorch::runtime::etensor::ScalarType`, etc.). This is in contrast to ATen mode, which uses the ATen implementation of Tensor (`at::Tensor`) and related types (`ScalarType`, etc.)
- `executorch::runtime::etensor::Tensor`, also known as ETensor, is a source-compatible subset of `at::Tensor`. Code written against ETensor can build against `at::Tensor`.
- ETensor does not own or allocate memory on its own. To support dynamic shapes, kernels can allocate Tensor data using the MemoryAllocator provided by the client.

## Portable kernels
Expand Down
2 changes: 1 addition & 1 deletion docs/source/etdump.md
Original file line number Diff line number Diff line change
Expand Up @@ -15,7 +15,7 @@ Generating an ETDump is a relatively straightforward process. Users can follow t
2. ***Create*** an Instance of the ETDumpGen class and pass it into the `load_method` call that is invoked in the runtime.

```C++
torch::executor::ETDumpGen etdump_gen = torch::executor::ETDumpGen();
executorch::etdump::ETDumpGen etdump_gen;
Result<Method> method =
program->load_method(method_name, &memory_manager, &etdump_gen);
```
Expand Down
16 changes: 8 additions & 8 deletions docs/source/executorch-runtime-api-reference.rst
Original file line number Diff line number Diff line change
Expand Up @@ -11,25 +11,25 @@ For detailed information on how APIs evolve and the deprecation process, please
Model Loading and Execution
---------------------------

.. doxygenclass:: executorch::runtime::DataLoader
.. doxygenclass:: executorch::runtime::Program
:members:

.. doxygenclass:: executorch::runtime::MemoryAllocator
.. doxygenclass:: executorch::runtime::Method
:members:

.. doxygenclass:: executorch::runtime::HierarchicalAllocator
.. doxygenclass:: executorch::runtime::MethodMeta
:members:

.. doxygenclass:: executorch::runtime::MemoryManager
.. doxygenclass:: executorch::runtime::DataLoader
:members:

.. doxygenclass:: executorch::runtime::Program
.. doxygenclass:: executorch::runtime::MemoryAllocator
:members:

.. doxygenclass:: executorch::runtime::Method
.. doxygenclass:: executorch::runtime::HierarchicalAllocator
:members:

.. doxygenclass:: executorch::runtime::MethodMeta
.. doxygenclass:: executorch::runtime::MemoryManager
:members:

Values
Expand All @@ -38,5 +38,5 @@ Values
.. doxygenstruct:: executorch::runtime::EValue
:members:

.. doxygenclass:: executorch::aten::Tensor
.. doxygenclass:: executorch::runtime::etensor::Tensor
:members:
162 changes: 81 additions & 81 deletions docs/source/llm/getting-started.md
Original file line number Diff line number Diff line change
Expand Up @@ -208,8 +208,8 @@ Create a file called main.cpp with the following contents:
#include <executorch/runtime/core/exec_aten/exec_aten.h>
#include <executorch/runtime/core/result.h>

using exec_aten::ScalarType;
using exec_aten::Tensor;
using executorch::aten::ScalarType;
using executorch::aten::Tensor;
using executorch::extension::from_blob;
using executorch::extension::Module;
using executorch::runtime::EValue;
Expand All @@ -235,56 +235,56 @@ std::string generate(
BasicSampler& sampler,
size_t max_input_length,
size_t max_output_length) {
// Convert the input text into a list of integers (tokens) that represents it,
// using the string-to-token mapping that the model was trained on. Each token
// is an integer that represents a word or part of a word.
std::vector<int64_t> input_tokens = tokenizer.encode(prompt);
std::vector<int64_t> output_tokens;

for (auto i = 0u; i < max_output_length; i++) {
// Convert the input_tokens from a vector of int64_t to EValue. EValue is a
// unified data type in the ExecuTorch runtime.
auto inputs = from_blob(
input_tokens.data(),
{1, static_cast<int>(input_tokens.size())},
ScalarType::Long);

// Run the model. It will return a tensor of logits (log-probabilities).
auto logits_evalue = llm_model.forward(inputs);

// Convert the output logits from EValue to std::vector, which is what the
// sampler expects.
Tensor logits_tensor = logits_evalue.get()[0].toTensor();
std::vector<float> logits(
logits_tensor.data_ptr<float>(),
logits_tensor.data_ptr<float>() + logits_tensor.numel());

// Sample the next token from the logits.
int64_t next_token = sampler.sample(logits);

// Break if we reached the end of the text.
if (next_token == ENDOFTEXT_TOKEN) {
break;
}

// Add the next token to the output.
output_tokens.push_back(next_token);

std::cout << tokenizer.decode({next_token});
std::cout.flush();

// Convert the input text into a list of integers (tokens) that represents
// it, using the string-to-token mapping that the model was trained on.
// Each token is an integer that represents a word or part of a word.
std::vector<int64_t> input_tokens = tokenizer.encode(prompt);
std::vector<int64_t> output_tokens;

for (auto i = 0u; i < max_output_length; i++) {
// Convert the input_tokens from a vector of int64_t to EValue.
// EValue is a unified data type in the ExecuTorch runtime.
auto inputs = from_blob(
input_tokens.data(),
{1, static_cast<int>(input_tokens.size())},
ScalarType::Long);

// Run the model. It will return a tensor of logits (log-probabilities).
auto logits_evalue = llm_model.forward(inputs);

// Convert the output logits from EValue to std::vector, which is what
// the sampler expects.
Tensor logits_tensor = logits_evalue.get()[0].toTensor();
std::vector<float> logits(logits_tensor.data_ptr<float>(),
logits_tensor.data_ptr<float>() + logits_tensor.numel());

// Sample the next token from the logits.
int64_t next_token = sampler.sample(logits);

// Break if we reached the end of the text.
if (next_token == ENDOFTEXT_TOKEN) {
break;
}

// Add the next token to the output.
output_tokens.push_back(next_token);

std::cout << tokenizer.decode({ next_token });
std::cout.flush();

// Update next input.
input_tokens.push_back(next_token);
if (input_tokens.size() > max_input_length) {
input_tokens.erase(input_tokens.begin());
}
// Update next input.
input_tokens.push_back(next_token);
if (input_tokens.size() > max_input_length) {
input_tokens.erase(input_tokens.begin());
}
}

std::cout << std::endl;
std::cout << std::endl;

// Convert the output tokens into a human-readable string.
std::string output_string = tokenizer.decode(output_tokens);
return output_string;
// Convert the output tokens into a human-readable string.
std::string output_string = tokenizer.decode(output_tokens);
return output_string;
}
```
Expand All @@ -309,32 +309,32 @@ penalties for repeated tokens, and biases to prioritize or de-prioritize specifi
```cpp
// main.cpp
using namespace torch::executor;
int main() {
// Set up the prompt. This provides the seed text for the model to elaborate.
std::cout << "Enter model prompt: ";
std::string prompt;
std::getline(std::cin, prompt);
// The tokenizer is used to convert between tokens (used by the model) and
// human-readable strings.
BasicTokenizer tokenizer("vocab.json");
// The sampler is used to sample the next token from the logits.
BasicSampler sampler = BasicSampler();
// Load the exported nanoGPT program, which was generated via the previous steps.
Module model("nanogpt.pte", Module::LoadMode::MmapUseMlockIgnoreErrors);
const auto max_input_tokens = 1024;
const auto max_output_tokens = 30;
std::cout << prompt;
generate(model, prompt, tokenizer, sampler, max_input_tokens, max_output_tokens);
// Set up the prompt. This provides the seed text for the model to elaborate.
std::cout << "Enter model prompt: ";
std::string prompt;
std::getline(std::cin, prompt);
// The tokenizer is used to convert between tokens (used by the model) and
// human-readable strings.
BasicTokenizer tokenizer("vocab.json");
// The sampler is used to sample the next token from the logits.
BasicSampler sampler = BasicSampler();
// Load the exported nanoGPT program, which was generated via the previous
// steps.
Module model("nanogpt.pte", Module::LoadMode::MmapUseMlockIgnoreErrors);
const auto max_input_tokens = 1024;
const auto max_output_tokens = 30;
std::cout << prompt;
generate(
model, prompt, tokenizer, sampler, max_input_tokens, max_output_tokens);
}
```

Finally, download the following files into the same directory as main.h:
Finally, download the following files into the same directory as main.cpp:

```
curl -O https://raw.githubusercontent.com/pytorch/executorch/main/examples/llm_manual/basic_sampler.h
Expand Down Expand Up @@ -524,20 +524,20 @@ option(EXECUTORCH_BUILD_XNNPACK "" ON) # Build with Xnnpack backend
# Include the executorch subdirectory.
add_subdirectory(
${CMAKE_CURRENT_SOURCE_DIR}/third-party/executorch
${CMAKE_BINARY_DIR}/executorch)
# include_directories(${CMAKE_CURRENT_SOURCE_DIR}/src)
${CMAKE_CURRENT_SOURCE_DIR}/third-party/executorch
${CMAKE_BINARY_DIR}/executorch
)
add_executable(nanogpt_runner main.cpp)
target_link_libraries(
nanogpt_runner
PRIVATE
executorch
extension_module_static # Provides the Module class
extension_tensor # Provides the TensorPtr class
optimized_native_cpu_ops_lib # Provides baseline cross-platform kernels
xnnpack_backend) # Provides the XNNPACK CPU acceleration backend
nanogpt_runner
PRIVATE executorch
extension_module_static # Provides the Module class
extension_tensor # Provides the TensorPtr class
optimized_native_cpu_ops_lib # Provides baseline cross-platform
# kernels
xnnpack_backend # Provides the XNNPACK CPU acceleration backend
)
```

Keep the rest of the code the same. For more details refer to [Exporting
Expand Down
Loading

0 comments on commit 0bcf75f

Please sign in to comment.