From 35e30a9297f7c1a96b3ed4fcb08eddb478ed83a9 Mon Sep 17 00:00:00 2001
From: Michael Gschwind <61328285+mikekgfb@users.noreply.github.com>
Date: Mon, 14 Oct 2024 21:41:13 -0700
Subject: [PATCH] Fix dangling open skip in README.md

1 - Fix an extraneous skip end that is out of order with a skip begin.
2 - fix some typos

PS: This might cause some README tests to fail, as they have not been run in a long time.
---
 README.md | 13 +++++++------
 1 file changed, 7 insertions(+), 6 deletions(-)
diff --git a/README.md b/README.md
index 4f58f714c..6e8eaa061 100644
--- a/README.md
+++ b/README.md
@@ -171,7 +171,7 @@ python3 torchchat.py download llama3.1
 <summary>Additional Model Inventory Management Commands</summary>
 
 ### Where
-This subcommand shows location of a particular model.
+This subcommand shows the location of a particular model.
 ```bash
 python3 torchchat.py where llama3.1
 ```
@@ -216,7 +216,6 @@ This mode generates text based on an input prompt.
 python3 torchchat.py generate llama3.1 --prompt "write me a story about a boy and his bear"
 ```
 
-[skip default]: end
 
 ### Server
 This mode exposes a REST API for interacting with a model.
@@ -286,6 +285,8 @@ First, follow the steps in the Server section above to start a local server. The
 streamlit run torchchat/usages/browser.py
 ```
 
+[skip default]: end
+
 Use the "Max Response Tokens" slider to limit the maximum number of tokens generated by the model for each response. Click the "Reset Chat" button to remove the message history and start a fresh chat.
 
 
@@ -293,7 +294,7 @@ Use the "Max Response Tokens" slider to limit the maximum number of tokens gener
 
 ### AOTI (AOT Inductor)
 [AOTI](https://pytorch.org/blog/pytorch2-2/) compiles models before execution for faster inference. The process creates a [DSO](https://en.wikipedia.org/wiki/Shared_library) model (represented by a file with extension `.so`)
-that is then loaded for inference. This can be done with both Python and C++ enviroments.
+that is then loaded for inference. This can be done with both Python and C++ environments.
 
 The following example exports and executes the Llama3.1 8B Instruct
 model.  The first command compiles and performs the actual export.
@@ -308,9 +309,9 @@ python3 torchchat.py export llama3.1 --output-dso-path exportedModels/llama3.1.s
 For more details on quantization and what settings to use for your use
 case visit our [customization guide](docs/model_customization.md).
 
-### Run in a Python Enviroment
+### Run in a Python Environment
 
-To run in a python enviroment, use the generate subcommand like before, but include the dso file.
+To run in a python environment, use the generate subcommand like before, but include the dso file.
 
 ```
 python3 torchchat.py generate llama3.1 --dso-path exportedModels/llama3.1.so --prompt "Hello my name is"
@@ -377,7 +378,7 @@ While ExecuTorch does not focus on desktop inference, it is capable
 of doing so. This is handy for testing out PTE
 models without sending them to a physical device.
 
-Specifically there are 2 ways of doing so: Pure Python and via a Runner
+Specifically, there are 2 ways of doing so: Pure Python and via a Runner
 
 <details>
 <summary>Deploying via Python</summary>