You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Hi all, I was experimenting with langchain-nvidia-ai-endpoints==0.0.4 and when I tried querying the ai-llama3-70b model, I got the following error:
---------------------------------------------------------------------------
Exception Traceback (most recent call last)
Cell In[1], line 7
5 # llm = ChatNVIDIA(model="mixtral_8x7b")
6 llm = ChatNVIDIA(model="ai-llama3-70b")
----> 7 result = llm.invoke("How do I query NVIDIA models in LangChain?")
8 print(result.content)
File [~/Documents/venvs/nvaif_env/lib/python3.12/site-packages/langchain_core/language_models/chat_models.py:158](http://localhost:8888/lab/tree/~/Documents/venvs/nvaif_env/lib/python3.12/site-packages/langchain_core/language_models/chat_models.py#line=157), in BaseChatModel.invoke(self, input, config, stop, **kwargs)
147 def invoke(
148 self,
149 input: LanguageModelInput,
(...)
153 **kwargs: Any,
154 ) -> BaseMessage:
155 config = ensure_config(config)
156 return cast(
157 ChatGeneration,
--> 158 self.generate_prompt(
159 [self._convert_input(input)],
160 stop=stop,
161 callbacks=config.get("callbacks"),
162 tags=config.get("tags"),
163 metadata=config.get("metadata"),
164 run_name=config.get("run_name"),
165 run_id=config.pop("run_id", None),
166 **kwargs,
167 ).generations[0][0],
168 ).message
File [~/Documents/venvs/nvaif_env/lib/python3.12/site-packages/langchain_core/language_models/chat_models.py:560](http://localhost:8888/lab/tree/~/Documents/venvs/nvaif_env/lib/python3.12/site-packages/langchain_core/language_models/chat_models.py#line=559), in BaseChatModel.generate_prompt(self, prompts, stop, callbacks, **kwargs)
552 def generate_prompt(
553 self,
554 prompts: List[PromptValue],
(...)
557 **kwargs: Any,
558 ) -> LLMResult:
559 prompt_messages = [p.to_messages() for p in prompts]
--> 560 return self.generate(prompt_messages, stop=stop, callbacks=callbacks, **kwargs)
File [~/Documents/venvs/nvaif_env/lib/python3.12/site-packages/langchain_core/language_models/chat_models.py:421](http://localhost:8888/lab/tree/~/Documents/venvs/nvaif_env/lib/python3.12/site-packages/langchain_core/language_models/chat_models.py#line=420), in BaseChatModel.generate(self, messages, stop, callbacks, tags, metadata, run_name, run_id, **kwargs)
419 if run_managers:
420 run_managers[i].on_llm_error(e, response=LLMResult(generations=[]))
--> 421 raise e
422 flattened_outputs = [
423 LLMResult(generations=[res.generations], llm_output=res.llm_output) # type: ignore[list-item]
424 for res in results
425 ]
426 llm_output = self._combine_llm_outputs([res.llm_output for res in results])
File [~/Documents/venvs/nvaif_env/lib/python3.12/site-packages/langchain_core/language_models/chat_models.py:411](http://localhost:8888/lab/tree/~/Documents/venvs/nvaif_env/lib/python3.12/site-packages/langchain_core/language_models/chat_models.py#line=410), in BaseChatModel.generate(self, messages, stop, callbacks, tags, metadata, run_name, run_id, **kwargs)
408 for i, m in enumerate(messages):
409 try:
410 results.append(
--> 411 self._generate_with_cache(
412 m,
413 stop=stop,
414 run_manager=run_managers[i] if run_managers else None,
415 **kwargs,
416 )
417 )
418 except BaseException as e:
419 if run_managers:
File [~/Documents/venvs/nvaif_env/lib/python3.12/site-packages/langchain_core/language_models/chat_models.py:632](http://localhost:8888/lab/tree/~/Documents/venvs/nvaif_env/lib/python3.12/site-packages/langchain_core/language_models/chat_models.py#line=631), in BaseChatModel._generate_with_cache(self, messages, stop, run_manager, **kwargs)
630 else:
631 if inspect.signature(self._generate).parameters.get("run_manager"):
--> 632 result = self._generate(
633 messages, stop=stop, run_manager=run_manager, **kwargs
634 )
635 else:
636 result = self._generate(messages, stop=stop, **kwargs)
File [~/Documents/venvs/nvaif_env/lib/python3.12/site-packages/langchain_nvidia_ai_endpoints/chat_models.py:155](http://localhost:8888/lab/tree/~/Documents/venvs/nvaif_env/lib/python3.12/site-packages/langchain_nvidia_ai_endpoints/chat_models.py#line=154), in ChatNVIDIA._generate(self, messages, stop, run_manager, **kwargs)
148 def _generate(
149 self,
150 messages: List[BaseMessage],
(...)
153 **kwargs: Any,
154 ) -> ChatResult:
--> 155 responses = self._call(messages, stop=stop, run_manager=run_manager, **kwargs)
156 self._set_callback_out(responses, run_manager)
157 message = ChatMessage(**self.custom_postprocess(responses))
File [~/Documents/venvs/nvaif_env/lib/python3.12/site-packages/langchain_nvidia_ai_endpoints/chat_models.py:186](http://localhost:8888/lab/tree/~/Documents/venvs/nvaif_env/lib/python3.12/site-packages/langchain_nvidia_ai_endpoints/chat_models.py#line=185), in ChatNVIDIA._call(self, messages, stop, run_manager, **kwargs)
184 """Invoke on a single list of chat messages."""
185 inputs = self.custom_preprocess(messages)
--> 186 responses = self.get_generation(inputs=inputs, stop=stop, **kwargs)
187 return responses
File [~/Documents/venvs/nvaif_env/lib/python3.12/site-packages/langchain_nvidia_ai_endpoints/chat_models.py:310](http://localhost:8888/lab/tree/~/Documents/venvs/nvaif_env/lib/python3.12/site-packages/langchain_nvidia_ai_endpoints/chat_models.py#line=309), in ChatNVIDIA.get_generation(self, inputs, **kwargs)
308 stop = kwargs["stop"] = kwargs.get("stop") or self.stop
309 payload = self.get_payload(inputs=inputs, stream=False, **kwargs)
--> 310 out = self.client.get_req_generation(self.model, stop=stop, payload=payload)
311 return out
File [~/Documents/venvs/nvaif_env/lib/python3.12/site-packages/langchain_nvidia_ai_endpoints/_common.py:387](http://localhost:8888/lab/tree/~/Documents/venvs/nvaif_env/lib/python3.12/site-packages/langchain_nvidia_ai_endpoints/_common.py#line=386), in NVEModel.get_req_generation(self, model_name, payload, invoke_url, stop, endpoint)
385 """Method for an end-to-end post query with NVE post-processing."""
386 invoke_url = self._get_invoke_url(model_name, invoke_url, endpoint=endpoint)
--> 387 response = self.get_req(model_name, payload, invoke_url)
388 output, _ = self.postprocess(response, stop=stop)
389 return output
File [~/Documents/venvs/nvaif_env/lib/python3.12/site-packages/langchain_nvidia_ai_endpoints/_common.py:374](http://localhost:8888/lab/tree/~/Documents/venvs/nvaif_env/lib/python3.12/site-packages/langchain_nvidia_ai_endpoints/_common.py#line=373), in NVEModel.get_req(self, model_name, payload, invoke_url, stop, endpoint)
372 if payload.get("stream", False) is True:
373 payload = {**payload, "stream": False}
--> 374 response, session = self._post(invoke_url, payload)
375 return self._wait(response, session)
File [~/Documents/venvs/nvaif_env/lib/python3.12/site-packages/langchain_nvidia_ai_endpoints/_common.py:213](http://localhost:8888/lab/tree/~/Documents/venvs/nvaif_env/lib/python3.12/site-packages/langchain_nvidia_ai_endpoints/_common.py#line=212), in NVEModel._post(self, invoke_url, payload)
211 session = self.get_session_fn()
212 self.last_response = response = session.post(**self.last_inputs)
--> 213 self._try_raise(response)
214 return response, session
File [~/Documents/venvs/nvaif_env/lib/python3.12/site-packages/langchain_nvidia_ai_endpoints/_common.py:285](http://localhost:8888/lab/tree/~/Documents/venvs/nvaif_env/lib/python3.12/site-packages/langchain_nvidia_ai_endpoints/_common.py#line=284), in NVEModel._try_raise(self, response)
283 if str(status) == "401":
284 body += "\nPlease check or regenerate your API key."
--> 285 raise Exception(f"{header}\n{body}") from None
Exception: [404] Not Found
Inference error
RequestID: d39bb278-1623-4561-a556-0b43547ab10e
It turned out that all I needed to do to query Llama 3 was to upgrade to 0.0.8. Is it intentional for newer models to be incompatible with older package versions, even minor ones? This is inconvenient when langchain-nvidia-ai-endpoints is a dependency inside another open source package which may not be staying up to date with the latest versions.
Additionally, is it possible for it to fail with a more informative error that would let the user know that they could query this model if they upgraded the package. An alternative would be for the Github README or Langchain documentation to be kept updated with the minimum package version required to query various models. It would also be nice if the output of llm.available_models could also show the minimum package version needed to query the model.
The text was updated successfully, but these errors were encountered:
Note that the trace includes a Please check or regenerate your API key but I actually did not need to do that. I only needed to upgrade my package to successfully query the newer models.
@aishwaryap thank you for reporting this. currently, supporting new models requires updating a table in the package with invocation information. we're working on automation for this.
as for the informative error, that is also tracked in #21
Hi all, I was experimenting with
langchain-nvidia-ai-endpoints==0.0.4
and when I tried querying theai-llama3-70b
model, I got the following error:It turned out that all I needed to do to query Llama 3 was to upgrade to 0.0.8. Is it intentional for newer models to be incompatible with older package versions, even minor ones? This is inconvenient when
langchain-nvidia-ai-endpoints
is a dependency inside another open source package which may not be staying up to date with the latest versions.Additionally, is it possible for it to fail with a more informative error that would let the user know that they could query this model if they upgraded the package. An alternative would be for the Github README or Langchain documentation to be kept updated with the minimum package version required to query various models. It would also be nice if the output of
llm.available_models
could also show the minimum package version needed to query the model.The text was updated successfully, but these errors were encountered: