Releases · pipecat-ai/pipecat

23 Jul 22:28

aconchillo

v0.0.39

4b39309

v0.0.39

Fixed

Fixed a regression introduced in 0.0.38 that would cause Daily transcription to stop the Pipeline.

Assets 2

23 Jul 21:28

aconchillo

v0.0.38

060a22f

v0.0.38

Added

Added force_reload, skip_validation and trust_repo to SileroVAD and SileroVADAnalyzer. This allows caching and various GitHub repo validations.
Added send_initial_empty_metrics flag to PipelineParams to request for initial empty metrics (zero values). True by default.

Fixed

Fixed initial metrics format. It was using the wrong keys name/time instead of processor/value.
STT services should be using ISO 8601 time format for transcription frames.
Fixed an issue that would cause Daily transport to show a stop transcription error when actually none occurred.

Assets 2

23 Jul 00:04

aconchillo

v0.0.37

eb998aa

v0.0.37

Added

Added RTVIProcessor which implements the RTVI-AI standard.
See https://github.com/rtvi-ai
Added BotInterruptionFrame which allows interrupting the bot while talking.
Added LLMMessagesAppendFrame which allows appending messages to the current LLM context.
Added LLMMessagesUpdateFrame which allows changing the LLM context for the one provided in this new frame.
Added LLMModelUpdateFrame which allows updating the LLM model.
Added TTSSpeakFrame which causes the bot say some text. This text will not be part of the LLM context.
Added TTSVoiceUpdateFrame which allows updating the TTS voice.

Removed

We remove the LLMResponseStartFrame and LLMResponseEndFrame frames. These were added in the past to properly handle interruptions for the LLMAssistantContextAggregator. But the LLMContextAggregator is now based on LLMResponseAggregator which handles interruptions properly by just processing the StartInterruptionFrame, so there's no need for these extra frames any more.

Fixed

Fixed an issue with StatelessTextTransformer where it was pushing a string instead of a TextFrame.
TTSService end of sentence detection has been improved. It now works with acronyms, numbers, hours and others.
Fixed an issue in TTSService that would not properly flush the current aggregated sentence if an LLMFullResponseEndFrame was found.

Performance

CartesiaTTSService now uses websockets which improves speed. It also leverages the new Cartesia contexts which maintains generated audio prosody when multiple inputs are sent, therefore improving audio quality a lot.

Assets 2

02 Jul 17:19

aconchillo

v0.0.36

065cfb2

v0.0.36

Added

Added GladiaSTTService. https://docs.gladia.io/chapters/speech-to-text-api/pages/live-speech-recognition
Added XTTSService. This is a local Text-To-Speech service. https://github.com/coqui-ai/TTS
Added UserIdleProcessor. This processor can be used to wait for any interaction with the user. If the user doesn't say anything within a given timeout a provided callback is called.
Added IdleFrameProcessor. This processor can be used to wait for frames within a given timeout. If no frame is received within the timeout a provided callback is called.
Added new frame BotSpeakingFrame. This frame will be continuously pushed upstream while the bot is talking.
It is now possible to specify a Silero VAD version when using SileroVADAnalyzer or SileroVAD.
Added AysncFrameProcessor and AsyncAIService. Some services like DeepgramSTTService need to process things asynchronously. For example, audio is sent to Deepgram but transcriptions are not returned immediately. In these cases we still require all frames (except system frames) to be pushed downstream from a single task. That's what AsyncFrameProcessor is for. It creates a task and all frames should be pushed from that task. So, whenever a new Deepgram transcription is ready that transcription will also be pushed from this internal task.
The MetricsFrame now includes processing metrics if metrics are enabled. The processing metrics indicate the time a processor needs to generate all its output. Note that not all processors generate these kind of metrics.

Changed

WhisperSTTService model can now also be a string.
Added missing * keyword separators in services.

Fixed

WebsocketServerTransport doesn't try to send frames anymore if serializers returns None.
Fixed an issue where exceptions that occurred inside frame processors were being swallowed and not displayed.
Fixed an issue in FastAPIWebsocketTransport where it would still try to send data to the websocket after being closed.

Other

Added Fly.io deployment example in examples/deployment/flyio-example.
Added new 17-detect-user-idle.py example that shows how to use the new UserIdleProcessor.

Assets 2

28 Jun 18:27

aconchillo

v0.0.35

8dff460

v0.0.35

Changed

FastAPIWebsocketParams now require a serializer.
TwilioFrameSerializer now requires a streamSid.

Fixed

Silero VAD number of frames needs to be 512 for 16000 sample rate or 256 for 8000 sample rate.

Assets 2

26 Jun 05:06

aconchillo

v0.0.34

0ac4200

v0.0.34

Fixed

Fixed an issue with asynchronous STT services (Deepgram and Azure) that could interruptions to ignore transcriptions.
Fixed an issue introduced in 0.0.33 that would cause the LLM to generate shorter output.

Assets 2

25 Jun 19:06

aconchillo

v0.0.33

e3b407d

v0.0.33

Changed

Upgraded to Cartesia's new Python library 1.0.0. CartesiaTTSService now expects a voice ID instead of a voice name (you can get the voice ID from Cartesia's playground). You can also specify the audio sample_rate and encoding instead of the previous output_format.

Fixed

Fixed an issue with asynchronous STT services (Deepgram and Azure) that could cause static audio issues and interruptions to not work properly when dealing with multiple LLMs sentences.
Fixed an issue that could mix new LLM responses with previous ones when handling interruptions.
Fixed a Daily transport blocking situation that occurred while reading audio frames after a participant left the room. Needs daily-python >= 0.10.1.

Assets 2

22 Jun 16:23

aconchillo

v0.0.32

269d06a

v0.0.32

Added

Allow specifying a DeepgramSTTService url which allows using on-prem Deepgram.
Added new FastAPIWebsocketTransport. This is a new websocket transport that can be integrated with FastAPI websockets.
Added new TwilioFrameSerializer. This is a new serializer that knows how to serialize and deserialize audio frames from Twilio.
Added Daily transport event: on_dialout_answered. See https://reference-python.daily.co/api_reference.html#daily.EventHandler
Added new AzureSTTService. This allows you to use Azure Speech-To-Text.

Performance

Convert BaseOutputTransport and BaseOutputTransport to fully use asyncio and remove the use of threads.

Other

Added twilio-chatbot. This is an example that shows how to integrate Twilio phone numbers with a Pipecat bot.
Updated 07f-interruptible-azure.py to use AzureLLMService, AzureSTTService and AzureTTSService.

Assets 2

13 Jun 22:33

aconchillo

v0.0.31

6cdccaf

v0.0.31

Performance

Break long audio frames into 20ms chunks instead of 10ms.

Assets 2

13 Jun 21:30

aconchillo

v0.0.30

4193a4f

v0.0.30

Added

Added report_only_initial_ttfb to PipelineParams. This will make it so only the initial TTFB metrics after the user stops talking are reported.
Added OpenPipeLLMService. This service will let you run OpenAI through OpenPipe's SDK.
Allow specifying frame processors' name through a new name constructor argument.
Added DeepgramSTTService. This service has an ongoing websocket connection. To handle this, it subclasses AIService instead of STTService. The output of this service will be pushed from the same task, except system frames like StartFrame, CancelFrame or StartInterruptionFrame.

Changed

FrameSerializer.deserialize() can now return None in case it is not possible to desearialize the given data.
daily_rest.DailyRoomProperties now allows extra unknown parameters.

Fixed

Fixed an issue where DailyRoomProperties.exp always had the same old timestamp unless set by the user.
Fixed a couple of issues with WebsocketServerTransport. It needed to use push_audio_frame() and also VAD was not working properly.
Fixed an issue that would cause LLM aggregator to fail with small VADParams.stop_secs values.
Fixed an issue where BaseOutputTransport would send longer audio frames preventing interruptions.

Other

Added new 07h-interruptible-openpipe.py example. This example shows how to use OpenPipe to run OpenAI LLMs and get the logs stored in OpenPipe.
Added new dialin-chatbot example. This examples shows how to call the bot using a phone number.

Assets 2

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fixed

Added

Fixed

Added

Removed

Fixed

Performance

Added

Changed

Fixed

Other

Changed

Fixed

Fixed

Changed

Fixed

Added

Performance

Other

Performance

Added

Changed

Fixed

Other

Releases: pipecat-ai/pipecat

v0.0.39

Fixed

v0.0.38

Added

Fixed

v0.0.37

Added

Removed

Fixed

Performance

v0.0.36

Added

Changed

Fixed

Other

v0.0.35

Changed

Fixed

v0.0.34

Fixed

v0.0.33

Changed

Fixed

v0.0.32

Added

Performance

Other

v0.0.31

Performance

v0.0.30

Added

Changed

Fixed

Other