Hi,
Can I please have some help with my ESPHome voice assistant?
It’s detecting my wake word, but I can’t hear any response through the speaker. Although, the ESPHome logs suggest that it’s trying to output something.
Assist Configuration
ESPHome Configuration
substitutions:
name: esphome-voice-satellite-dev
friendly_name: ESPHome Voice Satellite Dev
esphome:
name: ${name}
friendly_name: ${friendly_name}
name_add_mac_suffix: false
platformio_options:
board_build.flash_mode: dio
project:
name: "dan.voice_assistant"
version: '1.0'
min_version: 2023.11.5
esp32:
board: esp32-s3-devkitc-1
variant: esp32s3 # This shouldn't be needed.
flash_size: 16MB
framework:
type: esp-idf #arduino
version: recommended #4.4.6
sdkconfig_options:
CONFIG_ESP32S3_DEFAULT_CPU_FREQ_240: "y"
CONFIG_ESP32S3_DATA_CACHE_64KB: "y"
CONFIG_ESP32S3_DATA_CACHE_LINE_64B: "y"
CONFIG_AUDIO_BOARD_CUSTOM: "y"
psram:
mode: octal
speed: 80MHz
# Enable logging
logger:
# Enable Home Assistant API
api:
on_client_connected:
then:
- delay: 50ms
# - light.turn_off: led_ww
- micro_wake_word.start:
on_client_disconnected:
then:
- voice_assistant.stop:
# Allow Over-The-Air updates
ota:
- platform: esphome
password: !secret ota_password
# Allow provisioning Wi-Fi via serial
improv_serial:
wifi:
ssid: !secret wifi_iot_ssid
password: !secret wifi_iot_password
# Set up a wifi access point
ap: {}
# In combination with the `ap` this allows the user
# to provision wifi credentials to the device via WiFi AP.
captive_portal:
dashboard_import:
package_import_url: github://esphome/firmware/esphome-web/esp32s3.yaml@v2
import_full_config: true
# Sets up Bluetooth LE (Only on ESP32) to allow the user
# to provision wifi credentials to the device.
esp32_improv:
authorizer: none
# To have a "next url" for improv serial
web_server:
i2s_audio:
- id: i2s_mic
i2s_lrclk_pin: GPIO3 #WS
i2s_bclk_pin: GPIO5 #SCK
- id: i2s_speaker
i2s_lrclk_pin: GPIO6 #LRC
i2s_bclk_pin: GPIO7 #BLCK
#id: i2s_main
#i2s_lrclk_pin: GPIO7
#i2s_bclk_pin: GPIO6
#access_mode: duplex
microphone:
- platform: i2s_audio
id: va_mic
i2s_audio_id: i2s_mic
adc_type: external
i2s_din_pin: GPIO4 # SD Pin of INMP441 Microphone
channel: left # worked without this?
pdm: false
bits_per_sample: 32 bit
speaker:
- platform: i2s_audio
id: va_speaker
i2s_audio_id: i2s_speaker
dac_type: external
i2s_dout_pin: GPIO8 # DIN Pin of the MAX98357A Audio Amplifier
mode: mono
micro_wake_word:
on_wake_word_detected:
# then:
- voice_assistant.start:
wake_word: !lambda return wake_word;
silence_detection: true # defaults to true.
# - light.turn_on:
# id: led_ww
# red: 30%
# green: 30%
# blue: 70%
# brightness: 60%
# effect: fast pulse
model: hey_jarvis
voice_assistant:
# use_wake_word: false
id: va
microphone: va_mic
auto_gain: 31dBFS
noise_suppression_level: 2
volume_multiplier: 2.0 #2.0
speaker: va_speaker
on_stt_end:
then:
# - light.turn_off: led_ww
on_error:
- micro_wake_word.start:
on_end:
then:
# - light.turn_off: led_ww
- wait_until:
not:
voice_assistant.is_running:
- micro_wake_word.start:
ESPHome logs
“Hey Jarvis, what’s the time?”
[18:56:56][D][micro_wake_word:363]: Wake word sliding average probability is 0.574 and most recent probability is 0.957
[18:56:56][D][micro_wake_word:129]: Wake Word Detected
[18:56:56][D][micro_wake_word:178]: State changed from DETECTING_WAKE_WORD to STOP_MICROPHONE
[18:56:56][D][micro_wake_word:135]: Stopping Microphone
[18:56:56][D][micro_wake_word:178]: State changed from STOP_MICROPHONE to STOPPING_MICROPHONE
[18:56:56][D][esp-idf:000]: I (4556305) I2S: DMA queue destroyed
[18:56:56]
[18:56:56][D][micro_wake_word:178]: State changed from STOPPING_MICROPHONE to IDLE
[18:56:56][D][voice_assistant:504]: State changed from IDLE to START_MICROPHONE
[18:56:56][D][voice_assistant:510]: Desired state set to START_PIPELINE
[18:56:56][D][voice_assistant:221]: Starting Microphone
[18:56:56][D][ring_buffer:024]: Created ring buffer with size 16384
[18:56:56][D][voice_assistant:504]: State changed from START_MICROPHONE to STARTING_MICROPHONE
[18:56:56][D][esp-idf:000]: I (4556311) I2S: DMA Malloc info, datalen=blocksize=1024, dma_buf_count=4
[18:56:56]
[18:56:56][D][voice_assistant:504]: State changed from STARTING_MICROPHONE to START_PIPELINE
[18:56:56][D][voice_assistant:275]: Requesting start...
[18:56:56][D][voice_assistant:504]: State changed from START_PIPELINE to STARTING_PIPELINE
[18:56:56][D][voice_assistant:525]: Client started, streaming microphone
[18:56:56][D][voice_assistant:504]: State changed from STARTING_PIPELINE to STREAMING_MICROPHONE
[18:56:56][D][voice_assistant:510]: Desired state set to STREAMING_MICROPHONE
[18:56:56][D][voice_assistant:627]: Event Type: 1
[18:56:56][D][voice_assistant:630]: Assist Pipeline running
[18:56:56][D][voice_assistant:627]: Event Type: 3
[18:56:56][D][voice_assistant:641]: STT started
[18:56:57][D][voice_assistant:627]: Event Type: 11
[18:56:57][D][voice_assistant:781]: Starting STT by VAD
[18:56:58][D][voice_assistant:627]: Event Type: 12
[18:56:58][D][voice_assistant:785]: STT by VAD end
[18:56:58][D][voice_assistant:504]: State changed from STREAMING_MICROPHONE to STOP_MICROPHONE
[18:56:58][D][voice_assistant:510]: Desired state set to AWAITING_RESPONSE
[18:56:58][D][voice_assistant:504]: State changed from STOP_MICROPHONE to STOPPING_MICROPHONE
[18:56:58][D][esp-idf:000]: I (4558783) I2S: DMA queue destroyed
[18:56:58]
[18:56:58][D][voice_assistant:504]: State changed from STOPPING_MICROPHONE to AWAITING_RESPONSE
[18:57:04][D][voice_assistant:627]: Event Type: 4
[18:57:04][D][voice_assistant:655]: Speech recognised as: " What's the time?"
[18:57:04][D][voice_assistant:627]: Event Type: 5
[18:57:04][D][voice_assistant:660]: Intent started
[18:57:06][D][voice_assistant:627]: Event Type: 6
[18:57:06][D][voice_assistant:627]: Event Type: 7
[18:57:06][D][voice_assistant:683]: Response: "Sorry, I am not aware of any device called time?"
[18:57:06][D][voice_assistant:627]: Event Type: 98
[18:57:06][D][voice_assistant:768]: TTS stream start
[18:57:06][D][esp-idf:000][speaker_task]: I (4567203) I2S: DMA Malloc info, datalen=blocksize=512, dma_buf_count=8
[18:57:06]
[18:57:06][D][voice_assistant:627]: Event Type: 2
[18:57:06][D][voice_assistant:717]: Assist Pipeline ended
[18:57:06][D][i2s_audio.speaker:206]: Started I2S Audio Speaker
[18:57:09][D][voice_assistant:627]: Event Type: 99
[18:57:09][D][voice_assistant:776]: TTS stream end
[18:57:09][D][voice_assistant:375]: End of audio stream received
[18:57:09][D][voice_assistant:504]: State changed from STREAMING_RESPONSE to RESPONSE_FINISHED
[18:57:09][D][voice_assistant:510]: Desired state set to RESPONSE_FINISHED
[18:57:10][D][i2s_audio.speaker:210]: Stopping I2S Audio Speaker
[18:57:10][D][i2s_audio.speaker:222]: Stopped I2S Audio Speaker
[18:57:10][D][voice_assistant:407]: Speaker has finished outputting all audio
[18:57:10][D][voice_assistant:504]: State changed from RESPONSE_FINISHED to IDLE
[18:57:10][D][voice_assistant:510]: Desired state set to IDLE
[18:57:10][D][micro_wake_word:178]: State changed from IDLE to START_MICROPHONE
[18:57:10][D][micro_wake_word:116]: Starting Microphone
[18:57:10][D][micro_wake_word:178]: State changed from START_MICROPHONE to STARTING_MICROPHONE
[18:57:10][D][esp-idf:000]: I (4570425) I2S: DMA Malloc info, datalen=blocksize=1024, dma_buf_count=4
[18:57:10]
[18:57:10][D][micro_wake_word:178]: State changed from STARTING_MICROPHONE to DETECTING_WAKE_WORD
Hardware
- ESP32-S3 N16R8
https://www.aliexpress.com/item/1005006266375800.html?spm=a2g0o.order_list.order_list_main.5.394318022EFfZ4 - MAX98357
https://www.aliexpress.com/item/1005006382608935.html?spm=a2g0o.order_list.order_list_main.10.394318022EFfZ4 - INMP441
https://www.aliexpress.com/item/1005006109471759.html?spm=a2g0o.order_list.order_list_main.15.394318022EFfZ4 - 3W 4R speaker
https://www.aliexpress.com/item/32860336112.html?spm=a2g0o.order_list.order_list_main.20.394318022EFfZ4
Guides and Resources I used
- ESP32 & ESPHome Voice Assistant · GitHub
- How To Setup On-Device Wake Word Detection For Voice Assistant using Micro Wake Word | Smart Home Circle
- How I Created My Voice Assistant With On-Device Wake Word Detection On ESP32 Using Micro Wake Word | Smart Home Circle
I thought this bug might be relevant, but others seem to have resolved the issue, while I have not.
1 post - 1 participant