Katalog der Deutschen Nationalbibliothek
Ergebnis der Suche nach: tit all "Artificial intelligence"
![]() |
|
Link zu diesem Datensatz | https://d-nb.info/1374506079 |
Titel | Text, Speech, and Dialogue : 28th International Conference, TSD 2025, Erlangen, Germany, August 25–28, 2025, Proceedings, Part I / edited by Kamil Ekštein, Miloslav Konopík, Ondřej Pražák, František Pártl |
Person(en) |
Ekštein, Kamil (Herausgeber) Konopík, Miloslav (Herausgeber) Pražák, Ondřej (Herausgeber) Pártl, František (Herausgeber) |
Organisation(en) | SpringerLink (Online service) (Sonstige) |
Ausgabe | 1st ed. 2026 |
Verlag | Cham : Springer Nature Switzerland, Imprint: Springer |
Zeitliche Einordnung | Erscheinungsdatum: 2026 |
Umfang/Format | Online-Ressource, XXI, 409 p. 86 illus., 82 illus. in color. : online resource. |
Andere Ausgabe(n) |
Printed edition:: ISBN: 978-3-032-02547-0 Printed edition:: ISBN: 978-3-032-02549-4 |
Inhalt | -- Speech. -- Lightweight Target-Speaker-Based Overlap Transcription for Practical Streaming ASR. -- An Empirical Analysis of Discrete Unit Representations in Speech Language Modeling Pre-training. -- Optimizing ASR Models with Semantic Information. -- Efficient Enhancement of Norwegian ASR Model. -- Towards Stable and Personalised Profiles for Lexical Alignment in Spoken Human-Agent Dialogue. -- Audio–Vision Contrastive Learning for Phonological Class Recognition. -- TOSD-Net: A CNN-Transformer Architecture for Robust Frame-Level Overlapping Speech Detection in Diverse Acoustic Conditions. -- An Exploration of ECAPA-TDNN and x-vector Speaker Representations in Zero-shot Multi-speaker TTS. -- Emotion-Aware Speech-Driven Facial Avatar Animation via Joint Blendshape Prediction and Emotion Recognition. -- Beyond Static Emotions: Leveraging Multitask Learning to Model Dynamics of Dimensional Affect in Speech. -- Implicit Speaker Group Encoding in Self-supervised Speech Recognition Models. -- Combining Temporal Visual Dynamics and Audio Representations for Robust Speaker Identification. -- Sentences vs Phrases in Neural Speech Synthesis: the Phrases Strike Back. -- Evaluating Phoneme-Level Pretraining in Czech Text-to-Speech Synthesis. -- Unifying Global and Near-Context Biasing in a Single Trie Pass. -- Synthesising Cross-Speaker Data for Low-Resource Pathological Speech Recognition with PEFT. -- Multilingual Stutter Event Detection for English, German, and Mandarin Speech. -- How Far Can Synthetic Speech Go? Enhancing ASR in Low-Resource Scenarios via Voice Cloning. -- Enhancing Detection of Parkinson-induced Dysarthria with Cross-lingual Transfer Learning. -- Vocoder-Free Non-Parallel Conversion of Whispered Speech With Masked Cycle-Consistent Generative Adversarial Networks. -- Detection of Cognitive Disorders Using ASR-Based Nonsense Words Repetition. -- Mind the Gap: Entity-Preserved Context-Aware ASR for Structured Transcriptions. -- Boosting CTC-Based ASR Using LLM-Based Intermediate Loss Regularization. -- Robust Disfluency Labeling in Spontaneous Speech: Insights from Diverse Hungarian Corpora Including Mentally Ill Speakers. -- ParCzech4Speech: A New Speech Corpus Derived from Czech Parliamentary Data. -- Towards an Accurate Domain-Specific ASR: Transcription for Pathology. -- Automated Speaking Assessment for L2 Learners of Czech. -- Inclusive ASR for Critical Public Services: Debiasing with Actor-Simulated Speech. -- RECA-PD: A Robust Explainable Cross-Attention Method for Speech-based Parkinson's Disease Classification. -- Systematic FAIRness Assessment of Open Voice Biomarker Datasets for Mental Health and Neurodegenerative Diseases. -- When Silence Speaks: Understanding Open-Ended Responses via LLMs in Therapeutic Voice Interaction. -- Multilingual Domain Adaptation for Speech Recognition Using LLMs. -- Using Cross-attention For Conversational ASR Over The Telephone |
Persistent Identifier |
URN: urn:nbn:de:101:1-2508220408392.702302653722 DOI: 10.1007/978-3-032-02548-7 |
URL | https://doi.org/10.1007/978-3-032-02548-7 |
ISBN/Einband/Preis | 978-3-032-02548-7 |
Sprache(n) | Englisch (eng) |
Beziehungen | Lecture Notes in Artificial Intelligence ; 16029 |
DDC-Notation | 006.35 (maschinell ermittelte DDC-Kurznotation) |
Sachgruppe(n) | 004 Informatik |
Online-Zugriff | Archivobjekt öffnen |
