Skip to content

Vad 模型参数配置疑问 #2965

@TracyRAN1

Description

@TracyRAN1

部署环境

  • OS: Ubuntu24.04
  • Python version: 3.13.13
  • FunASR version: 1.3.9
  • ModelScope version: 1.37.1
  • PyTorch / torchaudio version: torch 2.12.0 torchaudio 2.11.0
  • Install method (pip, source, Docker):pip
  • Device (cuda, cpu, mps):cuda
  • GPU model: FunASR-Nano-2512 + fsmn-vad + cam++
  • CUDA/cuDNN version:13.0

问题

我调整了vad模型下的iic/speech_fsmn_vad_zh-cn-16k-common-pytorch/config.yamlmax_end_silence_time: 500,转写返回的sentence_info还是没有按照500ms间隙进行切分,请问是配置方式不对吗?

logger.info("正在加载 ASR 模型...")
        self._model = AutoModel(
            model=self._settings.model_name,  # FunASR-Nano-2512 
            vad_model=self._settings.vad_model, # fsmn-vad
            vad_kwargs={
                "max_single_segment_time": self._settings.max_single_segment_time, # 30000
                "max_end_silence_time": self._settings.max_end_silence_time, # 500
            },
            spk_model=self._settings.spk_model, # cam++
            device=get_device(),
            disable_pbar=True,
            disable_update=True,
            hub=self._settings.hub,
        )

响应结果(部分)

{'start': 30890, 'end': 60020, 'sentence': '问候一下,挺长时间没联系你了啊。是是啊,你最近咋样啊?忙不忙啊?最近啊,挺好的。行行,有时间有时间,大哥你来我这坐坐,到时候来我们这喝杯茶。嗯,长睦路跟三一路交汇处就是对,就是这边有个农贸市场斜对面。', 'timestamp': [[30950, 31010], [31070, 31130], [31190, 31250], [31250, 31310], [31310, 31370], [31490, 31550], [31610, 31670], [31730, 31790], [31790, 31850], [31970, 32030], [32150, 32210], [32270, 32330], [32450, 32509], [32630, 32689], [34490, 34550], [35750, 35810], [35810, 35870], [36170, 36230], [36350, 36410], [36470, 36530], [36530, 36590], [36710, 36770], [36830, 36890], [37010, 37070], [37130, 37190], [37310, 37370], [37370, 37430], [37430, 37490], [37550, 37670], [37670, 37730], **[37910, 37970], [39470, 39530]**, [41030, 41090], [41210, 41270], [41450, 41510], [41690, 41750], [41810, 41870], [41990, 42050], [42170, 42230], [43790, 43850], [44990, 45050], [45350, 45410], [45650, 45710], [45770, 45830], [46010, 46070], [46130, 46250], [46790, 46850], [46970, 47030], [47030, 47090], [47150, 47210], [47210, 47270], [47330, 47390], [47570, 47630], [47810, 47870], [47930, 48050], [48050, 48110], [48230, 48290], [48470, 48530], [48590, 48650], [48950, 49010], [49070, 49130], [49130, 49190], [49250, 49310], [49370, 49490], [49490, 49550], [49550, 49610], [49730, 49790], [49850, 49910], [50030, 50090], [51050, 51110], [52250, 52310], [52790, 52850], [52850, 52910], [53030, 53090], [53210, 53270], [53390, 53450], [53570, 53630], [53750, 53810], [53930, 53990], [54110, 54170], [54230, 54290], [54470, 54530], [55730, 55790], [55910, 55970], [57230, 57290], [57530, 57590], [57650, 57710], [57770, 57890], [58010, 58070], [58070, 58130], [58250, 58310], [58370, 58430], [58490, 58550], [58610, 58730], [58790, 58850], [58910, 58970], [59210, 59270], [59330, 59390], [59510, 59570], [59870, 59930]], 'spk': 0}

Metadata

Metadata

Assignees

No one assigned

    Labels

    needs triageNeeds maintainer triage and routingquestionFurther information is requested

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions