I just tried to remove what appeared to be a duplicate speaker, they had different number of segments. Upon deleting the one with fewer segments, it deleted the other speaker with the same name.
The overall speaker tagging experience has been quite buggy for me. Wondering about others’ experiences and if you’ve had issues too
I just spent nearly an hour adding names to a 3+ hour transcript, only for the app to suddenly decide to eliminate one of the names that I had been tagging (my own, coincidentally) and lump those transcript segments in with another name. This is beyond frustrating. How is it that others are not complaining about this? Are other people not relying on the speaker identification? Are most people working with much shorter recordings? Is my device not working correctly?
If the Pocket team wasn’t working so hard on improving everything, I’d be asking for a refund, because right now it simply doesn’t do what it was advertised it would do.
Hey @DaveNgl as mentioned before speaker identification doesn’t work for recordings above 3 hours would love to understand what happened in your case as the app shouldn’t be doing that
We’ve found that models struggle significantly with recordings longer than 3 hours—they hallucinate extensively and often incorrectly split segments across 20+ speakers, rendering the output unusable. Until there’s meaningful progress in solving these accuracy issues, the quality tradeoff isn’t worth it. We’ve decided to hold off on diarization for now.