Detecting headsets

🎻 February 2022 fiddle

Want to learn more about WebRTC?

Look us up here for the WebRTC Fiddle of the Month, created… once a month.

Or just enroll to one of our excellent WebRTC training courses.



Tsahi: Hi everyone, and welcome to another WebRTC Fiddle of the Month. And this time we’re going to detect headsets. And the one that we’re going to detect is mine, and that’s because Philip doesn’t have a Windows machine at the moment.

Philipp: Yes. So there’s an interesting saying in the WebRTC specification for media captures/devices and it says that, let me quote it “for example, the audio input and output devices representing the speaker and microphone on the same headset has the same group ID”.

Tsahi: Okay, so…

Philipp: That allows you to detect headsets and that’s a bit useful sometimes, because a lot of people wear headsets.

Tsahi: I’m assuming it starts from the fact that we’ve got these picker selection for the microphone and the speaker, and then you will see different lists of information in them. So we’re going to start there, right?

Philipp: Yes. So we’re going to start with the usual enumerateDevices API, and do a new thing: You share your screen this time. Yes.

Tsahi: So, what do we see on my screen?

Philipp: Yes, we have a simple function. First we call getUserMedia() with audio and video set to true. So we get access to the devices and get a full list of devices from enumerateDevices(), because if you don’t call getUserMedia() you get less information from enumerateDevices().

Tsahi: Okay.

Philipp: And then we call navigator.mediaDevices.enumerateDevices(), which lets you enumerate all the input and output devices. And we see the output on the right, and we see we have three different kinds of devices. All your input devices which are microphones, video input which are cameras and audio output which is speakers.

Tsahi: Okay, so let me see if I understand what’s here, and I’ll start with the video part because we want to talk about audio. So I see that I’ve got two different cameras for video input. One of them is this Chinese brand of Logitech C930 camera. And then I’ve got another one, which is an OBS mutual camera on my machine. The rest is all audio devices, right?

Philipp: Yes. And on Windows, you have things like the default input and communications device, and they are treated differently by the operating system.

Tsahi: That will be this one, the default headset microphone.

Philipp: Yes.

Tsahi: Okay.

Philipp:  Yes, but we’re logging the group ID and you can see that the group ID of the default headset, and the communication device is the same.

Tsahi: Yes. Okay.

Philipp: And also have that device again under its real name, which is “Headset Microphone (2- UC USB Headset)”.

Tsahi: This one, it’s the one that I’m using here.

Philipp: Yes.

Tsahi:  Okay. And then you also know that it’s a USB one with the cables on it.

Philipp: Yes.

Tsahi: Okay. What about the microphone that they have here?

Philipp: How so?

Tsahi: Sorry, the speakers, the output.

Philipp: They’re down below.

Tsahi: Okay. And that would be this one. The communications headset to your phone, whatever….

Philipp: Yes.

Tsahi: Okay. But it’s the same device, right?

Philipp: Yes. And that’s the reason it has the same group ID.

Tsahi: So I would have searched here for postfix of that ID and it’s the same for both the audio input and the output. And we see it multiple times, simply because there is default and then there’s the actual device.

Philipp:  Yes. And do you have a microphone on the camera as well?

Tsahi: Yes. Sadly yes.

Philipp:  Yes. And you can see that has a different device ID because it’s not on the same physical device.

Tsahi: Yes. And then there’s one that is connected to the digital audio interface here that is different, that is also the one here that is the “(Intel(R) Display Audio)”, whatever that means. Probably the one that comes out of the HDMI or something.

Philipp: Yes, that’s the monitor, typically. What we’ve seen here is how we can detect headsets basically. And I mean, that might be useful if you’re recommending your users use headsets in cases where you have echo, because headsets are physically easier to manage when it comes to echo.

Tsahi:  So what I would be looking for is the same group ID in both speaker and microphone that is being used at the same time in the code.

Philipp: Yes. And you have an approximation of this as a headset.

Tsahi: Okay, I would also recommend looking for the word Bluetooth or BT, just to know that if someone complains, it might be Bluetooth interface or whatever they interfering especially in call centers where they usually don’t like Bluetooth.

Philipp: Yes. And there are some difficulties with this approach. For instance, it didn’t work on Linux until Chrome 100. So that was just fixed a few days ago. The second is it doesn’t work on MacOS.

Tsahi: Nothing works on MacOS.

Philipp: Well, at least camera access and microphone access works. But for headsets, you don’t have the same group ID even though the specification says you should. Instead, you have the MacBook microphone and the MacBook speaker with the same group ID. Of course, they’re on the same device but that’s not quite what the specification means.

Tsahi:  Yes, because then it’s like every device will have on the same because it’s interfaces at the end of the day.

Philipp: Yes, I mean, you also have devices like the Jabra speakerphone, which is a combination of microphone and speaker. And they will also have the same group ID. It’s not a perfect way to detect the headset, but even those devices will have much better echo cancellation built into the hardware at times because on these devices, you know where the microphone is and where the speaker is. And it’s going to be a better experience than  just hearing your Chinese Logitech camera with your display speakers.

Tsahi: Yes. Okay. So thank you for being us for this middle of the month and we’ll see you in the next one next month. Bye everyone.

Philipp: Bye!