SIPREC Server on AWS

Forking in nature by Bing Image

Any telco company and MNO must provide tools or interfaces to third-party companies or government agencies for monitoring or lawful interception (LI) purposes. There are various 3GPP standards for LI systems and their interfaces like X1/2/3 that you can refer to. In addition, sometimes when customers complain about calls and voice qualities, and customer services report it, it is required to have a tool to monitor a specific subscriber or bunch of subscribers to verify what the issue is or what causes that.

No matter the 4G(VoLTE) or 5G network, the IMS core is the place where you need those tools or interfaces. Capturing SIP signaling is not complicated, and various options like P-CSCF or PGW exist to capture it. However, PGW is not your option for non-troubleshooting captures! the P-CSCF interfaces after IPSec decryption are one of the best places to fork SIP signaling for any purpose.

Recently, using the eBPF for this purpose is growing and it seems there are many valuable applications for that.

For media capturing, methods are different. Real-Time Transport Protocol (RTP) plays a crucial role in enabling the transmission of multimedia data, such as audio and video, across the network. RTP protocol capturing in IMS involves the monitoring and analysis of RTP packets exchanged between IMS endpoints during communication sessions. By capturing RTP packets, network administrators and engineers gain valuable insights into the quality, performance, and troubleshooting of real-time media streams. This captured data can be used to analyze and diagnose issues related to voice and video call quality, jitter, latency, packet loss, and other metrics. The captured RTP packets are typically analyzed using specialized tools and protocols to extract information about the media streams, codecs used, timing, synchronization, and other relevant parameters, which in LI maps to the X3 interface.

Using the networking interfaces of media servers, sniffing RTP packets, barging into the media session and call recording are some of the methods for media capturing and each has pros and cons. Although, eBPF as I mentioned, could be really interesting for this purpose!

Recently I worked on a SIPREC server to fork media sessions and later it lets us record the call or intercept the voice. If you are not familiar with SIPREC, please check my earlier post here: SIPREC with RTPEngine.

This is the overall architecture:

siprec-on-aws architecure Image

There are some facts that I didn’t mention in the diagram:

  • All communications are secured by HTTPS or WSS.

  • Authentication is managed but Cognito.

  • VPCLink and VPCE are used for more security in transport.

  • Obviously, the solution is on multi-region and cross-zone!

Ok good, but what exactly we are doing?

  • Calls are established normally which is out of the scope of this post ( I copied this from 3GPP docs :| )

  • Provider logic detects that they need to do recording or forking for a specific call or specific subscriber.

  • Request is sent to SIPREC service API with information of call or subscriber.

  • As a result of those information, lambda function logic extracts the hosting media server and sends all requisite information to the SIPREC server.

  • SIPREC prepares the forking or subscribe request for the media server (e.g. RTPEngine for us) and then after some negotiation for codec and ports, SIPREC starts to receive call legs media streams or RTP packets.

  • Server can save media locally or sends a bunch of packets over WebSocket to lambda and then lambda process it or save it into S3.

  • Please note that we have call SIP info, so decoding the media is not complicated.

Good to know that the SIPREC server does not have a very complicated implementation. It understands SDP, codec negotiation, and UDP connection management.

Challenge: WebSocket API gateway on AWS has a 32KB frame size limitation: https://docs.aws.amazon.com/apigateway/latest/developerguide/limits.html#apigateway-execution-service-websocket-limits-table

So for batching the RTP packets you need to take care of it and do fragmentation and etc.!

Enjoy your day :))

updatedupdated2024-10-082024-10-08