The importance of NTP to Genesys Cloud Physical Edges

The importance of NTP, for historical & real-time dashboard reporting in contact centres.  

Many of us know that NTP (Network Time Protocol) is a networking protocol for clock synchronisation between computer systems over packet-switched, variable-latency data networks. However, we should sometimes stop and think about what things in our lives depend on the accuracy delivered by NTP. The prompt for this short blog, comes from several UCA customers, using PureCloud Contact Centres from Genesys. Depending on customer need, Genesys, like many other cloud-based UC (Unified Communications) and CC based Contact Centre vendors, have several physical and virtual architectural design options, providing reliable voice call delivery. The term “Voice Call” is too specific in this case, as what I am about to discuss, can pertain to a Multi-Media Contact Centre handling voice, email, video, chat, webchat, SMS and Social media interactions/conversations, collectively known as interactions.  

Some of our customers are early adopters of PureCloud, and therefore have at least 2 or more Edge Appliances on their premise or corporate data centre. Other, more recent adopters may have the same due to geo-redundancy needs and or preferred available DC space. Many customers have however chosen to have virtual Edge Appliances hosted in AWS and have everything maintained for them in a real cloud model and should not be worried about the likes of NTP and how it relates to their edge appliances.  

For those that do have Edge Appliances, there is something you may not be aware of, that could severely impact the accuracy of reporting, yes even within PureCloud’s reporting capability. To understand this, we need to understand the function of the Edge Appliance which, although well documented by Genesys within their Resource Centre, many of us may have overlooked. The primary function of the Edge Appliance is similar to that of an SBC (Session Border Controller) that enables you to terminate various types of trunks and endpoints (telephones) to your PureCloud Solution in a secure fashion. Typically, they would be SIP (Session Initiation Protocol) trunks procured and managed by the customer or a designated integrator/support organisation. Some other essential functions are the registration of SIP endpoints (Telephones), WebRTC clients (softphones), Voice Call Recording and Voice Mail. They provide the management and network path for all RTP (Real-Time Protocol) voice packets on your network.  

Edges and their associated trunks are configured in various logical groups and or sites. A multisite deployment may have several edges backing up one another to provide multiple layers of transparent redundancy. So, with this in mind, and the fact that a CC agent may potentially take a call from multiple sites serviced by edges other than the one they are registered to, a single call or interaction may traverse multiple edges as it is serviced by and transferred between Agents. Welcome to call segment/interaction data (yes, it’s hard to get that word “Call” out of the vocabulary) and how we use it for reporting. To make life easy for PureCloud Supervisors to get started, PureCloud provides a lot of useful data that is already Aggregated such as Average Speed of Answer over 30-minute interval, which is pre-rolled up for our consumption from a more detailed set of data from each individual Call Segment. This detail is what users need to analyse when troubleshooting a call flow, or simply understanding the full customer journey during an interaction. This information is tagged with a unique Conversation ID that is unique to each instance of PureCloud and never repeated.  

For those people genuinely interested in this article, you have most likely been the recipient of such a challenge, and your first port of call would be the Interactions Page within PureCloud. There you see a timeline graph of any interaction you may choose to investigate. If you are fortunate to also have the benefit of additional reporting tools such as eMite, you can quickly extract detailed tables of data that list vast quantities of data about every single call segment for that Interaction/Conversation ID. So, what is a call segment? A Call Segment is one leg of the interaction but in a much more granular form. It lists the series of events that have occurred within the PureCloud Interaction Processing Software to deliver and manage a fruitful interaction between agents and their customers. We at UCA, in extreme cases, have seen a single interaction generate almost 30 call segments, while the average basic call that is answered, dealt with and terminated may only generate as little as 6. 

When calls start to bounce around an IVR listening to multiple announcements, multiple menu options chosen, external data dips used for routing decisions, presented to multiple agents going into “Not Responding” and even transferring between agents, the call segment data starts to stack up. When too many calls start to abandon, and that new call flow is suspected, you need to be able to review that data ASAP. It lists out all the events, in order, for each interaction/call. WELL, AT LEAST IT SHOULD. Trust me it will, down to millisecond accuracy, unless you have a problem with NTP.  

Remember, I said that an interaction could pass through multiple edge appliances? Well, another Key feature of the edge appliance is time stamping. When an interaction traverses its domain, it is responsible for applying the time stamp (yes with millisecond accuracy) to the call segment data that it is controlling. That data is then sent to PureCloud for storage in AWS for later retrieval when needed. It is a far cry from the old 80 – 120-character CDR/SMDR call records we used to use for call billing in the ’90s. It contains much more data. So, what do you think happens if say edge appliance A has its internal clock set to say 1:06 PM and edge B say 1:00 PM? Let’s say edge B is at the correct time, but edge A has drifted by 6 minutes, and a call presents on a trunk terminated on edge A and answered by an Agent registered to Edge B at around 1:00 PM. The call must traverse both edges with each edge responsible for reporting and time stamping different segments. Because the clock is set incorrectly on edge A, when reviewing the segment data, yes even within the PureCloud Interaction page, you may see strange things such as a call being placed on hold, even before it was initially answered. The whole timeline graph within the interaction page fails to make sense.   

So, why would this happen? Many organisations only support NTP via their own internal NTP hierarchy and block any requests from internal servers sending NTP requests to the internet. The edge appliance always tries to synchronise via the internet by default. If this is not successful, it runs off its internal clock, and this is when it begins to drift. In the early days of deployments, ININ or Genesys Support would need to resync each edge appliance at the OS level. Now I notice Telephony Administrators of your PureCloud instance can administer these changes with ease. It is but a mere 5-minute task within the web GUI. So, in summary, a simple thing like clock drift confuses your data, unless you are friends with NTP. It cannot fix the data already logged by edges that suffered drifting clocks. However, once all edges are synchronised within your PureCloud environment, all future data becomes accurate and extremely useful.  

If you are not sure about anything I have explained here, and possibly like many of us not the least bit technical, but your data is not making sense, I strongly recommend you ask your support organisation to ensure your Edge Appliances are securely synchronised to a reliable NTP source.  I hope that this article helps at least a few by saving them the time and energy it took me to learn the hard way. 😊