Wednesday, September 1, 2010
IFE System Health
AEEC Cabin Systems Subcommittee has developed parameters to monitor the health and performance of sophisticated in-flight entertainment systems
The AEEC Cabin Systems Subcommittee (CSS) has developed Supplement 1 to ARINC Specification 628: “Cabin Equipment Interface (CEI), Part 8, In-Flight Entertainment (IFE) Equipment Standard Availability Measurement Guidelines,” adopted by the AEEC Executive Committee in October 2009. This standard develops the concepts for monitoring IFE system performance and health, which will aid in identifying system degradation and allow proactive maintenance of the operational IFE systems prior to an actual fault flagged by integrated Built-In-Test (BIT).
Modern day IFE systems have incorporated Internet Protocol (IP) networks over an Ethernet physical layer to form the backbone and distribution architecture for system data transfer.
High-bandwidth video and audio data are streamed one way, via User Datagram Protocols (UDP), from head-end media servers to PC-equivalent seat clients over gigabit Ethernet backbones. Lower bandwidth control information is transmitted, over the same networks, bi-directionally to the seat clients using Transport Control Protocols (TCP).
The use of IP-based networks allows for monitoring capability not available in older non-IP-based IFE systems. The development of “enterprise” monitoring via Simple Network Management Protocols (SNMP) also aids in obtaining “real-time” network operational information from servers, routers, switches and clients in the Local Area Network (LAN). This information can be used to monitor IFE system performance and determine system health.
The definition, measurement and analysis of key operational parameters form the basis for implementing IFE System Performance and Health Monitoring (SPHM). This process and associated data can then be used to identify system degradation and error trends aiding in the proactive maintenance of the system prior to an actual line replaceable unit (LRU) malfunction or crash.
What is important in defining these key parameters is to identify measures that can be quantified and captured internally by the IFE system. These measurements should not require human intervention (i.e., observation) or auxiliary test equipment to obtain the measurement.
It is important to distinguish between SPHM and system BIT. SPHM is not intended to determine system failures normally detected by BIT. From an operational perspective, the SPHM parameters are collected during a flight and off-loaded periodically from the IFE system for analysis. This data transfer can occur manually by airline maintenance or IFE supplier personnel or via various data link methods such as ACARS or Gatelink.
The data would then be analyzed post flight to determine system trends pointing to abnormalities or degradation within the system or network. Software tools specifically developed to analyze this data would then be used to determine system performance and determine if and where degradation is occurring within the system and the source of the degradation. This would enable airline maintenance personnel to proactively address operational issues within the IFE system by identifying and replacing an intermittent or malfunctioning LRU prior to total failure of that unit.
Definition of parameters that affect IFE system performance is an important facet in developing SPHM. Supplement 1 to ARINC Specification 628, Part 8, suggests a list of parameters that were identified as important in assessing health and performance within IFE system.
These parameters however are not intended as an exhaustive list of performance and health monitoring criteria, rather the suggested parameters are provided as a “tool set” for the measurement of SPHM a definition of the framework for measurement and data collection and not the quantitative measures that should be used to develop SPHM for current and future IFE systems.
Based on the evolving architecture of IFE systems, which incorporate the latest in networking technology with the aim of offering new services to the passengers, the proposed list of monitor parameters should be considered only the genesis for implementing SPHM.
The suggested parameters of merit include:
➤ Throughput and Packet Loss: Video and audio media is delivered via UDP protocol, which is a best effort delivery protocol that has no retransmit of lost data. If a network has issues with data traffic congestion and the full network bandwidth is not available for the media data streams or packets are lost during transfer from the server to the client, the video displayed at the client may pixelate, pause or dropout.
➤ Network Path Usage: If a network segment is over utilized, e.g. data traffic sent through that network segment is approaching maximum bandwidth for a sustained time frame, the media streams may not be able to be transmitted to the clients at the intended data rate, causing the seat client to malfunction.
➤ Audio Video On Demand (AVOD) Buffer over/under-run: If the seat client is not receiving the proper media data rates due to network congestion issues, the AVOD buffers will suffer an under-run condition and the media decoder will be “starved” for data, causing poor video display (pixilation, video pause or dropouts). Conversely if media is arriving at a rate greater than the input video buffer can handle, data will be lost due to buffer overflow.
➤ Non-responsive Applications: An application that becomes non-responsive, e.g. slow to respond to passenger commands, or “hangs” on a continual basis, is an irritation to the passenger and is a sign that a process within the application is not executing properly.
➤ System Reset Count: An excessive number of manual or automatic system resets can be a sign that there are issues with an IFE system. It should be noted that an excessive number of manual resets of the system may not be a fair indication of system degradation since the cabin crew does not have any tools to fix a poorly operating IFE system and a manual reset in such a situation may not have any correlation to the root cause of a malfunctioning IFE system. All of these peripheral indicators need to be taken into consideration when performing post-flight analysis of the data.
➤ LRU/Seat Reset Count: An excessive number of manual or automatic LRU or seat resets signal possible issues with specific headend LRUs or LRUs installed in the seat.
The above is an abbreviated list of the measures identified as “key parameters” that can indicate system abnormalities or upset, viewed from a passenger perspective, that can be recorded and analyzed to identify error trends and system degradation.
Another important facet of SPHM is that the monitoring and collection of these parameters must not affect system performance or impact operational storage within the system. Therefore, it is very important to determine the optimum number of parameters being collected, periodicity of data collection and the affect of the monitoring/data collection method on system performance.
SPHM will aid in determining IFE system performance and health and help airlines and IFE suppliers to proactively maintain these very complex “enterprise” type networks and systems.
Gerald Lui-Kwan is an associate technical fellow of Network Systems at Boeing Commercial Airplanes. He is co-chairman of the ARINC AEEC Cabin Systems Subcommittee.