A Deeper Dive into AT&T ECOMP

Even a superficial review of AT&T ECOMP shows it’s a whole different way of looking at the virtualization/softwarization of networking.  The master architecture diagram is a picture of start-to-finish service lifecycle management, the bottom is a redrawing of SDN and NFV concepts, and the middle is a modeling approach that seems to draw from multiple sources.  There are no specific references to any standards in the diagram, and the term “ECOMP” appears only twice.

The reason for this is simple, I think.  ECOMP is a new glue that binds and somewhat reshapes things that AT&T had always believed would be part of its Domain 2.0 (D2) architecture.  Because that architecture was designed to create a network where vendors fit into a specific set of silos and were strongly discouraged from slopping across multiple zones to lock AT&T into their approach, so is ECOMP.  In fact, it goes beyond D2 in that regard.  By framing services in a D2 way, ECOMP makes D2 real and not just a set of zones and boundaries.

It’s a fascinating model, and I have to open by saying I’m reviewing ECOMP based on less information than I’d like.  Until the full source code is released we won’t have all the details of the current implementation, and another that I expect that ECOMP will evolve as AT&T gains experience.  It will evolve faster, and further, if Orange and other operators now looking at (or trialing) ECOMP decide to go with it.  That could make ECOMP the most important development in SDN/NFV.

Functionally, ECOMP divides into two parallel frameworks, a design-time framework that builds and sustains the models and policies, and a runtime framework that applies them to the service lifecycle and network infrastructure.  The latter is repeatedly linked to D2 with references like “manages the full lifecycle of D2 infrastructure”, and in the various diagrams and texts, it’s possible to see many holdovers from the early D2 work.

The “left side” or design part of the service lifecycle process is rooted in the AT&T Service Design and Creation (ASDC) element, and also includes policy and analytics application design.  The models themselves seem to be somewhat based on the TM Forum model, combined with a dash of TOSCA, but the details are still murky because the code hasn’t yet been released for public review.  This element feeds another related component that’s responsible for acting as a model and policy repository and distributor.  ECOMP’s common services heart element provides an interface to this repository.

There are two high-layer ECOMP elements, the Master Service Orchestrator (MSO) and the Active and Available Inventory (A&AI) elements.  The former does the orchestrating based on a catalog of “recipes” and is roughly analogous to an expanded NFV MANO function.  The latter is the real-time viewer into the D2 environment, including both resources and services and in my view, it represents something that’s not explicit in the ETSI model at all.

The lower-layer elements of ECOMP are perhaps the most critical to vendors because it’s here that D2 largely assigns outside servers, devices, and software.  Again, there are two primary components in this area.  One is the Data Collection, Analytics, and Events (DCAE) element that handles telemetry and analysis, and the other is the collection of what’s perhaps the most critical element—the Controllers.

Orchestration in ECOMP is a two-level process.  The MSO handles high-level service-end-to-end orchestration and it then hands off its needs to one of a number of Controllers, each of which are resource-domain-specialist orchestrators that turn high-level requests into specific commitments of resources.  ECOMP defines three Controller types, the Infrastructure (or cloud), the Network, and the Application.

This structure divides responsibility between MSO and the Controllers, with the former doing stuff that involves high-level deployment and redeployment actions that have broad scope and minute-level response requirements, and the latter doing the seconds-of-time responses to things.  The Controllers are like ETSI Infrastructure Managers, except that ECOMP explicitly assigns some level of orchestration responsibility to them, which ETSI should and does not (so far).

AT&T seems to envision “orchestration” at the Controller level to be a combination of specific orchestration steps (DevOps- or “NetOps”-like) and policy distribution.  The combination seems appropriate given that the domains for each Controller could involve both legacy and virtual elements.  The implication of the structure is that a Controller is given a mission to deploy and sustain a service element (from its associated domain) and will be responsible for event-handling as long as the Controller can do what it’s expected to do.  If it can’t, then it would buck the case upward to the MSO for handling.

The MSO seems responsible for activating individual elements, meaning that the recipes that it works on would have to be fairly detailed in terms of the steps needed.  The Controllers carry out the requests of the MSO, but as I noted they also respond to lifecycle management events.  This makes the controllers a mixture of ETSI functions.  The Infrastructure controller is surely similar to the Infrastructure Manager, but the ETSI IM (virtual IM in ETSI) is singular while the ECOMP model divides it into Network and Infrastructure (meaning cloud).  The Application Controller is analogous to the ETSI VNF Manager in some ways.

This approach sounds strange, particularly for those versed in the ETSI approach, but it’s more logical than the ETSI original.  Controllers are passed “policies” from the MSO, and they have their own orchestration/policy mechanisms to cycle through the lifecycles of the stuff they build.  It’s supported by telemetry that’s collected everywhere and distributed to where it’s needed, using a publish-and-subscribe model.

ECOMP is a big advance in SDN/NFV architecture, a major improvement in nearly every way.  That doesn’t mean that it’s entirely without questions, perhaps even issues.  Again I have to stress that the details here are sketchy because the code’s not released, but I think there’s enough to comment on.

The big issue in ECOMP remains the VNFs themselves.  A “virtual network function” is generally seen as a function transplanted from an appliance and strung into a service chain.  Every appliance is different, and there’s no standard way of hosting one of these functions as a result.  Each presumably has a set of interfaces, each requires parameters, and all of this variability could be handled in only two ways—require that it be shed in favor of a standard set of API dependencies (what I’ve called a “VPF PaaS” model) or nest custom code with each VNF to provide it what it needs and interface with the outside world.  Even that would require some standard way of harmonization.  Neither ETSI’s work nor ECOMP mandates either of these two approaches, and without them there’s still too much “snowflake” variability and not enough “Lego” interchangeability in the VNFs.

The second issue is in the Controller design.  It appears from the ECOMP material that while there are three types of controllers, there could be multiple instances of each type corresponding to specific domains.  I’d take “domains” here to mean resource/administrative domains, meaning areas under a common management jurisdiction.  That’s a good idea, and it could also contribute to federation if some of the domains were “foreign”, which appears to be possible given the implementation description.

What’s not quite clear is whether the instances of a given Controller type all share a common implementation.  In some places the material seems to suggest that they do, and in others that there might be different implementations to accommodate different technologies deployed in the controlled regions.  This isn’t a simple semantic point; if there is only one controller implementation for each type, then every controller would have to know a lot of things to reflect all the variability in implementation within its domain.  Or, the domains would have to be totally homogeneous from a control and functional perspective.

The final point is that orchestration of operations functions is still undefined.  It’s not that ECOMP precludes it, but that it doesn’t presume a common modeling and orchestration flow that starts at the OSS/BSS and moves downward to the services.  Operators will vary on how much they rely on OSS/BSS tools for specific service lifecycle processes, and thus it’s not clear how much operations efficiency benefits might be left on the table.

OK, overall, ECOMP isn’t perfect, but it’s a massive forward step in an industry that’s been dawdling around the key points for all too long.  I’m sure it will mature with adoption and that if there is successful promotion of ECOMP beyond AT&T, it will promote things to move further and faster.  I’ll probably revisit ECOMP down the line as it evolves and more details become available.