<?xml version="1.0" encoding="US-ASCII"?>
<!DOCTYPE rfc SYSTEM "rfc2629.dtd">
<?rfc toc="yes"?>
<?rfc tocompact="yes"?>
<?rfc tocdepth="3"?>
<?rfc tocindent="yes"?>
<?rfc symrefs="yes"?>
<?rfc sortrefs="yes"?>
<?rfc comments="yes"?>
<?rfc inline="yes"?>
<?rfc compact="yes"?>
<?rfc subcompact="no"?>
<rfc category="std" docName="draft-ietf-idr-link-bandwidth-09"
     ipr="trust200902">
  <front>
    <title abbrev="BGP Link Bandwidth Extended Community">BGP Link Bandwidth
    Extended Community</title>

    <author fullname="Pradosh Mohapatra" initials="P." surname="Mohapatra">
      <organization>Sproute Networks</organization>

      <address>
        <email>pradosh@sproute.com</email>
      </address>
    </author>

    <author fullname="Rex Fernando" initials="R." surname="Fernando">
      <organization>Cisco Systems</organization>

      <address>
        <postal>
          <street>170 W. Tasman Drive</street>

          <city>San Jose</city>

          <region>CA</region>

          <code>95134</code>

          <country>US</country>
        </postal>

        <email>rex@cisco.com</email>
      </address>
    </author>

    <author fullname="Reshma Das" initials="R." role="editor" surname="Das">
      <organization>Juniper Networks, Inc.</organization>

      <address>
        <postal>
          <street>1133 Innovation Way,</street>

          <city>Sunnyvale</city>

          <region>CA</region>

          <code>94089</code>

          <country>US</country>
        </postal>

        <email>dreshma@juniper.net</email>
      </address>
    </author>

    <author fullname="Satya Mohanty" initials="S." role="editor"
            surname="Mohanty">
      <organization>Zscaler</organization>

      <address>
        <postal>
          <street>120 Holger Way,</street>

          <city>San Jose</city>

          <region>CA</region>

          <code>95134</code>

          <country>US</country>
        </postal>

        <email>smohanty@zscaler.com</email>
      </address>
    </author>

    <author fullname="Mankamana Mishra" initials="M." surname="Mishra">
      <organization>Cisco Systems</organization>

      <address>
        <postal>
          <street>821 alder drive,</street>

          <city>Milpitas</city>

          <region>CA</region>

          <code>95035</code>

          <country>US</country>
        </postal>

        <email>mankamis@cisco.com</email>
      </address>
    </author>

    <author fullname="Rafal Jan Szarecki" initials="R.J." surname="Szarecki">
      <organization>Google LLC</organization>

      <address>
        <postal>
          <street>1160 N Mathilda Ave,,</street>

          <city>Sunnyvale</city>

          <region>CA</region>

          <code>94089</code>

          <country>US</country>
        </postal>

        <email>rszarecki@gmail.com</email>
      </address>
    </author>

    <date day="16" month="09" year="2024"/>

    <abstract>
      <t>This document describes an application of BGP extended communities
      that allows a router to perform unequal cost load balancing.</t>
    </abstract>

    <note title="Requirements Language">
      <t>The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT",
      "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this
      document are to be interpreted as described in <xref
      target="RFC2119"/>.</t>
    </note>
  </front>

  <middle>
    <section title="Introduction">
      <t>Load balancing is a critical aspect of network design, enabling
      efficient utilization of available bandwidth and improving overall
      network performance. Traditional equal-cost multi-path (ECMP) routing
      does not account for the varying capacities of different paths. This
      document suggests that the external link bandwidth be carried in the
      network using one of two new extended communities <xref
      target="RFC4360"/> - the transitive and non-transitive link bandwidth
      extended community. The Link Bandwidth Extended Community provides a
      mechanism for routers to advertise the bandwidth of their downstream
      path(s), facilitating maximum utilisation of network resources.</t>
    </section>

    <section title="Link Bandwidth Extended Community">
      <t>The Link Bandwidth Extended Communities are defined as a BGP extended
      community that carries the bandwidth information of a router,
      represented by BGP Protocol Next Hop, connecting to remote network. This
      community can be used to inform other routers about the available
      bandwidth on trough a given route.</t>

      <t>The Link bandwidth extended communities can be either transitive or
      non-transitive. Therefore the value of the high-order octet of the
      extended Type Field can be 0x00 or 0x40 respectively. The value of the
      low-order octet of the extended type field for this communities is 0x04.
      The value of the Global Administrator subfield in the Value Field SHOULD
      represent the Autonomous System of the router that attaches the Link
      Bandwidth Community, but in can be set to any 2-byte value. If four
      octet AS numbering scheme is used <xref target="RFC6793"/>, AS_TRANS
      should be used in the Global Administrator subfield. The bandwidth of
      the link is expressed as 4 octets in <xref target="IEEE.754-2019"/>
      floating point format, units being bytes (not bits!) per second. It is
      carried in the Local Administrator subfield of the Value Field.</t>

      <figure anchor="LBWExtCom" suppress-title="false"
              title="Link Bandwidth Extended Community">
        <artwork align="left" xml:space="preserve">
 0                   1                   2                   3
 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|   Type=0x00/0x40   | SubType= 0x04 |       AS Number          |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|                    Link Bandwidth Value                       |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+

 Type:   1-octet field MUST be set to 0x00 or 0x40 
         to indicate transitive/non-transitive.

 SubType: 1-octet field MUST be set to 0x04 
          to indicate 'Link-Bandwidth'.

 Global Administrator sub-field: 
          2-octet represent the Autonomous System.

 Local Administrator sub-field: 
          Bandwidth value (bytes per sec) encoded as 4 octets
          in IEEE floating point format.
</artwork>
      </figure>
    </section>

    <section title="Protocol Procedures">
      <section anchor="Originator"
               title="Sender (Originating Link Bandwidth Community)">
        <t>An originator of the link bandwidth community SHOULD be able to
        originate either a transitive or a non-transitive link bandwidth
        extended community. Implementation SHOULD provide configuration to set
        the transitivity type of the link bandwidth community, as well as
        Global Administrator filed value and bandwidth value in (Local
        Administrator filed), trough local policy. No more than one link
        bandwidth extended community SHALL be attached to a route.</t>

        <t>An originator can attach link bandwidth community to BGP path in
        egress processing to adj-RIB-out only or in ingress processing in
        which case link bandwidth community is present in Local-RIB.</t>

        <t>Note: Implementation MAY provide configuration option (knob) to
        allow sending non-transitive link bandwidth extended community on
        external BGP sessions.</t>
      </section>

      <section anchor="Receiver"
               title="Receiver (Receiving link bandwidth community)&nbsp;">
        <t>A BGP receiver MUST be able to process link bandwidth community of
        both transitive or non-transitive type. The receiver MUST NOT flap or
        treat the route as malformed based on the transitivity of the link
        bandwidth community and/or BGP session type (internal vs.
        external).</t>

        <t>Note: Implementation MAY provide configuration option (knob) to
        accept non-transitive link bandwidth extended community from external
        BGP sessions.</t>
      </section>

      <section anchor="Re-advertisement" title="Re-advertisement Procedures">
        <section anchor="NextHopSelf"
                 title="Re-advertisement with Next hop Self">
          <t>When a BGP speaker re-advertises a route with the Link Bandwidth
          Extended Community and sets the next hop to itself, it SHOULD follow
          the same procedures as outlined in <xref target="Originator"/>.</t>

          <t>In the absence of any import or export policies that alter the
          Link Bandwidth Extended Community, any received Link Bandwidth
          extended community on the route will be re-advertised unchanged, in
          accordance with standard BGP procedures.</t>
        </section>

        <section anchor="NextHopUnchanged"
                 title=" Re-advertisement with Next Hop Unchanged">
          <t>A BGP speaker that receives a route with link bandwidth
          community, re-advertises or reflects the same without changing its
          next hop SHOULD NOT change the link bandwidth extended community in
          any way.</t>
        </section>
      </section>

      <section title="Link bandwidth community Arithmetic and BGP multipath">
        <t>In a BGP multipath ECMP environment, the value of the link
        bandwidth community that is sent or re-advertised may be calculated
        based on the link bandwidth communities of the routes contributing to
        multipath in the Local Routing Information Base (Local-RIB). This
        topic is beyond the scope of this document.</t>
      </section>
    </section>

    <section anchor="Error" title="Error Handling">
      <t>If a receiver receives a route with more than one Link Bandwidth
      Extended Community, it SHOULD:<list>
          <t>Prefer the lowest value of the attached link bandwidth community
          (Irrespective of the transitivity)</t>

          <t>Prefer the transitive Link Bandwidth Extended Community when
          choosing between transitive and non-transitive types that have the
          same value</t>
        </list></t>

      <t>Implementations MAY provide configuration options (knobs) to change
      the above preference.</t>
    </section>

    <section anchor="History" title="Document History">
      <t>The BGP Link Bandwidth Extended Community has evolved over several
      versions of the IETF draft. In the earlier versions up to
      draft-ietf-idr-link-bandwidth-08, only the non-transitive version of the
      link bandwidth extended community was supported. However, starting from
      draft-ietf-idr-link-bandwidth-09, both transitive and non-transitive
      versions of the link bandwidth extended community are supported.</t>

      <t>An old sender/receiver is a BGP speakers either use procedures upto
      draft
      (https://datatracker.ietf.org/doc/html/draft-ietf-idr-link-bandwidth-08)
      or any undocumented behavior for link bandwidth extended community.</t>

      <t>A new sender/receiver is a BGP speaker that implements procedures
      specified in this document.</t>

      <t>Receiving speaker needs to be upgraded to support the procedures
      defined in this document to provide full interop with both transitive
      and non-transitive versions of LBW. In order to keep changes to
      procedures simple, it is not a goal to provide interop between old
      Receiver and new Sender.</t>
    </section>

    <section anchor="IANA" title="IANA Considerations">
      <t>This document defines a specific application of the two-octet AS
      specific extended community. IANA is requested to assign a sub- type
      value of 0x04 for the link bandwidth extended community.<figure>
          <artwork>    Name                                           Value
    ----                                           -----
    non-transitive Link Bandwidth Ext. Community  0x4004</artwork>
        </figure><figure>
          <artwork>    Name                                           Value
    ----                                           -----
    transitive Link Bandwidth Ext. Community       0x0004</artwork>
        </figure></t>
    </section>

    <section anchor="Security" title="Security Considerations">
      <t>There are no additional security risks introduced by this design.</t>
    </section>

    <section anchor=" Contributors" title=" Contributors">
      <author fullname="Kaliraj Vairavakkalai" initials="K."
              surname="Vairavakkalai">
        <organization>Juniper Networks, Inc.</organization>

        <address>
          <postal>
            <street>1133 Innovation Way,</street>

            <city>Sunnyvale</city>

            <region>CA</region>

            <code>94089</code>

            <country>US</country>
          </postal>

          <email>kaliraj@juniper.net</email>
        </address>
      </author>

      <author fullname="Natrajan Venkataraman" initials="N."
              surname="Venkataraman">
        <organization>Juniper Networks, Inc.</organization>

        <address>
          <postal>
            <street>1133 Innovation Way,</street>

            <city>Sunnyvale</city>

            <region>CA</region>

            <code>94089</code>

            <country>US</country>
          </postal>

          <email>natv@juniper.net</email>
        </address>
      </author>
    </section>

    <section anchor="Acknowledgments" title="Acknowledgments">
      <t>The authors would like to thank Yakov Rekhter, Srihari Sangli and Dan
      Tappan for proposing unequal cost load balancing as one possible
      application of the extended community attribute.</t>

      <t>The authors would like to thank Bruno Decraene, Robert Raszuk, Joel
      Halpern, Aleksi Suhonen, Randy Bush, Jeff Haas and John Scudder for
      their comments and contributions.</t>
    </section>
  </middle>

  <back>
    <references title="Normative References">
      <?rfc include="https://xml2rfc.tools.ietf.org/public/rfc/bibxml/reference.RFC.2119.xml"?>

      <?rfc include="https://xml2rfc.tools.ietf.org/public/rfc/bibxml/reference.RFC.4360.xml"?>

      <?rfc include="https://xml2rfc.tools.ietf.org/public/rfc/bibxml/reference.RFC.6793.xml"?>

      <reference anchor="IEEE.754-2019"
                 target="https://ieeexplore.ieee.org/document/8766229">
        <front>
          <title abbrev="IEEE-754-2019">IEEE Standard for Floating-Point
          Arithmetic</title>

          <author>
            <organization showOnFrontPage="true">IEEE</organization>
          </author>

          <date day="22" month="July" year="2019"/>
        </front>
      </reference>
    </references>
  </back>
</rfc>
